1 - 25 of 25 results

HapCUT

Allows haplotype assembly for diverse sequencing technologies. HapCUT is a maximum likelihood based algorithm that can assemble haplotypes for a diverse array of data modalities while other tools are specialized for certain subsets of data modalities. The software implements an iterative approach for modeling and estimating h-trans error probabilities de novo that reduces errors in assembled Hi-C haplotypes. It was assessed using data from fosmid-based dilution pool sequencing, 10X Genomics linked-read sequencing, single molecule real-time (SMRT) sequencing, and proximity ligation sequencing.

AltHap

Formulates haplotype assembly as sparse tensor decomposition. AltHap is a haplotyping assembly method for diploid, polyploid, and polyallelic polyploid organisms. The framework exploits structural properties of the problem to efficiently find tensor factors and thus assemble haplotypes in an iterative fashion, alternating between two computationally tractable optimization tasks. It is capable of reconstructing haplotypes with polyallelic sites, making it useful in a number of applications involving plant genomes.

phASER / phasing and Allele Specific Expression from RNA-seq

A fast and accurate approach for phasing variants that are overlapped by sequencing reads, including those from RNA-sequencing (RNA-seq), which often span multiple exons due to splicing. phASER provides 1) dramatically more accurate phasing of rare and de novo variants compared to population-based phasing; 2) phasing of variants in the same gene up to hundreds of kilobases away which cannot be obtained from DNA-sequencing reads; 3) high confidence measures of haplotypic expression, greatly improving power for allelic expression studies.

HapCol

A fast and memory-efficient method for haplotype assembly from long gapless reads. HapCol implements a fixed-parameter algorithm for the k-constrained Minimum Error Correction problem (k-cMEC), a variant of the well-known MEC problem where the maximum number of corrections per column is bounded by an integer k. HapCol, while is as accurate as other exact state-of-the-art combinatorial approaches, is significantly faster and more memory-efficient than them. Moreover, HapCol is able to process datasets composed of both long reads (over 100,000bp long) and coverages up to 25x on standard workstations/small servers, whereas the other approaches cannot handle long reads or coverages greater than 20x.

HapTree

A maximum-likelihood estimation framework for polyploid haplotype assembly of an individual genome using NGS read datasets. We evaluate the performance of HapTree on simulated polyploid sequencing read data modeled after Illumina sequencing technologies. For triploid and higher ploidy genomes, we demonstrate that HapTree substantially improves haplotype assembly accuracy and efficiency over the state-of-the-art; moreover, HapTree is the first scalable polyplotyping method for higher ploidy.

HARSH / HAplotype inference using Reference and Sequencing tecHnology

An efficient method that combines multi-SNP read information with reference panels of haplotypes for improved genotype and haplotype inference in sequencing data. Unlike previous phasing methods that use read counts at each SNP as input, our method takes into account the information from reads spanning multiple SNPs. HARSH is able to efficiently find the likely haplotypes in terms of the marginal probability over the genotype data. Using simulations from HapMap and 1000 Genomes data, we show that our method achieves superior accuracy than existing approaches with decreased computational requirements.

SV-AUTOPILOT / Structural Variation AUTOmated PIpeLine Optimization Tool

Obsolete
Standardizes the Structural Variation (SV) detection pipeline. SV-AUTOPILOT is a pipeline that can be used on existing computing infrastructure in the form of a Virtual Machine (VM) Image. It provides a “meta-tool” platform for using multiple SV-tools, to standardize benchmarking of tools, and to provide an easy, out-of-the-box SV detection program. In addition, the user can choose which of several alignment algorithms is used in their analysis.

HapEdit

An accuracy assessment tool to view haplotype assemblies by massively parallel sequencing technologies and edit misassembled haplotypes. HapEdit offers a graphical user interface to navigate haplotype assemblies and helps a user to fit the composition rates of the reads sequenced by the (up to) six different sequencing technologies to the ideal composition rates. As inputs, HapEdit currently takes reads from the Polonator, Illumina, SOLiD, 454 and Sanger sequencing technologies.

FastHap

A fast and accurate haplotype reconstruction approach, which is up to one order of magnitude faster than the state-of-the-art haplotype inference algorithms while also delivering higher accuracy than these algorithms. FastHap leverages a new similarity metric that allows us to precisely measure distances between pairs of fragments. The distance is then used in building the fuzzy conflict graphs of fragments. Given that optimal haplotype reconstruction based on minimum error correction is known to be NP-hard, we use our fuzzy conflict graphs to develop a fast heuristic for fragment partitioning and haplotype reconstruction.

ProbHap

A software package for reconstructing parental haplotypes from long reads. ProbHap works best with very long reads at a relatively shallow coverage (<= 12X). The main algorithmic idea of ProbHap is a new dynamic programming algorithm that exactly optimizes a likelihood function specified by a probabilistic graphical model and which generalizes a popular objective called the minimum error correction. In addition to being accurate, ProbHap also provides confidence scores at phased positions.

HAPLOWSER / HAPLotype brOWSER

A comparative browser to compare haplotypes inferred from genome assemblies or metagenome assemblies. HAPLOWSER offers a convenient way to navigate haplotype sequences and functional annotations, both of which operate synchronously. Along with zooming, a user can navigate any region of haplotypes and functional annotations at any resolution. Functional annotations and custom tracks that are projected onto haplotypes are saved as multiple files in FASTA format.