1 - 23 of 23 results


Estimates haplotype phase either within a genotyped cohort or using a phased reference panel. A phasing algorithm attaining high accuracy across a broad range of cohort sizes by efficiently leveraging information from large external reference panels using a new data structure based on the positional Burrows-Wheeler transform. Eagle2 shows two key differences compared to others hidden Markov model-based (HMM-based) methods: (i) it efficiently represents the full haplotype structure in a way that losslessly condenses locally matching haplotypes, (ii) it selectively explores that space of diplotypes in a way that only expends computation on the most likely phase paths.

HARSH / HAplotype inference using Reference and Sequencing tecHnology

An efficient method that combines multi-SNP read information with reference panels of haplotypes for improved genotype and haplotype inference in sequencing data. Unlike previous phasing methods that use read counts at each SNP as input, our method takes into account the information from reads spanning multiple SNPs. HARSH is able to efficiently find the likely haplotypes in terms of the marginal probability over the genotype data. Using simulations from HapMap and 1000 Genomes data, we show that our method achieves superior accuracy than existing approaches with decreased computational requirements.


Reconstructs local tree topologies for a set of population single-nucleotide polymorphism (SNP) haplotypes undergoing recombination. Due to recombination, tree topologies change as one moves accross the genome. The main idea of this tool is to jointly refine a set of local trees at the SNP sites by several justifiable rules. RENT+ extends previous program RENT+ which uses a novel search method to infer the local trees, one for each genomic region near a SNP site. The key benefit of using RENT+ is that it allows the inference to utilize the underlying joint information contained in multiple nearby SNPs (i.e. the so-called linkage disequilibrium) in such inference.


A fast haplotype phasing algorithm based on scalable sliding windows and parsimony principle, which not only maintains the similar speed with the 2SNP but also has a much higher accuracy. In the first step, the initial haplotypes of individual genotype dataset are obtained based on simplified 2SNP method. In the second step, the haplotypes will be improved by the scalable sliding windows if in which a type of haplotype pair occupies the majority. The scalable sliding window is composed of consecutive SNPs which contain heterozygous SNPs, homozygous SNPs or missing SNPs. In the final step, the haplotypes are iteratively decreased by restricting one recombination at most in two haplotypes of each genotype based on parsimony principle.

GERBIL / GEnotype Resolution and Block Identification using Likelihood

An algorithm for haplotype resolution and block partitioning. The algorithm uses a stochastic model for genotype generation, based on the biological finding that genotypes can be partitioned into blocks of low recombination rate, and in each block, a small number of common haplotypes is found. Our model uses the notion of a probabilistic common haplotype, which can have different forms in different genotypes, thereby accommodating errors, rare recombination events, and mutations. GERBIL was shown to be quick and accurate even when applied to many hundreds of individuals.

ISHAPE / Iterative Segmented HAPlotyping by Em

Allows a rapid and accurate inference of haplotypes from population genotype data. ISHAPE first uses a rapid bootstrap step of iterative EM (IEM) haplotyping in randomly chosen samples of the population in order to delimit the space of haplotype pairs that will be assigned to the population. It then uses a Phase 2.1-like algorithm to precisely infer haplotypes in the population, within the limits of the previously defined haplotype space. The first bootstrap-IEM step is very rapid, while the second Phase 2.1-like step is also quite rapid since it works on a small working space of haplotypes.


Haplotypes inference for multiple populations using a hierarchical Dirichlet process mixture. Haploi is a Bayesian approach that incorporates a hierarchical Dirichlet process (HDP) prior which couples multiple heterogeneous populations. It also facilitates sharing of mixture components (i.e., haplotype founders) across multiple Dirichlet process mixtures. This method can infer the true haplotypes in a multi-subpopulation dataset with an accuracy superior to the state-of-the-art haplotype inference algorithms.