Genotype imputation software tools | Genome-wide association study data analysis
Genotype imputation has been widely adopted in the postgenome-wide association studies (GWAS) era. Owing to its ability to accurately predict the genotypes of untyped variants, imputation greatly boosts variant density, allowing fine-mapping studies of GWAS loci and large-scale meta-analysis across different genotyping arrays.
Assists users in studying phased genotypes. Minimac is an application that can handle large reference panels with hundreds or thousands of haplotypes. This application is based on MaCH, an algorithm for genotype imputation. It supports the imputation of genotypes on the X chromosome. It relies on a two-step approach: (i) the samples that will be analyzed must be phased into a series of estimated haplotypes and (ii) imputation is carried out directly into these phased haplotypes.
A program for the analysis of single SNP association in genome-wide studies. The tests implemented include 1) binary (case-control) phenotypes, single and multiple quantitative phenotypes, 2) Bayesian and Frequentist tests, 3) ability to condition upon an arbitrary set of covariates and/or SNPs and 4) various different methods for the dealing with imputed SNPs.
A computer program for phasing observed genotypes and imputing missing genotypes. IMPUTE increases accuracy and combines information across multiple reference panels while remaining computationally feasible. IMPUTE v2 attains higher accuracy than other methods when the HapMap provides the sole reference panel, but that the size of the panel constrains the improvements that can be made.
A statistical model for patterns of genetic variation in samples of unrelated individuals from natural populations. fastPHASE is based on the idea that, over short regions, haplotypes in a population tend to cluster into groups of similar haplotypes. For imputing missing genotypes, methods based on this model are as accurate or more accurate than existing methods. For haplotype estimation, the point estimates are slightly less accurate than those from the best existing methods but require a small fraction of the computational cost.
Performs genotype calling, genotype phasing, imputation of ungenotyped markers, and identity-by-descent segment detection. Beagle can be applied to thousands of samples across genome-wide single nucleotide polymorphism (SNP) data. It can retrieve short tracts of identity by descent (IBD). This tool utilizes composite reference haplotypes to model large genomic regions with a parsimonious statistical model.
A free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner. The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype or CNV calls from raw data). Through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation and storage of results.
A method for the analysis of dense genetic maps in pedigree data that provides extremely fast solutions to common problems such as allele-sharing analyses and haplotyping. Merlin is a computer program that uses sparse inheritance trees for pedigree analysis. It performs rapid haplotyping, genotype error detection and affected pair linkage analyses and can handle more markers than other pedigree analysis packages.