Computational protocol: Current software for genotype imputation

[…] All three programs considered here make use of a hidden Markov model (HMM) to predict the missing genotypes of SNP markers. BEAGLE uses a localised haplotype-cluster model []. In this model, the reference haplotypes are grouped into clusters at each SNP. This allows for a reduction in complexity at different locations. IMPUTE and MACH implement variants of the 'product of approximate conditionals' (PAC) model []. The performance of both programs with regard to precision of prediction (accuracy) and efficacy is indistinguishable for populations that are well represented by HapMap. There are, however, subtle differences between the algorithms. For example, IMPUTE relies on user-specified recombination rates, whereas their estimation is part of the algorithm with MACH. Although the approach of IMPUTE may save computation time, it renders it sensitive to model misspecification []. This may be an important issue when imputation is carried out for populations that are less well represented by HapMap. All considered programs also differ in the methods used to infer haplotype phase and/or model recombination and mutation events. An insightful review of the underlying algorithms has been published recently []. […]

