1 - 39 of 39 results

LASER / Locating Ancestry using SEquencing Reads

A program to estimate individual ancestry by directly analyzing shotgun sequence reads without calling genotypes. LASER uses principal components analysis (PCA) and Procrustes analysis to analyze sequence reads of each sample and place the sample into a reference PCA space constructed using genotypes of a set of reference individuals. With an appropriate reference panel, the estimated coordinates of the sequence samples reflect their ancestral background and can be used to correct for population stratification in association studies. LASER can accurately estimate ancestry even with modest amounts of data, such as the off-target sequence data generated by targeted sequencing experiments. LASER can be used to improve case-control matching in genetic association studies and to reduce the risk of spurious findings due to population structure.


Infers global ancestry, population covariances, and constructs population trees using Gaussian models. The Ohana tool suite includes the following innovations: (i) An optimization method for STRUCTURE-style modeling for inferring admixture in a maximum likelihood estimate (MLE) framework. This method is applicable both to called genotypes and to next generation sequencing (NGS) data with uncertainty regarding the true genotypes. The method solves the sequential quadratic programming (QP) problem based on the Active Set algorithm, and tends to find higher maximum likelihood values than ADMIXTURE in similar computational time. (ii) A method for estimating population relationships from ancestry components using a Gaussian approximation. Ohana estimates the best covariance matrix compatible with a tree, thereby estimating a tree, and provides simple algorithms and visualization tools to obtain the evolutionary trees.

ASAFE / Ancestry Specific Allele Frequency Estimation

Estimates ancestry-specific allele frequencies for bi-allelic markers. We derived an EM algorithm to estimate ancestry-specific allele frequencies given data on a 3-way admixed population. The major advantage of ASAFE over alternative ancestry-specific allele frequency estimation approaches is that ASAFE is applicable to markers in the admixed sample that are absent from a reference panel. Furthermore, ASAFE takes advantage of linkage-disequilibrium based information by using local ancestry calls.


A statistical method to estimate local ancestry frequencies in admixed populations when ongoing gene flow from source populations is rare or absent, but before genome stabilization is complete. Popanc combines discriminant analysis with a continuous correlated beta process model to jointly estimate local ancestry within individuals and local ancestry frequencies at the population-level. I then assess the accuracy of the method by applying it and a traditional HMM approach to simulated data sets. The reliability and utility of the method is further demonstrated by using it to analyze genetic ancestry in an admixed human population (Uyghur), and three admixed populations from a house mouse hybrid zone.


Assists in modeling admixture and decomposing systematic variation due to population structure. ALStructure is a computationally efficient and statistically accurate method that estimates the low-dimensional linear subspace of the population admixture components. It then searches for a model within this subspace that is consistent with the admixture model’s natural probabilistic constraints. This application was developed as a unification of likelihood-based and PCA-based methods.


A comprehensive set of computational tools for evolutionary analysis of whole-genome alignments consisting of multiple individuals, from multiple populations or species. POPBAM works directly from BAM-formatted assembly files, calls variant sites, and calculates a variety of commonly used evolutionary sequence statistics. POPBAM is designed primarily to perform analyses in sliding windows across chromosomes or scaffolds. POPBAM accurately measures nucleotide diversity, population divergence, linkage disequilibrium, and the frequency spectrum of mutations from two or more populations. POPBAM can also produce phylogenetic trees of all samples in a BAM file.

REAP / Relatedness Estimation in Admixed Populations

Intends to estimate relatedness in samples from structured populations with admixed ancestry. REAP measures autosomal kinship coefficients and identity-by-descent (IBD) sharing probabilities from single nucleotide polymorphism (SNP) genotype data. The application integrates functionalities allowing users to specify a subset of individuals to include in the relatedness estimation analysis as well as an inbreeding-coefficient estimator.