Positive selection identification software tools | Population genomics data analysis
Natural selection is a significant force that shapes the architecture of the human genome and introduces diversity across global populations. The question of whether advantageous mutations have arisen in the human genome as a result of single or multiple mutation events remains unanswered except for the fact that there exist a handful of genes such as those that confer lactase persistence, affect skin pigmentation, or cause sickle cell anemia.
Carries out the widely used statistical test for natural selection. HKA is a computer program that can handle very large numbers of loci and sample sizes, and conducts tests via coalescent simulation as well as by the conventional chi square approximation. The simulations can also be used to conduct other tests of natural selection, including tests of Tajima's D statistic and the D statistic.
Implements three complementary methods for detecting sites under selection. Datamonkey is a popular web-based suite of phylogenetic analysis tools for use in evolutionary biology. This web app is linked to a cluster of computers so that analyses which would take a long time to run on a desktop computer can be run quickly. It provides a user-friendly web interface to a wide collection of state-of-the-art statistical techniques for estimating dS and dN and identifying codons and lineages under selection, even in the presence of recombinant sequences.
Assists users with the identification of candidate loci under natural selection from genetic data. BayeScan is an application that uses differences in allele frequencies between populations. This method is based on the multinomial-Dirichlet model. Three different types of data can be used: (i) codominant data such as single nucleotide polymorphism (SNP) or microsatellites, (ii) dominant binary data as amplified fragment length polymorphisms (AFLP) and, (iii) AFLP amplification intensity, which are neither considered as dominant nor codominant.
Implements an assembly of different evolutionary models, which allow for statistical testing of the hypothesis that a protein has undergone positive selection. Selecton is a server for detecting evolutionary forces at a single amino-acid site. This tool is an effective, user-friendly and freely available web server which implements up-to-date methods for computing site-specific selection forces, and the visualization of these forces on the protein’s sequence and structure.
Detecting selective sweeps from genomic SNP data is complicated by the intricate ascertainment schemes used to discover SNPs, and by the confounding influence of the underlying complex demographics and varying mutation and recombination rates. SweepFinder can be used to detect the location of a selective sweep based on SNP data. It will also estimate the frequency spectrum of observed SNP data in the presence of missing data.
A web application that has been developed to display the results of a scan for positive selection in the human genome using the HapMap data. Haplotter can be used as a resource to examine various population genetic measures in a genomic region.
A population genomics package for the R software environment (a de facto standard for statistical analyses). PopGenome can efficiently process genome-scale data as well as large sets of individual loci. PopGenome offers a wide range of diverse population genetics analyses, including neutrality tests as well as statistics for population differentiation, linkage disequilibrium, and recombination. PopGenome is linked to Hudson's MS and Ewing's MSMS programs to assess statistical significance based on coalescent simulations. PopGenome's integration in R facilitates effortless and reproducible downstream analyses as well as the production of publication-quality graphics.