A free software package for using multi-locus genotype data to investigate population structure. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed. It can be applied to most of the commonly-used genetic markers, including SNPS, microsatellites, RFLPs and AFLPs. fastSTRUCTURE estimates approximate posterior distributions on ancestry proportions 2 orders of magnitude faster than STRUCTURE, with ancestry estimates and prediction accuracies that are comparable to those of ADMIXTURE.

1 - 50 of 62
results

*filter_list*Filters

1 - 50 of 62
results

Uses principal components analysis to explicitly model ancestry differences between cases and controls along continuous axes of variation; the resulting correction is specific to a candidate marker’s variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. The EIGENSOFT package has a built-in plotting script and supports multiple file formats and quantitative phenotypes.

Identifies and allows to visualize fine-scale population structures from a genetic relationship matrix among individuals/populations. NetView is an analysis pipeline which combines three different software tools to generate a high-definition network visualization of population structures. It includes Super Paramagnetic Clustering (SPC) Network analysis Tool (NeAT) and CYTOSCAPE. This pipeline is computationally efficient and can be easily applied on large-scale genome-wide data sets to assign individuals to particular populations and to reproduce fine-scale population structures without prior knowledge of individual ancestry.

A software tool written in C++ for quick and accurate estimation of individual ancestry coefficients of a dataset exhibiting population structure. PSIKO takes as input file in the .geno format, with each row consisting of a SNP, and each column consisting of an individual. It then estimates the number of founder populations, outputs ancestry estimates as well as the principal components of the dataset for subsequent use in association studies.

Allows users to compute pedigree structure and population. GENESIS is a R package that provides functionalities to infer, estimate, and count through two main modules: (i) PC-Air uses genome-wide single nucleotide polymorphism (SNP) data to determine the structure of a population from a sample that potentially includes known or cryptic relatedness and; (ii) PC-Relate for providing estimates of genetic relatedness and improves relationship classification.

Identifies potential transmission in the context of epidemiological diseases. MinDistB fixes the distance between viral populations as the minimum Hamming distance between their representatives. It is able to take into account the sizes of relative borders of each pair of viral populations. This tool was tested on experimental outbreak sequencing data. It employs minimal distances between intra-host viral populations to proceed.

Generates a population of 3D genome structures where each domain is represented as a sphere. PGS is a user-friendly software package that runs on local machines and high performance computing platforms. The software automatically generates an analysis of the structure population, including a description of the model quality based on its contact probability agreement with experiments and various structural genome features, including the radial nuclear positions of individual chromatin domains. The individual genome structures also contain a wealth of information and can be used to detect higher-order structural patterns of chromatin regions.

Determines population structure in which the number of populations is a random variable. Structurama is based on a Bayesian method and on a hierarchical variant of the Dirichlet Process prior model. It can summarize the results of a Bayesian analysis of population structure and of a Markov chain Monte Carlo (MCMC) analysis employing the mean partition. This tool can discover a partitioning of individuals among populations.

Represents a predictive model for microbiome composition data. BioMiCo facilitates interpretation of a community structure in light of user-defined feature labels. It is a hierarchical model that can be used to simultaneously learn how assemblages of operational taxonomic units (OTUs) contribute to microbiome structure, and how multiple assemblages might be related to the known features of the samples.

Assists users in manipulation of large multilocus molecular datasets. Functionality can be divided among diagnostic-, manipulation-, sampling-, simulation-, and transformation-based tools. Metadata from large genomic data sets can be efficiently extracted, without the need to view data in a text-editing program. genepopedit works cross-platform and can easily integrate into existing population genomics workflows either directly through R or in combination with other genomic analysis software. Importantly, genepopedit provides a simple yet robust code-based tool for repeatable genomic data manipulation, which has been proven to be stable for data sets in excess of 200 000 single nucleotide polymorphisms (SNPs).

Allows to analyse and visualise population structure. pophelper supports output run files generated from population analysis programs such as STRUCTURE, TESS and numeric delimited formats such as ADMIXTURE or fastSTRUCTURE. The pophelper package can be used to tabulate runs, summarise runs, estimate K using the Evanno method, export files for CLUMPP, export files for DISTRUCT and generate barplot figures. The pophelper R package and web app are available to assist users working with molecular markers to investigate population structure.

Calculates “K” estimators. StructureSelector is a web based software which aims to help in selecting and visualizing of the best estimators across a targeted file. The software includes MedMedK, MedMeaK, MaxMedK and MaxMeaK and two other estimators. Besides, it can generate graphical representations of the results for improving data submission and rapid import of graphical plots.

Provides a substantial decrease in the time required to validate and conduct hybrid detection by enabling the parallelization of analyses using NEWHYBRIDS. parallelnewhybrid enables the exploration of hybrid class assignment power and the utilization of larger datasets than previously feasible with NEWHYBRIDS. This tools consists of an example data set, a readme and three operating system-specific functions to execute parallel newhybrids analyses on each of a computer's c cores.

Allocates and simulates population by using amplified fragment length polymorphism markers. AFLPOP is an adaptation of Paetkau’s method for co-dominant alleles. It can provide information on the rates and types of incorrect allocations and on empirical distributions of likelihood statistics. The tool uses a filtering procedure that allows the selection of loci according to user-defined criteria.

Develops for detection, with estimates of efficiency and accuracy, of multi-generational hybrid individuals using genetic or genomic data in conjunction with the program NEWHYBRIDS. hybriddetective includes functions for the development and testing of diagnostic panels of markers, the simulation of multi-generational hybrids, and the quantification and visualization of the accuracy with which (simulated) hybrids can be detected. Overall, this package delivers a streamlined hybrid analysis platform, providing improvements in speed, ease of use and repeatability over current ad hoc approaches.

Models jointly genetic recombination (with mutation) and population structure. Spectrum is a Bayesian method that describes the underlying genetic process of recombination and mutation explicitly in terms of the association between ancestors and modern individuals. It can also infer a number of important genetic variables, such as recombination hotspots and ancestor patterns.

A method for a principal component analysis (PCA) analogue on binomial data via estimation of latent structure in the natural parameter. LFA seeks to directly model the logit transformation of probabilities underlying observed genotypes in terms of latent variables that capture population structure. We demonstrate these advances on data from the Human Genome Diversity Panel and 1000 Genomes Project, where we are able to identify SNPs that are highly differentiated with respect to structure while making minimal modeling assumptions.

Adapts techniques from image reconstruction that encourages smoothness without requiring rigidly parameterized allele frequency surfaces. OriGen is model based and fast. It can infer the geographic origin of Europeans in the POPRES dataset to much less than 100 km. Its impressive speed is achieved by focusing on the most informative markers, sometimes as few as 1% of all markers, and relying on new minorization–maximization (MM) algorithms for parameter estimation.

A program for automatically inferring the population structure and number of clusters from a sample of admixed genotype data. StructHDP extends the model used by Structure to allow for a potentially infinite number of populations and then chooses the number of populations that best explain the data.

Provides a method for the evaluation of the principal components (PCs) employing read count data directly. TASER-PC is a standalone software, based on sequencing reads directly, that permits to deduce population structure by taking into account the difference in read depth and error rates. Moreover, the program can be used for performing the parallelization of the multiple repeats of subsampling and read flipping on multiple machines.

Finds recombination hotspots from population genetic data. SequenceLDhot is based on an approximate marginal likelihood method. It scans through a chromosomal region of interest and considers fitting a recombination hotspot at a set of possible locations. This tool considers a grid of possible hotspot positions and assesses the evidence for the presence of the hotspot at each of these positions.

Generates theoretical distributions of FST and dXY under the neutral coalescent model for two populations that accounts for demographic parameters in a probabilistic framework. GppFst is a popular method for evaluating model fit within a Bayesian framework that has been used to test a variety of evolutionary models. This method allows users to explicitly test the null hypothesis of genetic drift when conducting genomic scans.

Enables distributed, likelihood-free inference for computationally demanding models. pyABC is a distributed and scalable ABC-Sequential Monte Carlo (ABC-SMC) framework that features adaptive population size selection, distributed model selection, and web-based visualizations. The software is modular and extensible and permits users to experiment with and to develop new ABC-SMC schemes.

Integrates STRUCTURE analysis with post-processing using a pipeline approach in addition to implementing parallel computation. StrAuto is a Python program to streamline population structure analysis using parallel computing. It implements a pipeline that combines STRUCTURE analysis with the Evanno K analysis and visualization of results using STRUCTURE HARVESTER. This method runs over multiple processors using GNU Parallel. These functionalities make StrAuto ideal for deployment on high performance computing clusters and multi-core personal workstations, to reduce the computational time.

Includes implementations of the autopolyploid (diseq), allopolyploid (alloSNP), Hardy Weinberg (hwe), and GATK-like (gatk) models for genotyping in polyploids. ebg is a software for estimating genotypes from high throughput sequencing data and works on both diploids and polyploids. Input data include (i) a matrix of total read counts mapping to each site for each individual, (ii) a matrix of reference read counts mapping to each site for each individual, and (iii) a vector of sequencing error values for each locus.

Infers patterns of population splits and mixtures from genome-wide allele frequency data. TreeMix is a unified model that uses the composite likelihood in to search for the maximum likelihood graph. Estimation involves two major steps: (i) for a given graph topology, it need to find the maximum likelihood branch lengths and migration weights, and (ii) it needs to search the space of possible graphs. This model can be thought of as a complement to methods for the identification of population structure.

New

Allows users to infer population trees from data on unlinked markers. BANANAS rests on a model that assumes that the assignment of individuals to populations is known. After loading their own data, users can analyze the data either using all populations at once in a fixed tree topology or run pair-wise analysis where all pairs of populations are analyzed simultaneously. After this analysis, this software summarizes the results using tables and graphs.

New

Investigates recombination within and between populations. PopNet employs an all-against-all approach that colors segments according to clades which share common ancestry. It leans on patterns of single nucleotide polymorphism (SNP) distributions to fix and display population structure based on a series of graph clustering steps. This tool is useful for revealing novel subgroupings, as well as genotype-phenotype associations.

Maximizes the proportion of total genetic variance due to differences between groups of populations. SAMOVA is a method based on a simulated annealing procedure. It permits to define groups of populations that are geographically homogeneous and maximally differentiated from each other. This algorithm also leads to the identification of genetic barriers between these groups.

Provides a way to use genetic data to identify species hybrids. NewHybrids is applicable not only to loci with fixed differences between species, but also to loci without fixed differences. Though prior knowledge may be incorporated into model, the method is able to cluster individuals in a mixed population without any a priori genetic knowledge of the species. The model-based approach is extendable to special sampling scenarios and different types of genetic markers.

Analyzes single nucleotide polymorphisms (SNP) data to detect fine scale structure in samples. IPCAPS is based on the iterative pruning Principal Component Analysis (ipPCA). The package clusters information without prior knowledge, calculates few members as outlying individuals and finally identifies top discriminators between clusters. Its routines permit relatively easy extension to input data derived from transcriptome or epigenome experiments.

Infers population structure from single nucleotide polymorphism (SNP) genotype data. fastSTRUCTURE can predict ancestry proportions with accuracie. It is able to produce a reasonable range of values for the model complexity required to explain structure underlying the data, without the need for a cross-validation scheme. This tool does not explicitly account for linkage disequilibrium (LD) between genetic markers.

A software tool for the detection of population structure in the presence of admixing and mutations from multi-locus genotype data. mStruct is a mixed membership model (also referred to as an admixture model) which incorporates a mutation process on the observed genetic markers.

Analyzes genetic data in order to infer population structure. It is based on network theory concepts such as community and modularity. NetStruct works with individual genetic data (single nucleotide polymorphisms (SNPs), microsattelites etc.) and provides a partition of the individuals to subpopulations. An additional analysis, called SAD (strength of association distribution) analysis, allows inferring more details regarding the detected structure. There are currently two versions of NetStruct: a Mathematica version and a python version.

Compares microbial community structures in a rapid, easy-to-use, and streamlined manner. TreeClimber is adaptation to a method used in population genetics, the parsimony test, to determine the relatedness of communities. This tool was developed to determine whether the observed difference between the phylogenies of multiple communities are due to an accumulation of evolutionary variation or some perturbation.

Assists users in investigating populations genetics structure. CNVice is a script composed of an assortment of approaches including Hardy-Weinberg equilibrium (HWE) and an expectation maximization (EM) algorithm. This application supplies four main inferences methods enabling alleles, population genotype and individual genotype frequencies assessment as well as these of the multiallelic population structure parameter.

Estimates the direction of transmissions of epidemiological diseases. ReD is based on a deterministic hierarchical clustering method. It focuses on a k-clustered intersection of viral populations. This tool also employs a standard Jukes-Cantor distance which is based on the simplest substitution-based evolutionary model. It is accurate in estimation of transmission clusters, directions and sources.

Aims to find directions of transmissions in the context of epidemiological diseases. VOICE is able to deduce important epidemiological characteristics, including genetic relatedness and transmission clusters. It employs a simulation method to feign viral evolution as a Markov process in the space of observed viral haplotypes. This tool can determine the number of generations needed to acquire a genetic heterogeneity observed in the recipient.

Computes various genetic assignment criteria to assign or exclude reference populations as the origin of diploid or haploid individuals, as well as of groups of individuals, on the basis of multilocus genotype data. GeneClass allows the specific task of first-generation migrant detection. It includes several Monte Carlo resampling algorithms that compute for each individual its probability of belonging to each reference population or to be a resident (i.e., not a first-generation migrant) in the population where it was sampled. A user-friendly interface facilitates the treatment of large datasets.

Uses for population stratification and individual admixture. PSMIX is an R package based on maximum likelihood method using expectation-maximization algorithm. PSMIX can be used in population genetics and disease gene mapping. Compared with other available similar programs, PSMIX has several advantages: (i) it is computationally efficient and provides similar accuracy under realistic situations, and (ii) it performs a little better under some conditions involving a small number of ancestors and markers.

Uses gain of informativeness for assignment (GIA) to build informative haplotypes for population assignment. HaploPOP needs reference individuals whose population of origin is known. Based on these reference individuals, the algorithm uses GIA to construct informative haplotypes.

Performs Mantel tests for detecting population structure. MANTEL-STRUCT is a program that computes inter-observational similarity/distance measures. It intends to analyze the binary data generated by random amplified polymorphic DNA (RAPD) or amplified fragment length polymorphism (AFLP) procedures. Moreover, this tool is able to study three types of data: binary data, multiple state data, and users-provided similarity/dis-similarity matrices.

Allows to draw STRUCTURE bar plots. Structure Plot generates publication ready, aesthetic STRUCTURE bar plots by using individual Q matrix from STRUCTURE or CLUMPP output. The program is simple to use and includes variety of options like sorting bar by original order or by K, and selection of colors from R colors or RColorBrewer palette. Individual or population labels can be printed below or above the plot in any angle. Size of the graph and label can be defined, and option is provided to save plot in variety of picture formats in user defined resolution. The program is implemented as a web application for online users and also as a standalone shiny application.

Allows users to infer population structure. POPSTR is a Bayesian integrative modeling approach that integrates the two types of genomic measurements - single nucleotide polymorphisms (SNPs) and copy number variations (CNVs) - from individual samples, to differentiate population ancestries. It permits researchers to obtain estimation of the population admixture coefficients and to better understand population structure.

Offers a graph-construction method. CONE can identify an appropriate model for a neighborhood selection scheme. It is able to select an appropriate tuning parameter for LASSO with stability-based sub-sampling. This tool permits analysis of the full data set using the selected neighborhood selection procedure and the tuning parameter determined, allows division of individuals into different communities using a suitable community detection algorithm, and can produce estimation of ancestry coefficients.

Allows pairing optimisation using demographic and genetic data. PaM is a genetic-based tool that optimises pairing assignments a priori- and\or posteriori to the trial. The software models individual genomes as consisting of gene pools (or admixture components) that correspond to their recent demographic history and then matches individuals based on their age, gender, and the similarity of their admixture components. It also allows users to infer the ancestry of the responders and develop a precision medicine approach to treatment.

Aids users to perform estimation of genetic relatedness between members of heterogeneous populations of closely related genomic variants. SignatureSJ is able to work for two metrics: edit distance or hamming distance. This tool accepts two input types for which it brings different information: for single sample, it finds all related sequence pairs within one sample; and for multi sample it finds all related sequence pairs between all given samples.

New

Estimates the ancestries and leverages them for their association with phenotypes. POPSICLE separates the genome into non-overlapping sliding windows of user defined size. It attributes the local profiles to the existing clades or the new clades as needed. This tool deduces ancestries and determinants of phenotype. It is useful to unravel cross-species genome-wide similarities.

Collates results generated by the program STRUCTURE. Structure Harvester provides a fast way to assess and visualize likelihood values across multiple values of K and hundreds of iterations for easier detection of the number of genetic groups that best fit the data. In addition, Structure Harvester will reformat data for use in downstream programs, such as CLUMPP, and when possible, executes the ‘‘Evanno’’ method. This program is available online or as a stand-alone version for local use.

Offers a scalable Bayesian likelihood-free inference method for population genetics models. This approach is based on a deep learning method and can be applied to both multiple population genetic problems and general simulator-based machine learning tasks. The method allows users to directly perform their analysis on the raw population genetic data, without the need for ad hoc summary statistics.

0 - 0 of 0
results

1 - 4 of 4
results

*filter_list*Filters

1 - 4 of 4
results