1 - 49 of 49 results

STRUCTURE

A free software package for using multi-locus genotype data to investigate population structure. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed. It can be applied to most of the commonly-used genetic markers, including SNPS, microsatellites, RFLPs and AFLPs. fastSTRUCTURE estimates approximate posterior distributions on ancestry proportions 2 orders of magnitude faster than STRUCTURE, with ancestry estimates and prediction accuracies that are comparable to those of ADMIXTURE.

TreeMix

Infers patterns of population splits and mixtures from genome-wide allele frequency data. TreeMix is a unified model that uses the composite likelihood in to search for the maximum likelihood graph. Estimation involves two major steps: (i) for a given graph topology, it need to find the maximum likelihood branch lengths and migration weights, and (ii) it needs to search the space of possible graphs. This model can be thought of as a complement to methods for the identification of population structure.

hybriddetective

Develops for detection, with estimates of efficiency and accuracy, of multi-generational hybrid individuals using genetic or genomic data in conjunction with the program NEWHYBRIDS. hybriddetective includes functions for the development and testing of diagnostic panels of markers, the simulation of multi-generational hybrids, and the quantification and visualization of the accuracy with which (simulated) hybrids can be detected. Overall, this package delivers a streamlined hybrid analysis platform, providing improvements in speed, ease of use and repeatability over current ad hoc approaches.

LFA / Logistic Factor Analysis

A method for a principal component analysis (PCA) analogue on binomial data via estimation of latent structure in the natural parameter. LFA seeks to directly model the logit transformation of probabilities underlying observed genotypes in terms of latent variables that capture population structure. We demonstrate these advances on data from the Human Genome Diversity Panel and 1000 Genomes Project, where we are able to identify SNPs that are highly differentiated with respect to structure while making minimal modeling assumptions.

genepopedit

Assists users in manipulation of large multilocus molecular datasets. Functionality can be divided among diagnostic-, manipulation-, sampling-, simulation-, and transformation-based tools. Metadata from large genomic data sets can be efficiently extracted, without the need to view data in a text-editing program. genepopedit works cross-platform and can easily integrate into existing population genomics workflows either directly through R or in combination with other genomic analysis software. Importantly, genepopedit provides a simple yet robust code-based tool for repeatable genomic data manipulation, which has been proven to be stable for data sets in excess of 200 000 single nucleotide polymorphisms (SNPs).

NetStruct

Analyzes genetic data in order to infer population structure. It is based on network theory concepts such as community and modularity. NetStruct works with individual genetic data (single nucleotide polymorphisms (SNPs), microsattelites etc.) and provides a partition of the individuals to subpopulations. An additional analysis, called SAD (strength of association distribution) analysis, allows inferring more details regarding the detected structure. There are currently two versions of NetStruct: a Mathematica version and a python version.

GENESIS / GENetic EStimation and Inference in Structured samples

Allows users to compute pedigree structure and population. GENESIS is a R package that provides functionalities to infer, estimate, and count through two main modules: (i) PC-Air uses genome-wide single nucleotide polymorphism (SNP) data to determine the structure of a population from a sample that potentially includes known or cryptic relatedness and; (ii) PC-Relate for providing estimates of genetic relatedness and improves relationship classification.

IPCAPS / Iterative Pruning to CApture Population Structure

Analyzes single nucleotide polymorphisms (SNP) data to detect fine scale structure in samples. IPCAPS is based on the iterative pruning Principal Component Analysis (ipPCA). The package clusters information without prior knowledge, calculates few members as outlying individuals and finally identifies top discriminators between clusters. Its routines permit relatively easy extension to input data derived from transcriptome or epigenome experiments.

StrAuto

Integrates STRUCTURE analysis with post-processing using a pipeline approach in addition to implementing parallel computation. StrAuto is a Python program to streamline population structure analysis using parallel computing. It implements a pipeline that combines STRUCTURE analysis with the Evanno K analysis and visualization of results using STRUCTURE HARVESTER. This method runs over multiple processors using GNU Parallel. These functionalities make StrAuto ideal for deployment on high performance computing clusters and multi-core personal workstations, to reduce the computational time.

parallelnewhybrid

Provides a substantial decrease in the time required to validate and conduct hybrid detection by enabling the parallelization of analyses using NEWHYBRIDS. parallelnewhybrid enables the exploration of hybrid class assignment power and the utilization of larger datasets than previously feasible with NEWHYBRIDS. This tools consists of an example data set, a readme and three operating system-specific functions to execute parallel newhybrids analyses on each of a computer's c cores.

PSIKO / Population Structure Inference using Kernel-pca and Optimisation

A software tool written in C++ for quick and accurate estimation of individual ancestry coefficients of a dataset exhibiting population structure. PSIKO takes as input file in the .geno format, with each row consisting of a SNP, and each column consisting of an individual. It then estimates the number of founder populations, outputs ancestry estimates as well as the principal components of the dataset for subsequent use in association studies.

PaM / Pair Matcher

Allows pairing optimisation using demographic and genetic data. PaM is a genetic-based tool that optimises pairing assignments a priori- and\or posteriori to the trial. The software models individual genomes as consisting of gene pools (or admixture components) that correspond to their recent demographic history and then matches individuals based on their age, gender, and the similarity of their admixture components. It also allows users to infer the ancestry of the responders and develop a precision medicine approach to treatment.

EBG

Includes implementations of the autopolyploid (diseq), allopolyploid (alloSNP), Hardy Weinberg (hwe), and GATK-like (gatk) models for genotyping in polyploids. ebg is a software for estimating genotypes from high throughput sequencing data and works on both diploids and polyploids. Input data include (i) a matrix of total read counts mapping to each site for each individual, (ii) a matrix of reference read counts mapping to each site for each individual, and (iii) a vector of sequencing error values for each locus.

PGS / Population-based Genome Structure

Generates a population of 3D genome structures where each domain is represented as a sphere. PGS is a user-friendly software package that runs on local machines and high performance computing platforms. The software automatically generates an analysis of the structure population, including a description of the model quality based on its contact probability agreement with experiments and various structural genome features, including the radial nuclear positions of individual chromatin domains. The individual genome structures also contain a wealth of information and can be used to detect higher-order structural patterns of chromatin regions.

CONE / Community Oriented Network Estimation

Offers a graph-construction method. CONE can identify an appropriate model for a neighborhood selection scheme. It is able to select an appropriate tuning parameter for LASSO with stability-based sub-sampling. This tool permits analysis of the full data set using the selected neighborhood selection procedure and the tuning parameter determined, allows division of individuals into different communities using a suitable community detection algorithm, and can produce estimation of ancestry coefficients.

PSMix / Population Structure inference via MIXture model

Uses for population stratification and individual admixture. PSMIX is an R package based on maximum likelihood method using expectation-maximization algorithm. PSMIX can be used in population genetics and disease gene mapping. Compared with other available similar programs, PSMIX has several advantages: (i) it is computationally efficient and provides similar accuracy under realistic situations, and (ii) it performs a little better under some conditions involving a small number of ancestors and markers.

GeneClass

Computes various genetic assignment criteria to assign or exclude reference populations as the origin of diploid or haploid individuals, as well as of groups of individuals, on the basis of multilocus genotype data. GeneClass allows the specific task of first-generation migrant detection. It includes several Monte Carlo resampling algorithms that compute for each individual its probability of belonging to each reference population or to be a resident (i.e., not a first-generation migrant) in the population where it was sampled. A user-friendly interface facilitates the treatment of large datasets.

NetView

Identifies and allows to visualize fine-scale population structures from a genetic relationship matrix among individuals/populations. NetView is an analysis pipeline which combines three different software tools to generate a high-definition network visualization of population structures. It includes Super Paramagnetic Clustering (SPC) Network analysis Tool (NeAT) and CYTOSCAPE. This pipeline is computationally efficient and can be easily applied on large-scale genome-wide data sets to assign individuals to particular populations and to reproduce fine-scale population structures without prior knowledge of individual ancestry.

strplot / Structure Plot

Allows to draw STRUCTURE bar plots. Structure Plot generates publication ready, aesthetic STRUCTURE bar plots by using individual Q matrix from STRUCTURE or CLUMPP output. The program is simple to use and includes variety of options like sorting bar by original order or by K, and selection of colors from R colors or RColorBrewer palette. Individual or population labels can be printed below or above the plot in any angle. Size of the graph and label can be defined, and option is provided to save plot in variety of picture formats in user defined resolution. The program is implemented as a web application for online users and also as a standalone shiny application.

Structure Harvester

Collates results generated by the program STRUCTURE. Structure Harvester provides a fast way to assess and visualize likelihood values across multiple values of K and hundreds of iterations for easier detection of the number of genetic groups that best fit the data. In addition, Structure Harvester will reformat data for use in downstream programs, such as CLUMPP, and when possible, executes the ‘‘Evanno’’ method. This program is available online or as a stand-alone version for local use.

Genodive

Obsolete
Analyses clonal diversity in asexually reproducing organisms. First, differences in clonal diversity between pairs of populations can be tested through bootstrapping; resampling, with replacement, genotypes within the populations. Second, differences in clonal composition (i.e. whether two populations could be random samples from the same pool of genotypes) can be tested through randomizing genotypes over populations. Third, the bias, resulting from small sample sizes, in the estimation of the diversity indices can be assessed through calculating the indices for subsamples of dataset with different, increasing, sizes for the subsamples.

pophelper

Obsolete
Allows to analyse and visualise population structure. pophelper supports output run files generated from population analysis programs such as STRUCTURE, TESS and numeric delimited formats such as ADMIXTURE or fastSTRUCTURE. The pophelper package can be used to tabulate runs, summarise runs, estimate K using the Evanno method, export files for CLUMPP, export files for DISTRUCT and generate barplot figures. The pophelper R package and web app are available to assist users working with molecular markers to investigate population structure.

Genotype

Obsolete
Assigns individuals to clonal lineages. Genotype can handle data from different kinds of genetic markers, both codominant and dominant, such as allozymes, microsatellites, amplified fragment length polymorphisms (AFLPs) and random amplified polymorphic DNA (RAPD). Genotype draws a histogram of clonal differences, and gives for every threshold the number of clones recognized to help users in choosing a threshold for genotype assignment, which can then be performed by the program. It can be used for detecting genotyping errors in studies of sexual organisms.

hybridlab

Obsolete
Simulates intraspecific hybrids from population samples of nuclear genetic markers such as microsatellites, allozymes or SNPs (single nucleotide polymorphisms). From standard population genetic data of individual multilocus genotypes, hybridlab first estimates allele frequencies at each locus in each of parental populations specified in the input file. Then multilocus F1 hybrid genotypes are created by randomly drawing one allele at each locus, as a function of their calculated frequency distributions, from each of two user-specified hybridizing populations. Linkage equilibrium among loci, neutrality of markers and random mating is assumed.