Discrete trait evolution software tools | Phylogenomics data analysis
The study of discrete characters is crucial for the understanding of evolutionary processes. Even though great advances have been made in the analysis of nucleotide sequences, computer programs for non-DNA discrete characters are often dedicated to specific analyses and lack flexibility. Discrete characters often have different transition rate matrices, variable rates among sites and sometimes contain unobservable states.
Enables the accurate estimation of rates of gene family evolution when there are errors in the observed gene family sizes. By allowing users to marginalize over the uncertainty in the observed gene family sizes, CAFE 3 provides a platform for expanding comparative genomic analyses into clades consisting solely of draft genome sequences.
A software tool to determine the evolutionary histories of gene families over a phylogenetic tree. Given a set of gene family sizes and a phylogenetic tree DupliPHY will calculate the ancestral family sizes at each internal node within the tree. DupliPHY returns a list of ancestral family sizes and a phylogenetic tree for each family with the ancestral family sizes listed as internal node labels.
DupliPHY is a software tool to determine the evolutionary histories of gene families using maximum likelihood. Given a set of gene family sizes and a phylogenetic tree DupliPHY-ML will calculate the ancestral family sizes at each internal node within the tree, the average number of events per branch, estimates of rates of birth-death and an evolutionary rate for each family. DupliPHY-ML will return a list of ancestral family sizes for each family, a phylogenetic tree where branch lengths represent the average number of events along that branch, estimates of the birth and death parameters and a list of evolutionary rates for each family.
A user-friendly web server that accurately infers branch-specific and site-specific gain and loss events. The novel inference methodology is based on a stochastic mapping approach utilizing models that reliably capture the underlying evolutionary processes. A variety of features are available including the ability to analyze the data with various evolutionary models, to infer gain and loss events using either stochastic mapping or maximum parsimony, and to estimate gain and loss rates for each character analyzed.
A computer package for performing analyses of trait evolution among groups of species for which a phylogeny or sample of phylogenies is available, these can be created using BayesPhylogenies. BayesTraits can be applied to the analysis of traits that adopt a finite number of discrete states, or to the analysis of continuously varying traits. The methods can be used to take into account uncertainty about the model of evolution and the underlying phylogeny. BayesTraits uses Markov chain Monte Carlo (MCMC) methods to derive posterior distributions and maximum likelihood (ML) methods to derive point estimates of, log-likelihoods, the parameters of statistical models, and the values of traits at ancestral nodes of phylogenies.
A software package for the simulation of multiple gene families evolving under incomplete lineage sorting, gene duplication and loss, horizontal gene transfer –all three potentially leading to species-tree/gene-tree discordance– and gene conversion. SimPhy implements a hierarchical phylogenetic model in which the evolution of species, locus and gene trees is governed by global and local parameters (e.g., genome-wide, species-specific, locus-specific), that can be fixed or be sampled from a priori statistical distributions. SimPhy can be useful to understand interactions among different evolutionary processes, conducting a simulation study to characterize the systematic overestimation of the duplication time when using standard reconciliation methods.
A software package for the analysis of numerical profiles on a phylogeny. It is primarily designed to deal with profiles derived from the phyletic distribution of homologous gene families, but is suited to study any other integer-valued evolutionary characters. Count performs ancestral reconstruction, and infers family- and lineage-specific characteristics along the evolutionary tree. It implements popular methods employed in gene content analysis such as Dollo and Wagner parsimony, propensity for gene loss, as well as probabilistic methods involving a phylogenetic birth-and-death model.
A software tool to estimate family turnover rates, as well as the number of elements in internal phylogenetic nodes, by likelihood-based methods and parsimony. BadiRate implements two stochastic population models, which provide the appropriate statistical framework for testing hypothesis, such as lineage-specific gene family expansions or contractions.
Allows the reconstruction of ancestral traits and a number of criteria to select phylotypes. PhyloType is a web application permitting the analysis of phylogenies comprising several thousands of taxa (strains). It recovers a number of already-identified transmission chains, contradicts a few others, and suggests some alternatives routes. In summary, it assists in exploring and interpreting the large virus phylogenies and in focusing on protein families associated with differentiated or specialized functions.
Allows for fitting of maximum likelihood models using Markov chains on phylogenetic trees for analysis of discrete character data. Examples of such discrete character data include restriction sites, gene family presence/absence, intron presence/absence, and gene family size data. Hypothesis-driven user-specified substitution rate matrices can be estimated. markophylo allows for biologically realistic models combining constrained substitution rate matrices, site rate variation, site partitioning, branch-specific rates, allowing for non-stationary prior root probabilities, correcting for sampling bias, etc.
Serves for mapping character histories on phylogenies. SIMMAP uses a Bayesian approach and enables researchers to address a wide variety of questions important in evolutionary studies. It can be used as a teaching tool for explaining stochastic models, Bayesian inference, and character histories. It has several types of function such as: (1) it calculates conditional likelihood for each character state at each node of the tree; or (2) it simulates ancestral states at each internal node by sampling from the posterior distribution of states.
Performs maximum likelihood estimation for evolutionary rates of discrete characters on a provided phylogeny with the options that correct for unobservable data, rate variations, and unknown prior root probabilities from the empirical data. DiscML gives users options to customize the instantaneous transition rate matrices, or to choose pre-determined matrices from models such as birth-and-death (BD), birth-death-and-innovation (BDI), equal rates (ER), symmetric (SYM), general time-reversible (GTR) and all rates different (ARD). DiscML is ideal for the analyses of binary (1s/0s) patterns, multi-gene families, and multistate discrete morphological characteristics.
A Bayesian method for estimating the evolutionary history of gene families. BEGFE implements a Markov Chain Monte Carlo algorithm to estimate the posterior probability distribution of the birth and death rate parameter and the numbers of gene copies at the internodes of the phylogenetic tree. In addition, it can simulate gene family data under the birth and death model.
Provides an R-based implementation of an analytical approach to obtain accurate, per-branch expectations of numbers of state transitions and dwelling times. SFREEMAP also introduces an intuitive way of visualizing the results by integrating over the posterior distribution and summarizing the parameters onto a target reference topology (such as a consensus or MAP tree) provided by the user. SFREEMAP’s performance was benchmarked against make.simmap, a popular R-based implementation of stochastic mapping. SFREEMAP confirmed theoretical expectations outperforming make.simmap in every experiment and reducing computation time of relatively modest datasets from hours to minutes. SFREEMAP returns estimates which were not only similar to the ones obtained by averaging across make.simmap mappings, but also more accurate, according to simulated data.
A modular, extendible software tool for evolutionary biology, designed to help biologists organize and analyze comparative data about organisms. Its emphasis is on phylogenetic analysis, but some of its modules concern population genetics, while others do non-phylogenetic multivariate analysis. Because it is modular, the analyses available depend on the modules installed.
Is dedicated to the testing of non-homogeneous process in sequence evolution. TestNH provides a convenient and reliable approach where branches get clustered by their pattern of molecular evolution alone, with no need for prior knowledge about the data set under study. Model selection is achieved in a statistical framework and therefore avoids overparameterization. The package contains two major programs: mapnh for performing the substitution mapping and clustering of branches and partnh for fitting substitution models on the resulting subsets. TestNH also contains the randnh program that was used to generate random models in the simulation analysis. These programs should be very useful to study patterns of molecular evolution and reveal new correlations between sequence and species evolution. TestNH can run on DNA, RNA, codon, or amino acid sequences with a large set of possible models of substitutions.