The task of resolving the tree of life of extant species remains one of the grand challenges in evolutionary biology. As the number of trees grows superexponentially with the number of species for which an evolutionary tree is reconstructed, tree inference is considered a hard problem in computer science. The plethora of algorithmic challenges associated with phylogenetic trees and their efficient computation gave rise to the discipline of “phyloinformatics.”
Assists users to observe DNA and protein sequence data from different species and populations. MEGA is composed of several tools allowing researchers to work on phylogenomics and phylomedicine. This repository includes features aiming to determine gene duplication events in gene family trees. Moreover, this tool is available through a graphical user interface (GUI) and a command line interface.
Estimates maximum likelihood phylogenies from alignments of nucleotide or amino acid sequence. PhyML is a phylogeny software based on the maximum-likelihood principle. It implements algorithms to search the space of tree topologies with user-defined intensities. The software provides a wide range of options that were designed to facilitate standard phylogenetic analyses. It also implements two methods to evaluate branch supports in a sound statistical framework.
Estimates the posterior distribution of model parameters. MrBayes is a program that performs Bayesian inference of phylogeny using a variant of Markov chain Monte Carlo (MCMC). It can infer ancestral states while accommodating uncertainty about the phylogenetic tree and model parameters. It also implements several methods for relaxing the assumption of equal rates across sites, including gamma-distributed rate variation. It can be used to test various topological hypotheses or substitution models against each other.
Allows analysis of gene and species trees. AnGST is a phylogenomic method comparing individual gene phylogenies with the phylogeny of organisms. This tool uses the topology of the genealogy tree to function, and can infer the direction of gene transfer in addition to gene duplication. It accounts for uncertainty in gene trees by incorporating reconciliation into the tree-building process.
Allows identification of populations as groups of related strains sharing a common projected habitat, which reflects their relative abundance in the measured environmental categories. ADAPTML maps changes in environmental preference onto the tree by predicting projected habitats for each extant and ancestral strain in the phylogeny. It builds a hidden Markov model for the evolution of habitat associations.
Allows maximum likelihood analysis of large phylogenetic data. IQ-TREE explores the tree space efficiently and often achieves higher likelihoods than RAxML and PhyML. Other key features of IQ-TREE are (i) very fast model selection procedure including partition scheme finding, (ii) partitioned analysis for phylogenomic data, (iii) ultrafast bootstrap approximation, (iv) implementation of several branch tests and (v) tree topology tests. W-IQ-TREE an intuitive and user-friendly web interface and server for IQ-TREE is also available.
A cross-platform program for Bayesian phylogenetic analysis of molecular sequences. BEAST estimates rooted, time-measured phylogenies using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST 2 uses Markov chain Monte Carlo (MCMC) to average over tree space, so that each tree is weighted proportional to its posterior probability. BEAST 2 includes a graphical user-interface for setting up standard analyses and a suit of programs for analysing the results. It uses an XML input format that allows the user to design and run a large range of models. We also include a program that can convert NEXUS files into this format.
A package of programs for phylogenetic analyses of DNA and protein sequences using maximum likelihood (ML). PAML may be used to compare and test phylogenetic trees, but their main strengths lie in the rich repertoire of evolutionary models implemented, which can be used to estimate parameters in models of sequence evolution and to test interesting biological hypotheses. Uses of the programs include estimation of synonymous and nonsynonymous rates (dN and dS) between two protein-coding DNA sequences, inference of positive Darwinian selection through phylogenetic comparison of protein-coding genes, reconstruction of ancestral genes and proteins for molecular restoration studies of extinct life forms, combined analysis of heterogeneous data sets from multiple gene loci, and estimation of species divergence times incorporating uncertainties in fossil calibrations.
Involves repeated alignment and tree searching operations. SATé is a software package for inferring a sequence alignment and phylogenetic tree. It searches for a tree/alignment pair with an optimal maximum likelihood (ML) score by performing hill-climbing searches from a collection of starting tree/alignment pairs. For each starting alignment, SATé estimates an ML tree (general time reversible + gamma model) with RAxML. The “second stage” then uses an iterative, greedy search heuristic to find tree/alignment pairs with better ML scores.
Serves for degenerating nucleotides to IUPAC nomenclature ambiguity codes. Degen processes by reading individual DNA sequences as strings of codons with three sequential nucleotides within. It then changes every codon with a fully degenerated codon by using IUPAC nomenclature of polymorphic nucleotides for the ones that can be variable. This program is available as a download desktop and a web application.
Assesses a species tree from a set of unrooted gene trees. ASTRAL runs in polynomial time, by constraining the search space using a set of allowed bipartitions. This application intends to retrieve species tree with an optimized number of shared induced quartet trees with the set of gene trees. It can be used with large datasets, with a focus on those which include large k and many polytomies.
Facilitates phylogenomic analyses on microeukaryotes. GPSit is an automated method that is compatible with data from genome sequencing and transcriptome sequencing, including that from single cells. The software can contribute to the automated process and scalability of collection of extended DNA barcodes and specimen identification after genome skimming or single-cell sequencing. It is useful for molecular systematics and molecular ecological investigations.
Allows statistical phylogenetic modeling and functional element identification. PHAST is a collection of programs and supporting libraries for comparative genomics. The software also provides methods for detecting departures from neutrality in rates and patterns of molecular evolution. It is well-suited for analyzing patterns of conservation and acceleration in aligned sequences, and for extracting data from or exporting data to the UCSC Genome Browser and related resources, such as Galaxy.
Allows users to perform reconstruction of phylogenetic trees and ancestral genomes from gene order. GRAPPA is a standalone software developed on parsimony-based methods which aims to compute tree’s scoring by solving each median problem. It aims to calculate the inversion distance between two signed permutations. The program also contains a reversal median solver and the DCJ median solver.
Deduces and investigates reticulate evolutionary histories. PhyloNet is based on a Bayesian method that samples the posterior of phylogenetic networks and their associated parameters from bi-allelic data. It assists users in investigating large data sets, as well as in studying the performance of evolutionary network reconstruction methods. This tool allows compact representation of evolutionary networks. It also includes a method for co-estimating gene and species trees from sequence data of multiple, unlinked loci.
A free package of programs for inferring phylogenies. It can infer phylogenies by parsimony, compatibility, distance matrix methods, and likelihood. It also computes consensus trees, compute distances between trees, draw trees, resample data sets by bootstrapping or jackknifing, edit trees, and compute distance matrices. PHYLIP handles data that are nucleotide sequences, protein sequences, gene frequencies, restriction sites, restriction fragments, distances, discrete characters, and continuous characters.
Provides distance algorithms to infer phylogenies. FastME is based on balanced minimum evolution, the very principle of neighbor joining (NJ). The software performs topological moves using sophisticated algorithms. It includes nearest neighbor interchange, as well as subtree pruning and regrafting. The program also provides a number of facilities: distance estimation for DNA and proteins with various models and options, bootstrapping, and parallel computations.
Provides an interactive environment for statistical computation in phylogenetics. RevBayes is an application developed to model, simulate and calculate Bayesian inference in evolutionary biology, particularly phylogenetics. This method is entirely based on probabilistic graphical models and a powerful generic framework for specifying and analyzing statistical models. This environment is quite general and can be useful in any field dealing with complex stochastic models.
Enables analysis of nucleotide or amino acid sequence. PUZZLE is a PHYLIP-compatible maximum likelihood tree reconstruction program that implements Quartet puzzling (QP), a heuristic tree search procedure for maximum-likelihood trees, as well as likelihood-mapping analysis.