1 - 50 of 195 results

SATe / Simultaneous Alignment and Tree Estimation

Involves repeated alignment and tree searching operations. SATé is a software package for inferring a sequence alignment and phylogenetic tree. It searches for a tree/alignment pair with an optimal maximum likelihood (ML) score by performing hill-climbing searches from a collection of starting tree/alignment pairs. For each starting alignment, SATé estimates an ML tree (general time reversible + gamma model) with RAxML. The “second stage” then uses an iterative, greedy search heuristic to find tree/alignment pairs with better ML scores.


A Matlab package based on a Bayesian inference method for reconstructing transmission events in a densely sampled outbreak using time-labeled genomic data. TransPhylo works by colouring the branches of the phylogeny using a separate colour for each host, sampled or not. Each section of the tree coloured in a unique colour represents the pathogen evolution happening within a distinct host. Changes of colours on branches therefore correspond to transmission events from one host to another. Authors conclude that genomics cannot wholly replace traditional epidemiology but that Bayesian reconstructions derived from sequence data may form a useful starting point for a genomic epidemiology investigation.

PHAST / PHylogenetic Analysis with Space/Time models

Allows statistical phylogenetic modeling and functional element identification. PHAST is a collection of programs and supporting libraries for comparative genomics. The software also provides methods for detecting departures from neutrality in rates and patterns of molecular evolution. It is well-suited for analyzing patterns of conservation and acceleration in aligned sequences, and for extracting data from or exporting data to the UCSC Genome Browser and related resources, such as Galaxy.


Allows maximum likelihood analysis of large phylogenetic data. IQ-TREE explores the tree space efficiently and often achieves higher likelihoods than RAxML and PhyML. Other key features of IQ-TREE are (i) very fast model selection procedure including partition scheme finding, (ii) partitioned analysis for phylogenomic data, (iii) ultrafast bootstrap approximation, (iv) implementation of several branch tests and (v) tree topology tests. W-IQ-TREE an intuitive and user-friendly web interface and server for IQ-TREE is also available.

RWTY / R We There Yet

Implements various tests, visualizations, and metrics for diagnosing convergence and mixing of Markov chain Monte Carlo (MCMC) chains in phylogenetic. RWTY implements and automates many of the functions of the AWTY package in the R environment. It also adds a number of functions not available in AWTY. RWTY provides a single environment in which to analyse the convergence of all parameters in a phylogenetic MCMC analysis, including continuous parameters and those associated with the tree topology. RWTY accepts input from popular phylogenetic MCMC packages, currently including MrBayes, BEAST and RevBayes.


Provides distance algorithms to infer phylogenies. FastME is based on balanced minimum evolution, which is the very principle of Neighbor Joining (NJ). FastME improves over NJ by performing topological moves using fast, sophisticated algorithms. The first version of FastME only included Nearest Neighbor Interchange. The new version also includes Subtree Pruning and Regrafting, while remaining as fast as NJ and providing a number of facilities: Distance estimation for DNA and proteins with various models and options, bootstrapping, and parallel computations. FastME is available using several interfaces: Command-line (to be integrated in pipelines), PHYLIP-like, and a Web server.

CSI Phylogeny / Call SNPs & Infer Phylogeny

Identifies variations in whole genome sequencing (WGS) reads and conducts phylogenetic analysis of isolates. CSI Phylogeny is a webserver which calls and filters the single nucleotide polymorphisms (SNPs), does site validation and infers a phylogeny based on the concatenated alignment of the SNPs. The method was evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent) and overcomes the systematic biases caused by the sequencers.

REALPHY / Reference sequence Alignment-based Phylogeny builder

Aligns raw short-sequence reads to one or more reference sequences. REALPHY can successfully avoid biases from mapping to a single reference by implementing a procedure for merging alignments obtained by mapping to multiple reference genomes into a single nonredundant alignment. It was designed to reconstruct phylogenies for microbial genomes, that is, bacterial genomes and single cell eukaryotes such as fungi, but it can in principle be equally applied to data from higher eukaryotic organisms.


Allows users to specify an effectively countless number of diversification models, where each model describes an alternative scenario for the diversification of the three. TESS can be used to efficiently simulate under and/or infer parameters of the models. Additionally, TESS provides robust methods for assessing the relative fit of competing models to a given tree, providing users with an extremely flexible yet intuitive framework for testing hypotheses regarding the patterns and processes of lineage diversification.


Simplifies post-processing of MCMC traces with, for example, automatic burn-in estimation. VMCMC, a tool for phylogenetic MCMC analysis, with support for analysis and exploration of chain convergence, burn-in estimation, trace visualization, parameter estimation, graphical display of parameter traces, which can run both as a command-line tool and as an application with a graphical user-interface (GUI). It can be applied to trace files from several molecular phylogenetics MCMC tools.


Discovers phylogenetic markers from orthologous sequences. DiscoMark is a bioinformatics program designed to make it easy to develop phylogenetic markers from orthologous DNA sequences. One of the more tedious tasks in phylogenetics is picking the right markers and design PCR primers for them to be amplified in the samples of interest. DiscoMark supports researchers in this process and scales it up to the genome level by automating the steps from multiple sequence alignment to PCR primer design.


A python extension for phylogenetic analysis. Phycas specializes in Bayesian model selection for nucleotide sequence data, particularly the estimation of marginal likelihoods, central to computing Bayes Factors. Marginal likelihoods can be estimated using methods (Thermodynamic Integration and Generalized Steppingstone) that are more accurate than the widely used Harmonic Mean estimator. Phycas provides for analyses in which the prior on tree topologies allows polytomous trees as well as fully resolved trees, and provides for several choices for edge length priors, including a hierarchical model as well as the recently described compound Dirichlet prior, which helps avoid overly informative induced priors on tree length.

MixTreEM / Mixture of Trees using Expectation Maximization

A package to reconstruct species tree. MixTreEM uses a probabilistic generative mixture model to reconstruct a set of k-candidate species trees given a set of n monocopy gene families. In the first phase, a set of probable species trees are inferred given gene family data. In the second phase, each of the species tree, along with the gene families, is fed into DLRS (Duplication, Loss, Rate and Sequence) model, ultimately giving us the best species tree.

BayesCAT / Bayesian Co-estimation of Alignment and Tree

Implements a joint model for co-estimating phylogeny and sequence alignment. The BayesCAT software allows arbitrary-length overlapping indel events and a general distribution for indel fragment size. The implemented methods for joint estimation of phylogeny and sequence alignment infers phylogeny while accounting for uncertainty in the alignment and summarizes alignment samples. It also infers more information about the indel process.


Estimates speciation, extinction, and preservation rates from fossil occurrence data using a Bayesian framework. Pyrate includes several methods to understand how rates vary through time and whether they correlate with traits (e.g. body size) or respond to continuous variables (e.g. climate proxies) or competitive effects (through diversity dependence). Macro evolutionary rates are jointly estimated with preservation rates, describing processes of fossilization, sampling and identification of organisms.

VICTOR / Virus Classification and Tree Building Online Resource

Compares bacterial and archaeal viruses using their genome or proteome sequences. The VICTOR results include phylogenomic trees inferred using the Genome-BLAST Distance Phylogeny method (GBDP), with branch support, as well as suggestions for the classification at the species, genus and family level. Based on the results of the VICTOR service, users can make an informed decision on the evolutionary relationships between prokaryotic viruses. The method was thoroughly optimized against a large reference dataset of genome-sequenced taxa recognized by the International Committee on Taxonomy of Viruses (ICTV) and showed a high agreement with the classification, particularly at the species and genus level.


A bioinformatics tool that integrates sequence data from two genetic markers into a single phylogenetic tree that can be used for diversity analyses. Our approach starts with a "foundation" phylogeny based on one genetic marker whose sequences can be aligned across organisms spanning divergent taxonomic groups (e.g., fungal families). Then, "extension" phylogenies are built for more closely related organisms (e.g., fungal species or strains) using a second more rapidly evolving genetic marker. These smaller phylogenies are then grafted onto the foundation tree by mapping taxonomic names such that each corresponding foundation-tree tip would branch into its new "extension tree" child.


Selects the most representative organisms, following a set of simple rules based on taxonomy and assembly quality. phyloSkeleton allows users to retrieve the genomes from public databases (NCBI, JGI), to annotate them if necessary, to identify given markers in these, and to prepare files for multiple sequence alignment. phyloSkeleton is also useful to place a novel, unknown organism in a backbone tree, to resolve a particular region of a large tree, or to explore the monophyly of certain taxa.

ERaBLE / Evolutionary Rates and Branch Length Estimation

A phylogenomic distance-based method to estimate the branch lengths of a given reference topology, and the relative evolutionary rates of the genes employed in the analysis. ERaBLE uses as input data a potentially very large collection of distance matrices, where each matrix is obtained from a different genomic region — either directly from its sequence alignment, or indirectly from a gene tree inferred from the alignment. Our experiments show that ERaBLE is very fast and fairly accurate when compared to other possible approaches for the same tasks. Specifically, it efficiently and accurately deals with large data sets, such as the OrthoMaM v8 database, composed of 6,953 exons from up to 40 mammals.

AAF / Alignment and Assembly Free

Provides an efficient way of estimating the phylogenetic relationships using raw sequence data from whole genomes. The AAF method is a robust tool for phylogeny reconstruction especially when only low-coverage and heterogeneous genome data are available. This application was developed, explained, and validated using a combination of sequence evolution models, mathematical calculations and simulated short read sequence (SRS) data from published genomes for 11 primates.


Infers the pattern of chromosome number change along a phylogeny. ChromEvol facilitates the inference of the expected number of polyploidy and dysploidy transitions along each branch of a phylogeny and estimates ancestral chromosome numbers at internal nodes. It features an extension of the model accounting for general multiplication events, other than doubling of the number of chromosomes. This allows the monoploid number (commonly referred to as x, or the base-number) of a group of interest to be inferred in a statistical framework.

PAML / Phylogenetic Analysis by Maximum Likelihood

A package of programs for phylogenetic analyses of DNA and protein sequences using maximum likelihood (ML). PAML may be used to compare and test phylogenetic trees, but their main strengths lie in the rich repertoire of evolutionary models implemented, which can be used to estimate parameters in models of sequence evolution and to test interesting biological hypotheses. Uses of the programs include estimation of synonymous and nonsynonymous rates (dN and dS) between two protein-coding DNA sequences, inference of positive Darwinian selection through phylogenetic comparison of protein-coding genes, reconstruction of ancestral genes and proteins for molecular restoration studies of extinct life forms, combined analysis of heterogeneous data sets from multiple gene loci, and estimation of species divergence times incorporating uncertainties in fossil calibrations.


Estimates the posterior distribution of model parameters. MrBayes is a program that performs Bayesian inference of phylogeny using a variant of Markov chain Monte Carlo (MCMC). It can infer ancestral states while accommodating uncertainty about the phylogenetic tree and model parameters. It also implements several methods for relaxing the assumption of equal rates across sites, including gamma-distributed rate variation. It can be used to test various topological hypotheses or substitution models against each other.


Provides a set of C++ libraries for Bioinformatics, including sequence analysis, phylogenetics, molecular evolution and population genetics. Bio++ is Object Oriented and is designed to be both easy to use and computer efficient. Bio++ intends to help programmers to write computer expensive programs, by providing them a set of re-usable tools. Bio++ provides built-in access to sequence databases and data structures for handling and manipulating sequences from the omics era, such as multiple genome alignments and sequencing reads libraries. More complex models of sequence evolution, such as mixture models and generic n-tuples alphabets, are also included.

Gubbins / Genealogies Unbiased By recomBinations In Nucleotide Sequences

Identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions. Gubbins uses spatial scanning statistics to identify loci containing elevated densities of base substitutions suggestive of horizontal sequence transfer. Gubbins is also a practical tool in the context of current sequencing capacity; it is typically able to converge on a result for an alignment of 100 two megabase sequences in well under an hour, whereas others sophisticated model fitting can take weeks to analyze a much smaller number of genomes.


A Perl script combining the various steps necessary to producing a phylome. PhyloGenie sets up a list of programs that automates the steps from seed sequence to phylogeny and a utility to extract all phylogenies that match specific topological constraints from a database of trees. BLAST or PSI-BLAST searches are performed for all input sequences, the HSP's (High Scoring Pairs) corresponding to user defined selection criteria (E-value, coverage, score per column, identity) are extracted and used as a basis for multiple sequence alignment.