Supertree building software tools | Phylogenomics data analysis
Phylogenetic tree-building methods use molecular data to represent the evolutionary history of genes and taxa. A recurrent problem is to reconcile the various phylogenies built from different genomic sequences into a single one. This task is generally conducted by a two-step approach whereby a binary representation of the initial trees is first inferred and then a maximum parsimony (MP) analysis is performed on it. This binary representation uses a decomposition of all source trees that is usually based on clades, but that can also be based on triplets or quartets.
A software tool for constructing supertrees from source phylogenies. SuperFine is a meta-method that utilizes a novel two-step procedure in order to improve the accuracy and scalability of supertree methods. SuperFine-boosted supertree methods produce more accurate trees than standard supertree methods, and run quickly on very large data sets with thousands of sequences.
Enables the user to quickly summarize consensual information of a set of trees and localize groups of taxa for which the data require consolidation. For k input trees spanning a set of n taxa, this method produces a supertree that satisfies the above-mentioned properties in O(kn3 + n4) computing time. The polytomies of the produced supertree are also tagged by labels indicating areas of conflict as well as those with insufficient overlap.
Allows users to reconstruct phylogenies for very large protein families. QuickTree is a program simplifying activities such as bootstrapping and the investigation of more sophisticated distance measures for the phylogenetic research community. It makes it feasible to construct trees for large databases of sequence alignments such as Pfam when only limited resources are available.
Constructs supertrees and explore the underlying phylogenomic information from partially overlapping datasets. Clann has been developed to provide implementations of several supertree methods. The methods implemented all allow the investigation of data in a phylogenomic context. There are four supertree methods implemented in Clann: Matrix Representation using Parsimony (MRP); Most Similar Supertree (MSSA); Maximum Quartet Fit (QFIT) and Maximum Splits Fit (SFIT). It is important for the user to know that the software is designed to perform a number of different tasks, however the interpretation of the results is left entirely to the user.
A triplet-based supertree approach to phylogenomics. SuperTriplets infers supertrees with branch support values. The method avoids several practical limitations of the triplet-based binary matrix representation, making it useful to deal with large datasets. When the correct resolution of every triplet appears more often than the incorrect ones in source trees, SuperTriplets warrants to reconstruct the correct phylogeny. Both simulations and case studies on mammalian phylogenomics confirm the advantages of this approach.
Provides an application for construction of large phylogenetic trees. QuickJoin uses heuristics for speeding up the neighbour-joining algorithm while still constructing the same tree as the original neighbour-joining algorithm. This permits to construct trees for 8000 species on a single desktop computer. It can perform bootstrap validation of the constructed tree, and the bootstrap values for the individual edges are outputted as annotations on the outputted tree.
A method based on a dynamic programming method developed to find an exact solution to the Robinson-Foulds Supertree problem within a constrained search space. FastRFS has excellent accuracy in terms of criterion scores and topological accuracy of the resultant trees, substantially improving on competing methods on a large collection of biological and simulated data. In addition, FastRFS is extremely fast, finishing in minutes on even very large datasets, and in under an hour on a biological dataset with 2228 species.
Provides independent C++ implementations of the SUPERB algorithm for counting trees on a phylogenetic terrace. Terraphast offers independent C++ library facilitate integration of important phylogenetic post-processing step into popular phylogenetic inference tools that are predominantly written in C or C++. This resource yields exactly the same results that SUPERB algorithm.
Constructs a supernetwork from partial trees based on simulated annealing. SNSA generates a planar network, whereas Z-closure and Q-imputation potentially produce non-planar networks. This algorithm keeps high percentage of information of the input trees.
A python framework which construct split-based supertrees with the computation of three majority-rule (MR)supertree variants and input trees (MR(-), MR(+) and MR(+)g). PluMiST searches tree space by NNI (nearest-neighbor interchange) and TDR (taxa-deletion-reinsertion). Only fully resolved input and supertrees are considered, multifurcating trees may be returned as the strict consensus of equally best scoring trees.
Aims at inferring supertrees that satisfy the same appealing theoretical properties as with PhySIC, while being as informative as possible under this constraint. The informativeness of a supertree is estimated using a variation of the CIC (cladistic information content) criterion, that takes into account both the presence of multifurcations and the absence of some taxa.
Approximates maximum likelihood supertree inference. L.U.St allows the calculation of the approximate likelihood of a supertree, given a set of input trees, performs heuristic searches to look for the supertree of highest likelihood, and performs statistical tests of two or more supertrees. To this end, L.U.St implements a winning sites test allowing ranking of a collection of a-priori selected hypotheses, given as a collection of input supertree topologies. It also outputs a file of input-tree-wise likelihood scores that can be used as input to CONSEL for calculation of standard tests of two trees.
Serves as a greedy strict consensus merger supertree algorithm for rooted input trees. GSCM provides a command-line tool and a java library to generate different GSCM supertrees that augment the probability of both detecting all reliable clades and excluding all bogus clades. The main goal of this software is to exploit a conservative supertree method as preprocessing for better-resolving supertree methods.
Provides assistance for predicting multi locus sequence type (MLST) of genomes. MLSTar includes three main functions: (1) it uses genome assemblies and predicted genes from any number of strains, (2) performs sequence typing that uses a previously selected scheme form PubMLST and (3) utilizes standard phylogenetic approaches to analyze the data. This software tends to broaden the possibilities of performing allele-based genetic characterization.
A software for the inference of an optimal species tree under duplication and duplication-loss cost from a set of unrooted gene trees. Fasturec determines an optimal species tree from collections of unrooted gene trees. In particular, it has the following algorithms: local search to compute scores from a set of unrooted gene trees, generator of initial species tree and hill climbing method for inferring locally optimal species tree.
Allows the construction of a tree directly from raw reads or from assembled sequences. SaffronTree utilizes a kmer analysis to construct a phylogenetic neighbor joining tree in newick format. This tool supports assembled sequences, third generation data or next generation sequencing (NGS) data. It is able to return a graphical representation of the clustering of the data.
Creates a gene tree by combining a set of trees on partial, possibly overlapping data. SuGeT employs the super-tree principle combined with reconciliation frameworks for processing tree correction. It deletes the higher duplication nodes and then chooses the super-tree fitting the species tree and their hierarchical position in the initial gene tree. This tool can conserve a set of “trusted” subtrees, as well as the relative phylogenetic distance between them.
Helps for counting and enumerating the trees on a terrace. SUPERB is an algorithm that takes a set of rooted binary trees and then construct, when is it possible, all rooted, binary so-called supertrees that are compatible with all given trees in the input tree set. The algorithm starts with all leaves/taxa. Then, for each leaf, it determines if it belongs to the left or right subtree of the root. In the recursion, the algorithm then again divides the leaves among the children of the next node.
Includes several learning techniques for classification and regression. CORElearn consists of a suite of machine learning algorithms that can be used for feature selection or discretization of numeric attributes.