Allows to reconstruct and analyse phylogenetic relationships between molecular sequences. Phylogeny.fr is composed of (1) a pipeline to reconstruct a phylogenetic tree from a set of sequences through an automated process, (2) a Blast module to search for sequences similar to a query sequence, (3) a suite of standalone phylogenetic programs. New features and new programs can be easily added to the pipeline. The tool is able to analyze both DNA and protein sequences.
A free package of programs for inferring phylogenies. It can infer phylogenies by parsimony, compatibility, distance matrix methods, and likelihood. It also computes consensus trees, compute distances between trees, draw trees, resample data sets by bootstrapping or jackknifing, edit trees, and compute distance matrices. PHYLIP handles data that are nucleotide sequences, protein sequences, gene frequencies, restriction sites, restriction fragments, distances, discrete characters, and continuous characters.
Enables detailed evolutionary analyses of single-cell cancer sequencing data. SPhyR is a method for tumor phylogeny estimation based on a coordinate-ascent approach that infers a k-Dollo phylogeny from single-cell sequencing data with errors. Each single-nucleotide variants (SNVs) can only be gained once but lost k times. It provides a explanation of the evolutionary history of a metastatic colorectal cancer.
Identifies variations in whole genome sequencing (WGS) reads and conducts phylogenetic analysis of isolates. CSI Phylogeny is a webserver which calls and filters the single nucleotide polymorphisms (SNPs), does site validation and infers a phylogeny based on the concatenated alignment of the SNPs. The method was evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent) and overcomes the systematic biases caused by the sequencers.
Allows phylogenetic analyses (using Phylip) of ribosomal RNA (rRNA). rRNA phylogeny uses a matrix of empirical substitution rates and the OTRNA model. This method allows to obtain a large number of count matrices, corresponding to many different levels of sequence divergence, and to ensure that each count matrix is able to contain a high number of counts.
Provides access to phylogenetic tree generation methods from the ClustalW2 package. Simple Phylogeny is an online method to perform basic phylogenetic analysis on a multiple sequence alignment. It aims to model the substitutions that have occurred over evolutionary time and derive and represent the evolutionary relationships between sequences. It uses an alignment directly entered into the input box in a supported format.
Provides users with a platform for analysis of phylogeny. AQUA PONY offers four options: “Scale”, “Branch”, “Node” and “Annotation”. Results can be obtained through a linear or circular form. This software allows the visualization of the statistics of the leading tree and secondary tree.
Aligns raw short-sequence reads to one or more reference sequences. REALPHY can successfully avoid biases from mapping to a single reference by implementing a procedure for merging alignments obtained by mapping to multiple reference genomes into a single nonredundant alignment. It was designed to reconstruct phylogenies for microbial genomes, that is, bacterial genomes and single cell eukaryotes such as fungi, but it can in principle be equally applied to data from higher eukaryotic organisms.
Calculates ancestral gene orders by taking the phylogenetic tree and gene orders assigned the leaves of a tree into account. DupLoCut intends to detect the most parsimony assignment of gene orders under the duplication-loss evolutionary model. This software handles the capture of inverted duplications and is appropriate for dealing with larger instances or pairs of rather distant genomes.
Assists in comparing whole genome sequences. NexABP is an anchor based approach of calculating distances among next generation sequencing (NGS) raw data sets. It can be applied to low coverage data and can generate trees independent of availability of a fully assembled reference genome. It could also resolve inner branches and allow statistical testing using bootstrap analysis.
Consists of a phylogenetic tree-based method for modeling the on/off rates of transcription factor binding (TFB) events. TFBphylo is able to use expectation-maximization algorithm to estimate model parameters. This tool have been developed to study TFB evolution using multiple species ChIP-Seq data.
Enables copy number profiling and downstream analyses in disease genetic studies. MARATHON is a pipeline that gathers statistical software: CODEX and CODEX2 perform read depth normalization for total copy number profiling, iCNV receives read depth normalized by CODEX/CODEX2, FALCON and FALCON-X perform allele-specific copy number (ASCN) analysis and Canopy receives input from FALCON/FALCON-X to perform tumor phylogeny reconstruction. The pipeline adapts to different study designs and research goals.
A combinatorial method that infers clonal populations and their frequencies while satisfying phylogenetic constraints and is able to exploit data from multiple samples. Using simulated datasets and deep sequencing data from two cancer studies, CITUP has predicted clonal frequencies and the underlying phylogeny with high accuracy.
Aligns phylogeny-aware method for query sequences (QS). PaPaRa maps short reads against a fixed reference multiple sequence alignment (MSA) (RA) and the corresponding phylogenetic reference tree (RT). It leans on routines for parsing alignment files and trees and utilizes a custom-built sequential dynamic programming implementation. This tool also uses a simple model for ancestral states and an ‘ad hoc’ scoring scheme.
Infers a multi-state perfect phylogeny describing the evolutionary history of the somatic mutations (Single-Nucleotide Variations (SNV)s and Copy-Number Aberrations (CNAs)) of a tumor given multi-sample bulk sequencing data. SPRUCE addresses complexities in simultaneous analysis of SNVs and CNAs. Importantly, this tool relies on the infinite alleles, or no-homoplasy, assumption. Finally, SPRUCE gives additional insights into intra-tumor heterogeneity.
Addresses the problem of alignment of very large datasets, potentially containing fragmentary data. UPP can align datasets with up to 1,000,000 sequences. UPP produces highly accurate alignments for both nucleotide and amino acid sequences, even on ultra-large datasets or datasets containing fragmentary sequences.
Aims users to improve the rapidity in building maximum likelihood (ML) trees. MetaPiga is able to provide visualization and statistical analyses of trees. The algorithm can also run a rapid exploration of search space and identify the optimal trees and the multiple optima. Moreover, it can generates a probability index for each branch.
Automates and standardizes the analyses of RAD-seq data for phylogenetic inference. Users of RADIS can let their raw Illumina data be processed up to phylogenetic tree inference, or stop (and restart) the process at some point. Different values for key parameters can be explored in a single analysis (e.g. loci building, sample/loci selection), making possible a thorough exploration of data. RADIS relies on Stacks for demultiplexing of data, removing PCR duplicates and building individual and catalog loci. Scripts have been specifically written for trimming of reads and loci/sample selection. Finally, RAxML is used for phylogenetic inferences, though other software may be utilised.
Allows users to infer tumor progression which includes mutation losses from single cell sequencing data. gpps exploits an integer linear programming (ILP) approach that employs a maximum likelihood search to retrieve the best tree that explains the input, starting from single cell data. This software produces the ILP formulation which depends to an ILP solver to obtain the optimal solution.
Contains data of the evolutionary relationship of each cichlid protein to its nearest human and zebrafish protein. Cichlid phylogeny search tool can help cichlid researchers to predict biological and protein function for a given cichlid gene, understanding the evolutionary history of a given cichlid gene, identifying recently duplicated cichlid genes, or performing genome-wide analysis in cichlids that relies on using databases generated from other species.
Allows the complete Ensembl gene database to be queried using phylogenetic patterns. PhyloPat offers the possibility of querying with binary phylogenetic patterns or regular expressions, or through a phylogenetic tree of the 39 included species. Users can also input a list of Ensembl, EMBL, EntrezGene or HGNC IDs to check which phylogenetic lineage any gene belongs to.
Describes phylogenetic analyses operations and its related metadata. PhlyOnt is a terminology intending to encompass methods dealing with phylogenic reconstruction to give users a resource for annotating both data and services in workflows. It includes programs and models as well as the possibility to incorporate associated from various sources. This ontology aims to simplify access and exchange of materials of phylogenetic studies.
Depicts phylogenetic analyses. PHAGE is an ontology, consisting of more than 45000 classes, which is partly collected from repositories including UniProtKB or Gene Ontology. This ontology is divided into seven pieces where each one stands for a phylogenic step and which is composed of a merging of resources annotations and programs.
Provides aligned and annotated ribosomal RNA (rRNA) gene sequence data source, along with tools to allow researchers to analyze their own rRNA gene sequences. RDP offers tools for browsing and searching the data collections, for taxonomic classification and nearest neighbor search, for primer-probe testing and for tree building. RDP data and tools are utilized in fields as diverse as human health, microbial ecology, environmental microbiology, nucleic acid chemistry, taxonomy and phylogenetics.
Provides information about phylogenetic trees and the data used to generate them. TreeBASE takes a census of 4,076 publications. Available studies provides 8,233 matrices and resulted in 12,817 trees including 761,460 taxon labels divided in 104,593 distinct taxa. Searches can be made by using five criterion (author, citation, study accession number, matrix accession number, and taxon).
Allows biologists to design custom analyses of data on their local computers. TreeBASEdmp is an online database designed for permitting users to search phylogenetic topologies using nested sets and closure tables. This source can be used for investigating meta-analysis patterns in the field of phylogenetics, such as trends in usage of software programs, algorithms, or taxonomic coverage.
Displays GC content in eukaryotic genomes. GCevobase aims to provide a comprehensive map on how GC content evolves throughout the entire phylogeny of eukaryote organisms. The database contains data organized in five sections: Ensembl, Ensembl_Metazoa, Ensembl_Plants, Ensembl_Fungi and Ensembl_Protists. It has been created to facilitate the research centered on evolutionary dynamics of genomic composition.
Provides aligned, annotated and phylogenetically ordered sequences of the Signal recognition particle (SRP) components. SRPDB is a database that provides alphabetically and phylogenetically ordered lists of SRP RNA and SRP protein sequences. This online resource is also organized for additional proteins that play a role in SRP-mediated protein transport or are related to SRP components. The SRPDB lists 115 sequences from all phylogenetic groups.
Provides a reference phylogenetic dataset. This resource is a tunicate phylogenomic dataset including all major tunicate groups. It consists of about 250 orthologous nuclear genes for 63 taxa including representative deuterostome species and all major chordate lineages. This repository was built with transcriptomic data obtained through high-throughput sequencing technologies (Roche 454 and Illumina HiSeq).
Provides a database of sequence evolutionary rate profiles linked to amino acid and nucleotide data. EvoDB is composed of five datasets and uses the CODEML program to estimate evolutionary rates. Queries can be ruled for whole sequences (by protein or by domain), lineages within a phylogenetic tree or at particular codon sites. It can be used for testing hypotheses in molecular evolution.
An online database for comparative browsing of borrelia genomes. BorreliaBase is currently populated with sequences from 35 genomes of eight lyme-borreliosis group borrelia species and 7 relapsing-fever group borrelia species. Distinct from aggregator databases, this tool serves manually curated comparative-genomic data including genome-based phylogeny, genome synteny, and sequence alignments of orthologous genes and intergenic spacers. It also implements a novel graphic user-interface design that encourages comparisons of bacterial genomes under a framework of their shared phylogenetic history.
A resource for bird genomics, which provides access to data released by the Avian Phylogenomics Consortium. This bird portal can be tailored to the needs of the individual bird research communities. It can list available resources and support collaboration within and between research teams by providing and sharing data that can be used to improve the assembly (resequencing projects) or the annotation (variation and transcriptome data) for the genome of interest.
Provides access to the genomic alignments of public ribo-seq reads in conjunction with mRNA-seq reads along with relevant annotation tracks. GWIPS-viz is a specialized ribo-seq browser allowing researchers to support ribo-seq evidence for alternative proteoforms inferred from phylogenetic analysis or detect with proteomics or other experimental techniques. It can be used as a support tool for predictions based on other approaches and for generating hypotheses that can be tested using methods other than ribo-seq.
Provides orthologous groups (OGs) of proteins at different taxonomic levels. eggNOG is a database dedicated to orthologous groups and functional annotation. It provides pairwise orthology relationships within OGs based on analysis of phylogenetic trees. It also contains a framework for mapping novel sequences to OGs based on precomputed hidden Markov models (HMM) profiles.
Provides gene, protein and sequence information for multiple Candida species. CGD contains web-based tools for accessing, analyzing and exploring these data, to facilitate and accelerate research into Candida pathogenesis and biology. Locus pages comprise a summary view along with several additional tabs that display more detailed information, including phenotype details, Gene Ontology term curation, protein product details for coding genes, notes on changes to the sequence or structure of the gene, a comprehensive reference list and the Homology Information tab, a place where phylogeny- and similarity-related data may be examined and evaluated.
Allows user to study and visualize large phylogenetic trees. NCBI Influenza Virus Resource provides public access to influenza sequence data and a convenient interface. This platform is useful for constructing and viewing multiple sequence alignments and phylogenetic and clustering trees. This approach is based on a sequence-level representation of the data.
Provides a repository dedicated to information relative to ocean biodiversity and biogeographic data. OBIS is a community database that aims to integrate the most complete and standardized collection about marine organisms. The platform intends to develop international collaboration and supplies a variety of features for supporting research such as the identification of biologically important coastal habitats.