Gene orthology detection software tools | High-throughput sequencing data analysis
Gene orthology aims at identifying evolutionary relationships between genes from different species. Identification of orthologous gene sets typically involves phylogenetic tree analysis, heuristic algorithms based on sequence conservation, synteny analysis, or some combination of these approaches.
Searches protein database using a translated nucleotide query. BLASTX is a BLAST search application that compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database. This application can also work in Blast2Sequences mode and can send BLAST searches over the network to public NCBI server if desired.
Aims to create relations between genes across many species from all of life. OMA offers a web browser to request new or updated genomes easily. It integrates domain annotations from Gene3D for individual protein entries. This database focuses on orthology relationships among many genomes across all of life. It prioritizes general-purpose, high-quality genomes, with a special effort toward better sampling life’s diversity
Consists of a software suite for orthology relations inference. OrthoInspector is a portal providing access to precomputed orthology databases and enabling inference of orthologous relationships between protein coding-genes. It also implements comparative genomics tools that can serve for anyone interested in comparative genomics and evolutionary studies of protein families. This software can be used to find intra-domain orthologs in several species and to find interdomain orthologs in fewer, well-studied, species.
A program for identifying orthologous protein sequence families. Using real benchmark datasets we demonstrate that OrthoFinder is more accurate than other orthogroup inference methods by between 8 % and 33 %.
Allows comparison and separation of orthology relationships. CAT uses the reference free duplication-aware multiple genome alignments to comment multiple genomes simultaneously and symmetrically. This software can construct othology mappings and name equivalence class of orthologs with initial reference annotation. It works with RNA-set data and polyA-selected libraries.
Converts genome coordinates and genome annotation files between assemblies. UCSC LiftOver is a pipeline that can be used to convert coordinate ranges between genome assemblies. It supports forward/reverse conversions, batch conversions, and conversions between species.
Allows orthology inference. Hieranoid is an hybrid between tree and graph based approaches that infer groups of orthologs in a hierarchical structure. The software uses a species tree as a guide tree which reduces the number of proteome comparisons to N-1 for N proteomes. Hieranoid can be parallelized on compute cluster. One of its benefits is its linearly scaling compute time which makes it useful for applications with large multispecies sets of proteomes.
A toolkit that implements an adjusted MCScan algorithm for detection of synteny and collinearity and incorporates 14 computer programs for visualizing and analyzing identified synteny and collinearity. MCScanX scans multiple genomes or subgenomes in order to identify putative homologous chromosomal regions, and aligns these regions using genes as anchors. MCScanX can be used to effectively analyze chromosome structural changes, and reveal the history of gene family expansions that might contribute to the adaptation of lineages and taxa. An integrated view of various modes of gene duplication can supplement the traditional gene tree analysis in specific families.
A web platform for comparison and annotation of orthologous gene clusters among multiple species. OrthoVenn provides comprehensive coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for identify orthologous gene clusters and supports user define species to upload customized protein sequences. It has an efficient and interactive graphic tool which provide a Venn diagram view for comparing two to six species protein sequences. The only thing user need to do is choosing species or upload protein sequences.
Provides a variation of BLAST tool for detecting orthologous group. BLASTO is a web application that aims to assist users in deducing the function of a sequence and its hypothetic phylogenetic relationships to other sequences through a comparison performed towards various databases. Users can set a wide range of parameters such as genetic codes or substitution matrices and choose between five different databases including NCBI COG, NCBI KOG, OrthoMCL DB, MultiParanoid and TIGR EGO.
An approach to ortholog identification using subtree hidden Markov model-based placement of protein sequences to phylogenomic orthology groups in the PhyloFacts database. Results on a data set of microbial, plant and animal proteins demonstrate FAT-CAT's high precision at separating orthologs and paralogs and robustness to promiscuous domains.
Predicts genes involved in affecting specific cellular processes together with a gene of interest. EvoCor integrates profiles of sequence divergence derived by a Hidden Markov Model (HMM) and tissue-wide gene expression patterns to determine putative functional linkages between pairs of genes.
A database of orthologous mammalian markers describing the evolutionary dynamics of orthologous genes in mammalian genomes using a phylogenetic framework. Since its first release in 2007, OrthoMaM has regularly evolved, not only to include newly available genomes but also to incorporate up-to-date software in its analytic pipeline. OrthoMaM has proven to be a valuable resource for researchers interested in mammalian phylogenomics, evolutionary genomics, and has served as a source of benchmark empirical data sets in several methodological studies.
Provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB. BUSCO assessments are implemented in open-source software, with comprehensive lineage-specific sets of benchmarking universal single-copy orthologs for arthropods, vertebrates, metazoans, fungi, eukaryotes, and bacteria.
Computes for a pair of genes the probability that they are orthologous. primeGEM is a program that performs probabilistic orthology analysis. This framework is based on the gene evolution model which models gene duplication and loss using a generalized birth-death model with three parameters. The parameters can be integrated over using a Markov chain Monte Carlo (MCMC) or maximized over using machine learning (ML).
A package to compute orthology probabilities. DLRSOrthology enables orthology inference with respect to the DLRS (Duplication, Loss, Rate and Sequence) model by extending the original PrIME-DLR framework with new algorithms, and comes in two variants: the variable-tree variant and fixed-tree variant. DLRSOrthology outperforms competing approaches on synthetic data as well as on biological data sets and is robust to incomplete taxon sampling artifacts.
Enables multi-species orthology inference. SonicParanoid is a program that detects orthologous relationships among multiple species. The software treats groups as elements of numeric sets. It permits addition and deletion of proteomes by reusing the results from previous runs, which is beneficial to users who need to maintain their own orthology databases. This tool can contribute to annotate new genomes and find target genes in medical and biotechnological applications.
Quantifies and displays all the clusters formed by the genes, given a list of protein-coding genes. Cluster Locator is an analysis and visualization software that enables the study of how the genes on a list of interest are distributed in clusters and whether the percentage of gene clustering found in the list is statistically significant. It can analyze lists of protein-coding genes from any organism, and provides preloaded genomes of organisms that are among the most studied.
Identifies homologous genes and interprets their relationships (orthology or paralogy). The user can specify the topology of the tree pattern, and set constraints on its nodes and leaves. Then, this pattern is compared with all the phylogenetic trees of the database, to retrieve the families in which one or several occurrences of this pattern are found.