Gene Ontology annotation software tools | High-throughput sequencing data analysis
Gene ontologies are unified vocabularies and representations for genes and gene products across all living organisms. Gene annotation is of great importance for identification of their function or host species, particularly after genome sequencing. Gene ontology software tools are used for management, information retrieval, organization, visualization and statistical analysis of large sets of genes.
Searches protein database using a translated nucleotide query. BLASTX is a BLAST search application that compares the six-frame conceptual translation products of a nucleotide query sequence (both strands) against a protein sequence database. This application can also work in Blast2Sequences mode and can send BLAST searches over the network to public NCBI server if desired.
Allows users to obtain biological features/meaning associated with large gene or protein lists. DAVID can determine gene-gene similarity, based on the assumption that genes sharing global functional annotation profiles are functionally related to each other. It groups related genes or terms into functional groups employing the similarity distances measure. This tool takes into account the redundant and network nature of biological annotation contents.
Permits functional annotation, management, and data mining of novel sequence data. Blast2GO is based on the utilization of common controlled vocabulary schemas, the gene ontology (GO). It takes in consideration similarity, the extension of the homology, the database of choice, the GO hierarchy, and the quality of the original annotations. This tool is suitable for plant genomics research. It generates functional annotation and assesses the functional meaning of their experimental results.
Predicts the function of genes and gene sets. GeneMANIA is used for probing of gene function and revealing pairwise connections linking genes in yeast, fly, worm, human and other species. It allows users to construct networks from gene lists for custom organisms and network data. The prediction performed provides a method for leveraging functionally informative associations to explore bacterial gene function.
Gives access to many free software tools for sequence analysis. EMBOSS aims to serve the molecular biology community. It permits the creation and the release of software in an open source spirit. This tool is useful for sequence analysis into a seamless whole. It is free of charge and is available in open source.
Predicts functions of cis-regulatory regions. Many coding genes are well annotated with their biological functions. Non-coding regions typically lack such annotation. GREAT assigns biological meaning to a set of non-coding genomic regions by analyzing the annotations of the nearby genes. Thus, it is particularly useful in studying cis functions of sets of non-coding genomic regions. Cis-regulatory regions can be identified via both experimental methods (e.g. ChIP-seq) and by computational methods (e.g. comparative genomics).
Provides a suite of methods important for the prediction of protein structural and functional features. predictProtein is a web server that incorporates over 30 tools. This software searches up-to-date public sequence databases, creates alignments, and predicts aspects of protein structure and function. It can help when little is known about the protein in question. For medium-to-high throughput analyses, downloadable software packages and the PredictProtein Machine Image (PPMI) are available.
Permits the management, information retrieval, organization, visualization and statistical analysis of large sets of genes. WebGestalt integrates functional enrichment analysis and information visualization. It supports about 12 organisms, more than 320 gene identifiers from various databases and technology platforms, and about 151 000 functional categories from public databases and computational analyses.
Allows interactive visualization and retrieval of Gene Ontology (GO) terms and genes. NaviGO analyzes functional similarity and associations of GO terms and genes. It constructs similarity matrices based on the input GO terms and further continues to compute functional similarity among gene products/proteins based on the GO similarity matrices or it moves onto performing an enrichment analysis by calculating p-values for the overrepresented GO terms in the input.
Serves for visualizing, comparing and plotting gene ontology (GO) annotation results. WEGO allows researchers to work with the directed acyclic graph structure of GO to simplify histogram creation of GO annotation results. Moreover, this program can be used for understanding GO annotations and supports the comparison between several gene datasets.
Allows users to annotate a set of genes or proteins by mapping to genes with known pathways in the KEGG PATHWAY database. KOBAS permits users to detect enriched pathways using a hypergeometric test. This tool was used in pathway analysis in plants, animals and bacteria. Furthermore, it utilizes the genes from whole genome as the default background distribution.
A community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over the past year, the Gene Ontology (GO) Consortium (GOC) has implemented several processes to increase the quantity, quality and specificity of GO annotations.
Provides a set of tools for searching and browsing the Gene Ontology (GO) database. AMIGO visualizes speciation, duplication and horizontal gene transfer events, sequence alignments and descriptive data and external links for both proteins and annotations. The workflow annotation is a two-step process: (i) curators create a model of evolution that is consistent with the observed experimental annotations of modern-day sequences. Once constructed, this model is used in a (ii) step to create inferred annotations over the entire tree. This search box was designed to involve strict selection of terms, which results in coherent annotations within proteins families, as well as across families implicated in a single process.
Provides a web portal for gene annotation and analysis resource that assists biologists to make sense of one or multiple gene lists. Metascape provides automated meta-analysis tool to understand common and unique pathways within a group of orthogonal target-discovery studies. It also supports protein-protein interaction (PPI) analysis based on BioGrid, interactive visualization of Gene Ontology (GO) Networks and enrichment heatmaps generation.
Combines literature indices of selected public biological resources in a flexible text-mining system designed towards the analysis of groups of genes. TXTGate is a platform that offers multiple 'views' on vast amounts of genebased free-text information available in selected curated database entries and scientific publications. It enables detailed functional analysis of interesting gene groups by displaying key terms extracted from the associated literature and by offering options to link out to other resources or to sub-cluster the genes on the basis of text.
Allows users to compare genomics that include functional information and families with the taxonomic classification. GOTaxExplorer provides users four query types: (1) selection of sets, (2) comparison of sets of Pfam families, (3) semantic comparison of sets of gene ontology (GO) terms and (4) functional comparison of sets of gene products. This software enables to customize sets of GO terms, families or taxonomic groups.
Creates a binary matrix with gene ids as row names and GO IDs/Terms as column names and is filled with 0s and 1s. If a gene X is annotated with a GO term Y, then the cell XY will contain 1 else a 0. This application uses Biomart package of Bioconductor to query the data.
Classifies sequences into protein families and predicts the presence of important domains and sites. InterPro is an integrated resource of protein families, domains and sites which are combined from a number of different protein signature databases, including: Gene3D, Panther, PRSF, Pfam, PRINTS, ProSite, ProDom, SMART, SUPERFAMILY and TIGRFAMs. InterPro2GO creates annotations from data of InterPro. Gene Ontology terms assigned by InterPro2GO are cross-referenced more than 168 million times in UniProtKB, providing terms for almost 50 million individual proteins.
Generates Gene Ontology annotation graphs for protein sets and their associated statistics from simple frequencies to enrichment values and information content based metrics. GRYFUN is a freely available web application that allows GO annotation visualization of protein sets and which can be used for annotation coherence and cohesiveness analysis and annotation extension assessments within under-annotated protein sets.
Accommodates noise and errors in the selected gene set and Gene Ontology (GO). GenGO analyses the GO hierarchy for yeast and humans. This platform is effective in minimizing false positives while at the same time it can accurately balance the set of categories it returns, including both high level and specific categories. GenGO consistently outperforms both the original hypergeometric method and the methods considering only local structural dependencies, in some cases dramatically so.