Stores non-canonical splice site sequences extracted from mammalian annotated genes. SpliceDB is a database that includes (i) the GenBank accession number, (ii) intron number, (iii) positions of donor and acceptor splice junctions in the InfoGene sequence, (iv) sequence around splice sites, and (v) type of splice site in the classification as well as the information about expressed sequence tag (EST) used to support the splice pair.
Collects into a unique resource several heterogeneous information about splicing regulatory proteins, their binding sites and context-specific activity. Information of binding sites includes the corresponding genes, their genomic coordinates, the splicing effect, the experimental procedures used to assess binding and the related references. All these data have been manually extracted and collected by extensive database and literature screenings, yielding information on 71 splicing proteins, all those investigated so far to our knowledge for which some information is available on their RNA targets.
DEDB / Drosophila melanogaster Exon Database
A database of Drosophila melanogaster exons obtained from FlyBase arranged in a splicing graph form that permits the creation of simple rules allowing for the classification of alternative splicing events. Pfam domains were also mapped onto the protein sequences allowing users to access the impact of alternative splicing events on domain organization.
Primate Orthologous Exon Database
Provides a catalog of unique, non-overlapping, orthologous exon regions in the genomes of human, chimpanzee, and rhesus macaque. The database can be used in analysis of multi-species RNA-seq expression data, allowing for comparisons of exon-level expression across primates, as well as comparative examination of alternative splicing and transcript isoforms.
A cross-species interacting database inferring from three-dimensional (3D) protein structure complexes and a novel scoring function by using 3D-domain interologs. For a query protein, the 3D-Interologs database utilizes BLAST to identify homologous proteins and the interacting partners from multiple species. Based on the novel scoring function and structure complexes, 3D-Interologs provides the statistic significances, the interacting models (e.g. hydrogen bonds and conserved amino acids), and functional annotations of interacting partners of a query protein. The identification of orthologous proteins of multiple species is able to use to study on protein-protein evolution, protein functions, and cross-referencing of proteins.
Its purpose is to provide direct and free access to the experimental affinity of a given complex structure.
AtPID / Arabidopsis thaliana Protein Interactome Database
Depicts and integrates the information pertaining to protein-protein interaction networks, domain architecture, ortholog information and GO annotation in the Arabidopsis thaliana proteome. AtPID predicts the Protein-protein interaction pairs by integrating several methods with the Naive Baysian Classifier. All other related information curated in the AtPID is manually extracted from published literatures and other resources from some expert biologists. AtPID collects 5564 mutants with significant morphological alterations which were manually curated to 167 plant ontology (PO) morphology categories and predicts 4457 high confidence gene-PO pairs with 1369 genes as the complement. These single/multiple-gene mutants are indexed and linked to 3919 genes.
A database integrating physical (protein-protein) and functional interactions within the context of an E. coli knowledgebase. Tools are provided which allow the user to select and visualize functional, evolutionary and structural relationships between groups of interacting proteins and to focus on genes of interest.
Binding MOAD / Binding Mother of All Databases
Provides high resolution structures from the PDB with ligand annotation and protein classifications. Binding MOAD contains protein-ligand structures relationships with their corresponding experimental binding affinities. It permits users to display the protein-ligand complex, the ligand’s two-dimensional structure and to realize structure-based searching of ligands. This database represents a large collection of well resolved protein crystal structures.
BioGRID / Biological General Repository for Interaction Datasets
Assists in the capture of biological interaction data from the primary biomedical literature. BioGRID builds collection and creates annotations of genetic and protein interaction data for all major model organism species and humans. It allows users to investigate the function of individual genes and pathways, as well as to analyze the properties of large biological networks. This database is curated with an automated random re-curation procedure.
BISC / BInary SubComplex Database
A protein-protein interaction (PPI) database linking up the two communities most active in their characterization: structural biology and functional genomics researchers. The BISC resource offers users (i) a structural perspective and related information about binary subcomplexes (i.e. physical direct interactions between proteins) that are either structurally characterized or modellable entries in the main functional genomics PPI databases BioGRID, IntAct and HPRD; (ii) selected web services to further investigate the validity of postulated PPI by inspection of their hypothetical modelled interfaces.
CPDB / ConsensusPathDB
Allows searching, visualizing and retrieving of integrated interaction data. CPDB is an integrative interaction database that gathers molecular interaction data integrated from 32 different public repositories and provides a set of computational methods and visualization tools to explore these data. Its applications comprise over-representation analysis to characterize diverse sets of molecules, gene set enrichment analysis (GSEA) and identification of upstream regulators spanning various biological context. It is also used as a database by other tools, for instance by Cytoscape and Chipster.
CORUM / Comprehensive Resource of Mammalian protein complexes
Provides a resource of manually annotated protein complexes from mammalian organisms. Annotation includes protein complex function, localization, subunit composition, literature references and more. All information is obtained from individual experiments published in scientific articles; data from high-throughput experiments is excluded.
DIP / Database of Interacting Proteins
Catalogs experimentally determined interactions between proteins. DIP combines information from a variety of sources to create a single, consistent set of protein-protein interactions. The data stored within the DIP database were curated, both, manually by expert curators and also automatically using computational approaches that utilize the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data.
A database of known and predicted protein domain (domain-domain) interactions. It contains interactions inferred from PDB entries, and those that are predicted by 13 different computational approaches using Pfam domain definitions. DOMINE contains a total of 26,219 domain-domain interactions (among 5,410 domains) out of which 6,634 are inferred from PDB entries, and 21,620 are predicted by at least one computational approach.
Many protein interactions are mediated by small protein modules binding to short linear peptides. DOMINO is an open-access database comprising more than 3900 annotated experiments describing interactions mediated by protein-interaction domains. DOMINO can be searched with a versatile search tool and the interaction networks can be visualized with a convenient graphic display applet that explicitly identifies the domains/sites involved in the interactions.
DroID / Drosophila Interactions Database
Provides information about Drosophila gene and protein interactions. DroID is a resource that combines Drosophila protein–protein interactions (PPIs) from all available sources as well as predicted PPIs based on experimentally detected PPIs in other organisms. It integrates gene expression data with interaction data, which allows users to assess the potential relevance of interaction data for specific tissues and specific developmental stages. Data can be browsed via a web site, a graphical web and a plug-in Cytoscape.
Aims at collecting and annotating in a structured format all the interactions between human and viral proteins and to integrate this information in the human protein interaction network. The curation effort has focused on manuscripts reporting interactions between human proteins and proteins encoded by some of the most medically relevant viruses: papilloma viruses, human immunodeficiency virus 1, Epstein-Barr virus, hepatitis B virus, hepatitis C virus, herpes viruses and Simian virus 40.
VirHostNet / Virus-Host Network
Supplies a collection of integrated virus-virus, virus-host and host-host interaction networks coupled to their functional annotations. VirHostNet simplifies systems biology and gene-centered analysis of infectious diseases and assists users in the recognition of new molecular targets for antiviral drugs design. It aims to improve the knowledge on molecular mechanisms involved in the antiviral response mediated by the cell and in the viral strategies selected by viruses to hijack the host immune system.
UniHI / Unified Human Interactome
A database for retrieval, analysis and visualization of human molecular interaction networks. UniHI primary aim is to provide a comprehensive and easy-to-use platform for network-based investigations to a wide community of researchers in biology and medicine. A distinctive feature of UniHI 7 is its user-friendly interface designed to be utilized in an intuitive manner, enabling researchers less acquainted with network analysis to perform state-of-the-art network-based investigations.
TRIP Database / TRansient receptor potential channel-Interacting Protein Database
Gathers comprehensive information on protein-protein interactions (PPIs) of mammalian Transient receptor potential (TRP) channels. The TRIP Database provides a search engine, an interaction map and a function for cross-referencing useful external databases. It represents a valuable resource to assist in understanding the molecular regulatory network of TRP channels.
Associates experimentally-identified protein-protein interactions (PPIs) with human tissues. TissueNet allows users to select a protein and a tissue, and to obtain a network view of the query protein and its tissue-associated PPIs. TissueNet v.2 is an updated version of the TissueNet database. It includes over 40 human tissues profiled via RNA-sequencing or protein-based assays. Users can select their preferred expression data source and interactively set the expression threshold for determining tissue-association. The output of TissueNet v.2 emphasizes qualitative and quantitative features of query proteins and their PPIs. The tissue-specificity view highlights tissue-specific and globally-expressed proteins, and the quantitative view highlights proteins that were differentially expressed in the selected tissue relative to all other tissues. Together, these views allow users to quickly assess the unique versus global functionality of query proteins. Thus, TissueNet v.2 offers an extensive, quantitative and user-friendly interface to study the roles of human proteins across tissues.
SynSysNet / Synaptic Proteins Database
Provides a platform that creates a comprehensive 4D network of synaptic interactions. Protein-protein interactions and drug-target interactions can be viewed as networks; corresponding PubMed IDs or sources are given.
Aims to compile known and predicted post-translational modification (PTM) associations to provide a framework that would enable hypothesis-driven experimental or computational analysis of various scales. In order to make data available for understudied organisms, PTMcode v2 includes a strategy to propagate PTMs from validated modified sites through orthologous proteins.
Provides a freely available, open source database system and analysis tools for molecular interaction data. All interactions are derived from literature curation or direct user submissions and are freely available.
SNP Function Portal
Designed to be a clearing house for all public domain SNP functional annotation data, as well as in-house functional annotations derived from different data sources.
Genome Trax
A comprehensive compilation of variant knowledge that is made available for download for easy integration into your own custom variant analysis pipeline for human whole genome, exome and targeted sequences. With Genome Trax™ content you can confidently identify known pathogenic variants or explore novel, as-yet-uncharacterized variants found within your sequence samples that are predicted to have a deleterious effect by virtue of a change in amino acid, disruption of a regulatory motif, or the disease-, drug-, or pathway-association of the affected gene. The database includes the world’s most comprehensive collection of inherited disease causing mutations from HGMD® Professional and pharmacogenomic variants from PGMD™, as well as regulatory sites from TRANSFAC®, and disease genes, drug targets and pathways from PROTEOME™. It integrates the best public data-sets on somatic mutations, allele frequencies and clinical variants, in their most up-to-date version, for a total of more than 165 million annotations.
Provides a much needed compendium of genomic variants and their annotations for M. tuberculosis complex (MTBC) and provides the first step toward accelerating genotype–phenotype correlations in the closely related pathogens. tbvar provides a user-friendly interface, closely integrated and interlinked with other major resources in the field. The tool also provides interface for annotation of known variants and identification of novel variants obtained from genome sequencing data sets and could potentially lead to application in clinical settings.
A maize expression compendium, making use of an integration methodology and a consistent probe to gene mapping based on the 5b.60 sequence release of Zea mays.
DBM-DB / Diamondback moth Genome Database
A central online repository for storing and integrating genomic data of diamondback moth (DBM), Plutella xylostella (L.).
GONUTS / Gene Ontology Normal Usage Tracking System
A community-based browser and usage guide for Gene Ontology (GO) terms and a community system for general GO annotation of proteins.
OLS / Ontology Lookup Service
A spin-off of the PRIDE project, which required a centralized query interface for ontology and controlled vocabulary lookup. The OLS provides a web service interface to query multiple ontologies from a single location with a unified output format. The OLS can integrate any ontology available in the Open Biomedical Ontology (OBO) format.
Delivers primer and probe information. RTPrimerDB integrates primer sequences, target gene and organism information, mapping data, in silico evaluation characteristics, user feedback, links to citations, assays for the same target and external databases. It aims to solve the problem of laborious primer design and assay evaluation for quantification or detection of the same nucleic acid target sequences by different individuals.
Provides polymerase chain reaction (PCR) and quantitative PCR (qPCR) primers. PrimerBank is a public database for the retrieval of PCR and qPCR primers containing about 400 000 primers that cover 36 928 human and mouse genes, corresponding to around 94% of all known protein-coding gene sequences. Users can access all stored information (primer sequences and annotations, primer validation data, as well as gene annotations) from the search interface.
Provides optimized primers for human and mouse RefSeq genes. qPrimerDepot contains primers sets designed to be used under uniform annealing temperatures. This database is a valuable resource for real-time reverse transcription (qRT) applications, especially in circumstances requiring high throughput detection of rare transcripts in curated and/or patient-derived samples that often contain unavoidable contamination with genomic DNA.
Whole Transcriptome qPCR Primers
A set of rules to automatically design all possible exon-exon and intron-exon junctions in the human and mouse transcriptomes.
Offers a collection of post-translational modifications (PTMs). dbPTM contains a dataset of experimentally verified PTMs supported by the literature and gives an access to databases and tools associated with PTM analysis. It integrates the emerging S-nitrosylation, S-glutathionylation and succinylation, from approximately 500 research articles which were extracted by text mining.
Stores annotations and structures for protein pre-, co- and post-translational modifications including amino-terminal, carboxyl-terminal and peptide chain cross-link modifications. RESID provides information such as dates for database entry and modification, systematic name and Chemical Abstracts Service registry number, alternate names, atomic formula and mass, enzyme activities; indicators for amino-terminal, carboxyl-terminal or peptide chain cross-link modifications; literature citations, keywords and feature table representations for the modification in the Protein Information Resource (PIR) and SWISS-PROT protein sequence databases.
Stores experimentally verified phosphorylation sites in eukaryotic proteins. Phospho.ELM is a manually curated web-based resource that contains more than 42 000 non-redundant instances of phosphorylated residues in more than 11 000 different protein sequences (tyrosine, serine and threonine residues). The database also provides links to other resources. Users can participate in the curation of the Phospho.ELM resource by submitting their own data.
A database of glycoproteins with O-linked glycosylation sites. The criteria for inclusion are at least one experimentally verified O- or C-glycosylation site. The terminal sugar linked to serine or threonine is cited when known. O-GlycBase is non-redundant in the sense that it contains no identical sequences, unless there is conflicting glycosylation data.
A database on disulphide bonds in proteins that provides information on native disulphides and those which are stereochemically possible between pairs of residues in a protein.
A repository of functional predictions for protein post-translational modifications (PTMs).
PSP / PhosphoSitePlus
Provides information and resources to facilitate phosphorylation research. PhosphoSite is a curated database dedicated to aggregating information on physiological protein phosphorylation sites. Its goal is to identify and organize information about all in vivo phosphorylation sites in human and mouse proteomes. It contains over 330 000 non-redundant post-translational modifications (PTMs), including phospho, acetyl, ubiquityl and methyl groups.
Allows the retrieval of phosphorylation, acetylation, and N-glycosylation data of any protein of interest. PHOSIDA lists posttranslational modification sites associated with particular projects and proteomes or, alternatively, displays posttranslational modifications found for any protein or protein group of interest. In addition, structural and evolutionary information on each modified protein and posttranslational modification site is integrated. Importantly, Phosida links extensive peptide information to the sites, such as several peptides implicating the same site and temporal profiles of each site in response to stimulus (e.g., EGF stimulation).
A knowledge base of ubiquitylated proteins. UbiProt contains retrievable information about overall characteristics of a particular protein, ubiquitylation features, related ubiquitylation and de-ubiquitylation machinery and literature references reflecting experimental evidence of ubiquitylation. UbiProt can serve as a general reference source both for researchers in ubiquitin field and those who deal with particular ubiquitylated proteins which are of their interest.
UniProt / Universal Protein Resource
Stores protein sequence and annotation data. UniProt provides complete coverage of sequence space at several resolutions while hiding redundant sequences. It can be used as ‘gold standard’ reference proteome dataset for orthologue benchmarking. This database regroups several entities: the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), the UniProt Archive (UniParc) and the UniProt Metagenomic and Environmental Sequences (UniMES) database.
HPRD / Human Protein Reference Database
Provides access to experimentally derived information about the human proteome including protein–protein interactions (PPIs), post-translational modifications (PTMs) and tissue expression. HPRD is an integrated knowledgebase for genomic and proteomic investigators. The database also includes (i) PhosphoMotif Finder that contains known kinase/phosphatase substrate and binding motifs, (ii) links to a signaling pathway resource called NetPath, (iii) a distributed annotation system, called Human Proteinpedia for enhanced community participation and allows the use of BLAST for querying mRNA/protein data.
Annotates eukaryotic genomes with peptide sequences obtained from mass spectrometry (MS) experiments. PeptideAtlas is a compendium of observations of peptides and associated annotations, based on a large number of contributed data sets. It also supports targeted proteomics workflows, such as selected reaction monitoring (SRM), by allowing the researcher to identify suitable proteotypic peptides to target and to estimate approximate retention time for the target peptides.
Hosts a library of glycoproteins with their corresponding N-glycosites. SRMAtlas permits users to validate biomarker candidates by providing a collection of selected reaction monitoring (SRM) assays produced through three different mass spectrometric platforms. Users browse the database through five panels allowing transitions querying, transitions lists downloading, the browsing of specific builds and access to PeptideAtlas SRM Experiment Library (PASSEL) data and experiments.
PRIDE / PRoteomics IDEntifications database
Covers proteomics data, including protein and peptide identifications, post-translational modifications and supports spectral evidence. PRIDE allows public data deposition of MS proteomics data. It can conduct automatic and manual curation of the related experimental metadata and allows the assessment of data quality. This database can be searched by sample details, instrumentation, keywords and other provided annotations.
GPMDB / Global Proteome Machine and Database
A database of proteomics experimental information emphasizing biological context, data reuse & validation. GPMDB is based on a combination of data analysis servers, a user interface, and a relational database. The database was designed to store the minimum amount of information necessary to search and retrieve data obtained from the publicly available data analysis servers. Collectively, this system was referred to as the global proteome machine (GPM). The components of the system have been made available as open source development projects.
Aims to fully represent all relevant aspects of a proteomics experiment and to make them easily accessible to the user.
Simple web application for storing, sharing, visualizing, and analyzing spectrometry files.
NIST Libraries of Peptide Tandem Mass Spectra
Gives access to comprehensive, annotated mass spectral reference collections from various organisms and proteins. NIST Libraries of Peptide Tandem Mass Spectra are useful for the rapid matching and identification of acquired mass spectrometry (MS)/MS spectra. Peptide mass spectrum libraries can be used for direct peptide identification, validation of peptides identified by sequence search programs, organization and identification of recurring, unidentified spectra, detection of internal standards, biomarkers, and target proteins and subtraction of a component from a mixture spectrum.
Public repository of mass spectral data for sharing them among scientific research community.
Current tandem mass spectral libraries for lipid annotations in metabolomics are limited in size and diversity. LipidBlast is a freely available computer-generated tandem mass spectral library of 212,516 spectra covering 119,200 compounds from 26 lipid compound classes, including phospholipids, glycerolipids, bacterial lipoglycans and plant glycolipids.
An open, publicly free database of natural lipids including fatty acids, glycerolipids, sphingolipids, steroids, and various vitamins. LipidBank contains more than 6000 unique molecular structures (ChemDraw cdx format, MDL MOL format), their lipid names (common name, IUPAC), spectral information (mass, UV, IR, NMR and others), and most importantly, literature information.
LIPID MAPS / LIPID Metabolites And Pathways Strategy
Organizes lipids into eight well-defined categories that cover eukaryotic and prokaryotic sources. LIPID MAPS filters data in functional hierarchies involving lipids, reactions, and pathways. It is a multi-institutional effort created in 2003 and uses a systems biology approach and sophisticated mass spectrometers (MS), all of the major — and many minor — lipid species in mammalian cells. The database quantitates the changes in species in response to perturbation.
An evolving pathway map for sphingolipid biosynthesis that includes many of the known sphingolipids and glycosphingolipids arranged according to their biosynthetic origin(s). SphinGOMAP promotes dialog about the “knowns” and “unknowns” of sphingolipid biosynthesis and lead to experiments to refine this model.
GMD / Golm Metabolome Database
Facilitates the search for and dissemination of mass spectra from biologically active metabolites quantified using Gas chromatography (GC) coupled to mass spectrometry (MS). The GMD comprises mass spectra and retention time indices of pure reference substances and frequently observed mass spectral tags (MST: mass spectrum linked to chromatographic retention) of yet unidentified metabolites.
HMDB / Human Metabolome DataBase
Combines quantitative chemical, physical, clinical and biological data about thousands of endogenous human metabolites. HMDB is a multi-purpose database with a strong focus on quantitative, analytic or molecular-scale information about metabolites, their associated enzymes or transporters and their disease-related properties. This online resource is designed to contain or link three kinds of data: (i) chemical data, (ii) clinical data and (iii) molecular biology/biochemistry data.
BiGG Models
Provides a library of manually-curated genome-scale metabolic models. BIGG Models compiles biochemical, genetic and genomic standardized data and supplies more than 75 models linked to genome annotations and external databases. Information can be browsed by metabolites, models or by reaction and users can compare different models. The database can also be locally downloaded by using the source code or through Docker.
Stores raw experimental and associated metadata from metabolomics studies. MetaboLights is a general purpose, cross-species and cross-application database in metabolomics. Studies are created by researchers in ISA-Tab format, by either automatically creating datasets from in-house laboratory information management systems, or by manually creating ISA-Tab archives with the help of the ISA-tools suite. The species coverage of studies follows the preferences for model species around the globe.
Illumina Data Sets
View sequencing data generated on Illumina sequencers and analyzed in BaseSpace, the Illumina genomics computing environment. See how BaseSpace makes it easy to analyze your sequencing data and generate meaningful reports. View sample data sets and reports for a variety of applications, or test BaseSpace Apps on the sample data, and evaluate the results interactively.
A comprehensive, curated oncogenomic database that provides copy number aberration data to the human cancer research community. Over the past years, the database has undergone an extensive expansion and significant qualitative enhancements. Particularly, the database has made the transition from a ‘cytogenetic’ resource based on cancer cytogenetic data to an integrated resource incorporating cancer genome data from increasing variety of genome analysis techniques. Likewise, many ideas of the user interface improvements and data analysis tools have been implemented based on suggestions from users.
SKY/M-FISH & CGH Database
Provides a public platform for investigators to share and compare their molecular cytogenetic data. SKY/M-FISH & CGH Database is open to everyone and all users can view an individual investigator's public data or compare public cases from different investigators. Those wishing to contribute their own data must register and can choose to keep their data private for a period not to exceed two years.
TCGA Data Portal / The Cancer Genome Atlas Data Portal
Generates, analyzes, and makes available genomic sequence, expression, methylation, and copy number variation (CNV) data on over 11,000 individuals who represent over 30 different types of cancer. The information generated by TCGA is centrally managed and entered into databases as it becomes available, making the data rapidly accessible to the entire research community. TCGA is a collaborative effort led by the National Cancer Institute and the National Human Genome Research Institute to map the genomic and epigenomic changes that occur in types of human cancer, including nine rare tumors. Its goal is to support new discoveries through the generation of a catalog of somatic aberrations occurring in the different neoplasms, and accelerate the pace of research aimed at improving the diagnosis, treatment, and prevention of cancer.
CanGEM / Cancer GEnome Mine
A public, web-based database for storing quantitative microarray data and relevant metadata about the measurements and samples. CanGEM supports the MIAME standard and in addition, stores clinical information using standardized controlled vocabularies whenever possible. Microarray probes are re-annotated with their physical coordinates in the human genome and aCGH data is analyzed to yield gene-specific copy numbers. Users can build custom datasets by querying for specific clinical sample characteristics or copy number changes of individual genes. Aberration frequencies can be calculated for these datasets, and the data can be visualized on the human genome map with gene annotations.
A database for identifying and visualizing CNAs in cancers at any specific region within the human genome. CaSNP stores pre-computed raw copy numbers, and dynamically generates viewable and downloadable summaries of CNA status in response to user queries. A schema for uniformly processing, storing, annotating and presenting data sets across different data sets or platforms was successfully implemented, making CaSNP a useful tool for cancer genomic meta-study. The query results contain numerical values of cancer copy numbers and the frequencies of CNA events, which are well suited for more detailed analysis by other software or methods. Besides the tabular display, the heatmap view displays SNP copy numbers in colors, enabling users to intuitively and comprehensively visualize the results and facilitating finding novel CNA regions in subset of samples.
A curated reference database and bioinformatics resource targeting copy number profiling data in human cancer. The arrayMap database provides a platform for meta-analysis and systems level data integration of high-resolution oncogenomic CNA data. The 2014 release of arrayMap contains more than 64 000 genomic array data sets, representing about 250 tumor diagnoses. The large amount of tumor CNA data in arrayMap can be freely downloaded by users to promote data mining projects, and to explore special events such as chromothripsis-like genome patterns.
TAMEE / Tissue Array Management and Evaluation Environment
A web-based database application for the management and analysis of data resulting from the production and application of TMAs.
Provides data and analytics portal focuses on taxonomy, ecology, genomics and metagenomics. EZioCloud is an integrated database with a complete taxonomic hierarchy of the Bacteria and Archaea represented by 16S rRNA gene and genome sequences. All genomes were identified taxonomically at the kind, species or subspecies levels using a combination of gene-based search.
Provides access to metabolite information and tandem mass spectrometry data. METLIN is a metabolite database for metabolomics containing over 64,000 structures. It also contains a data management system designed to assist in metabolite researching by providing public access to its repository of current and comprehensive MS/MS metabolite data. An annotated list of known metabolites and their mass, chemical formula, and structure are available on the METLIN website.
A collection of literature and in-house MSn spectra data for research on plant metabolomics. As a main web application of ReSpect, a fragment search was established based on only the m/z values of query data and records. The confidence levels of the annotations were managed using the MS/MS fragmentation association rule, which is an algorithm for discovering common fragmentations in MS/MS data.
A mass spectral and retention index library for comprehensive metabolic profiling. The current libraries comprise over 1,000 identified metabolites that are currently screened by the Fiehn laboratory.
AHD / Arabidopsis Hormone Database
Provides a large collection of Arabidopsis hormone related genes (AHRGs). AHD integrates detailed gene information and a phenotype ontology that is developed to precisely describe myriad hormone-regulated morphological processes with standardized vocabularies in the model organism Arabidopsis. It offers a systematic and comprehensive view of Arabidopsis hormone related genes. This database contains data on major phytohormones: abscisic acid, auxin, brassinosteroid, cytokinin, ethylene, gibberellin, jasmonic acid and salicylic acid.
Arabidopsis Reactome
A knowledgebase of biological processes in Arabidopsis. It covers biological pathways ranging from the basic processes of metabolism to high-level processes such as cell cycle regulation. While Arabidopsis Reactome is targeted at Arabidopsis pathways, it also includes many biological events from other plant species. This makes the database relevant to the large number of researchers who work on other plants. Arabidopsis Reactome currenlty contains both in-house curated pathways as well as imported pathways from AraCyc and KEGG databases. All the curated information in Arabidopsis Reactome is backed up by its provenance: either a literature citation or an electronic inference based on sequence similarity.
Contains biochemical pathways of Arabidopsis, developed at The Arabidopsis Information Resource. The aim of AraCyc is to represent Arabidopsis metabolism as completely as possible with a user-friendly Web-based interface. It features pathways that include information on compounds, intermediates, cofactors, reactions, genes, proteins, and protein subcellular locations.
A knowledgebase for pathway analysis in Arabidopsis. To create a knowledgebase for plant pathway analysis, 1683 lists of differentially expressed genes were collected from 397 gene-expression studies, which constitute a molecular signature database of various genetic and environmental perturbations of Arabidopsis.
ASD / AlloSteric Database
Provides a central resource for the display, search and analysis of structure, function and related annotation for allosteric molecules. A significant expansion to the context and new features such as allosteric sites and allosteric pathways has been released in the current version of ASD. Additionally, the enhanced front-end and back-end of ASD now enable users to efficiently explore the available information about allostery.
AtIPD / Arabidopsis thaliana Isoprenoid Pathway Database
Provides access to manually curated list of Arabidopsis isoprenoid pathways and genes, and allows to visualize pathway topology. The database was compiled using information on pathways and pathway genes from BioPathAt, KEGG, AraCyc, SUBA, and from the literature. AtIPD can be searched or browsed to extract data and external links related to isoprenoid pathway models, enzyme activities, or subcellular enzyme localizations.
BESC Knowledgebase
Serves as a centralized repository for experimentally generated data and to provide an integrated, interactive and user-friendly analysis framework. The Portal makes available tools for visualization, integration and analysis of data either produced by BESC or obtained from external resources.
Allows users to search information about pathway/genomes. BioCyc is a database that mixes thousands of genomes with additional information curated from the biomedical literature by biologist curators, imported from other databases, and deduced by computer programs. If offers a list of other tools such as: (1) a search tools; (2) a sequence-alignment tool; (3) a tool for visualizing groups of related metabolic pathways; (4) and a tool named SmartTables enabling biologists to perform analyses.
A repository of computational models of biological processes. BioModels Database hosts models described in peer-reviewed scientific literature and models generated automatically from pathway resources (Path2Models). A large number of models collected from literature are manually curated and semantically enriched with cross-references from external data resources (such as publications, databases of compounds and pathways, ontologies, etc.). The resource allows scientific community to store, search and retrieve mathematical models of their interest. In addition, features such as generation of sub-models, online simulation, conversion of models into different representational formats, and programmatic access via web services, are provided.
Stores manually curated information about proteins and genes directly implicated in the Biodegradation metabolism. When possible, Bionemo includes information on sequence, domains and structures for proteins; and sequence, regulatory elements and transcription units for genes. Bionemo has been built by manually associating sequences databases entries to biodegradation reactions, using the information extracted from published articles. Information on transcription units and their regulation was also extracted from the literature for biodegradation genes, and linked to the underlying biochemical network.
A database of biochemical pathways that provides access to metabolic transformations and cellular regulations derived from the Roche Applied Science "Biochemical Pathways" wall chart. In the current version 3 BioPath also provides access to biological transformations reported in the primary literature. The BioPath database is available in Symyx MOL/RDF format for integration into existing retrieval systems or, optionally, fully integrated into the web-based retrieval system BioPath.Explore.
A unified biological database that integrates heterogeneous data types such as proteins, structures, domain families, protein-protein interactions and cellular pathways, and establishes the relationships between them. All data are integrated on to a single graph schema centered around the non-redundant set of biological objects that are shared by each source. This integration results in a highly connected graph structure that provides a more complete picture of the known context of a given object that cannot be determined from any one source.
BRENDA / BRaunschweig ENzyme DAtabase
Contains manually annotated literature-based data on a wide range of aspects of enzyme function, their metabolic role, involvement in disease processes, genomic and protein sequences, and enzyme structures. BRENDA includes information retrieved by text mining of literature abstracts. It is part of the ELIXIR’s list of databases which are considered as critically important for life science research.
ENZYME / Enzyme nomenclature database
Contains information related to the nomenclature of enzymes. ENZYME is structured so, it is usable by human readers as well as by computer programs. It describes each type of characterized enzyme for which an Enzyme Commission number has been provided. The database contains 5946 enzymes and can be accessible through the web portal or downloaded. It is searchable thank to different parameter as enzyme class, chemical compound, alternative name, cofactor and other.
Provides access to the International Union of Biochemistry and Molecular Biology (IUBMB) Enzyme List. ExplorEnz is a database containing enzyme data with associated literature references. It consists in the primary source of new Enzyme Commission (EC) numbers, from which all other databases containing the Enzyme Nomenclature data can be updated. A curatorial interface permits modification of existing entries as well as the addition of new candidate enzymes.
IntEnz / Integrated relational Enzyme database
A freely available resource focused on enzyme nomenclature.
An integrated source of information about peptidases, their substrates and inhibitors, which are of great relevance to biology, medicine and biotechnology. The hierarchical classification of the database is as follows: homologous sets of sequences are grouped into a protein species; protein species are grouped into a family; families are grouped into clans. There is a type example for each protein species (known as a 'holotype'), family and clan, and each protein species, family and clan has its own unique identifier. Pages to show the involvement of peptidases and peptidase inhibitors in biological pathways have been created. Each page shows the peptidases and peptidase inhibitors involved in the pathway, along with the known substrate cleavages and peptidase-inhibitor interactions, and a link to the KEGG database of biological pathways. Links have also been established with the IUPHAR Guide to Pharmacology.
TECRDB / Thermodynamics of Enzyme-Catalyzed Reactions DataBase
A comprehensive collection of thermodynamic data on enzyme-catalyzed reactions. The data, which consist of apparent equilibrium constants and calorimetrically determined molar enthalpies of reaction, are the primary experimental results obtained from thermodynamic studies of biochemical reactions.
MACiE / Mechanism, Annotation and Classification in Enzymes
A database of enzyme reaction mechanisms.
CADgene / Coronary Artery Disease Gene Database
Collects coronary artery disease (CAD) candidate genes and their detailed evidence associated with CAD from publications. CADgene is a comprehensive database that provides three approaches for searching the data, including the text search and sequence search. For each candidate gene, CADgene displays its involved KEGG pathways and all the CAD-related genes in these pathways. The database aims to provide a complete and up-to-date gene resource to the research community.
A detailed metabolic pathway database, from C. roseus RNA-Seq data sets. CathaCyc contains 390 pathways with 1,347 assigned enzymes and spans primary and secondary metabolism. Curation of the pathways linked with the synthesis of TIAs and triterpenoids, their primary metabolic precursors, and their elicitors, the jasmonate hormones, demonstrated that RNA-Seq resources are suitable for the construction of pathway databases. CathaCyc offers a range of tools for the visualization and analysis of metabolic networks and 'omics' data.
ECMDB / EColi Metabolome DataBase
An expertly curated database containing extensive metabolomic data and metabolic pathway diagrams about Escherichia coli (strain K12, MG1655). This database includes significant quantities of “original” data compiled by members of the Wishart laboratory as well as additional material derived from hundreds of textbooks, scientific journals, metabolic reconstructions and other electronic databases. ECMDB currently contains 3755 small molecules with 1402 associated enzymes and 387 associated transporters. It also has 1542 metabolic pathways that are linked to 3011 metabolites. A total of 19,294 NMR and MS spectra (experimental and predicted) for 3098 different E. coli metabolites are also contained in the database. Each metabolite is linked to more than 100 data fields describing the compound, its ontology, physical properties, reactions, pathways, references, external links and associated proteins or enzymes.
A curated repository for Drosophila melanogaster pathways and reactions. The information in this database is authored by biological researchers with expertise in their fields, maintained by the FlyReactome staff, and cross-referenced with the following external databases: FlyBase, UniProt, NCBI (GeneID and RefSeq), Ensembl, BioGPS, CTD, KEGG (Genes and Compound), ChEBI, PubMed and GO.
GOLD.db / Genomics of Lipid-Associated Disorders database
Developed to address the need for integrating disparate information on the function and properties of genes and their products that are particularly relevant to the biology, treatment, and prevention of lipid-associated disorders.
Allows users to query, visualize, analyze, and compare plant genome and pathway data across crops and model species. Gramene is a resource that uses information generated from projects supported by public funds to improve the study of cross-species comparisons. The database provides a search interface, and views and functionalities for Plant Reactome. It also shares infrastructure, specialized software components and pre-computed data with Ensembl Plants.