A database using manual curation from 38 423 articles published before 1 April 2014, and integrating protein interactomes and several transcriptome datasets. FR database provides detailed information for 904 genes derived from 53 organisms reported to participate in fleshy fruit development and ripening. Genes from climacteric and non-climacteric fruits are also annotated, with several interesting Gene Ontology (GO) terms being enriched for these two gene sets and seven ethylene-related GO terms found only in the climacteric fruit group. Furthermore, protein-protein interaction analysis by integrating information from FR database presents the possible function network that affects fleshy fruit size formation. Collectively, FR database will be a valuable platform for comprehensive understanding and future experiments in fruit biology.
PreDREM / Predicted DNA REgulatory Motifs
A database of DNA regulatory motifs and motifs modules predicted from DNase I hypersensitive sites in 349 human cell and tissue samples. It contains 845-1325 predicted motifs in each sample, which result in a total of 2684 non-redundant motifs. In comparison with seven large collections of known motifs, more than 84% of the 2684 predicted motifs are similar to the known motifs, and 54-76% of the known motifs are similar to the predicted motifs. PreDREM also stores 43 663-20 13 288 motif modules in each sample, which provide the cofactor motifs of each predicted motif.
A database that uncovers the molecular basis of TF binding in the human genome based on regulatory motif analysis of all Transcription Factors (TFs) grouped by family. This allows browsing of all known motifs for each factor, curated from TRANSFAC, Jaspar, and Protein Binding Microarray (PBM) experiments, and their enrichment and instances within corresponding TF binding experiments. It also provides a list of novel regulatory motifs discovered by systematic application of several motif discovery tools (including MEME, MDscan, Weeder, AlignACE) and evaluated based on their enrichment relative to control motifs within TF-bound regions. ENCODE-motifs also provides a genome-wide map of regulatory motif instances in the human genome for both known and novel motifs.
Stores human biological pathways and their annotations. PathCards is an online compendium based on the GeneCards database, presenting SuperPath-related data in each page, enabling quick in-depth analysis of each human SuperPath. The database allows a view of the pathway network connectivity within a SuparPath, as well as the gene lists of the SuperPath and of each of its constituent pathways. Links to the original pathways are available from the pathway database symbols.
Collects information related to metal sites in biological macromolecules. MetalPDB starts from the structural information contained in the Protein Data Bank (PDB) with the aim of providing an access to an overview of metal-containing biological structures. Searches can be made by keywords, metal, sequence, or PDB List. The database includes panels that give access to a direct downloading, links towards tools dedicated to metal analysis, and various statistics. The repository is updated monthly in an automated manner.
COMe / Co-Ordination of Metals
Represents the ontology for bioinorganic and other small molecule centres in complex proteins. COMe consists of three types of entities: 'bioinorganic motif' (BIM), 'molecule' (MOL), and 'complex proteins' (PRX), with each entity being assigned a unique identifier. The complex proteins in COMe are subdivided into three categories: (i) metalloproteins, (ii) organic prosthetic group proteins and (iii) modified amino acid proteins.
Lists metal protein interactions whose geometry has been experimentally determined and allows them to be visualized. This can contribute to the modeling process.
A publicly accessible web-based database on which the interactions between a variety of chelating groups and various central metal ions in the active site of metalloproteins can be explored in detail. Additional information can also be retrieved including protein and inhibitor names, the amino acid residues coordinated to the central metal ion, and the binding affinity of the inhibitor for the target metalloprotein.
MINAS / Metal Ions in Nucleic AcidS
Compiles the detailed information on innersphere, outersphere and larger coordination environment of >70,000 metal ions of 36 elements found in >2000 structures of nucleic acids contained today in the PDB and NDB. MINAS is updated monthly with new structures and offers a multitude of search functions, e.g. the kind of metal ion, metal-ligand distance, innersphere and outersphere ligands defined by element or functional group, residue, experimental method, as well as PDB entry-related information. The results of each search can be saved individually for later use with so-called miniPDB files containing the respective metal ion together with the coordination environment within a 15 A radius. MINAS thus offers a unique way to explore the coordination geometries and ligands of metal ions together with the respective binding pockets in nucleic acids.
BioMe / Biologically relevant Metals
A web-based platform for calculation of various statistical properties of metal-binding sites. Users can obtain the following statistical properties: presence of selected ligands in metal coordination sphere, distribution of coordination numbers, percentage of metal ions coordinated by the combination of selected ligands, distribution of monodentate and bidentate metal-carboxyl, bindings for ASP and GLU, percentage of particular binuclear metal centers, distribution of coordination geometry, descriptive statistics for a metal ion-donor distance and percentage of the selected metal ions coordinated by each of the selected ligands.
MIPS / Metal Interactions in Protein Structures
A database of metals in the three-dimensional macromolecular structures available in the Protein Data Bank. Bound metal ions in proteins have both catalytic and structural functions. The proposed database serves as an open resource for the analysis and visualization of all metals and their interactions with macromolecular (protein and nucleic acid) structures. MIPS can be searched via a user-friendly interface, and the interactions between metals and protein molecules, and the geometric parameters, can be viewed in both textual and graphical format using the freely available graphics plug-in Jmol.
GWAS Catalog / genome-wide association studies Catalog
Gathers a manually curated resource of all published genome-wide association studies (GWAS) and association results. GWAS Catalog provides a dedicated mapping spreadsheet between all reported GWAS search interfaces and ontology terms. It allows users to identify the child terms included under each higher-level trait category on the GWAS. Moreover, the database can assist in identifying causal variants, understanding disease mechanisms, and establishing targets for novel therapies.
BmTEdb / Bombyx mori Transposable Elements Database
A collective database of transposable elements (TEs) in the silkworm (Bombyx mori) genome. Users are entitled to browse, search and download the sequences in the database. Sequence analyses such as BLAST, HMMER and EMBOSS GetORF were also provided in BmTEdb. This database will facilitate studies for the silkworm genomics, the TE functions in the silkworm and the comparative analysis of the insect TEs.
Nowadays, many protein-protein interaction (PPI) databases are available, but the isoform level PPI prediction database has not been seen yet. IIIDB is a database for isoform-isoform interactions and isoform network modules. The interactions in IIIDB were calculated by a logistic regression approach that integrates information from RNA-seq datasets and domain-domain interaction.
A database of siRNAs experimentally tested by researchers with consistent efficacy ratings. This database will help siRNA researchers develop more reliable siRNA design rules; in the meantime, siRecords will benefit experimental researchers directly by providing them with information about the siRNAs that have been experimentally tested against the genes of their interest. The current release of siRecords contains the records of 17,192 RNAi experiments targeting 5,086 genes. Data contributors could submit their own records of RNAi experiments.
A database of known and novel AS events in the brain. Known AS events refers to the AS events detected in existing gene models, cDNAs and ESTs. Novel AS events refers to the AS events for which at least one isoform is detected only from RNA-Seq.
STPD / Salinity Tolerant Poplar Database
An integrative database for salt-tolerant poplar genome biology. Currently the STPD contains Populus euphratica genome and its related genetic resources. P. euphratica, with a preference of the salty habitats, has become a valuable genetic resource for the exploitation of tolerance characteristics in trees. This database contains curated data including genomic sequence, genes and gene functional information, non-coding RNA sequences, transposable elements, simple sequence repeats and single nucleotide polymorphisms information of P. euphratica, gene expression data between P. euphratica and Populus tomentosa, and whole-genome alignments between Populus trichocarpa, P. euphratica and Salix suchowensis. The STPD provides useful searching and data mining tools, including GBrowse genome browser, BLAST servers and genome alignments viewer, which can be used to browse genome regions, identify similar sequences and visualize genome alignments. Datasets within the STPD can also be downloaded to perform local searches.
LMPID / Linear Motif mediated Protein Interaction Database
A manually curated database which provides comprehensive experimentally validated information about the linear motifs (LMs) mediating PPIs from all organisms on a single platform. About 2200 entries have been compiled by detailed manual curation of PubMed abstracts, of which about 1000 LM entries were being annotated for the first time, as compared with the Eukaryotic LM resource. The users can submit their query through a user-friendly search page and browse the data in the alphabetical order of the bait gene names and according to the domains interacting with the LM.
Consists of a database that curates experimentally validated motif-based molecular switches and a prediction tool to identify possible switching mechanisms that might regulate a user-submitted motif of interest. switches.ELM helps to extend knowledge and direct research on how motifs mediate cooperative decision-making in a context-dependent manner and direct reliable and robust cell regulation.
A collection of different modular protein domains (SH2, SH3, PDZ, WW, etc.). ADAN contains 3505 entries with extensive structural and functional information available, manually integrated, curated and annotated with cross-references to other databases, biochemical and thermodynamical data, simplified coordinate files, sequence files and alignments.
A collection of toxic compounds from literature and web sources. The current version of this database compiles about 60,000 compounds and their structures. These molecules are classified according to their toxicity, based on more than 2 million measurements. The SuperToxic database provides a variety of search options like name, CASRN, molecular weight and measured values of toxicity. With the aid of implemented similarity searches, information about possible biological interactions can be gained.
A resource for the comparison of fragments found in metabolites, drugs or toxic compounds. Starting from 13,000 metabolites, 16,000 drugs and 2200 toxic compounds we generated 35,000 different building blocks (fragments), which are not only relevant to their biosynthesis and degradation but also provide important information regarding side-effects and toxicity.
DSSTox / Distributed Structure-Searchable Toxicity
Provides a public forum for publishing downloadable, structure-searchable, standardized chemical structure files associated with chemical inventories or toxicity data sets of environmental relevance. The standardized files allow one to assess, compare and search the chemical content in each resource, in the context of the larger DSSTox toxicology data network, as well as across large public cheminformatics resources such as PubChem.
More than 1 Million ELISA kits, antibodies, proteins, peptides and isotype controls. A search function with extensive filter options allows you to quickly identify the most suitable product amongst the large product range. Find and order the most suitable product for your research.
A public database to collect plant genes generated by tandem duplication mechanism in the process of plant evolution. PTGBase delivers the most comprehensive resource available to date, spanning 39 plant genomes, including model species and newly sequenced species alike. Across these genomes, 54 130 tandem duplicated gene clusters (129 652 genes) are presented in the database. Each tandem array, as well as its member genes, is characterized in complete detail. Tandem duplicated genes in PTGBase can be explored through browsing or searching by identifiers or keywords of functional annotation and sequence similarity. Users can download tandem duplicated gene arrays easily to any scale, up to the complete annotation data set for an entire plant genome. PTGBase will be updated regularly with newly sequenced plant species as they become available.
STRBase / Short Tandem Repeat DNA Internet DataBase
An information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. STRBase consolidates and organizes the abundant literature on this subject to facilitate on-going efforts in DNA typing.
Contains 31,396 RepeatMasker-identified non-redundant variant repeat sequences derived from 16,527 mouse cDNAs with protein-coding potential. The repeats were computationally associated with potential effects on transcriptional variation, translation, protein function or involvement in disease to identify Functional REPeats (FREPs). FREP is a unique resource for illuminating the role of transposons and repetitive sequences in shaping the coding part of the mouse transcriptome and for selecting the appropriate experimental model to study diseases with suspected repeat etiology contributions.
Contains the tandem repeats found in the human genome. The database is publicly available online, and the software for locating the repeats is also freely available. The definition of tandem repeats used by TRedD is a new and innovative definition based upon the concept of 'evolutive tandem repeats'.
TassDB / TAndem Splice Site DataBase
Stores extensive data about alternative splice events at donors and acceptors, both confirmed and unconfirmed cases. TassDB offers a user-friendly interface to search for specific genes or for genes containing tandem splice sites with specific features as well as the possibility to download result datasets. TassDB provides comprehensive resources for researchers interested in both targeted experimental studies and large-scale bioinformatics analyses of short distance tandem splice sites.
A freely accessible variable number tandem repeat database (VNTRDB) that is intended to be a resource for helping in the discovery of putatively polymorphic tandem repeat loci and to aid with assay design by providing the flanking sequences that can be used in subsequent PCR primer design. In order to reveal possible polymorphism, each TR locus was obtained by comparing the sequences between different sets of bacterial genera, species or strains.
The Microorganisms Tandem Repeats Database
An internet-based resource to help develop and perform tandem repeats based bacterial strain typing. The tools comprise four parts. The Tandem Repeats Database enables the identification of tandem repeats across entire genomes. The Strain Comparison Page identifies tandem repeats differing between different genome sequences from the same species. The "Blast in the Tandem Repeats Database" facilitates the search for a known tandem repeat and the prediction of amplification product sizes. The "Bacterial Genotyping Page" is a service for strain identification at the subspecies level.
RNABP COGEST / RNA Base Pair Count, Geometry and Stability
Brings together information, extracted from literature data, regarding occurrence frequency, experimental and quantum chemically optimized geometries, and computed interaction energies, for non-canonical base pairs observed in a non-redundant dataset of functional RNA structures. RNABP COGEST is designed to enable the quantum mechanical (QM) community, on the one hand, to identify appropriate biologically relevant model systems and also enable the biology community to easily sift through diverse computational results to gain theoretical insights which could promote hypothesis driven biological research.
A searchable database of fungal and bacterial genes encoding lignocellulose-active proteins that have been biochemically characterized. All the biochemical properties and functional annotations described in mycoCLAP are manually curated and are based on experimental evidence reported in published literature. The aim of mycoCLAP is to provide data on solely characterized proteins to facilitate the functional annotation of novel lignocellulose-active proteins.
An interactive database of small molecule ligands of epigenetic protein families by bringing together experimental, structural and chemoinformatic data in one place. Currently, EpiDBase encompasses 5784 unique ligands (11 422 entries) of various epigenetic markers such as writers, erasers and readers. The EpiDBase includes experimental IC50 values, ligand molecular weight, hydrogen bond donor and acceptor count, XlogP, number of rotatable bonds, number of aromatic rings, InChIKey, two-dimensional and three-dimensional (3D) chemical structures.
HEMD / Human Epigenetic Enzyme and Modulator Database
Provides a central resource for the display, search, and analysis of the structure, function, and related annotation for human epigenetic enzymes and chemical modulators focused on epigenetic therapeutics. HEMD could be a platform and a starting point for biologists and medicinal chemists for furthering research on epigenetic therapeutics.
A free knowledgebase of chemical modulators with documented modulatory activity for epigenome reader domains. ChEpiMod organizes information about chemical modulators and their associated binding-affinity data, as well as available structures of epigenome readers from the Protein Data Bank. The data are gathered from the literature and patents. Entries are supplemented by annotation. The current version of ChEpiMod covers six epigenome reader domain families (Bromodomain, PHD finger, Chromodomain, MBT, PWWP and Tudor). The database can be used to browse existing chemical modulators and bioactivity data, as well as, all available structures of readers and their molecular interactions.
TeloPIN / Telomeric Proteins Interaction Network
A database that points to provide comprehensive information on protein-protein, protein-DNA and protein-RNA interaction of telomeres. TeloPIN database contains four types of interaction data, including (i) protein-protein interaction (PPI) data, (ii) telomeric proteins ChIP-seq data, (iii) telomere-associated proteins data and (iv) telomeric repeat-containing RNAs (TERRA)-interacting proteins data.
A detailed, searchable repository of coiled-coil assignment. Coiled coils were identified using the program SOCKET, which locates coiled coils based on knobs-into-holes packing of side chains between alpha-helices.
A genome-scale functional network server constructed by integrating diverse genomics data and demonstrated the use of the network in genetic dissection of rice biotic stress responses and its usefulness for other grass species. This enhanced network and gene prioritization method will facilitate effective hypothesis generation about the function of the estimated 37K rice genes.
CCG / Catalogue of Cancer Genes
An integrated interactome of cancer genes. CCG has broad biomedical implications for both basic cancer biology and the development of personalized cancer therapy.
A systems biology-based framework to catalogue the human kinome, including 538 kinase genes, in the broader context of the human interactome. This comprehensive human kinome interactome map sheds light on anticancer drug resistance mechanisms and provides an innovative resource for rational kinase inhibitor design.
A gene expression and cancer association database in which the expression levels are mapped to genes using RNA-seq data obtained from The Cancer Genome Atlas, International Cancer Genome Consortium, Expression Atlas and publications. The BioXpress database includes expression data from 64 cancer types, 6361 patients and 17 469 genes with 9513 of the genes displaying differential expression between tumor and normal samples. In addition to data directly retrieved from RNA-seq data repositories, manual biocuration of publications supplements the available cancer association annotations in the database. All cancer types are mapped to Disease Ontology terms to facilitate a uniform pan-cancer analysis.
GENT / Gene Expression database of Normal and Tumor tissues
A web-accessible database which provides gene expression patterns across diverse human cancer and normal tissues. More than 34000 samples, profiled by Affymetrix U133A or U133plus2 platforms, are consistently processed and combined into two large-size data sets, facilitating the identification of cancer outliers over-expressed only in a subset of patients. Gene expression patterns in nearly 1000 human cancer cell lines are also provided. In each tissue, users can retrieve gene expression patterns classified by more detailed phenotypic information.
Provides comprehensive information on the annotation, gene function and expression for the sacred lotus. The information will facilitate users to efficiently query and browse genes, graphically visualize genome and download a variety of complex data information on genome DNA, coding sequence (CDS), transcripts or peptide sequences, promoters and markers. It will accelerate researches on gene cloning, functional identification of sacred lotus, and hence promote the studies on this species and plant genomics as well.
The L. monocytogenes 10403S BioCyc database
A resource for researchers studying Listeria and related organisms. The L. monocytogenes 10403S BioCyc database allows users to (i) have a comprehensive view of all reactions and pathways predicted to take place within the cell in the cellular overview, as well as to (ii) upload their own data, such as differential expression data, to visualize the data in the scope of predicted pathways and regulatory networks and to carry on enrichment analyses using several different annotations available within the database.
BioSurfDB / Biosurfactants and Biodegradation Database
A curated relational information system integrating data from: (i) metagenomes; (ii) organisms; (iii) biodegradation relevant genes; proteins and their metabolic pathways; (iv) bioremediation experiments results, with specific pollutants treatment efficiencies by surfactant producing organisms; and (v) a biosurfactant-curated list, grouped by producing organism, surfactant name, class and reference. The main goal of this repository is to gather information on the characterization of biological compounds and mechanisms involved in biosurfactant production and/or biodegradation and make it available in a curated way and associated with a number of computational tools to support studies of genomic and metagenomic data.
A computational framework to integrate complex relationship among different types of data and infer the potential drug targets by using the semantic web technology, and to improve performance through network neighborhood effect modeling. As a preliminary research, an OWL ontology including drugs, diseases, genes, pathways, SNPs, and their relations from the PharmGKB were constructed.
A search service for abbreviations and long forms utilized in Lifesciences. It provides a solution to the issue that many abbreviations are used in the literature, and polysemous or synonymous abbreviations appear frequently, making it difficult to read and understand scientific papers that are not relevant to the reader's expertise. Allie searches for abbreviations and their corresponding long forms from titles and abstracts in the entire MEDLINE, a database of the U.S. National Library of Medicine.
BioABACUS / Biotechnology ABbreviation and Acronym Uncovering Service
A searchable, cross-referenced, database of abbreviations and acronyms in biotechnology and computer science. To researchers in biotechnology in general, BioABACUS should help them avoid christening new terms with old acronyms or abbreviations.
ADAM / Another Database of Abbreviations in MEDLINE
Covers commonly used abbreviations and their definitions (or long-forms) within MEDLINE titles and abstracts, including both acronym and non-acronym abbreviations. A model of recognizing abbreviations and their long-forms from titles and abstracts of MEDLINE (2006 baseline) was employed. After grouping morphological variants, 59 405 abbreviation/long-form pairs were identified. ADAM shows high precision (97.4%) and includes most of the frequently used abbreviations contained in the Unified Medical Language System (UMLS) Lexicon and the Stanford Abbreviation Database.
SaRAD / Simple and Robust Abbreviation Dictionary
Provides an easy to implement, high performance tool for the construction of a biomedical symbol dictionary. The algorithms, applied to the MEDLINE document set, result in a high quality dictionary and toolset to disambiguate abbreviation symbols automatically.
An inventory of abbreviations and acronyms from clinical texts. Sense inventories created using clinical notes and medical dictionary resources demonstrate challenges with term coverage and resource integration.
GNS / Gene Name Service
Genomics researchers suffer from getting lost in the forest of gene aliases. Gene Name Service (GNS) solves this problem by providing a comprehensive alias resolution service of all widely used gene id nomenclatures. Most importantly, in addition to web-based interface, GNS also provides services through Web Service, which can be integrated into applications such as bioinformatics value-added databases, analysis pipelines or workflows.
A database of models of the genome of Mycobacterium tuberculosis (Mtb). The CHOPIN database assigns structural domains and generates homology models for 2911 sequences, corresponding to approximately 73% of the proteome. A sophisticated pipeline allows multiple models to be created using conformational states characteristic of different oligomeric states and ligand binding, such that the models reflect various functional states of the proteins. Additionally, CHOPIN includes structural analyses of mutations potentially associated with drug resistance.
A curated database of phosphorylation sites in prokaryotes for 96 prokaryotic organisms, which belong to 11 phyla in two domains including bacteria and archaea. All the phosphorylation sites were annotated with original references and other descriptions in the database, which could be easily accessed through user-friendly website interface including various search and browse options. The dbPSP database provides a comprehensive data resource for further studies of protein phosphorylation in prokaryotes.
3DGD / 3D Genome Database
A database that currently collected Hi-C data on four species, for easy accessing and visualization of chromatin 3D structure data. With the integration of other omics data such as genome-wide protein-DNA-binding data, this data source would be useful for researchers interested in chromatin structure and its biological functions.
A general repository for chromatin interaction data. Records in 4DGenome are compiled through comprehensive literature curation of experimentally-derived and computationally-predicted interactions. The current release contains 4,433,071 experimentally-derived and 3,605,176 computationally-predicted interactions in 5 organisms. Experimental data cover both high throughput datasets and individiual focused studies. All interaction data are freely available in a standardized file format. Records can be queried by genomic regions, gene names, organism, and detection technology.
Enables simultaneous comparisons between a wide range of data by combining major resources from human and vertebrate model organisms. Manteia performs several types of analyses as well as data retrieval, gene or probe set annotation, information content analysis or candidate gene prediction and prioritization. It aims to help in investigating the genetic origin of human diseases or identifying significant correlations in lists of genes and proteins generated by modern high-throughput techniques.
A curated database of human, mouse and rat miRNAs/mRNAs targets. miRGate is designed to analyze miRNA and gene isoforms lists under a common and consistent space of annotations. Including all existing 3 UTR and the entirely known miRNAs. All Havana biotypes and ENCODE principal isoforms for the three organisms are also included.
A freely accessible web application and database that enables human mitochondrial genome researchers to study genetic variation in mitochondrial genome with textual and graphical views accompanied by assignment function of haplogrouping if users submit their own data. Hence, the MitoVariome containing many kinds of variation features in the human mitochondrial genome will be useful for understanding mitochondrial variations of each individual, haplogroup, or geographical location to elucidate the history of human evolution.
STEPdb / STEP database
Contains a comprehensive characterization of subcellular localization and topology of the complete proteome of Escherichia coli. Two widely used E. coli proteomes (K-12 and BL21) are presented organized into thirteen subcellular classes. STEPdb exploits the wealth of genetic, proteomic, biochemical, and functional information on protein localization, secretion, and targeting in E. coli, one of the best understood model organisms. Subcellular annotations were derived from a combination of bioinformatics prediction, proteomic, biochemical, functional, topological data and extensive literature re-examination that were refined through manual curation.
FLAVIdB / Flavivirus Database
A database that combines antigenic data of flaviviruses, specialized analysis tools, and workflows for automated complex analyses focusing on applications in immunology and vaccinology. FLAVIdB represents a new generation of databases in which data and tools are integrated into a data mining infrastructures specifically designed to aid rational vaccine design by discovery of vaccine targets.
Contains over 590 complete flavivirus genome/protein sequences and information on known mutations and literature references. Each sequence has been manually annotated according to its date and place of isolation, phenotype and lethality. Internal tools are provided to rapidly determine relationships between viruses in Flavitrack and sequences provided by the user.
A portal to accessing the lineage and genotype information of influenza A viruses and a Web tool for determining lineages and genotypes of influenza A viruses. These features make FluGenome unique in its ability to automatically detect genotype differences attributable to reassortment events in influenza A virus evolution.
GISAID EpiFlu database
Provides a collection of influenza sequences containing associated metadata, both clinical and epidemiological. GISAID EpiFlu database is a resource that stores information about Influenza virus. It includes data-sharing platform through which sequence data are recommended for inclusion in seasonal and pre-pandemic vaccines. These data are available for research scientists, public and animal health officials and the pharmaceutical industry.
MPIC / Mitochondrial Protein Import Components
Provides searchable information on the protein import apparatus of plant and non-plant mitochondria. An in silico analysis was carried out, comparing the mitochondrial protein import apparatus from 24 species representing various lineages from Saccharomyces cerevisiae (yeast) and algae to Homo sapiens (human) and higher plants, including Arabidopsis thaliana (Arabidopsis), Oryza sativa (rice) and other more recently sequenced plant species. Each of these species was extensively searched and manually assembled for analysis in the MPIC DB. The database presents an interactive diagram in a user-friendly manner, allowing users to select their import component of interest.
An inventory of genes encoding mitochondrial-localized proteins and their expression across 14 mouse tissues. Using the same strategy we have now reconstructed this inventory separately for human and for mouse based on (i) improved gene transcript models, (ii) updated literature curation, including results from proteomic analyses of mitochondrial sub-compartments, (iii) improved homology mapping and (iv) updated versions of all seven original data sets. The updated human MitoCarta2.0 consists of 1158 human genes, including 918 genes in the original inventory as well as 240 additional genes. The updated mouse MitoCarta2.0 consists of 1158 genes, including 967 genes in the original inventory plus 191 additional genes. The improved MitoCarta 2.0 inventory provides a molecular framework for system-level analysis of mammalian mitochondria.
Provides a comprehensive knowledgebase for mitochondrial proteome, interactome and human diseases. MitProNet features a user-friendly graphic visualization interface to present functional analysis of linkage networks. As an up-to-date database and analysis platform, MitProNet should be particularly helpful in comprehensive studies of complicated biological mechanisms underlying mitochondrial functions and human mitochondrial diseases.
A large-scale relational database that is automatically updated to keep pace with advances in mitochondrial proteomics and is curated to assure that the designation of proteins as mitochondrial reflects gene ontology (GO) annotations supported by high-quality evidence codes. A set of postulates is proposed to help define which proteins are authentic components of mitochondria. A web interface is provided to permit members of the mitochondrial research community to suggest modifications in protein annotations or mitochondrial status.
Allows the complete Ensembl gene database to be queried using phylogenetic patterns. PhyloPat offers the possibility of querying with binary phylogenetic patterns or regular expressions, or through a phylogenetic tree of the 39 included species. Users can also input a list of Ensembl, EMBL, EntrezGene or HGNC IDs to check which phylogenetic lineage any gene belongs to.
A useful resource platform, providing all basic features of a sequence database with the addition of unique analysis tools which could be valuable for the Vibrio research community. VibrioBase currently houses a total of 252 Vibrio genomes developed in a user-friendly manner and useful to enable the analysis of these genomic data, particularly in the field of comparative genomics. Besides general data browsing features, VibrioBase offers analysis tools such as BLAST interfaces and JBrowse genome browser.
SpPress / Drosophila Spermatogenesis Expression Database
A public database containing genome-wide expression analysis of wild-type males using three cell populations isolated from mitotic, meiotic and post-meiotic phases of spermatogenesis in D. melanogaster.
A database providing candidate genes for reproductive researches in pig by mining and processing existing biological literatures in human and pigs. Based on text-mining and comparative genomics, ReCGiP presents diverse information of reproduction-relevant genes in human and pig. The genes were sorted by the degree of relevance with the reproduction topics and were visualized in a gene's co-occurrence network where two genes were connected if they were co-cited in a PubMed abstract.
OKdb / Ovarian Kaleidoscope Database
Provides information regarding the biological function, expression pattern and regulation of genes expressed in the ovary. OKdb also contains information on gene sequences, chromosomal localization, human and murine mutation phenotypes and biomedical publication links.
A publicly available web-based SAGE database on male gonad development that covers six male mouse embryonic gonad stages, including E10.5, E11.5, E12.5, E13.5, E15.5 and E17.5. The sequence coverage of each SAGE library is beyond 150K, 'which is the most extensive sequence-based male gonadal transcriptome to date'. An interactive web interface with customizable parameters is provided for analyzing male gonad transcriptome information.
Provides a comprehensive platform to gather detailed information of experimentally verified and Greed AUC Stepwise (GAS)-predicted genes in spermatogenesis. SpermatogenesisOnline integrates the detailed information for 1666 genes that have been reported to be involved in spermatogenesis and 762 genes predicted by our GAS model (GAS probability >0.5) to participate in spermatogenesis. SpermatogenesisOnline 1.0 will help researchers to obtain a comprehensive understanding of complex biological mechanisms of spermatogenesis.
A resource that maps small molecule bioactivities to protein domains from the Pfam-A collection of protein families. Small molecule bioactivities mapped to protein domains add important precision to approaches that use protein sequence searches alignments to assist applications in computational drug discovery and systems and chemical biology.
An open database which allows participants of Direct-To-Consumer genetic testing to publish their genetic data at no cost along with phenotypic information. Through this crowdsourced effort of collecting genetic and phenotypic information, openSNP has become a resource for a wide area of studies, including Genome-Wide Association Studies.
A wiki resource of the functional consequences of human genetic variation as published in peer-reviewed studies. Online since 2006 and freely available for personal use, SNPedia has focused on the medical, phenotypic and genealogical associations of single nucleotide polymorphisms. Entries are formatted to allow associations to be assigned to single genotypes as well as sets of genotypes (genosets).
PGP / Personal Genome Project
A charitable organization working to generate, aggregate and interpret human biological and trait data on an unprecedented scale. Open data is a critical component of the scientific method, but genomes are both identifiable and predictive. As a result, many studies choose to withhold data from participants and restrict access to researchers. The PGP's public data is a common ground to collaborate and improve our understanding of genomes.
Compiles data on experimentally validated, naturally occurring transcription factors binding sites (TFBS) across the Bacteria domain, placing a strong emphasis on the transparency of the curation process, the quality and availability of the stored data and fully customizable access to its records. CollecTF integrates multiple sources of data automatically and openly, allowing users to dynamically redefine binding motifs and their experimental support base.
A metazoan transcription factor and maternal factor resource specially designed for developmental biology studies. Using this web interface, users can browse, search, and download detailed information on species of interest, genes, transcription factor families, or developmental ontology terms.
A database and resource of protein families in Arthropod genomes. ProtoBug platform presents the relatedness of complete proteomes from 17 insects as well as a proteome of the crustacean, Daphnia pulex. The represented proteomes from insects include louse, bee, beetle, ants, flies and mosquitoes.
CPD / Cellular Phenotype Database
A repository for data derived from high-throughput systems microscopy studies. The aims of this resource are: (i) to provide easy access to cellular phenotype and molecular localization data for the broader research community; (ii) to facilitate integration of independent phenotypic studies by means of data aggregation techniques, including use of an ontology; and (iii) to facilitate development of analytical methods in this field.
Provides centralized local storage and access to completed archaeal and bacterial genomes. MicrobeDB creates a simple to use, easy to maintain, centralized local resource for various large-scale comparative genomic analyses and a back-end for future microbial application design.
A web-based database system for storing and retrieving folliculogenesis-related experimental data. It provides detailed information for 580 genes/proteins (from 23 model organisms, including Homo sapiens, Mus musculus, Rattus norvegicus, Mesocricetus auratus, Bos Taurus, Drosophila and Xenopus laevis) that have been reported to be involved in folliculogenesis, POF (premature ovarian failure) and PCOS (polycystic ovary syndrome).
A comprehensive resource of neuropeptides, which holds 5949 non-redundant neuropeptide entries originating from 493 organisms belonging to 65 neuropeptide families. In NeuroPep, the number of neuropeptides in invertebrates and vertebrates is 3455 and 2406, respectively. It is currently the most complete neuropeptide database. In addition, user-friendly web tools like browsing, sequence alignment and mapping are also integrated into the NeuroPep database.
An endogenous peptide database to aid mass spectrometric identifications. In the identification process the experimental peptide masses are compared with the peptide masses stored in SwePep both with and without possible post-translational modifications. This intermediate identification step is fast and singles out peptides that are potential endogenous peptides and can later be confirmed with tandem mass spectrometry data.
Catalogs information on the sequence, structure, active site and genomic neighborhood of experimentally characterized enzymes involved in five novel PTMs, namely AMPylation, Eliminylation, Sulfation, Hydroxylation and Deamidation. The novPTMenzy database is a unique resource that can aid in discovery of unusual PTM catalyzing enzymes in newly sequenced genomes.
Consists of a library of biological parts from the database of plasmid features. GenoLIB was designed using the synthetic biology open language (SBOL), an emerging standard developed to organize libraries of genetic parts to facilitate synthetic biology workflows. This database supports unambiguous annotation of plasmid sequences. It is indexed with a combination of automatic and manual curation methods for the determination of feature sequences, borders and functional descriptions.
A multi-species database to disentangle the SNP chip jungle. Features of SNPchiMp include, but are not limited to, the following functions: 1) referencing the SNP mapping information to the latest genome assembly, 2) extraction of information contained in dbSNP for SNPs present in all commercially available bovine chips, and 3) identification of SNPs in common between two or more bovine chips (e.g. for SNP imputation from lower to higher density). This platform allows easy integration and standardization, and it is aimed at both industry and research. It also enables users to easily link the information available from the array producer with data in public databases, without the need of additional bioinformatics tools or pipelines.
Wheat microRNA Portal
Provides a broad repertoire of hexaploid wheat miRNAs associated with abiotic stress responses, tolerance and development. These valuable resources of expressed wheat miRNAs will help in elucidating the regulatory mechanisms involved in freezing and aluminum responses and tolerance mechanisms as well as for development and flowering.
LigASite / LIGand Attachment SITE
A gold-standard dataset of binding sites in 550 proteins of known structures. LigASite consists exclusively of biologically relevant binding sites in proteins for which at least one apo- and one holo-structure are available. The website interface allows users to search the dataset by PDB identifiers, ligand identifiers, protein names or sequence, and to look for structural matches as defined by the CATH homologous superfamilies. The datasets can be downloaded from the website as Schema-validated XML files or comma-separated flat files.
HTT-DB / Horizontal Transferred of Transposable elements Database
Allows easy access to all known cases of horizontal transfer of transposable elements (HTT) reported along with rich information about each case. Moreover, it allows the user to generate tables and graphs based on searches using TEs and/or host species classification and export them in several formats.
DBAASP / Database of Antimicrobial Activity and Structure of Peptides
A manually curated database for those peptides for which antimicrobial activity against particular targets has been evaluated experimentally. The database is a depository of complete information on: the chemical structure of peptides; target species; target object of cell; peptide antimicrobial/haemolytic/cytotoxic activities; and experimental conditions at which activities were estimated. The DBAASP search page allows the user to search peptides according to their structural characteristics, complexity type (monomer, dimer and two-peptide), source, synthesis type (ribosomal, nonribosomal and synthetic) and target species. The database prediction algorithm provides a tool for rational design of new antimicrobial peptides.
The Functional lncRNA Database
A repository of mammalian long non-protein-coding transcripts that have been experimentally shown to be both non-coding and functional. To search for a specific lncRNA, enter its name and choose the appropriate species. Alternatively, you can browse all the lncRNAs at once. Currently the database contains lncRNAs from Human, Mouse and Rat.
A genome-wide transcription atlas of miRNAs in grapevine, analyzing the spatio-temporal distribution of known and newly discovered miRNAs, in the widest range of grapevine samples considered thus far. miRVine aims at becoming the reference for the future development of targeted functional studies, a first indispensable step towards the definition of miRNA involvement in grapevine development.
An integrative database of Arabidopsis thaliana miRNAs and their target genes, expression profiles, function annotations and pathways. A friendly web interface is developed to browse and analyze of the data. We believe that miRFANs is a useful platform for exploring the regulatory functions of Arabidopsis thaliana miRNAs and can provide considerable value for many researchers.
IUPHAR / IUPHAR/BPS Guide to PHARMACOLOGY
Provides expert-curated molecular interactions between successful and potential drugs and their targets in the human genome. The information in the database is presented at two levels: the initial view or landing pages for each target family provide expert-curated overviews of the key properties and selective ligands and tool compounds available. For selected targets more detailed introductory chapters for each family are available along with curated information on the pharmacological, physiological, structural, genetic and pathophysiogical properties of each target. The database is enhanced with hyperlinks to additional information in other databases including Ensembl, UniProt, PubChem, ChEMBL and DrugBank, as well as curated chemical information and literature citations in PubMed.
PDSP Ki database
A unique resource in the public domain which provides information on the abilities of drugs to interact with an expanding number of molecular targets. The Ki database serves as a data warehouse for published and internally-derived Ki, or affinity, values for a large number of drugs and drug candidates at an expanding number of G-protein coupled receptors, ion channels, transporters and enzymes.