EBI / EMBL-EBI - The European Bioinformatics Institute
Supplies an access to several biological data resources and bioinformatics services. EBI is a platform that covers the entire range of biological sciences: raw DNA sequences to curated proteins, chemicals, structures, systems, pathways, ontologies and literature. Databases, tools, as well as web services are provided for sharing data, performing queries and analyzing results. Users can also deposit their data through a data submission page. All the resources are freely available without restriction, with few exceptions.
PRIDE / PRoteomics IDEntifications database
A centralized, standards compliant, public data repository for proteomics data, including protein and peptide identifications, post-translational modifications and supporting spectral evidence. PRIDE is a core member in the ProteomeXchange (PX) consortium, which provides a single point for submitting mass spectrometry based proteomics data to public-domain repositories. Datasets are submitted to PRIDE via ProteomeXchange and are handled by expert biocurators.
Offers a seamless integration of and navigation through protein-related data. NeXtProt contains proteomics data for over 85% of human proteins. Moreover, this tool includes over 8000 phenotypic observations for over 4000 variations in a number of genes involved in hereditary cancers and channelopathies. All of the data are available via a user interface and FTP site. An API access and a SPARQL endpoint are also provided for more technical applications.
The Medicago Proteome Atlas
An atlas of protein expression of Medicago truncatula in association with Sinorhizobium meliloti. The Medicago Proteome Atlas provides evidence for more than 23013 protein groups (19679 from the eukaryotic host plant, M. truncatula; 3334 from S. meliloti) along with 20120 phosphorylation sites and 734 lysine acetylation sites. Further mining of this proteomic resource may enable engineering of crops and their microbial partners to increase agricultural productivity and sustainability.
A web-based resource to aid analysis of existing biological data and inspire future biological investigations. Phosphomouse presents experimental data about tissue-specific protein abundance and phosphorylation, including 12000 proteins and 36000 phosphorylation sites from 9 mouse tissues. These data revealed distinctive and complementary protein and phosphoprotein expression profiles that support each tissue’s unique physiology. Moreover, by combining protein abundance measurements with phosphorylation observations, we could distinguish tissue-specific phosphorylation of ubiquitous proteins from phosphorylation of tissue-specific proteins.
Translates the human proteome into molecular and digital tools for drug discovery, personalized medicine and life science research. The ProteomeTools project is a joint effort of the Technical University of Munich (TUM), JPT Peptide Technologies, SAP SE and Thermo Fisher Scientific. It aims to use synthetic reference peptides to create reference mass spectra and covers all human proteins, important post-translational modifications thereof and other interesting biology such as disease associated mutations, HLA neo antigens, small open reading frames (ORFs) or translated lincRNAs.
Plant-PrAS / Plant-Protein Annotation Suite
A database of physicochemical and structural properties, and novel functional region in plant proteomes. Plant-PrAS database plant species are Arabidopsis, soybean, poplar, rice, moss and algae. We carried out the calculation and prediction of physicochemical parameters (Length, Charged, Nonpolar, Acidic, Basic, Low complexity, GRAVY and pI), secondary structural properties (Solvent accessibility, β sheet, Intrinsically disordered regions, Signal peptide cleavage sites, Transmembrane helices, S-S bond and Domain linker), functional annotation (Pfam, Uniprot-plant, Uniprot-sprot, EC number, PDB and KOG), functional region (PASS and Rosetta stone proteins) and others (Ubiquitylation site, N-glycosylation site, O-glycosylation and Subcellular location, Protein solubility).
Membranome / Membrane Proteome
Gathers information about bitopic proteins from six complete genomes (Homo sapiens, Arabidopsis thaliana, Dictyostelium discoideum, Saccharomyces cerevisiae, Escherichia coli and Methanocaldococcus jannaschii), corresponding to each kingdom. Membranome is a database which compiles 3D models of transmembrane (TM) domains, organized following a customized classification, for over 6000 bitopic proteins accompanied by their related structural and functional information.
HipSci / human induced pluripotent Stem cells initiative
Generates human induced pluripotent stem cells (iPSCs) from hundreds of healthy individuals as well as patients diagnosed with selected diseases. HipSci is a powerful resource to evaluate and quantify cell responses to chemical, physical and biological stimuli using novel assays and artificial microenvironments. Within this framework, phenotypic data are being collated with genomics, epigenomics and proteomics data to discover the impact of their variation on the cellular phenotype.
Displays Rab annotation for all genomes available as a part of Superfamily 1.75. RabDB a database created to explore the universe of the Rab family of small GTPases, key regulators of the Eukaryotic endomembrane system, predicted by the Rabifier classification pipleline in the sequenced eukaryotic genomes. It is designed to enable the cell biology community to keep pace with the increasing number of fully-sequenced genomes and change the scale at which we perform comparative analysis in cell biology.
Medicago PhosphoProtein Database
A repository built to house phosphoprotein, phosphopeptide, and phosphosite data specific to Medicago. Medicago PhosphoProtein Database holds 3457 unique phosphopeptides that contain 3404 non-redundant sites of phosphorylation on 829 proteins. Through the web-based interface, users are allowed to browse identified proteins or search for proteins of interest. Furthermore, it allows users to conduct BLAST searches of the database using both peptide sequences and phosphorylation motifs as queries. The data contained within the database are available for download to be investigated at the user’s discretion.
Dynamic Proteomics
Consists in a compendium of endogenously tagged human proteins and their time-lapse microscopy movies. Dynamic Proteomics provides the annotation of the tagged proteins, alignment of protein dynamics for proteins of interest, sequence search and comparison of up to 50 input sequences to all the complementary DNAs (cDNAs) in the library. It offers a search for gene names, DNA sequences, protein description, image or published localization and exon-tag insertion point.
Colorectal cancer atlas
An integrated web-based resource that catalogues the genomic and proteomic annotations identified in colorectal cancer (CRC) tissues and cell lines. The data catalogued to-date include sequence variations as well as quantitative and non-quantitative protein expression data. The database enables the analysis of these data in the context of signaling pathways, protein–protein interactions, Gene Ontology terms, protein domains and post-translational modifications. Currently, Colorectal Cancer Atlas contains data for >13 711 CRC tissues, >165 CRC cell lines, 62 251 protein identifications, >8.3 million MS/MS spectra, >18 410 genes with sequence variations (404 278 entries) and 351 pathways with sequence variants. Overall, Colorectal Cancer Atlas has been designed to serve as a central resource to facilitate research in CRC.
jMorp / Japanese Multi Omics Reference Panel
Contains metabolome and proteome data in plasma obtained from 5,093 healthy volunteers in a Japanese population. jMorp delivers minimized biases due the utilization of a single protocol in a single institute, the Tohoku Medical Megabank Cohort Study. It offers a graphical viewer that allows to display correlations between metabolites. This database is built using large-scale cohort data for healthy volunteers with various health records and genome data, and provides significant genome wide association study (GWAS) results.
ROUGE / Rodent Unidentified Gene-Encoded Large Proteins
Provides access to the results of computer-assisted sequence analysis of mouse homologues of KIAA cDNA (mKIAA cDNA) that were isolated. ROUGE is a subsidiary database of the Human Unidentied Gene-Encoded (HUGE) protein database that contains about 1000 mKIAA cDNA entries. The two databases have the same basic organization, with a gene/ protein characteristic table, summarizing the results from computer-assisted analysis of the cDNA sequence and the deduced amino acid sequences, for each cDNA entry.
CharProtDB / Characterized Protein Database
Concerns biochemically characterized proteins data. CharProtDB provides a source of transitive assignments of function which allow to make annotation pipelines. This annotation contains (1) gene name, (2) symbol and various controlled vocabulary terms, (3) Enzyme Commission number, (4) TransportDB accession. A BLAST sequence similarity search has been provided from the CharProtDB web interface, which permits user input and can search the user which submitted query sequence against the entire CharProtDB data set.
PlaMoM / Plant Mobile Macromolecules
Provides convenient and interactive search tools allowing users to retrieve, to analyze and also to predict mobile RNAs/proteins. Each entry in the PlaMoM database contains detailed information such as nucleotide/amino acid sequences, ortholog partners, related experiments, gene functions and literature. The resource provides a built-in tool to identify potential RNA mobility signals such as tRNA-like structures. The current version of PlaMoM compiles a total of 17 991 mobile macromolecules from 14 plant species/ecotypes from published data and literature.
IPI / International Protein Index
Offers complete nonredundant data sets representing the human, mouse and rat proteomes, built from the Swiss-Prot, TrEMBL, Ensembl and RefSeq databases. IPI is a nonredundant human proteome set that was used in the primary analysis of the human genome sequence. It provides a species-specific, complete and non-redundant dataset particularly suited to supporting protein identification in proteomics experiments. Its sequence- and identifier-based construction eliminates the need for manual filtering of redundant results in protein identification, while maintaining cross-references to the source data.
AMPAD Knowledge Portal / Accelerating Medicines Partnership-Alzheimer’s Disease Knowledge Portal
Contains data related to Alzheimer's disease and consists of genomics, proteomics, metabolomics and other data types from a variety of human studies, animal and cellular model systems. AMPAD Knowledge Portal generates data which have been inspected with the NIA programs: Accelerating Medicines Partnership-Alzheimer’s Disease - Target Discovery and Preclinical Validation Project AMPAD and Molecular Mechanisms of the Vascular Etiology of Alzheimer’s Disease M²OVEAD Consortium.
HPSF / Human Proteome Structure and Function
Provides a repository of structure and function annotations on the 'missing proteins' of the human proteome. HPSF hosts missing proteins that have not been validated at protein level which are first extracted from the neXtProt database. The structure folding simulations are then generated by I-TASSER with all homologous templates excluded from the threading libraries. Finally, the functional insights of each protein, including enzyme commission, gene ontology, ligand-binding and subcellular location, are provided by the structure-based function annotation tool, COFACTOR. One goal of the HPSF database is to construct a comprehensive repository consisting of annotations on the folding and function of all missing proteins in human proteome using the cutting-edge bioinformatics methods, which should provide important help to recognize possible protein-coding genes from the 'missing proteins' and to guide further protein characterization experiments.
ProteinCarta / Protein Catalog indexed by Recorded Terminal Tags
Provides terminal tags of proteomes. ProteinCarta is a database storing over 50 residues from both termini of all amino acid sequences in the UniProt reference proteome data of the nine organisms analyzed. It requires only the amino acid sequence and the organism name as the input information. The search mode is useful, especially for identifying isoforms where the alternative sequences are located in the terminal regions.
