Contains chemical structures and biological properties of molecules including small molecules and siRNA reagents. PubChem consists of three interconnected databases: Substance, BioAssay and Compound. The database also provides a suite of web-based bioactivity analysis tools allowing to download and search individual test results, compare biological activity data from multiple screenings, examine target selectivity or explore structure–activity relationships for compounds of interest.
A well-established resource in the fields of drug discovery and medicinal chemistry research. The ChEMBL database curates and stores standardized bioactivity, molecule, target and drug data extracted from multiple sources, including the primary medicinal chemistry literature. Programmatic access to ChEMBL data has been improved by a recent update to the ChEMBL web services, which exposes significantly more data from the underlying database and introduces new functionality.
Provides services and resources to the academic and private-sector research communities worldwide to facilitate the discovery and development of new cancer therapeutic agents. Services available directly from DTP: (i) NCI-60 human cancer cell line screen, (ii) Molecular Target Program, (iii) Materials for research — tumor cells, chemicals, natural products and biological samples. Anti-cancer Agent Mechanism Database DTP is a set of 122 compounds with anti-cancer activity and reasonably well known mechanism of action. The list of compounds was assembled as a training set for neural network analysis of drug mechanism of action.
A database and ontology containing information about chemical entities of biological interest. ChEBI currently includes over 46 000 entries, each of which is classified within the ontology and assigned multiple annotations including (where relevant) a chemical structure, database cross-references, synonyms and literature citations. Programmatic access has been improved by the introduction of a library, libChEBI, in Java, Python and Matlab. Furthermore, we have added two new tools, namely an analysis tool, BiNChE, and a query tool for the ontology, OntoQuery.
Gathers detailed drug, drug-target, drug action and drug interaction information about drugs. DrugBank is a web resource that contains information about FDA-approved drugs as well as experimental drugs going through the FDA approval process. The database also includes pharmaco-omic data covering the influence of drugs on metabolite levels, gene expression levels and protein expression levels, as well as data on investigational drug clinical trials and drug repurposing trials, and thousands of up-to-date drug images of approved drugs.
Links genetic, lineage, and other cellular features of cancer cell lines to small-molecule sensitivity with the goal of accelerating discovery of patient-matched cancer therapeutics. CTRP hosts an 'Informer Set' of 481 small-molecule probes and drugs that selectively target distinct nodes in cell circuitry and that collectively modulate a broad array of cell processes. The CTRP is a living resource for the biomedical research community that can be mined to develop insights into small-molecule mechanisms of action and novel therapeutic hypotheses, and to support future discovery of drugs matched to patients based on predictive biomarkers.
A publicly available compilation of chemical-protein-disease annotation resources that enables the study of systems pharmacology for a small molecule across multiple layers of complexity from molecular to clinical levels. In this third version, ChemProt has been updated to more than 1.7 million compounds with 7.8 million bioactivity measurements for 19,504 proteins. Within ChemProt, it is possible to navigate the chemogenomics space and to link chemically induced target perturbations to diseases and other biological outcomes. Such tools might be of interest for drug discovery, drug safety and also chemical risk assessment. ChemProt 3.0 supports predicting bioactivities on targets and off-targets for new compounds and can assist in the associations to phenotypes and side effects relationships.
A comprehensive, publically-accessible collection of approved and investigational drugs for high-throughput screening. NPC provides a valuable resource for both validating new models of disease and better understanding the molecular basis of disease pathology and intervention. It has already generated several useful probes for studying a diverse cross section of biology, including novel targets and pathways. NCGC provides access to its set of approved drugs and bioactives through the Therapeutics for Rare and Neglected Diseases (TRND) program and as part of the compound collection for the Tox21 initiative, a collaborative effort for toxicity screening among several government agencies.
Contains a comprehensive collection of approved drugs in Japan, USA and Europe unified based on chemical structures and/or chemical components. KEGG DRUG is a database which contains information about molecular networks, such as targets, metabolizing enzymes and drug–drug interactions. All the marketed drugs in Japan, the prescription drugs but also the over-the-counter (OTC) drugs, are represented in the database, including crude drugs and Traditional Chinese Medicine (TCM) drugs.
Allows to explore the medicinal value of diet and elucidate the synergistic effects of natural bioactive compounds on disease phenotypes. NutriChem is a database that contains food-compound pairs between some plant-based foods and phytochemicals, as well as the food-disease associations between some plant-based foods and diseases. It was generated by text mining of 21 million MEDLINE abstracts. The incorporation of confidence scores based on the availability of support from literature or patient records may serve as future update of NutriChem.
Summarizes the volatilomes of bacteria and fungi. mVOC was constructed thanks to an automatic text mining procedure before being manually curated by experts. It provides a web browser that allows users to query for mass spectra peaks or for emitter and receiver of specific microbial volatile organic compounds (mVOCs). This database is searchable by compounds name, PubChem-ID, chemical formula or properties.
A public, Web-based informatics environment. ChemBank stores and makes freely available data derived from small molecules and small-molecule screens and has resources for relating and studying these data. Currently, ChemBank stores information on hundreds of thousands of small molecules and hundreds of biomedically relevant assays performed at the Broad Institute screening center. Web-based analysis tools are available within ChemBank to study the relationships between small molecules, cell measurements, and cell states.
Provides several online databases and tools relevant to the biology of ageing. HAGR is a web portal that hosts high-quality curated gene-centric information relevant to human ageing. All data sets are linked to each other, and each gene entry contains direct links to all other relevant entries in HAGR’s datasets. The database aims to organize the increasing amount of these information and make them accessible to the research community.
An interactive, visual database containing more than 618 small molecule pathways found in humans. More than 70% of these pathways (>433) are not found in any other pathway database. SMPDB is designed specifically to support pathway elucidation and pathway discovery in metabolomics, transcriptomics, proteomics and systems biology. It is able to do so, in part, by providing exquisitely detailed, fully searchable, hyperlinked diagrams of human metabolic pathways, metabolic disease pathways, metabolite signaling pathways and drug-action pathways.
Offers information of pharmaceutical ingredients. SuperDRUG provides annotated drugs with regulatory details, chemical structures (2D and 3D), dosage, biological targets, physicochemical properties, external identifiers, side-effects and pharmacokinetic data. It enables a comparison of 2D- and 3D-similarity between drugs of different indication classes elucidating structural reasons for adverse effects that might be neglected by exclusive consideration of their 2D-resemblence.
Provides access to about 5M commercially available small molecules. ChemDB is a chemical database that contains data coming from the electronic catalogs of over 150 chemical vendors as well as a limited number of publicly available datasets. The database includes search tools in support of systems biology and drug discovery projects. Data can be searched by structural similarity, includes several, names and annotations and users can also search virtual chemical space.
Provides unique chemical structures that come from the Substance database and more than 60 million of Compound IDs (CIDs). PubChem Compound aggregates substance records from different data sources about the same molecule through a common ‘compound’ record. It provides features that allow users to see how their structures would be handled during the standardization process when they submit a structure.
A chemical structure database designed to make this metadata accessible. SCRIPDB provides the full original patent text, reactions and relationships described within any individual patent, in addition to the molecular files common to structural databases. SCRIPDB may be searched by exact chemical structure, substructure or molecular similarity and the results may be restricted to patents describing synthetic routes.
A public database and suite of tools developed to provide access to bioassay data produced by the NIH Molecular Libraries Program (MLP). Data from 631 MLP projects were migrated to a new structured vocabulary designed to capture bioassay data in a formalized manner, with particular emphasis placed on the description of assay protocols. New data can be submitted to BARD with a user-friendly set of tools that assist in the creation of appropriately formatted datasets and assay definitions.
A publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. The SureChEMBL database contains more than 17 million distinct compounds extracted from more than 14 million patent documents, spanning a time range from 1970 to present.
Provides biomarkers of exposure to environmental risk factors for diseases. Exposome-Explorer contains detailed information on the nature of biomarkers, populations and subjects in which biomarkers have been measured, samples analysed, methods used for biomarker analyses, concentrations in biospecimens, correlations with external exposure measurements, and biological reproducibility over time. This information can be used by epidemiologists and clinicians to compare the performance and field of application of various biomarkers and to identify the specific biomarkers or panels of biomarkers that are most useful for biomonitoring or disease etiology studies.
A database for pain stimulating and pain relieving compounds, which bind or potentially bind to ion channels, such as TRPV1, TRPM8, TRPA1, hERG, TREK1, P2X, ASIC or voltage-gated sodium channels. The database consists of about 8,700 ligands, which are characterized by experimentally measured binding affinities. Additionally, 100,000 putative ligands are included.
Stores depositor-contributed information of chemical structures. PubChem Substance provides more than 157 million of Substance IDs (SIDs). It can contain information such as biological functions of glucose, or characteristics of a research grade sample of glucose. This tool stores different descriptions about the same molecule as separate records that are independent of each other. It allows users to submit descriptions following a standardization process.
SwissSidechain is a structural and molecular mechanics database and curated platform of hundreds of non-natural amino-acid sidechains that can be used to study in silico their insertion into natural peptides or proteins. Structural files (pdb, mol2, SMILES) for 230 sidechains can be found together with PyMOL and UCSF Chimera plugins to insert them into existing peptide or protein structures. Predicted rotamers as well as topologies and parameters to run molecular mechanics analysis are provided. This tool is provided and maintained by the Molecular modeling group at the Swiss Institute of Bioinformatics. This database is free for academic use.
A resource that was specifically designed to capture information about the toxic exposome. The focus of the T3DB is on providing mechanisms of toxicity and target proteins for each toxin. This dual nature of the T3DB, in which toxin and toxin target records are interactively linked in both directions, makes it unique from existing databases. It is also fully searchable and supports extensive text, sequence, chemical structure, and relational query searches.
Contains information on molecular replacements and their performance in biochemical assays. The SwissBioisostere database is meant to provide researchers in drug discovery projects with ideas for bioisosteric modifications of their current lead molecule, as well as to give interested scientists access to the details on particular molecular replacements.
Dictionary of chemical components (ligands, small molecules and monomers) referred to in PDB entries and maintained by wwPDB. It provides comprehensive search facilities for finding a particular component, or determining components in structure entries. This database contains currently over 23.657 ligands.
A resource for withdrawn and discontinued drugs. WITHDRAWN not only contains information related to drug withdrawals and associated adverse drug reactions but also drug-target interactions and genetic variations of the protein targets. The drug-target interaction information is mapped to biological context by enriching the relevant pathways. The illustrated case study proves that, connecting links between drugs, targets and SNPs may explain the underlying mechanisms of toxicity. The knowledge presented in the database can improve the insights of drug-target interactions in toxicological context and provide the rationale for further off-target profiling and enhanced pharmacogenetics studies in different populations.
Compiles information on natural and artificial sweetening agents. SuperSweet includes sweetening agents’ properties such as 3D structure, origin, sweetness, approval, calories and provides hypotheses on their binding to the receptor. It contains more than 8000 carbohydrates, proteins, D-amino acids and artificial (synthesized) sweeteners, which were retrieved from the literature and different pre-existing data sources like Pubchem and the Protein Data Bank (PDB).
Provides a dataset of volatile compounds collected from a variety of sources. SuperScent contains information about compounds chemical properties and commercial availability. The database is composed of more than 2000 compounds, 9000 synonyms and references to over 20 different suppliers. It permits users to retrieve information about scents or to get an overview of the known volatile organic compounds.
Integrates structure, bioactivity, regulatory, pharmacologic actions and indications for active pharmaceutical ingredients approved by FDA and other regulatory agencies. DrugCentral includes content for active ingredients with pharmaceutical formulations, indexing drugs and drug label annotations, complementing similar resources available online. At the molecular level, DrugCentral bridges drug-target interactions with pharmacological action and indications. The integration with FDA drug labels enables text mining applications for drug adverse events and clinical trial information.
Focuses on providing chemical and structural information for small molecules found as part of the structures deposited in the Protein Data Bank. Ligand Expo is an integrated data resource for finding information about small molecules bound to proteins and nucleic acids. Ligand Expo accepts keyword-based queries and also provides a graphical interface for performing chemical substructure searches.
Allows exploration of scaffolds or chemical probes for pharmaceutical innovations and chemical biology studies. The scaffolds in ASDB were derived from public databases including ChEMBL, DrugBank, and TCMSP, with a scaffold-based classification approach. Each scaffold was assigned with an InChIKey as its unique identifier, energy-minimized 3D conformations, and other calculated properties. A scaffold is also associated with drugs, natural products, drug targets, and medical indications. The database can be retrieved through text or structure query tools.
Gathers information about non-redundant protein-ligand complexe. PSMDB is an online repository allowing users to handle structural redundancy at both the protein and ligand levels. It contains highly similar ligands interacting with different proteins and different ligands. It preserves maximal information from structures containing bound ligands, and information contained in this database is regularly updated.
A resource that maps small molecule bioactivities to protein domains from the Pfam-A collection of protein families. Small molecule bioactivities mapped to protein domains add important precision to approaches that use protein sequence searches alignments to assist applications in computational drug discovery and systems and chemical biology.
Provides force-field parameters of small and drug-like molecules for all major all-atom force fields. Ligandbook aims to enable parameter re-use and simulation reproducibility by (1) facilitating the publication of force field parameters as open data; (2) acting as an archive for parameter sets that are supplied and maintained by the community; (3) making large, richly annotated parameter datasets easily available through human and machine accessible interfaces.
A chemical reference data resource that describes all residue and small molecule components found in Protein Data Bank (PDB) entries. The CCD contains detailed chemical descriptions for standard and modified amino acids/nucleotides, small molecule ligands and solvent molecules. Each chemical definition includes descriptions of chemical properties such as stereochemical assignments, chemical descriptors, systematic chemical names and idealized coordinates. The content, preparation, validation and distribution of this CCD chemical reference dataset are described.
A free knowledgebase of chemical modulators with documented modulatory activity for epigenome reader domains. ChEpiMod organizes information about chemical modulators and their associated binding-affinity data, as well as available structures of epigenome readers from the Protein Data Bank. The data are gathered from the literature and patents. Entries are supplemented by annotation. The current version of ChEpiMod covers six epigenome reader domain families (Bromodomain, PHD finger, Chromodomain, MBT, PWWP and Tudor). The database can be used to browse existing chemical modulators and bioactivity data, as well as, all available structures of readers and their molecular interactions.
Uses a statistical approach to reliably and automatically annotate compounds with concepts defined in Medical Subject Headings, and the National Library of Medicine's controlled vocabulary for biomedical concepts. These annotations provide links from compounds to biomedical literature and complement existing resources such as PubChem and the Human Metabolome Database.
Gathers cellular reprogramming information. RPdb is an online repository that provides detailed information about experimentally verified genes and microRNAs, including gene descriptions and functional annotations. This database allows users to find genes, microRNAs (miRNAs) and novel reprogramming factors in a broad spectrum of cell reprogramming research.
Provides commercially-available compounds for virtual screening. ZINC is composed of 3D molecules which have been assigned biologically relevant protonation states and are annotated with properties such as molecular weight, calculated LogP, and number of rotatable bonds. It is searchable through a query tool which incorporates a molecular drawing interface. The database enables investigators to attempt computational ligand discovery.
Aims to provide high-quality summary data on lifespan-extending drugs and compounds in model organisms. DrugAge contains 1316 entries featuring 418 different compounds from studies across 27 model organisms, including worms, flies, yeast and mice. The data contained in the database focus solely on assays performed in standard conditions. It integrates drug–gene interactions and cross-links to aging-related genes, allowing a deeper examination of lifespan extending drugs.
Provides data concerning polyphenols in foods, a major class of food bioactives. Phenol-Explorer is a resource that includes retention factors on the class of dietary bioactives. This resource allows the filtering of information according to various criteria. The data collated allow the most detailed analysis so far of how different domestic and industrial processes affect the polyphenol contents of foods.
Provides a platform allowing access to information related to chemical substances. CompTox Chemistry Dashboard is a repository applying both manual and algorithmic curation techniques for compiling data from various public datasets, such as the U.S. Environmental Protection Agency (EPA) Substance Registry Service or the PubChem database. It aims to assist users in evaluating the available data about a specific content or for data exploration.
Is a database of triazoles generated using existing alkynes and azides, synthesizable in no more than three synthetic steps from commercially available products. ZINClick is freely available to everyone to use. This database can be downloaded and subsets of it are updated yearly.
Stores more than 900 000 ring systems. GDB4c is a database containing a high percentage of ring systems with stereocenters (more than 800 000). It reflects the fact that a connection between two adjacent rings involves most often zero (aromatic rings, spiro centers) or two stereocenters (other bicyclic systems).
Since its public introduction in 2005 the IUPAC InChI chemical structure identifier standard has become the international, worldwide standard for defined chemical structures. InChI is a non-proprietary identifier for chemical substances that can be used in printed and electronic data sources thus enabling easier linking of diverse data compilations.
Gathers data about associations between Alzheimer’s disease (AD) related miRNAs and bioactive small molecules. SmiRN-AD allows users to find information by miRNA or the small molecule. The database permits to visualize the stem-loop structure of primary miRNAs, the corresponding p-values, the associated small molecules and the consistently differentially expressed target genes. Each search query can return an illustration of the miRNA and small molecule associations.