dbGaP / database of Genotypes and Phenotypes

Enables investigators to have rapid access to a collection of genotypes and phenotypes data, providing a rich resource of both individual-level and summary-level information for their exploration. The browser uses the standard NCBI graphical interface to combine sequence viewer track views with genotype tables and a novel sample-subject data selector that displays core phenotype data about the samples. The dbGaP data browser serves as a third solution, providing researchers with view-only access to a compilation of individual-level data from general research use studies through a simplified controlled-access process.

AtPID / Arabidopsis thaliana Protein Interactome Database

Depicts and integrates the information pertaining to protein-protein interaction networks, domain architecture, ortholog information and GO annotation in the Arabidopsis thaliana proteome. AtPID predicts the Protein-protein interaction pairs by integrating several methods with the Naive Baysian Classifier. All other related information curated in the AtPID is manually extracted from published literatures and other resources from some expert biologists. AtPID collects 5564 mutants with significant morphological alterations which were manually curated to 167 plant ontology (PO) morphology categories and predicts 4457 high confidence gene-PO pairs with 1369 genes as the complement. These single/multiple-gene mutants are indexed and linked to 3919 genes.

GAD / Genetic Association Database

Intends to collect, standardize and archive genetic association study data. GAD is a public repository of published genetic association studies that contains molecular, clinical and study parameters for more than 5,000 human genetic association studies. It aims to facilitate the studying of complex common human genetic disease in modern high-throughput assay systems and current annotated molecular nomenclature. All datasets can be downloaded from the website.


Offers information encompassing published genetic polymorphisms. VarySysDB is a database that provides separately annotated genetic polymorphisms for each H-inv transcripts (HITs), even from multiple transcripts forming a HIX. It (i) delivers an even greater understanding of the various biological processes, (ii) permits a detailed evaluation of how polymorphisms affect different phenotypes, and (iii) fosters a rich research environment focused on exploring the causes of genetic variation through genome-wide association studies.

Monarch Initiative

Provides tools for genotype-phenotype analysis, genomic diagnostics, and precision medicine across broad areas of disease. Monarch tools leverage this conceptual framework to help users understand and diagnose disease. Statistical similarity calculations enable comparison across species, biological scales, and community-specific vocabularies. Monarch supports researchers and clinicians using this data with visualization tools, application programming interfaces, and a rich web site. This suite of tools has developed four species-agnostic ontologies designed to unify their species-specific counterparts: GENO for genotypes, Uberpheno for phenotypes, UBERON for anatomy and MONDO for diseases. Monarch also contributes to the Gene Ontology, which also unifies gene function and subcellular anatomy across species.

dbGaP Data Browser / database of Genotypes and Phenotypes Data Browser

A resource that enables view-only access to the database of Genotypes and Phenotypes (dbGaP). dbGaP Data Browser was developed in response to requests from the scientific community for a resource that would more easily enable view-only access of genotypes, aggregate variant data, and individual-level genomic sequence data not available in unrestricted-access (Appendix - Glossary), and without having to download individual dbGaP data sets. This tool allows approved users to find and view specific regions of the human genome, including all allele frequencies and subsets of individual-level genotype and sequence data stored in dbGaP within that region, without having to download the data sets of interest and perform additional analyses.

GWAS Catalog / genome-wide association studies Catalog

Gathers a manually curated resource of all published Genome-Wide Association Studies (GWAS) and association results. GWAS Catalog provides a dedicated mapping spreadsheet between all reported GWAS search interfaces and ontology terms. It allows users to identify the child terms included under each higher-level trait category on the GWAS. Moreover, the database can help in identifying causal variants, understand disease mechanisms, and establish targets for novel therapies.


Extracts and aligns associations for user-specified variants and proxies across a large curated database. PhenoScanner extends current catalogues of genetic data by including all available results as opposed to filtering on strength of association. This database aligns genotype-phenotype associations across traits and proxies, providing the user with an easily interpretable formatted output file. PhenoScanner will make cross-referencing genetic variants with many phenotypes faster and more efficient.

DANCE / Disease-ANCEstry Networks

A graph-based web tool that allows to integrate and visualize information on human complex phenotypes and their GWAS-hits, as well as their risk allele frequencies in different populations. DANCE integrates information from two existing databases: (i) GWAS-hit SNPs reported in the NHGRI-EBI GWAS Catalog and (ii) risk-allele frequencies in Europeans, Africans and Asians from the 1KGP. DANCE provides an interactive way to explore the human SNP-Disease Network and its projection, a Disease-Disease Network. With these functionalities, DANCE fills a gap in our ability to handle and understand the knowledge generated by GWAS and the 1000 Genomes Project.


An integrated database, to identify and annotate disease-associated SNPs in human lincRNAs. The current release of LincSNP contains approximately 140,000 disease-associated SNPs (or linkage disequilibrium SNPs), which can be mapped to around 5,000 human lincRNAs, together with their comprehensive functional annotations. The database also contains annotated, experimentally supported SNP-lincRNA-disease associations and disease-associated lincRNAs. It provides flexible search options for data extraction and searches can be performed by disease/phenotype name, SNP ID, lincRNA name and chromosome region.

LD Hub / Linkage Disequilibrium Hub

Automates the linkage disequilibrium score regression analysis pipeline. LD Hub calculates the single nucleotide polymorphims (SNP) heritability for the uploaded phenotype(s), and a genetic correlation matrix across traits. It can be envisafed as a useful hypothesis generating tool, providing an easy method of screening hundreds/thousands of traits for interesting genetic correlations that could subsequently be followed up in further detail by other approaches such as pathway analysis.


A comprehensive discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET contains over 380,000 associations between >16,000 genes and 13,000 diseases, which makes it one of the largest repositories currently available of its kind. DisGeNET integrates expert-curated databases with text-mined data, covers information on Mendelian and complex diseases, and includes data from animal disease models. It features a score based on the supporting evidence to prioritize gene-disease associations. It is an open access resource available through a web interface, a Cytoscape plugin and as a Semantic Web resource. The web interface supports user-friendly data exploration and navigation. DisGeNET data can also be analysed via the DisGeNET Cytoscape plugin, and enriched with the annotations of other plugins of this popular network analysis software suite. Finally, the information contained in DisGeNET can be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Data cloud.

VaDE / VarySysDB Disease Edition

A literature based genetic trait and genomic information database. VaDE provides genomic polymorphisms associated to diseases, traits, and pharmacogenomics. SNP-trait association data were obtained from the National Human Genome Research Institute GWAS (NHGRI GWAS) catalog and RAvariome, and detailed information of sample populations by curating original papers was added. VaDE will contribute to the future establishment of personalized medicine and increase our understanding of genetic factors underlying diseases.

PsyGeNET / Psychiatric disorders and Genes association NETwork

Constitutes a resource on psychiatric diseases and their associated genes. PsyGeNET consists in a database and analysis tools. It contains information on depression, bipolar disorder, alcohol use disorders and cocaine. The database was developed by applying text mining tools to extract information from the scientific literature. The document describing the curation guidelines, providing a resource for the development and evaluation of text mining systems, is available in the web portal.

MEGA-V / Mutation Enrichment Gene set Analysis of Variants

Provides a statistical framework to test associations between any type of perturbation of biological processes and the occurrence of disease. MEGA-V identifies gene sets with a significantly higher number of variants in a cohort of interest (cohort A) as compared to a control cohort (cohort B) or a random distribution generated using Monte Carlo. Gene sets are predefined by the user and they can be group of genes involved in the same biological processes, Gene Ontology groups, or genes associated to the same disease.


Provides a comprehensive, regularly updated, collection of data from genetic association studies in cutaneous melanoma (CM), including random-effects meta-analysis results of all eligible polymorphisms. The updated database version includes data from 192 publications with information on 1114 significantly associated polymorphisms across 280 genes, along with new front-end and back-end capabilities. Various types of relationships between data are calculated and visualized as networks.


Enables the query and exploration of the information contained in the GRASP2 database alongside evolutionary information which we have added for each single nucleotide polymorphism (SNP). This evolutionary information can be used to prioritize SNPs with a greater likelihood of bona fide and reproducible genetic disease associations. E-GRASP resource includes phenotype association measures of significance (P-values) and population allele frequency data for SNPs present in the GRASP2 database, which provides information on ~8.87 million SNPs-phenotype associations with a statistical significance threshold of ≤ 0.05 aggregated from 2082 Genome-Wide Association Studies that span 177 broad phenotypic categories.


Provides genome-wide association studies (GWAS) results. AraGWAS is a central resource for all genetic associations found through GWAS in A. thaliana. It reports respective a comprehensible way and make data accessible by the community. It provides a selection of different methods to facilitate the exploration of these high dimensional data. This catalog also offers a sophisticated and fast search API to query the database and to extract information for specific associations, genes or traits. Interactive visualizations empower the user to easily manoeuvre the data and uncover interesting patterns.

GWAS Central

Provides access to a collection of summary-level genetic association data. GWAS Central collates association data and study metadata from many disparate sources whose data are available in different formats and to differing degrees of detail. The database also provides a toolkit for the storage, mining and display of summary-level association data. It enables experimental biologists to explore and compare data in the genome-wide association study (GWAS) domain, from either a genotype or phenotype starting point.

GuavaH / Genomic Utility for Association and Viral Analyses in HIV

Provides results from genome-wide association (GWAS) of human immunodeficiency virus (HIV) disease phenotypes including more than 4,000 individuals. GuavaH includes association results on HIV control (set point plasma viral load and elite control) and on susceptibility to infection in a cohort of highly exposed seronegative individuals. The GuavaH resource also includes functional transcriptome analyses from in vivo and in vitro studies.

DistiLD Database / DistiLinkage Disequilibrium Database

Increases the usage of existing genome-wide association study (GWAS) results. DistiLD database performs three important tasks: (i) published GWAS are collected from several sources and linked to standardized, international disease codes; (ii) data from the International HapMap program are analyzed to define linkage disequilibrium (LD) blocks onto which single nucleotide polymorphisms (SNPs) and genes are mapped; (iii) a web interface makes it easy to query and visualize disease-associated SNPs and genes within LD blocks. Users can query DistiLD in three different ways, starting from either a disease, a list of SNPs or a list of genes.

MR-Base / Mendelian Randomization base

Provides a complete summary database from 1094 genome-wide association studies (GWAS) on diseases and other complex traits. MR-Base is a platform using data to perform Mendelian randomization (MR) tests and sensitivity analyses. MR-Base exists conceptually as two-part framework: (i) it is a repository of harmonized published GWAS summary data which has been aggregated from disparate and heterogeneous sources on traits from across the phenome; (ii) it plays host to a range of causal estimation methods and automatically applied sensitivity analyses that can be used to improve the reliability of causal inferences.


Integrates both genome-wide association studies and expression quantitative trait loci information, the two primary sources of genome-wide mapping for genotype-phenotype and genotype-expression associations together with phenotype-associated gene lists. The GEPdb provides simultaneous interpretation of both genetic risks and potential gene regulatory pathways toward phenotypic outcome by establishing the ternary relationship of genotype-expression-phenotype (GEP). The analytic scope is further extended by linkage disequilibrium from five different populations of the international HapMap Project.

NCBI PheGenI / Phenotype-Genotype Integrator

Merges NHGRI genome-wide association study (GWAS) catalog data with several databases housed at the National Center for Biotechnology Information (NCBI), including Gene, dbGaP, OMIM, GTEx and dbSNP. This phenotype-oriented resource, intended for clinicians and epidemiologists interested in following up results from GWAS, can facilitate prioritization of variants to follow up, study design considerations, and generation of biological hypotheses.


An information system with key data on the biology of all fishes. FishBase is a global biodiversity information system on finfishes. Its initial goal to provide key facts on population dynamics for 200 major commercial species has now grown to having a wide range of information on all species currently known in the world: taxonomy, biology, trophic ecology, life history, and uses, as well as historical data reaching back to 250 years. At present, FishBase covers >33,000 fish species compiled from >52,000 references in partnership with >2,000 collaborators: >300,000 common names and >55,000 pictures.

SIDD / Semantically Integrated Disease-associated Database

Integrates 18 disease-associated databases, for researchers to browse multiple types of disease-related molecular, phenotypic and environmental features (DR-MPEs) in a view. A web interface allows easy navigation for querying information through browsing a disease ontology tree or searching a disease term. Furthermore, a network visualization tool using Cytoscape Web plugin has been implemented. It enhances the SIDD usage when viewing the relationships between diseases and DR-MPEs.


A tool based on a computational algorithm that uses orthology and protein-protein interaction information to infer gene-phenotype associations for multiple species. PhenoPPIOrth is a web server that provides genome-wide phenotype inference for six species: fly, human, mouse, worm, yeast, and zebrafish. We evaluated our inference method by comparing the inferred results with known gene-phenotype associations. The high Area Under the Curve values suggest a significant performance of our method. By applying our method to two human representative diseases, Type 2 Diabetes and Breast Cancer, we demonstrated that our method is able to identify related Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways. The web server can be used to infer functions and putative phenotypes of a gene along with the candidate genes of a phenotype, and thus aids in disease candidate gene discovery.

GeMInA / Genomic Metadata for Infections Agents

Identifies, standardizes and integrates the outbreak metadata for the breadth. Gemina is an open source web-based pathogen-centric tool designed to offer an integrated investigative and geospatial surveillance system connecting pathogens, pathogen products and disease metadata anchored on the taxonomic ID of the pathogen and host. It provides a metadata selection query interface to guide identification of the National Institute of Allergy and Infectious Diseases (NIAID) category A–C viral and bacterial pathogens.


Provides a gene-centered summary view of genetic association studies. The system of Genopedia translates a gene name, gene symbol, gene alias or protein name entered by users. It displays information about diseases that have been studied in association with a given gene. A gene-disease network is also generated by defining two genes as “connected” if they have been studied for association with the same disease. Each search result page provides links to the foremost gene-centered databases.


Provides researchers access to updated information on human genetic association studies in order to facilitate knowledge synthesis. The information about genes studied in relation to a particular disease (e.g. stroke) or phenotype (e.g. hypertension) is summarized on the web page of Phenopedia in a tabular format. The results of the search include: (i) the number of published genetic association studies, (ii) the number of genes studied; (iii) the number of investigators (published authors); and (iv) temporal and geographic publication trends.

Type2 Diabetes Knowledge Portal

Offers creating analytic tools to analyze medical data. T2D Knowledge Portal is a database of DNA sequence, functional and epigenomic information, and clinical data from studies on type 2 diabetes and its macro- and microvascular complications. The data and analytical tools are accessible to academic and industry researchers, and all interested users, to identify and validate changes in DNA that influence onset of type 2 diabetes, disease severity, or disease progression.


Enables to search and retrieve information relating to amphibian biology and conservation. AmphibiaWeb provides a large-scale estimate of amphibian phylogeny, containing over 2800 species. The database provides support for groups recognized in previous studies, suggests non-monophyly for several currently recognized families, particularly in hyloid frogs. It also includes several families not recognized in current classifications and important for avoiding non-monophyly of current families.