It is the leading website and database of Drosophila genes and genomes. FlyBase curates a variety of data from published biological literature, including phenotype, gene expression, interactions (genetic and physical), gene ontology (GO) information and many others. These data are organized in ∼31 different data-type reports such as the Gene Report or the Allele Report. The range of data we provide increases and changes as new types of data become available. Whether you are using the fruit fly Drosophila melanogaster as an experimental system or wish to understand Drosophila biological knowledge in relation to human disease or to other model systems, FlyBase can help you successfully find the information you are looking for.


Provides a resource for data analysis and visualization in a gene-by-gene or genome-wide scale. PlasmoDB is a functional genomic database for Plasmodium spp. It belongs to a family of genomic resources that are housed under the EuPathDB Bioinformatics Resource Center (BRC) umbrella. Data in PlasmoDB can be queried by selecting the data of interest from a query grid or drop down menus. Various results can then be combined with each other on the query history page.


Assists users to organize and centralize all variant data and annotations from their lab. Highlander provides researchers several tools for filtering information. This tool, coupled to a local MySQL database, aims to classify all variant data coming from exome- and whole genome sequencing experiments. It also supplies annotations or visualizations functions that allow to detect changes-of-interest amongst the complete list of variants detected in a sample.

EBI / EMBL-EBI - The European Bioinformatics Institute

Supplies an access to several biological data resources and bioinformatics services. EBI is a platform that covers the entire range of biological sciences: raw DNA sequences to curated proteins, chemicals, structures, systems, pathways, ontologies and literature. Databases, tools, as well as web services are provided for sharing data, performing queries and analyzing results. Users can also deposit their data through a data submission page. All the resources are freely available without restriction, with few exceptions.


A scientific database for the bacterium Escherichia coli K-12 MG1655. The EcoCyc project performs literature-based curation of the entire genome, and of transcriptional regulation, transporters, and metabolic pathways. New experimental discoveries about gene products, their function and regulation, new metabolic pathways, enzymes and cofactors are regularly added to EcoCyc. SmartTable tools allow users to browse collections of related EcoCyc content. SmartTables can also serve as repositories for user- or curator-generated lists. EcoCyc supports running and modifying E. coli metabolic models directly on the EcoCyc website.


A light weight comprehensive genome resource and sequence analysis platform for oomycete organisms. EuMicrobedbLite is a successor of the VBI Microbial Database (VMD) that was built using the Genome Unified Schema (GUS). This database has 26 publicly available genomes and 10 EST datasets of oomycete organisms. The browser page has dynamic tracks presenting comparative genomics analyses, coding and non-coding data, tRNA genes, repeats and EST alignments. In addition, 44777 core conserved proteins were defined from twelve oomycete organisms that form 2974 clusters. The user interface has undergone major changes for ease of browsing. Queryable comparative genomics information, conserved orthologous genes and pathways are among the new key features updated in this database. Annotations for the organisms are updated once every six months to ensure quality.

GCGene / Gastric Cancer Gene database

A literature-based database with comprehensive annotations supported by a user-friendly website. In the current release, we have collected 1,815 unique human genes including 1,678 protein-coding and 137 non-coding genes curated from extensive examination of 3,142 PubMed abstracts. The resulting database has a convenient web-based interface to facilitate both textual and sequence-based searches. All curated genes in GCGene are downloadable for advanced bioinformatics data mining. Gene prioritization was performed to rank the relative relevance of these genes in gastric cancer development.

CGD / Candida Genome Database

Provides gene, protein and sequence information for multiple Candida species. CGD contains web-based tools for accessing, analyzing and exploring these data, to facilitate and accelerate research into Candida pathogenesis and biology. Locus pages comprise a summary view along with several additional tabs that display more detailed information, including phenotype details, Gene Ontology term curation, protein product details for coding genes, notes on changes to the sequence or structure of the gene, a comprehensive reference list and the Homology Information tab, a place where phylogeny- and similarity-related data may be examined and evaluated.


Provides the biological research community with a comprehensive encyclopedia of genomic functional elements in the model organisms C. elegans and D. melanogaster. modENCODE is run as a Research Network and the consortium is formed by 11 primary projects, divided between worm and fly, spanning the domains of gene structure, mRNA and ncRNA expression profiling, transcription factor binding sites (TFBS), histone modifications and replacement, chromatin structure, DNA replication initiation and timing, and copy number variation (CNV).

Ascaris suum

Offers assembly and gene annotation of Ascaris suum also known as large roundworm of pigs, which is in the Ascarididae family. The database reports the 273 megabase draft genome of Ascaris suum and compares it with other nematode genomes. This genome has low repeat content (4.4%) and encodes about 18,500 protein-coding genes. The A. suum secretome (about 750 molecules) is rich in peptidases linked to the penetration and degradation of host tissues, and an assemblage of molecules likely to modulate or evade host immune responses. This genome provides a comprehensive resource to the scientific community and underpins the development of urgently needed interventions (drugs, vaccines and diagnostic tests) against ascariasis and other nematodiases.


Allows gene ontology (GO) analysis. agriGO is an analysis toolkit and database for agricultural community. The database mainly contains analysis tools for processing with the agriGO v2.0-provided background and custom analyses, including search, singular enrichment analysis (SEA), parametric analysis of gene set enrichment (PAGE), SEACOMPARE, Batch SEA, DAG drawer and Scatter Plots. It also contains a large number of species and datatypes available, which have been classified into several groups.

CoReCG / Colon Rectal Cancer Gene Database

Contains 2056 colon-rectal cancer genes information involved in distinct colorectal cancer stages sourced from published literature with an effective knowledge based information retrieval system. Additionally, interactive web interface enriched with various browsing sections, augmented with advance search facility for querying the database is provided for user friendly browsing, online tools for sequence similarity searches and knowledge based schema ensures a researcher friendly information retrieval mechanism. It is expected that availability and use of CoReCG will reduce the time and effort of scientists and clinicians to survey the literature on genes and their involvement in colon-rectal malignancy as a result can make further advances towards therapeutic solutions.

MSGene / Metastasis Suppressor Gene Database

The first literature-based gene resource for exploring human metastasis suppressor genes (MS genes) to unveil the cellular complexity of MS genes. MSGene database stores 194 human MS genes (161 protein-coding and 33 microRNA genes), and 1488 homologous genes from 17 model species collected by manual curation of the literature. Follow-up functional analyses associated 194 human MS genes with epithelium/tissue morphogenesis and epithelia cell proliferation. In addition, pathway analysis highlights the prominent role of MS genes in activation of platelets and coagulation system in tumor metastatic cascade. Moreover, global mutation pattern of MS genes across multiple cancers may reveal common cancer metastasis mechanisms. All these results illustrate the importance of MSGene to our understanding on cell development and cancer metastasis.

SGD / Saccharomyces Genome Database

Compiles comprehensive integrated biological information about the budding yeast Saccharomyces cerevisiae. SGD is a manually-curated database which aims to improve the discovery of functional relationships between sequence and gene products in fungi and higher organisms. The database records information about the yeast genome and its genes, proteins, and other encoded features. Moreover, it contains several bioinformatic tools to facilitate experimental design and analysis.

Pseudomonas Genome Database

Collaborates with an international panel of expert Pseudomonas researchers to provide high quality updates to the PAO1 genome annotation and make cutting edge genome analysis data available. The Pseudomonas Genome Database integrates completely-sequenced Pseudomonas genome sequences and their annotations with genome-scale, high-precision computational predictions and manually curated annotation updates. The wide range of tools for comparing Pseudomonas annotations and sequences includes a strain-specific access point for viewing high precision computational predictions including updated, more accurate, protein subcellular localization and genomic island predictions.


A centralized gene-annotation portal that enables researchers to access distributed gene annotation resources. The unique features of BioGPS, compared to those of other gene portals, are its community extensibility and user customizability. Users contribute the gene-specific resources accessible from BioGPS (‘plugins’), which helps ensure that the resource collection is always up-to-date and that it will continue expanding over time. BioGPS users can create their own collections of relevant plugins and save them as customized gene-report pages or ‘layouts’. In addition, we recently updated the most popular plugin, the ‘Gene expression/activity chart’, to include ∼6000 datasets (from ∼2000 datasets) and we enhanced user interactivity.


Provides an easy way of accessing the sequences and all-inclusive annotation data on the structures of the cyanobacterial genomes. It contains cyanobacterial genomic sequences from 376 species, which consist of 86 complete and 290 draft genomes. The user interface was optimized for large genomic data to include the use of semantic web technologies and JBrowse. CyanoBase focuses on the representation and reusability of reference genome annotations, which are continuously updated by manual curation. Advanced users can also retrieve this information through the representational state transfer-based web application programming interface in an automated manner.

Branchiostoma floridae

Offers gene annotation of Branchiostoma floridae, a lancelet of the genus Branchiostoma. The genome of this species reveals that among the chordates, the morphologically simpler tunicates are actually more closely related to vertebrates than lancelets. The genome of Branchiostoma floridae is estimated to be approximately 575 Mb contained in 19 pairs of chromosomes, and is being sequenced to approximately 8.1 X depth. Branchiostoma floridae belongs to the Branchiostomidae family.

Ricinus communis

A database which offers gene annotation of Ricinus communis, also known as Castorbean. The genome sequence assembly was searched for repetitive DNA using a combination of sequence alignment to databases of repetitive sequences and RepeatScout to identify repeats de novo. Overall, over 50% of the genome was identified as repetitive DNA (excluding low-complexity sequences), most of which could not be associated with known element families. Ricinus communis belongs to the Euphorbiaceae family.

MaizeGDB / Maize Genetics and Genomics Database

Provides several types of information about corn. MaizeGDB is an online repository offering several functions: genome browser, or bin viewer. It also proposes different tools allowing users to work on Zea mays such as: (1) SNPversity that permits researchers to compare single nucleotide polymorphisms (SNPs); (2) a BLAST tool assisting users to BLAST datasets at several sites. A “data centers” page supplies a lot of filters to simplify user’s searches.

PATRIC / Pathosystems Resource Integration Center

Aims to assist scientists in infectious-disease research. PATRIC is a National Institute of Health (NIH) supported bioinformatics resource center that has been built to enable comparative genomic analysis of bacterial pathogens. The database provides researchers with an online resource that stores and integrates a variety of data types (e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data) and associated metadata. Tools and services for bacterial infectious disease research are also available.

Pleurobrachia bachei

Offers assembly and gene annotation of Pleurobrachia bachei, which is in the Pleurobrachiidae family. The database sequences the Pleurobrachia bachei genome and identifies ~19,600 gene models, 96% of which are supported by transcriptome data. The Pleurobrachia bachei draft genome was assembled using a custom approach designed to leverage the individual strengths of three popular de novo assembly packages and strategies: Velvet, SOAPdenovo, and pseudo-454 hybrid assembly with ABySS.

Thalassiosira pseudonana

A database offering assembly and gene annotation of the Thalassiosira pseudonana, a species of marine centric diatom. It is a model for diatom physiology studies, belongs to a genus widely distributed throughout the world's oceans, and has a relatively small genome at 34 mega base pairs. This genome sequence is composed of "finished chromosomes" and "unmapped sequence", which were annotated separately. Thalassiosira pseudonana belongs to the Thalassiosiraceae family.

IMG / Integrated Microbial Genomes

Offers a collection of genomes from all three domains of life, as well as viruses, plasmids and genome fragments. IMG contains biosynthetic clusters of genes associated with pathways involved in the generation of secondary metabolites in isolate prokaryotic genomes. It provides about 11 500 bacterial, archaeal and eukaryotic genomes; more than 2 800 viral genomes and 1 190 plasmids that did not come from a specific microbial genome sequencing project and about 600 genome fragments.

Nematostella vectensis

Offers gene annotation of Nematostella vectensis also known as starlet sea anemone. This genome includes approximately 7.8X whole genome sequencing (WGS) in small insert end-sequence coverage. After trimming for vector and quality, and excluding short/redundant scaffolds, there were 10,942 assembled scaffolds, containing 2,817,779 reads, and 357 Mbp of sequence. Further exclusion of contaminant and mis-assembled scaffolds reduced this to 10,804 scaffolds, with a total length of 356 Mbp Roughly half of the genome is contained in 181 scaffolds all at least 473 Kb in length. Nematostella vectensis belongs to the Edwardsiidae family.

Triticum Urartu

A database which offers gene annotation of triticum urartu is the diploid progenitor of the bread wheat A-genome. Also known as red wild einkorn, is a diploid species whose genome is the A genome of the allopolyploid hexaploid bread wheat Triticum aestivum, which has genomes AABBDD. The genome of Triticum urartu accession G1812 was sequenced by the BGIusing a whole-genome shotgum strategy, and assembled using SOAPdenovo software. Triticum Urartu belongs to the Poaceae family.

Anolis carolinensis

A database which offers gene annotation of Anolis carolinensis also known as Carolina anole an arboreal lizard. The anole lizard genome is composed of 13 chromosomes, assembled from 41.9861 contigs and 2.143 scaffolds. The total number of bases in the genome is 1.78Gb. The gene set for anole lizard was built using the Ensembl genebuild pipeline. In addition to the main set, gene models have been predicted for each tissue type using the RNA-Seq pipeline. Anolis carolinensis belongs to the Dactyloidae family.

Aegilops tauschii

A database which offers assembly and gene annotation of Aegilops tauschii, also known as Tausch's goatgrass. The diploid progenitor of the bread wheat D-genome provides important evolutionary information for wheat. The bread wheat genome is a hexaploid, resulting from the hybridization of the wild A. tauschii with a cultivated tetraploid wheat, Triticum turgidum. This spontaneous event occurred about 8,000 years ago in the Fertile Crescent. Aegilops tauschii belongs to the Poaceae family.