Expressed sequence tag databases | Transcription analysis
Expressed sequence tags (ESTs) are short DNA sequences (200–500 nucleotides) generated by sequencing the 5′ and/or 3′ ends of cDNAs that are subsequently clustered and counted (Adams et al., 1991). Source text: Feichtinger et al., 2014.
Provides a resource for data analysis and visualization in a gene-by-gene or genome-wide scale. PlasmoDB is a functional genomic database for Plasmodium spp. It belongs to a family of genomic resources that are housed under the EuPathDB Bioinformatics Resource Center (BRC) umbrella. Data in PlasmoDB can be queried by selecting the data of interest from a query grid or drop down menus. Various results can then be combined with each other on the query history page.
A database which offers gene annotation of cucurbit. This base offers the genome of Melon (Cucumis melo), Cucumber (Cucumis sativus), Watermelon (Citrullus lanatus), Pumpkin (Cucurbita maxima). The Cucurbitaceae consist of 98 proposed genera with 975 species, mainly in regions tropical and subtropical. All species are sensitive to frost. Most of the plants in this family are annual vines, but some are woody lianas, thorny shrubs, or trees (Dendrosicyos). Cucurbit belongs to the Cucurbitaceae family.
Provides a collection of short single-read transcript sequences. dbEST is a division of GenBank available as an online resource. This database includes sequence data and other information on single-pass cDNA sequences or expressed sequence tags (EST) from a number of organisms. The given sequences provide a resource to evaluate gene expression, find potential variation, and annotate genes.
Provides information about Trypanosomatidae. Tritrypdb is a collective database which intends to gather annotation, curation and access to tools enabling sophisticated queries against genomic scale datasets. Users can select more than 80 different searches against the TriTryp genomes and datasets and combine them in an integrated and graphical manner. All searches can be customized, summarized by species and displayed as an interactive gene list.
Gathers information concerning maternal gene expression information for Halocynthia roretzi. MAGEST is a database assisting in prediction of amino acid fragment sequences. This resource is dedicated to fragments of amino acid sequences that are predicted from the expressed sequence tag (EST) data sets.
Compiles information about Cryptosporidium. Cryptodb intends to collect whole genome sequence, annotation, sequence analysis and related data about this parasite. The database integrates a set of tools such as BLAST, a tool for annotate personal sequence or an interface for saving searching strategies. Searches can be made among nine data types including popset isolate sequences, single nucleotide polymorphisms (SNPs), open reading frames (ORFs), compounds or gene.
Provides an online resource of genomic data for key blood flukes (genus Schistosoma). SchistoDB integrates whole-genome sequence (WGS) and annotation of three species of the genus and provides enhanced bioinformatics analyses and data-mining tools. This database supplies access and visualization of the Schistosoma mansoni genome and features, integrated to other data types such as expressed sequence tags (ESTs), proteins and metabolic pathways.
Provides information about cDNA sequences. PEDE contains more than 68 000 high-quality expressed sequence tag (EST) collected from additional libraries. PEDE works with an Internet-based search interfaces. This database covers some porcine gene expression data. It enables scientists to prepare a catalog of genes likely to be of interest when pigs are used as animal models in research applications.
Compiles a set of protein expression and protein synthesis data added to human Gateway entry clones. HGPD compiles more than 40000 entry clones splitted in 10 protein function groups. The database allows users to make searches among the content which presents the open reading frames (ORFs) region of each cDNA. Moreover it includes in vivo cellular localization data of proteins for about 30000 humans Gateway entry clones.
Helps biologists to find the flanking insertion sites (FSTs) that interrupt the genes in which they are interested. The FLAGdb information system was developed with the aim of using whole plant genomes as physical references in order to gather and merge available genomic data from in silico or experimental approaches. Combining original data with the output of experts and graphical displays that differ from classical plant genome browsers, FLAGdb presents a powerful complementary tool for exploring plant genomes and exploiting structural and functional resources, without the need for computer programming knowledge.
A genomic database for Giardia lamblia. GiardiaDB is based on the genome of the WBC6 clinical isolate of G. lamblia. It is accessed via the standard EuPathDB web interface, providing a wide variety of tools for genomic database mining. In addition to BLAST and pattern/motif similarity searches, users can identify genes based on genomic position; common name or keyword; gene attributes (such as gene type, or number of exons); evidence of transcript expression including ESTs, SAGE tags, microarray and proteomics; gene product annotation (such as GO function, or EC enzyme number); and predicted cellular location (based on signal peptide and transmembrane predictions).
Stores raw and cleans expressed sequence tag (EST) classified in EST division of GenBank (dbEST) libraries. CleanEST is a web-based database server that provides two different cleansed sequences for each dbEST library: “pre-cleansed” and “user-cleansed”. The database uses an automatic user-cleansing pipeline, in which sequences in a user-selected library are cleansed on-the-fly according to user-input options. Four types of search menus are available: organism, sequencing center, eVOC ontologies (for human libraries) and user sequences.
Furnishes pig gene annotations in all sequenced genomic regions. PigGIS gathers 3.84 million whole genome shotgun (WGS) records generated by the Sino–Danish Pig Genome Project, 870 084 expressed sequence tags (ESTs) from 100 differentiated pig tissues/developmental stages, and 589 996 genomic reads together with 570 773 mRNA sequences extracted from GenBank.
Provides Physcomitrella patens DNA sequences. PHYSCObase is an online resource that gathers full-length enriched cDNA libraries of P. patens from auxin- and cytokinin-treated gametophytes, as well as from gametophytes that were grown without exogenous plant hormones. These data are deposited in public databases together with the expressed sequence tag (EST) project.
Provides an expressed sequence tags (EST) database for Bombyx mori. Silkbase intends to help in finding gene sequences and gene functions in the model species and in non model lepidopterans. The database allows users to run keyword clone name and BLAST searches to ease finding of homologous Bombyx cDNAs with known amino acid sequences of other species. In addition, it authorizes direct comparisons with FlyBase and WormBase.
Hosts a comprehensive collection of Orchidaceae floral transcriptomes. The OrchidBase is a collection of 37,979,342 sequence reads collected from 11 in-house Phalaenopsis orchid cDNA libraries. Among them, 41,310 expressed sequence tags (ESTs) were obtained by using Sanger sequencing, whereas 37,908,032 reads were obtained by using next-generation sequencing (NGS) including both Roche 454 and Solexa Illumina sequencers. These reads were assembled into 8,501 contigs and 76,116 singletons, resulting in 84,617 non-redundant transcribed sequences with an average length of 459 bp. The Orchidbase provides a detailed annotation including general information, relative expression level, gene ontology (GO), KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway mapping and gene network prediction. The online resources for putative annotation can be searched either by text or by using BLAST, and the results can be explored on the website and downloaded.
Supports investigations on expressed sequence data from multiple tomato species. tomatEST is a database of expressed sequence tag (EST) /complementary DNA (cDNA) sequences from dbEST libraries, a division of GenBank containing sequence data. This resource was developed to provide a workbench for mining the complexity of EST sequence information content from multiple tomato species (i) for expression pattern analysis and (ii) for gene discovery in the framework of Solanum lycopersicum genome project.
Provides access to sequence, classification, clustering and annotation data of crop EST projects. CR-EST currently holds more than 200,000 sequences derived from 48 cDNA libraries of six species: barley, wheat, pea, petunia, tobacco and potato. The barley section comprises approximately one-third of all publicly available ESTs. CR-EST deploys an automatic EST preparation pipeline that includes the identification of chimeric clones in order to transparently display the data quality. Sequences are clustered in species-specific projects to currently generate a non-redundant set of approximately 45,000 consensus sequences.
A database which offers gene annotation of Hevea brasiliensis. The rubber tree is the major commercial source of natural rubber. The transcriptome was sequenced from vegetative shoot apex yielding 2,311,497 reads. Clustering and assembly of the reads produced a total of 113,313 unique sequences, comprising 28,387 isotigs and 84,926 singletons. Also, 17,819 EST-SSRs were identified from the data set. Hevea brasiliensis belongs to the Euphorbiaceae family.
Provides a repository for silkworm genome information for functional and applied genomics, including data from KAIKO2DDB of proteome data, Bombyx trap database for transgene and reporter data. KAIKOcDNA compiles genomic sequences, map information and expressed sequence tags (EST) data. The database makes available 4 map viewers: gene viewer, sequence search, keyword and position search systems. Results can be visualized at the level of nucleotide sequence, gene, scaffold and chromosome.
Stores 9,023 unique gene sequences (designated as UNIGENEs) assembled from expressed sequence tag sequences of P. haptonemofera as core information. The UNIGENEs were annotated with gene sequences sharing significant homology, conserved domains, Gene Ontology, KEGG Orthology, predicted subcellular localization, open reading frames and orthologous relationship with genes of 10 other algal species, a cyanobacterium and the yeast Saccharomyces cerevisiae. This sequence and annotation information can be easily accessed via several search functions. Besides fundamental functions such as BLAST and keyword searches, Pleurochrysome also offers search functions to explore orthologous genes in the 12 organisms and to seek novel genes.
Gathers more than 50 000 Capra hircus and Ovis aries expressed sequence tags (EST). GoSH DB provides a web interface allowing users to query the database retrieve sequence subsets. This repository can be useful for the selection of sequences to be used for specific purpose, such as single nucleotide polymorphism (SNP) mapping. Moreover, the possibility to use the dataset performing subset-specific text searches makes the database useful for data mining and data retrieval.
Provides access to information about potential intron polymorphism (PIP) markers and homologous relationships among PIP markers from different plant species. PIP contains more than 55,000 PIP markers for about 60 plant species. Users can search the database by (i) species name, (ii) marker ID, (iii) gene name in subject species, (iv) PlantGDB PUT ID, (v) intron length range in subject species, (vi) Gene Ontology (GO) number in subject species and (vii) gene description in subject species.
Facilitates genome research in Littorina saxatilis and related species. LSD was built by operating a hybrid assembly between four different expressed sequence tag (EST) data sets: two based on Sanger sequencing and two based on 454 sequencing. It can be useful to find markers such as microsatellites, single nucleotide polymorphisms (SNPs) useful for studies of population structure and phylogeny, and for genetic mapping and population genomic approaches.
Provides sequence of euchromatic genome of Drosophila melanogaster in high quality. BDGP is a resource developed to (i) produce gene disruptions using P element-mediated mutagenesis on a scale unprecedented in metazoans, (ii) characterize sequence and expression of cDNAs and (iii) develop informatics tools that support the experimental process, identify features of DNA sequence, and allow to present up-to-date information about annotated sequence to the research community.
Links Arabidopsis information to resources dealing with crops. SABRE compiles full-length cDNAs from more than 10 different plant species linked to the Arabidopsis information resource (TAIR) annotations. This application allows users to browse TAIR gene models and annotations as well as homologous gene clones. Searches can be made by Resource/Clone ID, DDBJ/EMBL/GenBank accession number or by keywords.
Offers a set of monocot homologs of the Arabidopsis genes that are responsible for DNA replication and repair. bEST-DRRD contains the expressed sequence tags (ESTs) and genomic sequences derived from four large barley source databases: HarvEST, TIGR, The IPK Crop EST (CR-EST) and the Computational Biology and Functional Genomics Laboratory. It allows comparisons between genomes of O. sativa and B. distachyon.
Compiles information about Aphanomyces sequences and their annotations. AphanoDB aims to simplify gene prediction and annotation for the whole genome sequencing of Saprolegniales species. The database provides over 18000 assembled and annotated expression sequence tags (ESTs) that can be used for functional and comparative genomic studies. Searches can be made by text searches or BLAST analysis or, for a given sequence, by accession number, EST name or gene ID.
Allows individual and batch queries using Xenopus accession, GI, and XenDB, UniGene and TIGR cluster IDs. The XenDB database is designed to address a critical issue facing many researchers: the comparison of genomic studies in one organism and their application to studies in another model organism. Using the XenDB system, the biologist can identify sequences of interest using simple gene name queries, accessions, or gene ontologies.
A genomics and genetics database of radish. RadishBase contains radish mitochondrial genome sequences, expressed sequence tag (EST) and unigene sequences and annotations, biochemical pathways, EST-derived single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers, and genetic maps.
Stores oomycete transcriptomics data and interfaces with VBI microbial database (VMD) and Phytophthora transcriptomics database (PTD). OTD is a resource for the oomycete community to browse and retrieve transcriptomics information. The database features multiple data types such as raw next-generation sequencing (NGS) reads, assembled reads, raw expressed sequence tags (ESTs), assembled ESTs and their annotation and mapping to reference genomes.
Provides DBM comprehensive transcriptomic and draft genomic sequences with useful annotation information with easy-to-use web interfaces, which helps researchers to efficiently search for target sequences such as insect resistance-related genes. KONAGAbase contains rich graphical user interface (GUI)-based web interfaces for easy and efficient searching, browsing, and downloading sequences and annotation data.
Allows users to browse ontologies from 70 human expressed sequenced tags (ESTs) libraries from different tissues. HEOE is an online resource that offers the possibility to simultaneously visualize data from different libraries, allowing direct comparison of distribution of ontologies in user-selected tissues. Pathway-oriented, ontology-oriented and statistics search pages are available.
Handles both microarray expression data and sequence assembly and annotation data. ANEXdb is an open-source web application that supports integrated access of two databases that house microarray expression (ExpressDB) and EST annotation (AnnotDB) data. Although ANEXdb currently houses porcine-specific data, it has been designed to be species independent. ANEXdb can be easily customized for other species by populating the databases with the relevant annotation and expression data from a variety of platforms such as Affymetrix GeneChips in other species or custom arrays.
Provides expressed sequence tags (ESTs) currently available from NCBI (Genbank/EMBL/DDBJ) and cassava mRNA sequences and their annotations. Cassava Online Archive simplifies the cassava genomics research and contributes to molecular breeding. It allows to search with sequence similarity (BLAST), accession number and gene function. The annotations collected in this database come from different other protein databases.
A repository of transcriptome data including the sequences and the expression profiles of barley genes resulting from microarray analysis. The bex-db should provide a useful resource for further genomics studies and development of genome-based tools to enhance the progress of the genetic improvement of cereal crops.
Collects genomic and complementary DNA (cDNA) data. Ghost Database is an online resource that is useful for researchers searching information in comparative and evolutionary genomics topic. The platform of this database is composed of many search tabs: genome browser, a blast mode, search, statistics, or download. Users can find for information by specifying characteristics concerning gene expression patterns, or cDNA Cluster.
Contains genetic information on miniature tomato cultivar Micro-Tom. MiBASE is a database that provides information on expressed sequence tag (EST) sequences, EST annotations, full-length cDNA clones, UNIGENEs, single nucleotide polymorphisms (SNPs) between other tomato inbred lines, simple sequence repeats (SSRs), gene ontology (GO) terms, metabolic pathway names, gene expressions, and sequence similarities with other plant genes.
Represents a collection of Clonorchis sinensis ESTs that is intended as a resource for parasite functional genomics. ClonorESTdb enables the researcher to identify and compare expression signatures under different biological stages and promotes ongoing parasite drug and vaccine development and biological research.
Provides a unified view of the human transcriptome. WikiCell is an open and public platform dedicated to the annotation of the human transcriptome. Researchers can contribute transcriptome data, including ESTs and annotations. The wiki format allows authors to create and edit any number of interlinked webpages. Based on the anatomy of the human body, the logical structure traces out an image of refined classification from nine major systems to cell level, and includes both physiological and pathological transcriptome data. WikiCell can be searched by organ, tissue, and cell type, as well as GenBank accession number.