An innovative online platform to help you interpret your results, prioritize targets and biomarkers, identify correlated genes, discover novel disease-specific genes, or simply to explore the world's gene expression data. The search engine processes measurement data from 17 organisms.
Allows users to compare two or more groups of Samples in a GEO Series in order to identify genes that are differentially expressed across experimental conditions. GEO2R performs comparisons on original submitter-supplied processed data tables using the GEOquery and limma R packages from the Bioconductor project. This tool provides a simple interface that allows users to perform R statistical analysis without command line expertise. Results are presented as a table of genes ordered by significance.
Recognizes reliable gene expression markers. MERGE employs a principled way of integrating multi-omic prior information relevant to disease processes. It integrates prior information on genes’ relevance in order to prioritize gene-drug associations. This method increases the chance that the identified gene-drug associations are replicated in validation data. It gives the potential to make novel discoveries about molecular markers.
Identifies genes showing similar expression or response profiles from selected databases. Expression Angler employs the Pearson correlation coefficient to identify co-regulated genes. It provides powerful means to query the data. This tool can recognize marker genes for genotoxic stress and different kinds of pathogen response. It uses the pattern to search for genes with similar expression profiles.
An interactive HTML5 web-based software application that facilitates querying, browsing and interrogating many of the currently available LINCS L1000 data. LCB implements two compacted layered canvases, one to visualize clustered L1000 expression data, and the other to display enrichment analysis results using 30 different gene set libraries. Clicking on an experimental condition highlights gene-sets enriched for the differentially expressed genes from the selected experiment. A search interface allows users to input gene lists and query them against over 100,000 conditions to find the top matching experiments. The tool integrates many resources for an unprecedented potential for new discoveries in systems biology and systems pharmacology.
A web-based application that allows researchers to compare a query set of genes, e.g. a set of over- and under-expressed genes, against a signature database built from GEO datasets for different organisms and platforms.
A web-based tool that can query a database comprising ∼4300 microarrays, representing human gene expression in normal tissues, cancer cell lines and primary tumors. MERAV has been designed as a powerful tool for whole genome analysis which offers multiple advantages: one can search many genes in parallel; compare gene expression among different tissue types as well as between normal and cancer cells; download raw data; and generate heatmaps; and finally, use its internal statistical tool. Most importantly, MERAV has been designed as a unique tool for analyzing metabolic processes as it includes matrixes specifically focused on metabolic genes and is linked to the Kyoto Encyclopedia of Genes and Genomes pathway search.
Enables the linked, all-in-one visualization of genes and samples across the whole brain and genome, and across developmental stages. BrainScope is a web portal for fast, interactive visual exploration of the Allen Atlases of the adult and developing human brain transcriptome. Consequently, BrainScope is a valuable tool for neurologists to gain a deeper understanding of the interactions between brain anatomy and molecular function.
Facilitates the retrieval of lung cell-specific gene expression information from extensive data sets derived from RNA sequencing of single cells. LungGENS is a web-based bioinformatics resource for querying single-cell gene expression databases by entering a gene symbol or a list of genes or selecting a cell type of their interest. It also integrates the data with previous RNA expression studies from mouse lung at various developmental times.
A package that allows access to the wealth of information within GEO directly from BioConductor, eliminating many formatting and parsing problems that have made such analyses labor-intensive in the past. The primary goal of GEOquery is to download and parse the SOFT format files from GEO, maintaining all of the information contained in the GEO records. The design of GEOquery makes accessing data from GEO very simple. There is only one command that is needed, getGEO. GEOquery provides a bridge between the BioConductor analysis tools and the vast public data resources contained in the NCBI GEO repositories. By maintaining the full richness of the GEO data rather than focusing on getting only the ‘numbers’, it is possible to integrate GEO data into Bioconductor data structures and to perform analyses on that data quite quickly and easily or to export the data into any number of formats for use by other tools or for local storage and data mining.
Provides an alternative, yet much more flexible and efficient, set of tools for both online and programmatic access to Gene Expression Omnibus (GEO) metadata. GEOmetadb was developed in an attempt to make querying the GEO metadata both easier and more effective. It includes a web-based query engine with several convenient utilities and a Bioconductor package, also called GEOmetadb, which queries a locally installed GEOmetadb SQLite database update regularly and supply for download; each can be used independently of the other.
A deep learning method to infer the expression of target genes from the expression of landmark genes. We used the microarray-based GEO dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms linear regression with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than linear regression in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2,921 expression profiles. Deep learning still outperforms linear regression with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes.
Allows to retrieve the genes associated to psychiatric diseases. psygenet2r offers users to study the association between a disease of interest and PsyGeNET diseases based on shared genes. This tool enables to retrieve several contents stored on the PsyGeNET database. It also performs comorbidity studies with PsyGeNET's and user's data. It provides a gene-disease associations (GDAs) network which identifies diseases and genes.
Provides an environment for downloading functional genomics data from Gene Expression Omnibus (GEO), parsing the information into a local or remote database, and interacting with the database using dedicated R functions, thus enabling seamless integration with other tools available in R/Bioconductor. The compendiumdb package consists of a number of R functions to access this database either locally or remotely. The database schema has been designed to be rich enough to store information provided by MIAME-compliant expression databases such as GEO. The package provides R functions to (i) download data from GEO given the identifier of the experiment, (ii) load the expression data, sample and probe annotation to the relational database, and (iii) convert experimental data from the database to an R/Bioconductor ExpressionSet.
Maps spatiotemporal brain data to vector graphic diagrams of the human brain. cerebroViz allows rapid generation of publication-quality figures that highlight spatiotemporal trends in the input data, while striking a balance between usability and customization. cerebroViz is generalizable to any data quantifiable at a brain-regional resolution and currently supports visualization of up to thirty regions of the brain found in databases such as BrainSpan, GTEx and Roadmap Epigenomics.
Mines high throughput gene expression data to identify genes and pathways of interest to answer biological questions relevant to diverse fields of study. ScanGEO uses a database query to identify relevant gene expression omnibus (GEO) data sets, then reanalyzes each qualifying data set for differential expression of a custom list of genes or all genes in a Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway.
Converts the expression changes of the landmark genes into a perturbation barcode that reveals important features of the underlying data. Siamese uses deep learning techniques. The barcode captures compound structure and target information, and predicts a compound’s high throughput screening promiscuity, to a higher degree than the original data measurements. The software uncovers underlying factors of the expression data that are otherwise entangled or masked by noise.
Enables interactive query and navigation of transcriptome datasets relevant to human immunology research. GXB can be used to browse context-rich systems-scale data within and across systems immunology studies. It can handle the large collections of datasets generated through systems-scale profiling approaches. The tool can be used to display other data types, such as protein or cellular measurements, regardless of whether they are high dimensional.
A web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data.
Stores and facilitates search of RNA-Seq based expression profiles available from the modENCODE consortium and other public data sets. DGET provides a flexible tool for expression data retrieval and analysis with short or long lists of Drosophila genes, which can help scientists to design stage- or tissue-specific in vivo studies and do other subsequent analyses. Using DGET, researchers are able to look up gene expression profiles, filter results based on threshold expression values, and compare expression data across different developmental stages, tissues and treatments.
A fast search engine that uses a query genomic profile to search for similar profiles in a very large genomic database. GEMINI enables users to identify similar profiles independent of sample label, data origin or other meta-data information. GEMINI implements a nearest-neighbor search algorithm using a vantage-point tree to store a database of n profiles and in certain circumstances achieves an O(log n) expected query time in the limit. We tested GEMINI on breast and ovarian cancer gene expression data from The Cancer Genome Atlas project and show that it achieves a query time that scales as the logarithm of the number of records in practice on genomic data. In a database with 10(5) samples, GEMINI identifies the nearest neighbor in 0.05 sec compared to a brute force search time of 0.6 sec.
Allows users to process and analyze RNA-seq data. Serves for transcriptomic profiling as a clinically-oriented application. TED toolkit is an application that is divided in several modules: (1) the first module provides quality control of the RNAseq data which are preprocessing steps; (2) the second module carries out analysis of differentially coding, non-coding and novel isoform gene expression; and (3) the third module transforms the analysis results produced from the second module into detailed, biologically interpreted annotated reports.
Provides multiple channels of processed RNA-seq data from GEO/SRA. ARCHS4 supports retrospective data analyses and reuse. It can be used to predict gene function and protein-protein interactions. This tool facilitates rapid progress of retrospective post-hoc focal and global analyses. The platform offers a three-dimensional data viewer that lets users gain intuition about the global space of gene expression data.
Enables re-processing and re-analysis of Gene Expression Omnibus (GEO) RNA-seq data. GREIN is a web application composed of a back-end computational pipeline (GREP2) for uniform processing of RNA-seq data and several already processed data sets. The software offers features such as sub-setting and downloading of processed data, interactive visualization, statistical power analyses, construction of differential gene expression signatures and their comprehensive functional characterization, or connectivity analysis with LINCS L1000 data.
Permits to annotate many perturbation experiments from NCBI Gene Expression Omnibus (GEO) in a semi-automated fashion with full user control. GEOracle follows the same steps a bioinformatcian would employ when analysing perturbation data on GEO. After annotation, it performs differential expression analysis to identify gene targets of the perturbation agent. An interface guides user through the entire process and allows the user to manually adjust and verify all details of the predicted GSM labels and pairings.
Identifies cross-species cell-type. C3 is based on a cross-species gene set analysis method that relieves the false positive bias while still maintain good statistical power for gene sets affected by highly complex homology structures. It is able to prioritize identification of the correct corresponding cell type as the most significant hit. This tool can be used to find an unknown cell type from a potentially poorly characterized organism.
Aims to reduce potential errors made by curator during the data collection process. GEOMetaCuration deletes mechanical steps in the curation process to improve curation productivity for a large amount of metadata. It supports gene expression omnibus (GEO) metadata. This tool can assist users in the computational creation of biological insights using large-scale biological data.
Enables visualization, exploration of genes and cancers, and pinpointing of transcriptions factors (TFs) and miRNAs from the Cancer Genome Atlas (TCGA). EDGE in TCGA aims to investigate possible novel insights about 17 different cancers. It proposes features for identifying relevant miRNAs, methylation sites, somatic mutations as well as copy-number alterations involved in gene expression and can also be used for comparing relative effects of genetic and epigenetic drivers of gene expression.
Performs processing of RNA-seq data of human, mouse, and rat from the Gene Expression Omnibus (GEO). GREP2 is a pipeline that consists of several steps: (1) retrieval of metadata from GEO and the MetaSRA project, (2) downloading and processing of corresponding experiment run files from the Sequence Read Archive (SRA), (3) quantification of transcript abundances, and (4) compilation and organization of quality control (QC) reports. GREP2 is also the backend pipeline of the web application GREIN.
A web application that allows a user to download gene expression data sets directly from GEO in order to perform differential expression and survival analysis for a gene of interest. In addition, shinyGEO supports customized graphics, sample selection, data export, and R code generation so that all analyses are reproducible. The availability of shinyGEO makes GEO datasets more accessible to non-bioinformaticians, promising to lead to better understanding of biological processes and genetic diseases such as cancer.
A web server that allows for querying the Gene Expression Omnibus based on genome-wide patterns of differential expression. Using a novel, content-based approach, ProfileChaser retrieves expression profiles that match the differentially regulated transcriptional programs in a user-supplied experiment. This analysis identifies statistical links to similar expression experiments from the vast array of publicly available data on diseases, drugs, phenotypes and other experimental conditions.
Naim Al Mahi I am a PhD candidate working in the area of computational and statistical genomics, with a focus on developing new methodologies and computational pipelines for analyzing large scale genomics data.
University of Cincinnati