Unlock your biological data


Try: RNA sequencing CRISPR Genomic databases DESeq

Prognostic gene expression biomarker validation software tools | Transcription data analysis

Survival analysis using gene expression to derive predictive gene signatures is a commonly used feature in research studies employing high throughput genomic data. Gene signatures predictive of overall, relapse free or metastasis free survival are popular and several such signatures are published periodically and the data submitted to public repositories. Data from such studies which is available on the public domain can be leveraged to identify prognostic markers in different cancer types.

Source text:
(Goswami and Nakshatri, 2014) PROGgeneV2: enhancements on the existing database. BMC Cancer.

1 - 50 of 99 results
filter_list Filters
build Technology
healing Disease
settings_input_component Operating System
tv Interface
computer Computer Skill
copyright License
1 - 50 of 99 results
Provides a web resource for exploring, visualizing, and analyzing multidimensional cancer genomics data. cBioPortal reduces molecular profiling data from cancer tissues and cell lines into readily understandable genetic, epigenetic, gene expression, and proteomic events. The query interface, combined with customized data storage, enables researchers to interactively explore genetic alterations across samples, genes, and pathways and, when available in the underlying data, to link these to clinical outcomes.
A multi-tiered compendium of bioinformatics algorithms and gene signatures for molecular subtyping and prognostication in breast cancer. Genefu provides bioinformatics implementations of classification algorithms to identify molecular subtypes, as well as prognostic predictors along with their published gene signatures. It also includes other functions to facilitate quick manipulation of gene expression datasets, including gene selection and probe-gene mapping across microarray platforms. The genefu package provides a unified framework for integration of molecular subtype and survival analysis of breast cancer. We have demonstrated how the package can be utilized to perform both meta-analyses across datasets and across algorithms, to facilitate integrated analysis of breast cancer gene expression profiles.
RSF / Random Survival Forests
A unified treatment of Breiman's random forests for survival, regression and classification problems. RSF provides a unified treatment of Breiman’s random forests for a variety of data settings. Regression and classification forests are grown when the response is numeric or categorical. Multivariate regression and classification responses as well as mixed outcomes are also handled as are unsupervised forests. Different splitting rules invoked under deterministic or random splitting are available for all families. Variable predictiveness can be assessed using variable importance measures for single, as well as grouped variables. Missing data can be imputed on both training and test data.
Performs survival analyses and draws Kaplan–Meier (KM) plots for submitted ‘microRNA' across several available data sets, which cover more than 800 patients. A robust statistical procedure is implemented to account for multiple testing. MIRUMIR is incorporated into BioProfiling.de, analytical portal for high-throughput cell biology. MIRUMIR supports the need of biomedical researchers to estimate the power of miR to serve as potential biomarker to predict survival of cancer patients. MIRUMIR provides such analyses based on several publicly available clinical miR data sets annotated with patient survival information.
An online tool for statistical validation of hypotheses regarding the effect of p53 mutational status on gene regulation in cancer. p53MutaGene is based on several large-scale clinical gene expression data sets and currently covers breast, colon and lung cancers. The tool detects differential co-expression patterns in expression data between p53 mutated versus p53 normal samples for the user-specified genes. Statistically significant differential co-expression for a gene pair is indicative that regulation of two genes is sensitive to the presence of p53 mutations. p53MutaGene can be used in 'single mode' where the user can test a specific pair of genes or in 'discovery mode' designed for analysis of several genes.
Builds a risk prediction signature for a specific stratum by down-weighting the observations from the other strata using a range of weights. CoxBoost actively controls the extent to which each stratum contributes to the variable selection and estimation of regression coefficients. It also focuses on building a risk prediction signature for a specific stratum by down-weighting the observations from the other strata using a range of weights. CoxBoost was designed to identify clusters of variables that either are important only in the stratum of interest or are also important to some extent in the other strata.
GOBO / Gene expression-based Outcome for Breast cancer Online
A user-friendly online tool that allows rapid assessment of gene expression levels, identification of co-expressed genes and association with outcome for single genes, gene sets or gene signatures in an 1881-sample breast cancer data set. Moreover, GOBO offers the possibility of investigation of gene expression levels in breast cancer subgroups and breast cancer cell lines for gene sets, as well as creation of potential metagenes based on iterative correlation analysis to a prototype gene. The design and implementation of GOBO facilitate easy incorporation of additional query functions and applications, as well as additional data sets irrespective of tumor type and array platform in the form of precompiled R-data sets.
Allows researchers to use publicly available data to study prognostic implications of genes of interest in multiple cancers. PROGgene is a survival analysis tool for performing biomarker identification with a large repository of public datasets. Users can upload their own gene expression datasets. The software can be used to conduct on the fly survival analysis and create survival plots (Kaplan Meier, KM plots) based on gene expression of user input genes in user selected datasets from multiple cancers.
BioPlat / Biomarkers Platform
A user-friendly open-source bioinformatic resource, which provides a set of analytic tools for the discovery and in silico evaluation of novel prognostic and predictive cancer biomarkers based on integration and re-use of gene expression signature in the context of follow-up data. The desktop client app is now supported by a dedicated web server for the statistical and computational analysis of very large databases. Furthermore, we have refurbished its graphical interface, added new visualization tools and up-graded the BioPlat data bases. BioPlat facilitates the integration, analysis, validation and feature selection of gene signatures derived from different databases in the context of follow-up data obtained from publicly available gene expression profiling repositories.
Automatically derives the currently known interactome for a gene of interest and correlates expression levels of its interactome, with survival outcome in multiple publicly available clinical expression data sets. PPISURV automatically correlates expression of an input gene interactome with survival rates on >40 publicly available clinical expression data sets covering various tumours involving about 8000 patients in total. To derive the query gene interactome, PPISURV employs several public databases including protein-protein interactions, regulatory and signalling pathways and protein post-translational modifications.
A Bayesian ensemble method for survival prediction in high-dimensional gene expression data. This non-parametric method incorporates both additive and interaction effects between genes, which results in high predictive accuracy compared with other methods. In addition, SurvBART provides model-free variable selection of important prognostic markers based on controlling the false discovery rates; thus providing a unified procedure to select relevant genes and predict survivor functions.
A package that predicts true survival times for the individual patient based on microarray measurements. RCASPAR is based on a multivariate Cox regression model that is embedded in a Bayesian framework. A hierarchical prior distribution on the regression parameters is specifically designed to deal with high dimensionality (large number of genes) and low sample size settings, that are typical for microarray measurements. This enables RCASPAR to automatically select small, most informative subsets of genes for prediction.
A simple algorithm for classification and discrimination. One reason why más-o-menos is comparable with more sophisticated methods such as penalized regression may be that we often use a prediction model trained on one set of patients to discriminate between subgroups in an independent sample, usually collected from a slightly different population and processed in a different laboratory. más-o-menos should be useful for developing prediction models from high-dimensional data in any situation where the covariates are sufficiently correlated and the true effect is roughly linear.
APPEX / Analysis Platform for identification of Prognostic gene EXpression signature in cancer
A web-based software platform to help researchers in the efforts to identify prognostic signatures from genomics data. APPEX is designed to be easy to use and flexible, and it is freely available for advanced statistical survival analyses. A user-friendly graphical interface similar to a desktop application is provided so that users can easily handle their own data on APPEX even if they are not familiar with statistical analysis packages. In addition, APPEX contains >200 publicly available datasets directly applicable on the system so that users can easily validate newly identified signatures in independent patient cohorts.
QPATH / Quantitative methods for pathology
Predicts the molecular subtypes of colorectal cancer from the routine histology images. QPATH uses neural networks for extracting local descriptors which were then used for constructing a dictionary–based representation of each tumor sample. It is based on support vector machine (SVM) classifiers models, with radial basis functions kernels. The tool can identify with high confidence at least four of five subtypes, the most difficult to recognize is the subtype E.
Allows users to develop, validate prediction model, estimate expected survival of patients and visualize them graphically. biospear is an R package that implements approaches to develop and evaluate a prediction model within a high-dimensional Cox regression setting, and to estimate expected survival at a given time point. The software consists of two core functions: BMsel, that identifies a prediction model, and expSurv, that estimates expected survival. biospear also provides a function for generating survival data.
ePCR / Ensemble-based Penalized Cox Regression
Serves for survival and time-to-event prediction. The ePCR model makes use of inter-variable interactions and advanced multi-variable machine learning to identify marker combinations. This tool aids users to apply the prognostic model to real-world prostate cancer patient cohorts. Moreover, this program is designed for clinical trial data from docetaxel-treated mCRPC patients, or heterogeneous hospital registry cohorts of advanced prostate cancer patients.
star_border star_border star_border star_border star_border
star star star star star
Provides functions for facilitating survival analysis and visualization. survminer is an R package that provides functions to (i) draw survival curves with the ‘number at risk’ table, the cumulative number of events table and the cumulative number of censored subjects table, (ii) arrange multiple ggsurvplots on the same page, (iii) plots the distribution of event’s times, (iv) determine the optimal cut-point for one or multiple continuous variables at once, and others.
A simple user-friendly tool for examining putative gene/miRNA prognostic markers in breast cancer. BreastMark combines gene expression data from multiple microarray experiments which frequently also contain miRNA expression information, and detailed clinical data to correlate outcome with gene/miRNA expression levels. This algorithm integrates gene expression and survival data from 26 datasets on 12 different microarray platforms corresponding to ~17,000 genes in up to 4,738 samples. It also allows us to examine the prognostic potential of 341 microRNAs. The value of BreastMark is both in the simplicity of its design and the robustness of its approach. It is designed with non-bioinformatic research groups in mind and is of great value in the preliminary assessment of putative biomarkers in breast cancer as a whole and within its molecular subtypes.
An artificial neural networks (ANN) framework to predict patient prognosis from high throughput transcriptomics data. Cox-nnet utilizes feature importance scores based on the partial derivatives of gene features selected by the model, so that the relative importance of the genes to prognosis outcome can be directly assessed. The hidden layer node structure in ANN can be harnessed to reveal much richer information of featuring genes and biological pathways, compared to other methods. Cox-nnet is a desirable survival analysis method with both excellent predictive power and usage to gain biological functions related to prognosis.
Enables researchers to study the expression level of genes to compare primary tumor with normal tissue samples. UALCAN is an interactive web resource that provides critical information and graphic ability to make stage, grade, race and other sub status specific expression features from transcriptome sequencing data. This portal can aid in the identification of candidate biomarkers of specific cancer subclasses, with diagnostic, prognostic or therapeutic implications. It can also be used as a platform for in silico validation of target genes.
sPAGM / subPathway Activity by integrating Gene and MiRNA
Deduces subpathway functional activity for single samples. sPAGM provides a method that determines subPathway Activity (sPA) scores from the expression levels of genes and miRNAs in corresponding subpathway graphs. It can be used for featuring biological mechanisms at the subpathway level and for detecting and characterizing functional signatures in the prognoses of cancer patient. Besides, the activity score generated by the method can also be applied to other researches, such as drug action mechanisms for various types of tumors.
Low rank approximation based multi-omics data clustering (LRAcluster) is a new method to discover molecular subtypes by detecting the low-dimensional intrinsic space of high-dimensional cancer multi-omics data. The low-rank constraint is the core to generate the low-dimensional representation of the original data. And the convexity of the regularized likelihood function provides efficient gradient-descent algorithm for optimization. Extensive experiments show that LRAcluster is computationally efficient with high accuracy and thus suitable for large-scale cancer multi-omics studies.
Predicts prognostic outcome and identifies biomarkers for different human cancers. ENCAPP is an elastic-net-based approach that combines the reference human protein interactome network with gene expression data. The software identifies functional modules that are differentially expressed between patients with good and bad prognosis and uses these to fit a regression model that can be used to predict prognosis for breast, colon, rectal, and ovarian cancers. It can also accurately identify genes that can serve as prognostic biomarkers for different cancers.
Provides a powerful platform for evaluating potential tumor markers and therapeutic targets. PrognoScan is 1) a large collection of publicly available cancer microarray datasets with clinical annotation, as well as 2) a tool for assessing the biological relationship between gene expression and prognosis. PrognoScan searches the relation between gene expression and patient prognosis such as overall survival (OS) and disease free survival (DFS) across a large collection of publicly available cancer microarray datasets.
A versatile free tool to perform validation of multi-gene biomarkers for gene expression in human cancers. We generated a cancer database collecting more than 20,000 samples and 130 datasets with censored clinical information covering tumors over 20 tissues. We implemented a web interface to perform biomarker validation and comparisons in this database, where a multivariate survival analysis can be accomplished in about one minute. SurvExpress is a valuable and comprehensive web tool and cancer database with clinical outcomes tailored to rapidly evaluate gene expression biomarkers.
Detects network biomarkers using network-constrained support vector machines (NetSVM). CyNetSVM predicts clinical outcome of patients and identifies biologically meaningful networks. It includes a graphical user interface (GUI) and offers users to analyze largescale biomedical data efficiently. It generates a network view of the identified biomarkers in Cytoscape. This tool is useful to study breast cancer data for clinical outcome prediction and network biomarker identification.
0 - 0 of 0 results
1 - 9 of 9 results
filter_list Filters
computer Job seeker
Disable 4
person Position
thumb_up Fields of Interest
public Country
language Programming Language
1 - 9 of 9 results