1 - 50 of 83 results

limma / Linear Models for Microarray Data

star_border star_border star_border star_border star_border
star star star star star
Provides an integrated solution for analysing data from gene expression experiments. limma contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. It also contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions: (i) it can perform both differential expression and differential splicing analyses of RNA-seq data; (ii) the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences.


Serves for the functional analysis of gene expression and genomic data. Babelomics offers the possibility to explore the effects of alteration in gene expression levels or changes in genes sequences within a functional context. It provides user-friendly access to a full range of methods that cover: (1) primary data analysis; (2) a variety of tests for different experimental designs; and (3) different enrichment and network analysis algorithms for the interpretation of the results of such tests in the proper functional context.


Allows the analysis of multiple time course transcriptomics data. maSigPro is a regression based approach to find genes for which there are significant gene expression profile differences between experimental groups in time course microarray and RNA-Seq experiments. The software incorporates a clustering function to visualize genes with similar profiles. maSigPro was initially developed for microarrays and later updated to model count data. It includes Iso-maSigPro, a functionality to study differential isoform usage in time course RNA-seq experiments.


Infers cell type-specific expression based on co-expression similarity with known cell type marker genes. CellMapper is an R package that can make accurate predictions using publicly available expression data, even when a cell type has not been isolated before. It was developed as an approach to obtain the gene expression profiles unique to individual cell types. This method is effective for cell types that have never been isolated before, providing an opportunity to fill gaps in available expression data.


Analyzes genome-wide expression patterns in one experiment at a time. T-profiler is a web application that uses the t-test to score the difference between the mean expression level of predefined groups of genes and that of all other genes on the microarray. The consensus motifs derive from three different sources: (i) motifs are extracted from the Promoter Database of Saccharomyces cerevisiae (SCPD) database, (ii) motifs are found by comparing the genome sequences of highly related yeast species, and (iii) motifs discovered from various microarray experiments using the REDUCE algorithm were added.

betr / Bayesian Estimation of Temporal Regulation

A package to identify differentially expressed genes in microarray time-course data. BETR explicitly uses the time-dependent structure of the data, employing an empirical Bayes procedure to stabilize estimates derived from the small sample sizes typical in microarray experiments. It is applicable to one- or two-color replicated microarray data, and can be used to detect differences between two conditions or changes from baseline in a single condition. BETR outperforms three commonly used techniques in the analysis of time-course data. This advantage is particularly noticeable for genes with a small but sustained differential expression signal. When the magnitude of differential expression is of similar magnitude to background noise, it is difficult to identify by examining each time point in isolation. These patterns of differential expression become easier to identify when the time series structure of the data is taken into account; a small, noisy signal becomes identifiable if it is sustained across several adjoining time points.

GNEA / Gene Network Enrichment Analysis

Aims to identify biological processes that are consistently deregulated across a broad set of microarray experiments associated with different disease models in both animal and human tissues. GNEA consists of five steps: (1) Assemble a collection of gene sets associated with biological processes or signalling pathways of interest (2) Assume an underlying model of cellular processes using a global protein–protein interaction network (3) Evaluate the hypothesis that genes in a given gene set are observed in a higher proportion (i.e., enriched) than expected by chance in the high-scoring subnetwork (HSN) and repeat for each gene set in the assembly (4) Order the gene sets of interest based on the number of different HSNs where they appear enriched (5) For each gene set, assign a p-value to the number of conditions where it is enriched.


Uses the Earth mover’s distance to measure the overall difference between the distributions of a gene’s expression in two classes of samples and uses permutations to obtain q-values for each gene. EMDomics algorithm is used to perform a supervised multi-class analysis to measure the magnitude and statistical significance of observed continuous genomics data between groups. Usually the data will be gene expression values from array-based or sequence-based experiments, but data from other types of experiments can also be analyzed (e.g. copy number variation). This package also incorporates the Komolgorov-Smirnov (K-S) test and the Cramer von Mises test (CVM), which are both common distribution comparison tests.


A package for ranking differentially expressed gene expression time courses through Gaussian process regression. gprege fits two GPs with the an RBF (+ noise diagonal) kernel on each profile. One GP kernel is initialised wih a short lengthscale hyperparameter, signal variance as the observed variance and a zero noise variance. It is optimised via scaled conjugate gradients (netlab). A second GP has fixed hyperparameters: zero inverse-width, zero signal variance and noise variance as the observed variance. The log-ratio of marginal likelihoods of the two hypotheses acts as a score of differential expression for the profile. Comparison via ROC curves is performed against BATS.

BGX / Bayesian Gene eXpression

Provides posterior distributions of gene expression indices and other quantities of interest rather than point estimates. BGX is a Bayesian hierarchical model for the analysis of Affymetrix GeneChip data. The multiple array BGX model allows the posterior distributions of any function of the gene expression measures to be obtained. It also allows credibility intervals for the ranks of the genes with respect to the degree of differential expression to be obtained. This additional information is easy to interpret and can be presented visually.

EEGC / Engineering Evaluation by Gene Categorization

Evaluates cellular engineering processes in a systemic rather than marker-based fashion. EEGC integrates transcriptome profiling and functional analysis. It clusters genes into categories representing different states of (trans)differentiation. The tool performs functional and gene regulatory network analyses for each of the categories of the engineered cells, thus offering practical indications on the potential lack of the reprogramming protocol.

PREDA / Position RElated Data Analysis

Detects regional variations in genomics data. PREDA implements a procedure to analyze the relationships between data and physical genomic coordinates along chromosomes with the final aim of identifying chromosomal regions with likely relevant functional role. The software integrates high-throughput signals and structural information using a non-linear kernel regression with adaptive bandwidth. The integrative analysis is performed through a modular and flexible framework accommodating different types of functions and statistics.

TTCA / Transcript Time Course Analysis

Analyses sparse and heterogeneous time course data with high detection sensitivity and transparency. TTCA is specifically designed for the analysis of perturbation responses. It combines different scores to capture fast and transient dynamics as well as slow expression changes, and performs well in the presence of low replicate numbers and irregular sampling times. The results are given in the form of tables including links to figures showing the expression dynamics of the respective transcript. These allow to quickly recognize the relevance of detection, to identify possible false positives and to discriminate early and late changes in gene expression. An extension of the method allows the analysis of the expression dynamics of functional groups of genes, providing a quick overview of the cellular response.

DGCA / Differential Gene Correlation Analysis

An R package for systematically assessing the difference in gene-gene regulatory relationships under different conditions. DGCA contains functions to filter, process, save, visualize, and interpret differential correlations of identifier-pairs across the entire identifier space, or with respect to a particular set of identifiers (e.g., one). It also contains several functions to perform differential correlation analysis on clusters (i.e., modules) or genes. Finally, it proposes functions to generate empirical p-values for the hypothesis tests and adjust them for multiple comparisons. This user-friendly, effective, and comprehensive software tool will facilitate the application of differential correlation analysis in many biological studies and thus will help identification of novel signalling pathways, biomarkers, and targets in complex biological systems and diseases.


Predicts cellular composition of heterogeneous samples. PERT is a deconvolution model that addresses transcriptional variations between reference and constituent profiles. The software is based on the non-negative maximum likelihood model (NNML) framework but accounts for transcriptional variations between reference and constituent profiles. It is readily applicable to circumstances where available reference profiles are collected under different micro-environmental or developmental conditions from the heterogeneous samples.

EXPANDER / EXpression Analyzer and DisplayER

An integrated software platform for the analysis of microarray gene expression data. EXPANDER is designed to support all the stages of microarray data analysis, from raw data normalization to inference of transcriptional regulatory networks. The microarray analysis starts with importing the data into and is followed by normalization and filtering. Then, clustering and network-based analyses are performed. The gene groups identified are tested for enrichment in function, co-regulation (using transcription factor and microRNA target predictions) or co-location.

FEM / Functional Epigenetic Modules

Identifies gene modules of coordinated differential methylation and differential expression in the context of a human interactome. FEM is a functional supervised algorithm that could be applied to cellular differentiation data to identify cell type-specific gene expression modules under the regulation of DNA methylation. It represents a functional supervised network algorithm, integrating multi-dimensional DNAm and gene expression data in the context of a human protein-protein interaction (PPI) network.


Implements Partial Least Squares regression to extract the hidden signals of sample-specific heterogeneity in the data and uses them to find the genes that are actually correlated with the phenotype of interest. svapls that can be used to identify several types of unknown sample-specific sources of heterogeneity in a gene expression study and adjust for them in order to provide a more accurate inference on the original expression pattern of the genes over different varieties of samples.

INDEED / Integrated DiffErential Expression and Differential network analysis

Builds a sparse differential network based on partial correlation for better visualization, and integrates differential expression (DE) and differential network (DN) analyses for biomarker discovery. INDEED includes four steps: (i) performing DE analysis to obtain p-value for each biomolecule, (ii) building a differential network, (iii) computing the activity score for each biomolecule and, (iv) prioritizing the biomolecules with the activity score. Future work includes developing an R package and extending it to integrate multiple omic data of various types for biomarker discovery.

WoPPER / Webserver fOr Position Related data analysis of gene Expression in Prokaryotes

Integrates transcriptional expression data and genomic annotations to identify groups of physically contiguous genes characterized by regional differential expression in bacterial genomes. WoPPER Is a web app that can analyze any RNA-seq or microarray-based gene expression dataset from any microorganism with a sequenced and annotated genome. This resource provides researchers with novel and informative insights regarding the correlation between gene expression and chromosomal organization in bacterial genomes.


A process for the creation of library tags that increases accuracy in identification of the source tissue with little processing time overhead. UITagCreator also identifies the tissue of origin utilizes a synthetic oligonucleotide tag to uniquely identify the source tissue from which the clone was derived. The algorithm utilizes the LDA to determine the edit distance between pairs of tags, and implements a linear-feedback shift register (LFSR) to quickly perform an exhaustive pseudo-random search of each candidate tag.

CellCODE / cell-type COmputational Differential Estimation

A multi-step statistical framework that uses latent variable analysis to analyze differential expression from mixture samples. This approach is based on latent variable analysis and is computationally transparent, requires no additional experimental data, yet outperforms existing methods that use independent proportion measurements. CellCODE has few parameters that are robust and easy to interpret. The method can be used to track changes in proportion, improve power to detect differential expression and assign the differentially expressed genes to the correct cell-type.


Explores the linear decision boundary family. DiscriminantCut is a machine learning methodology for robust differential expression analysis, which can be an avenue to significantly advance research on large-scale differential expression analysis. The corresponding mathematical model was formulated as a constrained optimization problem aiming to maximize discoveries satisfying a user-defined False Discovery Rate (FDR) constraint. An effective algorithm, Discriminant-Cut, was developed to solve an instantiation of this problem. Extensive comparisons of Discriminant-Cut with a couple of cutting edge methods were carried out to demonstrate its robustness and effectiveness.

Plsgenomics / Partial least squares Analyses for Genomics

A statistical approach based on partial least squares regression to infer the true TFAs from a combination of mRNA expression and DNA-protein binding measurements. Plsgenomics is also statistically sound for small samples and allows the detection of functional interactions among the transcription factors via the notion of "meta"-transcription factors. Plsgenomics performs very well both for simulated data and for real expression and ChIP data from yeast and E. Coli experiments. It overcomes the limitations of previously used approaches to estimating TFAs.

MFSelector / Monotonic Feature Selector

A system, easy to use even with no pre-existing knowledge, to identify gene sets with monotonic expression patterns in multi-stage as well as in time-series genomics matrices. The case studies on embryonic stem cell neurogenesis and embryonic stem cell vasculogenesis have helped to get a better understanding of stemness and differentiation. The novel monotonic marker genes discovered from a data set are found to exhibit consistent behavior in another independent data set, demonstrating the utility of the proposed method.

Worm tissue

Allows exploration of tissue-gene expression models. Worm tissue is a machine learning-based prediction tool enabling users to explore the predicted expression patterns of their gene(s) of interest. The application permits visualization of hierarchically clustered expression patterns and allows users to sort by any gene or tissue model of interest. The software also provides suggestions of genes with similar tissue expression profiles, which users can immediately visualize alongside their original query.

NUDGE / Normal Uniform Differential Gene Expression

Uses a simple univariate normal-uniform mixture model, in combination with new normalization methods for spread as well as mean that extend the lowess normalization. NUDGE is a simple method to find differentially expressed genes in cDNA microarrays. This method gives a high probability of differential expression to genes known/suspected a priori to be differentially expressed and a low probability to the others. In terms of known false positives and false negatives, NUDGE outperforms all multiple-replicate methods except for the Gamma-Gamma EBarrays method to which it offers comparable results with the added advantages of greater simplicity, speed, fewer assumptions and applicability to the single replicate case.


A differentially expressed (DE) gene selection algorithm, which controls the FDR based on predictive Bayesian estimates. The simulation studies empirically showed that the proposed confident difference criterion methods outperform the existing methods when comparing gene expressions across different conditions for both microarray studies and sequence-based high-throughput studies. For the analysis of the real data, the method II successfully identified more clinically important DE genes than the other methods. In comparison to Method I, the Method II provides a much better sensitivity rate, but slightly a lower specificity rate based on the simulation studies.


Implements a wavelet-based model for analyzing transcriptome data and extends it towards more complex experimental designs. With waveTiling, the user is able to discover group-wise expressed regions, differentially expressed regions between any two groups in single-factor studies and in multifactorial designs. Moreover, for time-course experiments, it is also possible to detect linear time effects and a circadian rhythm of transcripts. By considering the expression values of the individual tiling probes as a function of genomic position, waveTiling allows to dectect effect regions regardless of existing annotation.