Serves for blind compressed sensing in the context of gene expression. CS-SMAF can find a non-negative, sparse module dictionary, and sparse module activity levels. It employs fixed and variable measurements to investigate gene expression. This tool clusters samples based on the subset of composite observations, searches for a relatively small dictionary to explain the samples in that cluster, and then concatenates the small dictionaries into a large dictionary.
Provides class infrastructure and associated methods to construct an Illumina analysis workflow pipeline starting with raw data through functional analysis. Besides supporting the existing algorithms for microarray data, the lumi package includes several unique parts: (i) a variance-stabilizing transformation that utilizes the technical replicates available on the Illumina microarray; (ii) normalization algorithms designed for Illumina microarray data and; iii) the nucleotide universal identifier annotation packages.
Analyzes the reliability of individual probes directly from gene expression data. A major advantage of the proposed approach is its capability to detect unreliable probes independently of physical models or external, constantly updated information such as genomic sequence data. RPA can be useful in many applications, including evaluation of the end results of gene expression analysis, and recognition of potentially unknown probe-level error sources. It can be also used to quantify the uncertainty in the measurements and in designing the probes, and is also utilized by our model to provide robust estimates of differential gene expression.
Provides several unique features in a modular and flexible system for the analysis of microarray data. The design and modular conception of CARMAweb allows the use of the different analysis modules either individually or combined into an analytical pipeline. CARMAweb performs (i) data preprocessing (background correction, quality control and normalization), (ii) detection of differentially expressed genes, (iii) cluster analysis, (iv) dimension reduction and (v) visualization, classification, and Gene Ontology-term analysis.
A web-based program for processing microarray data. In completely automated fashion, ExpressYourself will correct the background array signal, normalize the Cy5 and Cy3 signals, score levels of differential hybridization, combine the results of replicate experiments, filter problematic regions of the array and assess the quality of individual and replicate experiments. ExpressYourself is designed with a highly modular architecture so various types of microarray analysis algorithms can readily be incorporated as they are developed; for example, the system currently implements several normalization methods, including those that simultaneously consider signal intensity and slide location. The processed data are presented using a web-based graphical interface to facilitate comparison with the original images of the array slides. In particular, Express Yourself is able to regenerate images of the original microarray after applying various steps of processing, which greatly facilities identification of position-specific artifacts.
Implements Partial Least Squares regression to extract the hidden signals of sample-specific heterogeneity in the data and uses them to find the genes that are actually correlated with the phenotype of interest. svapls that can be used to identify several types of unknown sample-specific sources of heterogeneity in a gene expression study and adjust for them in order to provide a more accurate inference on the original expression pattern of the genes over different varieties of samples.
Implements a unified framework for preprocessing microarray data and interfaces with other BioConductor tools for downstream analysis. The Oligo package provides array coordinates, feature types, sequences, feature names and other relevant information for preprocessing. Developers can use oligo solutions to facilitate the integration of their tools with BioConductor. They also benefit from the unified model that the package makes available, as the consistency in data delivery and handling improves efficiency.
Identifies features correlating with a phenotype of interest in the presence of potential confounding factors. Using simulated data, we show that ISVA performs well in identifying confounders as well as outperforming methods which do not adjust for confounding. Using four large-scale Illumina Infinium DNA methylation datasets subject to low signal to noise ratios and substantial confounding by beadchip effects and variable bisulfite conversion efficiency, we show that ISVA improves the identifiability of confounders and that this enables a framework for feature selection that is more robust to model misspecification and heterogeneous phenotypes. Finally, we demonstrate similar improvements of ISVA across four mRNA expression datasets. Thus, ISVA should be useful as a feature selection tool in studies that are subject to confounding.
Inspects a large number of p-values in an effort to detect additional positive cases. EBS offers an automatic screening of the p-values a user may obtain from his or her favorite gene-by-gene analysis software. In addition, the current procedure utilizes the p-values and not the test statistics; therefore, it has broader applicability to other types of tests such as the F-tests or rank tests. It screens each p-value not only on its own magnitude but also on the basis of the totality of the p-values (or its empirical distribution).
Provides the implementation of distance weighted discrimination (DWD) using an interior point method for the solution of second order cone programming problems. DWD is related to, and has been shown to be superior to, the support vector machine in situations that are fundamental to bioinformatics, such as very high dimensional data. DWD has proven to be very useful for several fundamental bioinformatics tasks, including classification, data visualization and removal of biases, such as batch effects.
Performs high-throughput expression analysis, with accurate and consistent results. Codelink is a single-channel microarray platform that uses 30-bp oligonucleotide probes designed for three different organisms; human, mouse and rat. It facilitates reading, preprocessing and manipulating Codelink microarray data. The raw data must be exported as text file using the software. The tool provides users with an easy to use interface for the analysis of data on the R platform.
Consists of a module-based prediction (MBP) strategy. MBP is a method hypothesized to yield predictions completely independent of information from the test data. This algorithm takes advantage of information from genes sharing similar expression patterns. It focuses on two factors to increase model reproducibility: gene missingness and experimental noise. The method can be extended to analyze deep sequencing data, where the feature dimensionality is even higher than microarray data.
Features multinomial probit regression with Gaussian Process priors and estimates class posterior probabilities employing fast variational approximations to the full posterior. VBMP is an R package for Gaussian Process classification of data over multiple classes. It incorporates feature weighting by means of Automatic Relevance Determination. The vbmp package implements a VB approach to classification of multi-class datasets. This non-parametric approach is developed within a probabilistic framework for Bayesian inference, which yields to efficient sparse approximations by optimizing a strict lower bound of the marginal likelihood of a multinomial probit regression model.
Automates FASTA file inspection rendering files compatible for a variety of downstream bioinformatics tools. Fasta-O-Matic reports any issues detected to the user with optionally color coded and quiet or verbose logs. It can serve as a general pre-processing tool in bioinformatics workflows and as a sanity check for bioinformatic core facilities. This tool is useful to repeat common analysis steps on FASTA files received from disparate sources.
A package for the automatic detection and masking of blemishes in HDONA microarray chips. Harshlight’s algorithm combines image analysis techniques with statistical approaches to recognize three types of defects frequent in Affymetrix microarray chips: extended, compact, and diffuse defects. It provides a way to safely identify blemishes of different nature and correct the intensity values of the batch of chips provided by the user. The corrections made by Harshlight improve the reliability of the expression values when the chips are further analyzed with other programs, such as GCRMA and MAS5.
A package for ranking differentially expressed gene expression time courses through Gaussian process regression. gprege fits two GPs with the an RBF (+ noise diagonal) kernel on each profile. One GP kernel is initialised wih a short lengthscale hyperparameter, signal variance as the observed variance and a zero noise variance. It is optimised via scaled conjugate gradients (netlab). A second GP has fixed hyperparameters: zero inverse-width, zero signal variance and noise variance as the observed variance. The log-ratio of marginal likelihoods of the two hypotheses acts as a score of differential expression for the profile. Comparison via ROC curves is performed against BATS.
Combines raw data of different microarray platforms into one virtual array. virtualArray consists of several functions that act subsequently in a semi-automatic way. Doing as much of the data combination and letting the user concentrate on analysing the resulting virtual array. Using this software package, researchers can easily integrate their own microarray data with data from public repositories or other sources that are based on different microarray chip types.
Reduces probe hybridization bias from experiments performed on the Affymetrix microarray platform, allowing accurate assessment of germline influence on gene expression. equalizer uses genome variant data to modify annotation files for the commonly used Affymetrix IVT and Gene/Exon platforms. These files can be used by any microarray normalization method for subsequent analysis.