Batch effect correction software tools | RNA sequencing data analysis
It is now known that unwanted noise and unmodeled artifacts such as batch effects can dramatically reduce the accuracy of statistical inference in genomic experiments. These sources of noise must be modeled and removed to accurately measure biological variability and to obtain correct statistical inference when performing high-throughput genomic analysis.
Adjusting batch effects in microarray expression data using Empirical Bayes methods. The modified ComBat (M-Combat) is designed specifically in the context of meta-analysis and batch effect adjustment for use with predictive models that are validated and fixed on historical data from a ‘gold-standard’ batch.
Allows differential expression analysis of digital gene expression data. edgeR implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models and quasi likelihood tests. The package and methods are general, and can work on other sources of count data, such as barcoding experiments and peptide counts.
Identifies differentially expressed genes from count data or previously normalized count data. NOISeq empirically models the noise distribution of count changes by contrasting fold-change differences (M) and absolute expression differences (D) for all the features in samples within the same condition. This reference distribution is then used to assess whether the M-D values computed between two conditions for a given gene are likely to be part of the noise or represent a true differential expression.
Investigates data from gene expression experiments. limma contains features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. It can perform both differential expression and differential splicing analyses of RNA-seq data. This tool is useful for studying expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures.
Allows to remove batch effects and other unwanted variation in high-throughput experiment. SVA is a package containing several functions permitting to identify and build surrogate variables for large data sets. Artifacts can be removed in three ways: (i) identification and estimation of surrogate variables, (ii) direct removal of known batch effect with ComBat and (iii) removal of batch effect with known probes.
An easy-to-use application for microarray, RNA-Seq and metabolomics analysis. For splicing sensitive platforms (RNA-Seq or Affymetrix Exon, Gene and Junction arrays), AltAnalyze will assess alternative exon (known and novel) expression along protein isoforms, domain composition and microRNA targeting. In addition to splicing-sensitive platforms, AltAnalyze provides comprehensive methods for the analysis of other data (RMA summarization, batch-effect removal, QC, statistics, annotation, clustering, network creation, lineage characterization, alternative exon visualization, gene-set enrichment and more).
Uses surrogate variable analysis (sva) for estimating unwanted noise and unmodeled artifacts by (i) identifying the part of the genomic data only affected by artifacts and (ii) estimating the artifacts with principal components or singular vectors of the subset of the data matrix. svaseq contains functions for removing batch effects and other unwanted variation in high-throughput experiment. It was specifically created for count data or Fragments Per Kilobase Of Exon Per Million Fragments Mapped (FPKM) from sequencing experiments based on appropriate data transformation.