Data normalization software tools | Mass spectrometry-based untargeted metabolomics
In this review, we describe the importance of sample normalization in the analytical workflow with a focus on mass spectrometry (MS)-based platforms, discuss a number of methods recently reported in the literature and comment on their applicability in real world metabolomics applications. Sample normalization has been sometimes ignored in metabolomics, partially due to the lack of a convenient means of performing sample normalization. We show that several methods are now available and sample normalization should be performed in quantitative metabolomics where the analyzed samples have significant variations in total sample amounts.
Provides a web-based analytical pipeline for high-throughput metabolomics studies. MetaboAnalyst aims to offer a variety of commonly used procedures for metabolomic data processing, normalization, multivariate statistical analysis, as well as data annotation. The current implementation focuses on exploratory statistical analysis, functional interpretation, and advanced statistics for translational metabolomics studies. This tool is also available as desktop version.
An LC/MS-based data analysis approach which incorporates novel nonlinear retention time alignment, feature detection, and feature matching. The XCMS software reads and processes LC/MS data stored in netcdf , mzXML, mzData and mzML files. It provides methods for feature detection, non-linear retention time alignment, visualization, relative quantization and statistics. XCMS is capable of simultaneously preprocessing, analyzing, and visualizing the raw data from hundreds of samples. XCMS is freely available under an open-source license.
An open-source software tool for mass-spectrometry data processing, with the main focus on LC-MS data. It is based on the original MZmine toolbox described in the 2006 Bioinformatics publication, but has been completely redesigned and rewritten since then. Our main goal is to provide a user-friendly, flexible and easily extendable software with a complete set of modules covering the entire LC-MS data analysis workflow.
Aims users to detect metabolites by annotation of pathways from cross-omics data. MarVis-Suite serves especially for the extraction, clustering, and visualization of metabolic markers from data originating of non-targeted experiments. It provides interactive desktop user interfaces for interactive inspection of data clusters, and supplies specialized functions for the analysis of data from non-targeted mass spectrometry (MS) experiments.
Permits comprehensive metabolomics data pre-processing, statistical analysis and interpretation. W4M includes computational modules for data normalization, multivariate analysis and annotation. It can create interactive web-based documents showing the results of the analyses, and users can share them with collaborators directly on the platform. This tool enables multi-omics analyses in a global systems-biology approach.
Aims to integrate and analyze metabolomics experiment data. MeltDB is a program that can be applied for the description and analysis of metabolomic experiments. This program hosts over 30 experiments predominantly from gas chromatography-mass spectrometry (GC/MS) measurements. Moreover, this tool includes an API allowing users to evaluate novel methods and algorithms for the preprocessing of metabolomic datasets.
Assesses performances of various normalization methods from multiple perspectives. NOREVA can be used to normalize the mass spectrometry (MS)-based metabolomics data. It assists users in the selection of suitable algorithm in metabolomics data analysis, using multiple evaluation criteria. This tool is useful for processing large-scale metabolomics dataset and for pathological investigation, drug discovery, biomarker identification.
Facilitates improved compound identification using mass spectrometry (MS). RANSY/RAMSY is an application that reduces spectral interference and facilitates the identification of individual molecules in overlapped MS spectra. This method is designed to work using datasets that contain multiple MS spectra for the same metabolite. It can be applied for compound identification using different analytical platforms.
Normalizes mass spectrometry (MS)-based metabolomics data. NOREVA could conduct normalization using 24 different methods, and provided evaluation report by collectively considering 5 different criteria for assessing the normalization performance. It is able to distinguish the best performed method from the others based on multiple evaluation criteria, which provided valuable guidance to the selection of suitable algorithm in metabolomics data analysis.
An R package for post-processing of metabolomic data. The primary functions of the MSPrep package are summarization of replicates, filtering, imputation of missing data, normalization and/or batch effect adjustment and dataset diagnostics.
Provides a program for the quantitative analysis of high throughput Gas Chromatography-Mass Spectrometry (GC-MS)-based metabolomics data. MetaQuant is intended to automatically determine the accurate intracellular amount of hundreds of metabolites. It provides access to various functions: (i) metabolite definition, (ii) calibration, (iii) quantification, (iv) import and export of data and (v) batch analysis.
A software package which implements two post-extraction processing steps including a method for block-wise quantitative summary and a novel normalization procedure. There are a number of experimental factors that are unique to MS platforms and the two proposed methods are different from the existing alternatives that had been developed for other omic platforms such as gene expression microarrays.
Permits users to realize autonomous and real-time analysis of metabolomic data. SimExTargId is an open source R package that provides an autonomous workflow that can also calculate data preprocessing in real-time, thereby alerting the user to signal degradation or loss. This method also facilitates real-time monitoring of liquid chromatography-mass spectrometry (LC-MS) data acquisition.
Allows analysis of direct infusion and liquid chromatography mass spectrometry-based metabolomics data. Galaxy-M consists of a metabolomics tool for Galaxy, developed for both direct infusion mass spectrometry (DIMS) and liquid chromatography mass spectrometry (LC-MS) metabolomics. This tool aims to enable biologists without programming skills to construct and execute next generation sequencing (NGS) data analyses.
A software tool for the efficient and automatic analysis of GC/MS-based metabolomics data. Starting with raw MS data, MetaboliteDetector detects and subsequently identifies potential metabolites. Moreover, a comparative analysis of a large number of chromatograms can be performed in either a targeted or nontargeted approach. It automatically determines appropriate quantification ions and performs an integration of single ion peaks. The analysis results can directly be visualized with a principal component analysis. Since the manual input is limited to absolutely necessary parameters, the program is also usable for the analysis of high-throughput data. However, the intuitive graphical user interface of MetaboliteDetector additionally allows for a detailed examination of a single GC/MS chromatogram including single ion chromatograms, recorded mass spectra, and identified metabolite spectra in combination with the corresponding reference spectra obtained from a reference library. MetaboliteDetector is able to import GC/MS data in NetCDF and FastFlight format.
Provides an implementation of cross-contribution compensating multiple standard normalization (CCMN) method, as well as other normalization algorithms. crmn is an R package including the CCMN algorithm, which is applicable to data sets coming from randomized experiments where the nature of the biological effect is known and the systematic error is monitored by internal standards (ISs).
Serves for integration of metabolomic data from multiple batches, visualization and statistical analysis for non-targeted and targeted approaches of quantitative mass spectrometry-based omics data. StatTarget is based on the QC-RFSC algorithm to remove inter- and intra-unwanted variation. This tool aims to improve data quality of quantitative mass spectrometry-based omics data. It can be extended to perform on other biological data such as protein or peptide expression data.
Normalizes the uploaded data using twelve different well known normalization methods and compares the resulting data based on quantitative and qualitative parameters. Normalyzer is completely automated online tool for data normalization. There are no parameters to configure and no scripts to install.
Permits within-and between batch correction of liquid chromatography-mass spectrometry (LC-MS) metabolomics data. batchCorr is an R package implementing an approach that includes multiple algorithms, developed to overcome some of the measurement errors in LC-MS metabolomics. It introduces two methods: between-batch feature alignment and within-batch cluster-based drift correction. The provided algorithms can be used either alone or in combination to suit any particular analytical situation.
Serves for largescale metabolomics data normalization and integration. MetNormalizer consists of a machine learning algorithm-based method, support vector regression (SVR). It can remove the intra-batch and inter-batch variations during liquid chromatography-mass spectrometry (LC–MS) analysis and enhance the power of statistical analysis for biomarker discovery purposes. This package is designed to perform data processing utilizing SVR normalization.