Computational protocol: α-Hydroxybutyrate Is an Early Biomarker of Insulin Resistance and Glucose Intolerance in a Nondiabetic Population

Similar protocols

Protocol publication

[…] The data extraction of raw mass spectra data files yielded information that was loaded into a relational database and manipulated without resorting to BLOB manipulation. Once in the database the information was examined and appropriate QC limits were imposed. Peaks were identified using Metabolon's proprietary peak integration software, and component parts were stored in a separate and specifically designed complex data structure.The median relative standard deviation (MRSD), a quality assurance metric of quantification and measure of instrument variability, was determined to be 8% for a panel of 30 internal standards. Overall process variability (i.e., extraction, recovery, resuspension, and instrument performance) for endogenous biochemicals within technical replicate plasma samples was calculated to be 15% MRSD. These SD values reflected acceptable levels of variability for overall process and instrumentation of the analytical platform.A variety of data curation procedures were carried out to ensure that a high quality data set was made available for statistical analysis and data interpretation. The QC and curation processes were designed to ensure accurate and consistent identification of true chemical entities, and to remove those representing system artifacts, mis-assignments, and background noise. Metabolon data analysts use proprietary visualization and interpretation software to confirm the consistency of peak identification among the various samples. Library matches for each compound were checked for each sample and corrected if necessary. In addition to rigorous identification, the quality of the automated Metabolyzer integration (basis of quantitation) was verified for each biochemical.For QA/QC purposes a number of additional samples were included with each day's analysis. Briefly, a selection of internal standards was added to every sample, immediately prior to injection into the instrument. These compounds were carefully chosen in order to not interfere with measurement of endogenous compounds. These QC samples were primarily used to evaluate process control for each study. Additionally, a small aliquot of each experimental sample was pooled together to serve as a technical replicate for duration of the run. This technical replicate sample was injected throughout the platform run day and across all run days, allowing variability in quantitation of all consistently detected biochemicals in the experimental samples to be monitored. With this monitoring, a metric on overall process variability was assigned for the platform's performance based on quantitation of metabolites in actual experimental samples (see section). [...] Data are given as median and [interquartile range]. Classification and Regression Trees (CART), Random Forest (RF) , multiple linear regression, correlation, and logistic regression analyses were carried out on untransformed data, whereas log-transformed data were used for t-testing. When data from NGT, IGT, or IFG categories were used in comparisons for classification by RF, the number of in-bag samples was set to 50% of smallest sub-group to account for unbalanced samples sizes. For platform screening data and targeted analytical data, we used 50,000 and 1,000 trees, respectively. Random forest analysis was performed using the R-package “randomForest” . Partition analysis (JMP) was employed to find the metabolite value that best separated the MFFM value into two groups. Multiple logistic regression tested the independent association of metabolites with lower tertile of insulin resistance; results are given as the odds ratio and 95% confidence interval (C.I.). Statistical analyses were performed using JMP (JMP, Version 8. SAS Institute Inc., Cary, NC, 1989–2009), and “R” (http://cran.r-project.org/). […]

Pipeline specifications

Software tools MetaboLyzer, randomforest
Applications Miscellaneous, MS-based untargeted metabolomics
Diseases Cardiovascular Diseases, Diabetes Mellitus, Glucose Intolerance
Chemicals Glucose