Computational protocol: 13C NMR Metabolomics: Applications at NaturalAbundance

Similar protocols

Protocol publication

[…] Synthetic mixtures with known metabolite compositions were constructed to test the limits of detection of the probe and our ability to correctly identify known compounds. Two groups of mixtures (groups A and B, five replicates each) were each made using 20 common synthetic metabolites ranging in concentration from 1 to 5 mM (Table ). The first 10 metabolites were at equal concentrations in all mixtures in both groups, and the remaining 10 metabolites differed between groups with half higher in group A and half higher in group B (Table ). Rather than prepare a uniform batch to eliminate concentration differences between samples, each metabolite was pipetted individually into each mixture. This was to increase the random mixture-to-mixture variation of individual metabolites both within and between groups due to pipetting errors. This random variation between samples allowed us to develop and validate the statistical methods (i.e., STOCSY) used in this study. Variation was not controlled in this experiment and is assumed to be random. Each sample was brought to 50 μL using 99% D2O (Cambridge Isotope Laboratories), of which 40 μL was pipetted into a 1.5 mm (OD) NMR tube (Norell). [...] One-dimensional 1H and 13C spectra were collected on an Agilent VNMRS-600 spectrometer using a custom 1.5 mm 13C high-temperature superconducting (HTS) probe. Synthetic mixture and fly extract 1H 1D data were collected in ∼2.5 min using a simple pulse sequence with presaturation of residual water, a spectral width of 12 ppm (7183.9 Hz), an observe frequency of 599.68 MHz, a 2.0 s relaxation delay, a 45° pulse, and 2.3 s acquisition time. Mouse serum 1H data were collected in 18 min using a Carr–Purcell–Meiboom–Gill (CPMG) sequence, to remove the protein signal contribution resulting in a flat baseline. Mouse data were recorded with a spectral width of 16 ppm (9615.4 Hz), a 2.0 s relax delay, a 90° initial pulse, a train of 124 180° pulses with a 1 ms interpulse delay, and a 2.0 s acquisition time. All 13C spectra were collected under conditions that favor nuclei with short T1 relaxation times to maximize overall sensitivity and minimize measurement time: a 60° pulse with a 0.1 s relaxation delay and a 0.8 s acquisition time. The total time for each 13C spectrum was ∼2 h, with a 212 ppm (32051.3 Hz) spectral window and a carrier frequency of 98.0 ppm at a frequency of 150.79 MHz. The mouse serum 13C CPMG sequence used a 90° excitation pulse followed by a train of 18 180° pulses with a 1 ms interpulse delay. All 13C spectra were recorded using continuous 1H decoupling at 599.68 MHz with a power of 37 dB using WALTZ-16. All NMR spectra were processed in NMRPipe. All 1H spectra were processed using a cosine-squared window function, zero-filled 2×, Fourier-transformed, and baseline-corrected. 13C spectra were processed using an exponential window function with a 2 Hz line broadening, zero-filled 2×, Fourier-transformed, and baseline-corrected. The synthetic mixture and fly extract 1H spectra were referenced to TSP (0.0 ppm); mouse serum 1H spectra were referenced to the lactate peak at 4.1 ppm, because TSP binds to proteins. 13C spectra were referenced to the anomeric carbon glucose peak at 98.64 ppm.Fourier-transformed and referenced spectra were imported into an in-house MATLAB Metabolomics Toolbox that has been developed in our laboratory. This Toolbox is a collection of MATLAB scripts that includes various analytical tools for the alignment, normalization, scaling, and multivariate analysis of complex data. The Metabolomics Toolbox has several options for alignment, but we found that the star alignment algorithm where an individual spectrum is chosen as a “star” to which all other spectra are aligned, gave the best results in this study. For the statistical correlations, we binned the 1H spectra using an open source optimized bucketing algorithm described by Sousa et al.13C spectra were peak picked by setting all resonances below 2× the standard deviation of all spectra to zero, finding local maxima, and then merging peaks within 0.2 ppm together. Optimal normalization and scaling methods were specific for each data set. 13C and 1H spectra of the synthetic mixtures and mice serum were normalized using probabilistic quotient normalization (PQN) and scaled using pareto scaling. PQN can be used with NMR spectra in the absence of a reliable internal standard. Pareto scaling is similar to autoscaling except that it keeps more to the original data set and is less susceptible to noise. This prevents the high-intensity peaks from dominating our models. 1H fly spectra were normalized to TSP (an internal standard) and scaled using range scaling. Range scaling is also similar to autoscaling by allowing all metabolites to become equally important. Unlike autoscaling, it uses the biological range as a scaling factor. This is useful for exploratory analysis as in our experiments and multivariate statistical analysis were performed with no previous knowledge of the contents of our fly mixtures. Range scaling has been shown to give biologically sensible results. Principal component analysis (PCA) was conducted on the 1D 1H and 13C data sets of the synthetic mixtures and fly samples. Partial least-squares-discriminant analysis (PLS-DA) was conducted on the 1D 1H and 13C data sets of the both the fly samples and mouse serum. All NMR raw data, processing scripts, and MATLAB code were deposited in the Metabolomics Workbench database (http://www.metabolomicsworkbench.org/) supported by the NIH Common Fund. […]

Pipeline specifications

Software tools STOCSY, NMRPipe
Application NMR-based metabolomics
Organisms Drosophila melanogaster