Computational protocol: Analytical challenges of untargeted GC MS based metabolomics and the critical issues in selecting the data processing strategy

[…] The full experimental design, procedures, and statistical methods are described in the . The clinical characteristics of the participants have been described previously .In brief, the longitudinal cohort of this study constituted 61 Chinese pregnant women who completed their antenatal care at the First Affiliated Hospital of Chongqing Medical University. Of the 61 participants, 34 had normal glucose tolerance (controls), and 27 met the diagnostic criteria for gestational diabetes (GDM) based on the International Association of Diabetes and Pregnancy Study Groups recommendations . Blood samples were collected on the scheduled antenatal visits, one in each trimester. Samples were stored at – 80°C until analysis.An enhanced GC-MS method was employed to investigate the longitudinal change of non-esterified fatty acids (NEFAs) and other aromatic metabolites in the maternal plasma of women who developed GDM and healthy pregnancies (controls). To enhance the separation of cis- and trans- isomers of mono- and polyunsaturated fatty acid, methyl esters, a 100 m long biscyanopropyl/phenylcyanopropyl polysiloxane column was used. EDTA-treated plasma samples were thawed on ice and extracted with methanol/toluene pre-mixed with internal standards. The extracts were derivatized with acetyl chloride solution in round-bottom glass tubes with screw caps and sealed. The tubes were then heated and stirred at 100°C for 1h. NEFAs were derivatized to their fatty acid methyl esters (FAMEs). The organic layer was recovered and analysed directly by GC-MS after neutralisation with aqueous potassium carbonate solution. GC-MS data were acquired with an Agilent GC-MS system in the splitless mode. An RESTEK Rtx®-2330 column (90% biscyanopropyl/10% phenylcyanopropyl polysiloxane) was installed in the system. The column temperature was computer controlled and was ramped from 45°C to 215°C in over 65 mins. Data pre-processing was performed in the Agilent MassHunter suit (version 8 of Qualitative Workflows and Profinder), Metabolite Detector (version 2.5), and AMDIS (Automated Mass Spectral Deconvolution and Identification System) (version 2.72), and the accuracy of data extraction of these software tools was compared. Data was further processed and analysed with five different normalisation methods (CRMN, EigenMS, PQN, SVR and LOWESS). The performance of the normalisation methods and the marker candidates identified were investigated. PCA was performed with EZinfo (version 3.0.3). Multilevel PCA was performed using mixOmics (version 6.1.3). Pareto scaling was used in PCA and mPCA modelling. RLA plots were drawn with the RlaPlots function of the package metabolomics (version 0.1.4). ROC was calculated with the colAUC function of caTools (version 1.17.1). Binomial logistic regression was performed with the glm function of R (version 3.3.3). […]

