Computational protocol: Temporal Proteome and Lipidome Profiles Reveal Hepatitis C Virus-Associated Reprogramming of Hepatocellular Metabolism and Bioenergetics

Similar protocols

Protocol publication

[…] Both peptide and lipid LC-MS datasets, defined as the data obtained from a single LC-MS analysis, were processed using the PRISM Data Analysis system , a series of software tools (e.g. Decon2LS , VIPER ; freely available at developed in-house. The first step involved deisotoping of the raw MS data to give the monoisotopic mass, charge state, and intensity of the major peaks in each mass spectrum. The data were next examined in a 2-D fashion to identify groups of mass spectral peaks that were observed in sequential spectra using an algorithm that computes a Euclidean distance in n-dimensional space for combinations of peaks. Each group, generally ascribed to one detected species and referred to as a “feature”, has a median monoisotopic mass, central normalized elution time (NET), and abundance estimate computed by summing the intensities of the MS peaks that comprise the entire LC-MS feature.The identities of detected features of both peptides and lipid LC-MS datasets were initially determined by comparing their measured monoisotopic masses and NETs to the calculated monoisotopic masses and observed NETs of each of the peptides or lipids in an accurate mass and time (AMT) tag database within search tolerances of ±5 ppm and ±0.02 NET for monoisotopic mass and elution time, respectively . The AMT tag database utilized for peptide matching was a composite of all previous published Huh-7.5 and human liver tissue MS/MS analyses ,. In contrast, the lipid AMT tag database was constructed from human plasma, erythrocyte, and lymphocyte lipids. Non-linear chromatographic alignment of LC-MS datasets was performed with the LCMSWARP algorithm during database matching by using the NETs of either peptide or lipid AMT tags as retention time locks. The identities of some features that did not match entries in the lipid AMT tag database were determined manually based on accurate mass, isotopic distribution (using the in-house software IsotopicDistributionModeler), and MS/MS information, as previously described .In regard to peptide data analysis, the abundance ratios (18O/16O) for labeled peptide pairs were accurately computed using an equation as previously reported ,. All ratios corresponding to peptide sequences which overlapped between multiple protein groups, based upon ProteinProphet results , were removed as the exact protein source of these peptide sequences is ambiguous. After rolling-up all remaining quantified peptides into non-redundant protein groups using the ProteinProphet results, the corresponding 18O/16O intensity data was loaded into Rosetta Elucidator (Rosetta Biosoftware, Seattle, WA) and an error-model for 18O-labeled FTICR data was applied as previously described . Ratios from multiple observations of the same protein across the 5 SCX fractions were then rolled up to compute a final protein abundance ratio for all proteins identified in a given sample and to identify those proteins exhibiting statistically significant (p≤0.05) changes in abundance compared to the control sample.For lipid analysis, after chromatographic alignment and database matching, intensity normalization was applied using the expectation maximization algorithm . Briefly, this algorithm analyzes the histogram of log ratios of intensities of features common to two or more datasets and finds the peak apex of this distribution by assuming that the histogram is a mixture of a normal density corresponding to unchanged features and uniform density background corresponding to changed features. The expectation maximization algorithm calculates the normal and uniform parts of the histogram, and a shift in intensity is applied to all features in the aligned dataset. It is important to note that all lipid features (i.e. both identified and unidentified) were considered during intensity normalization.The set of normalized lipid features (both identified and unidentified) was then transformed to log 2 scale and comparative data analysis was performed on two levels. The first level considered only complete data, i.e. those lipid features detected in every LC-MS dataset. The second level allowed for some missing data; a feature was required to be observed in both LC-MS replicates of two out of three conditions (mock, HCVcc, and UV-HCVcc). It is important to note that more observations than the required minimum were present for most lipid features within a culture condition. The data matrices corresponding to these two levels were analyzed separately using Matlab, and changes in the lipid profiles as functions of time and condition were determined using analysis of variance (ANOVA, p<0.05). Lipid features that were significantly different by ANOVA were further analyzed using principal component analysis (PCA) . Abundance values for missing lipid features were estimated as the average lipid abundance obtained from the same features observed in the remaining LC-MS datasets to aid visualization in PCA only. Finally, a table showing the associated p values, average lipid abundances, lipid abundance standard deviations, mass-to-charge (m/z) ratio, NET, and lipid identities were generated and are provided as supplementary data. […]

Pipeline specifications

Software tools DeconTools, VIPER, ProteinProphet
Application MS-based untargeted proteomics
Organisms Classical swine fever virus, Homo sapiens
Diseases Hepatitis C
Chemicals Citric Acid