Computational protocol: Quantitative Proteomic ProfilingReveals DifferentiallyRegulated Proteins in Cystic Fibrosis Cells

Similar protocols

Protocol publication

[…] Tandem mass spectra were extracted from the Xcalibur data system format (.raw) into MS2 format using RawXtract1.9.9.2. The MS/MS spectra were searched with the ProLuCID algorithm against the human SwissProt database (downloaded March 2014) that was concatenated to a decoy database in which the sequence for each entry in the original database was reversed. The search parameters include 10 ppm peptide precursor mass tolerance and 0.6 Da for the fragment mass tolerance acquired in the ion trap; carbamidomethylation on cysteine was defined as fixed modification in the search criteria. The search space also included all fully and semitryptic peptide candidates with a length of at least six amino acids. Maximum number of internal miscleavages was kept unlimited, thereby allowing all cleavage points to be considered. ProLuCID outputs were assembled and filtered using the DTASelect2.0 program that groups related spectra by protein and removes those that do not pass basic data-quality criteria. DTASelect2.0 combines XCorr and ΔCn measurements using a quadratic discriminant function to compute a confidence score to achieve a user-specified false discovery rate (1% for the current study). We accepted only those proteins that were supported by two or more lines of evidence.For label-free quantification, normalized spectral abundance factor (NSAF) values were calculated for proteins in each sample to account for protein size and variability between runs. Briefly, the NSAF for a protein k is the number of spectral counts (SpC, the total number of MS/MS spectra) identifying a protein, k, divided by the protein length (L), divided by the sum of SpC/L for all N proteins in the experimental design (eq ).1A critical assumption that must be satisfied for use of statistical approaches is that the data set being analyzed must have a normal/Gaussian distribution. Following elucidation of NSAF values, their natural logarithm (ln(NSAF)) was calculated, and a density plot of the distribution of ln(NSAF) values from replicates of each condition were generated to show the normality of the distribution (Figure S1 in the ). After establishing that both CFBE and HBE data sets fit a normal distribution, the data sets were statistically compared to determine the significance of the change between the two groups using Student’s t test (two-tailed unpaired t test). To determine the relative abundance of expressed proteins in CFBE cells relative to that in HBE, the data set was first filtered to include only those proteins that were detected in all three replicates for each condition and then the ratio of the mean of the NSAF values from three biological replicates of CFBE cells to the mean of NSAF values from three biological replicates of HBE cells was computed. Proteins were considered to exhibit significant expression changes with log2NSAFCFBE/HBE ≥ 0.58 (p < 0.05) (overexpressed in CFBE cells) and ≤ −0.58 (p < 0.05) (underexpressed in CFBE cells). We discarded the proteins from further quantitative analyses that were identified in both conditions but were found in less than three replicates in each group because of poor reproducibility. The NSAF value, t test, and ratio calculation were performed using Microsoft Excel. The graphs were drawn either in Excel or the R statistical package (http://www.r-project.org/). […]

Pipeline specifications

Software tools RawConverter, ProLuCID, DTASelect
Application MS-based untargeted proteomics
Diseases Cystic Fibrosis