Computational protocol: The Rab-binding Profiles of Bacterial Virulence Factors during Infection*

Similar protocols

Protocol publication

[…] The data were processed using MaxQuant (version () and peptides were identified by matching MS/MS spectra with reference human (Uniprot, downloaded on 19/01/2015) and L. pneumophila strain 130b (ORF extraction of draft genome, Schroeder et al., ()) proteomes using Andromeda search engine (). N-terminal acetylation and methionine oxidation were selected as variable modifications. No fixed modifications were set. Reference proteomes were in silico digested using the trypsin/P setting whereby cleavages were allowed after arginine/lysine residues but only if it is not followed by a proline. Light (+28 Da) and heavy (+32 Da) dimethyl labeled lysines and N termini were used for quantification by a built-in algorithm in MaxQuant (). Up to two missed cleavages were allowed. The false discovery rate was set to 0.01 for peptides, proteins, and sites. All other parameters were as pre-set for the software.The data were further processed using Perseus (Version Samples from the same cell line were processed together. Reverse and identified by site hits were removed. Proteins identified with at least 1 unique and 1 razor peptide were included for further analysis. MS/MS spectra of proteins identified by a single unique peptide are shown in Supplementary MS Spectra. Light (Bio) and heavy (K/A) intensities were logarithmized (log2). Replicates were grouped together and at least two valid values across three (SAP/TAP, THP-1, lysis buffers, formaldehyde concentrations and LidA experiments) or four (crosslinker reactivity experiment) replicates were required for at least one group as a threshold for a positive protein identification. No unique peptide threshold was applied per sample to identify as many potential interactors as possible and not exclude proteins prematurely. Missing log2 intensity values were imputed using a downshifted normal distribution (1.8 downshift, 0.3 width) for each sample individually as an estimate of the detection limit of intensity for each sample. Enrichment factors (the difference in average log2 intensity between Bio and K/A samples) were calculated using imputed values if required. Proteins were ranked according to this enrichment factor for each experiment, resulting in Top10 ranked enriched proteins. Heat maps were generated using log2 light and heavy intensities with imputed values removed. Proteins were classified into five possible categories: Bio-specific (protein is only identified in the Bio sample), Bio enriched (enrichment factor ≥2), nonspecific (-2≤enrichment factor≤2), K/A enriched (enrichment factor ≤-2) and K/A-specific (protein is only identified in the K/A sample). These 5 categories were combined into two broader groups: interactors (Bio specific and Bio enriched) and non-interacting proteins (nonspecific, K/A-enriched, and K/A-specific). MS tables are found in Supplementary MS Tables. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD003573. […]

Pipeline specifications

Software tools MaxQuant, Andromeda, Perseus
Application MS-based untargeted proteomics
Organisms Legionella pneumophila
Diseases Infection, Legionnaires' Disease