Computational protocol: Proteome remodelling by the stress sigma factor RpoS/σS in Salmonella: identification of small proteins and evidence for post-transcriptional regulation

Similar protocols

Protocol publication

[…] Raw data were analysed using MaxQuant software version, using the Andromeda search engine. The MS/MS spectra were searched against the Salmonella Typhimurium strain 14028 s UniProt database containing 5,369 proteins, and against the contaminant file included in MaxQuant. The digestion mode was set to trypsin, and a maximum of two missed cleavages were allowed. N-terminal acetylation and Methionine oxidation were set to variable modifications and Cysteine Carbamidomethylation as fixed modification. Identification of protein required at least one unique peptide per protein group, and every peptide were used only once in the protein identification process by the Razor protein FDR parameter. The minimum peptide length was fixed to 7 amino acids, and the required false discovery rate was set to 1% at the peptide and protein level. The main search peptide tolerances was set to 4.5 ppm and to 20 ppm for the MS/MS match tolerance. Second peptides was enabled to identify co-fragmentation events and match between runs accepted a match time window of 0.7 min for an alignment time window of 20 min. Quantification was performed using the XIC-based LFQ algorithm, with the Fast LFQ mode as described in ref. . Unique and razor peptides, included modified peptides, with at least 2 ratio counts were accepted for quantification. [...] Output protein group file was integrated into Perseus, the companion software of MaxQuant, to perform data filtering and statistical tests. First, contaminants, reverse identifications, and proteins only identified by site were excluded from further data analysis and a categorical annotation was applied to create two sample groups according to the two types of bacterial strain in triplicate. Second, LFQ intensities were log2 transformed. A protein filtering was set for the validation process, such as a protein was integrated in the final list only if the protein was identified in at least two replicates of one sample group. Third, statistical analysis of the proteome adaptation between the two bacterial strains was performed on the 2444 filtered proteins. To this effect, we decided to analyse and compare our dataset with (SI approach) and without (AI approach) missing values. Missing values for LFQ intensities were imputed and replaced by random LFQ intensities that were drawn from a normal distribution at the low detection level (Supplementary Fig. ). Yellow indicated imputated values in Supplementary Datasets  and . In both cases, two-sided T-tests of the log2 transformed LFQ intensities with a permutation-based FDR calculation at 5%, 1%, 0.1% and S0 = 1 were employed to determine different degrees of statistically significant proteins. This statistical process is the base of the proteomic comparison between the two bacterial strains, which is represented by the two Volcano-Plots, plotting the protein difference values against negative log10 transformed p-values of the two-sided T-test (Supplementary Figs  and ). Proteins detected in less than two replicates of one strain and in at least two replicates of the other strain, were designated as “exclusive” to that latter strain. With the SI approach, “exclusive” proteins were considered as significant proteins (Supplementary Dataset ). The SI approach yielded a list of 299 exclusives proteins (116 in the wild type strain and 183 in the ΔrpoS mutant), and three statistically significant sets of differentially abundant proteins. RpoS was found exclusively in the wild type strain, consistent with the deletion of the rpoS gene in the mutant, and was thus excluded from the final list of σS-regulated proteins (Fig. ). For further analyses, changes in protein abundance were considered significant only when meeting the threshold of p-value 0.05 (log10 > 1.3, Supplementary Dataset ). This yielded a final list of 806 significant proteins showing differential abundance in the wild type and ΔrpoS strains, among which 400 proteins were selected only with a 0.5% FDR, 223 were selected with 0.5 and 0.1% FDR, and 183 were selected with 0.5, 0.1 and 0.01% FDR (Fig. , Supplementary Dataset ). The AI approach, used to evaluate the significance of the 299 exclusive proteins, yielded a final list of 134 statistically significant (p-value < 0.05) exclusive proteins, including RpoS itself (Fig. , Supplementary Dataset ). In total, the abundance of 939 proteins (133 exclusives and 806 differentially abundant) was regulated by σS (Fig. , Supplementary Dataset ). […]

Pipeline specifications

Software tools MaxQuant, Andromeda, Perseus
Application MS-based untargeted proteomics
Organisms Salmonella enterica subsp. enterica serovar Typhimurium str. SL1344, Bacteria, Homo sapiens, Salmonella enterica subsp. enterica serovar Typhimurium
Chemicals Inositol, Ethanolamine