Computational protocol: Global Analysis and Comparison of the Transcriptomes and Proteomes of Group A Streptococcus Biofilms

Similar protocols

Protocol publication

[…] RNA sequencing datasets in FastQ format were analyzed for quality using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) (). Reads were trimmed and Illumina adapters were clipped using Trimmomatic v 0.32 () with a leading and trailing minimum score of 3 and a 4-base sliding window minimum score of 15, which resulted in an average of 99.98% of reads surviving (range, 99.93% to 99.99%). Reads were mapped to the GAS MGAS5005 genome (NC_007297.1; NCBI) using Bowtie2 v 2.2.4 () run in end-to-end mode with default settings for an average overall alignment rate of 98.80% (range, 95.72% to 99.38%). Transcript abundances were calculated in fragments per kilobase per million mapped reads (FPKM) using Cufflinks v 2.2.1 () with a ribosomal masking file for all 5S, 16S, 23S, and tRNA loci (NC_007297.1.gff; NCBI). Cuffdiff (), a program within the Cufflinks package, was used to calculate differential expression values for genes with an FDR-adjusted P value (q value) of less than 0.01. Operon structure was predicted from the resulting Bowtie2 alignment files using Rockhopper v 2.03 (, ). [...] The MS datasets were searched against a S. pyogenes serotype M1 database (UniProt) using the Andromeda search engine () from the MaxQuant software package (). A bottom-up approach was employed, and MS1 peak intensity was used for the peptide quantification. MaxQuant LFQ values, which take MS1 peak intensity (extracted ion current) information, were used for the peptide quantification. Protein abundance profiles were assembled using the maximum possible information from MS signals, given that the presence of quantifiable peptides varies from sample to sample. Permutation-based methods for calculating q values and global FDRs were applied (). Search results were filtered with a false-discovery-rate cutoff of 0.01. Label-free quantification (LFQ) was performed using MaxQuant (). Because LC-MS/MS was performed on the early stationary proteomic samples using a different mass spectrometer, we were unable to include this time point in the LFQ analysis with the rest of the samples. Data from the early stationary time point were analyzed in a second, separate MaxQuant LFQ analysis and were therefore not adequate for comparison to the proteomic data from the other time points. Perseus v 1.5.1.6, a software package for shotgun proteomics data analysis (http://www.perseus-framework.org/), was used to calculate differential expression from the resulting LFQ intensity values. Differential expression values with a false-discovery-rate-adjusted P value (q value) of less than 0.01 were considered significant. […]

Pipeline specifications

Software tools Andromeda, MaxQuant, Perseus
Application MS-based untargeted proteomics
Organisms Streptococcus pyogenes