Computational protocol: Comparative 'omics analyses differentiate Mycobacterium tuberculosis and Mycobacterium bovis and reveal distinct macrophage responses to infection with the human and bovine tubercle bacilli

Similar protocols

Protocol publication

[…] Computational analyses were performed on a 32-node Compute Server with Linux Ubuntu (version 12.04.2). Briefly, adapter sequence contamination and paired-end reads of poor quality were removed from the raw data. At each step, read quality was assessed with FastQC (version 0.10.1) []. Single-end reads were aligned to the M. bovis AF2122/97 or M. tuberculosis H37Rv reference genomes with the aligner Stampy in hybrid mode with the BWA algorithm []. Read counts for each gene were calculated using featureCounts, set to unambiguously assign uniquely aligned single-end reads in a stranded manner to gene exon annotation [].Prior to cross-species differential expression analysis, an Identical/Variable gene dataset was constructed for M. tuberculosis H37Rv and M. bovis AF2122/97 where orthologous genes were separated into those whose protein products are of equal length and 100 % conserved at the amino acid level (Identical, n=2775) from those that are not (Variable, n=1224) ( and Table S1, available in the online version of this article). Among the Variable genes are examples of truncated genes, genes that have been split into two or more as a result of in-frame sequence variance (leading to some genes being represented in more than one Variable gene pair), or genes that differ by a non-synonymous SNP resulting in an amino acid change at the protein level. Negative binomial modelling tools such as DESeq2, which was used in this instance, assume equal feature lengths when calculating differential expression (DE) of a gene, or in this case between orthologous genes of two species in a given condition []. For those annotations whose gene lengths are not equal, such as in the truncated/elongated/frameshift instances found in the M. bovis AF2122/97 genome with respect to M. tuberculosis H37Rv, analysis with DESeq2 would result in erroneous differential expression results; thus, a separate differential expression analysis was carried out for Variable genes using transcript per million (TPM) values that are normalized for feature length []. Differential gene expression analysis for those genes of equal lengths was performed using the DESeq2 pipeline, correcting for multiple testing using the Benjamini–Hochberg method []. All further reference to DE genes between the two mycobacterial species will be with regard to a gene being expressed at a higher level in one species with respect to the other, and hence if a gene is upregulated in M. bovis AF2122/97 it is downregulated or expressed at a lower level in M. tuberculosis H37Rv and vice versa [log2 fold change |log2FC>1|, false discovery rate (FDR) threshold of significance <0.05]. [...] Mycobacterial cells were harvested by centrifugation at 2500 g for 10 min and the pellet was re-suspended in 1 ml LB buffer (0.1 M ammonium bicarbonate buffer, 8 M urea, 0.1 % RapiGEST SFl;Waters). The suspension was transferred to a 2 ml screw cap tube and the cells were lysed by bead-beating for 30 s at maximum setting using 3 mm glass beads (Sigma) and a MagNaLyser instrument (Roche). The lysate supernatant was harvested by centrifugation at 12 000 g for 10 min and transferred to a clean 1 ml tube. The remaining pellet was re-suspended in LB buffer and the bead-beating cycle was repeated twice more. Protein lysate samples were stored at −80 °C. Protein samples were removed from −80 °C storage and thawed on ice. Total protein content was measured using the Qubit Protein Assay kit according to the manufacturer’s guidelines and protein concentrations were adjusted to 0.5 mg ml−1. Protein disulphide bonds were reduced by addition of 0.2 M Tris(2-carboxyethyl)phosphine (TCEP) and the resulting free cysteine residues were alkylated by addition of 0.4 M iodoacetamide (IAA). Extracted protein samples were diluted with 0.1 M ammonium bicarbonate buffer to reach a urea concentration of <2 M and then digested with 1 : 50 enzyme/substrate ratio of sequencing-grade modified trypsin (Promega). Then, 50 % trifluoroacetic acid (TFA) was added to lower the pH to 2 in order to stop the tryptic digest and to precipitate the RapiGEST. Water-immiscible degradation products of RapiGEST were pelleted by centrifugation at 12 000 g for 10 min. The cleared peptide solution was desalted with C18 reversed-phase columns (SepWPak Vac C18; Waters). The columns were pre-conditioned 2–3 times with acetonitrile and equilibrated three times with Buffer A (2 % acetonitrile, 0.1 % TFA in H2O) prior to sample loading. The flow-through was re-loaded onto the column and the column was then washed three times with Buffer A. The peptides were eluted from the column using Buffer B (50 % acetonitrile, 0.1 % TFA in H2O) and the elution step was repeated. The eluate was dried under vacuum using a rotary evaporator at 45 °C. Dried peptide pellets were re-suspended in MS buffer (2 % acetonitrile, 0.1 % TFA in ultrapure H2O) to a concentration of 1 µg µl−1, sonicated in a water bath for 3 min and supernatant was harvested by centrifugation at 12 000 g for 10 min.SWATH MS measurements were conducted at the Institute for Molecular Systems Biology at ETH Zurich. One microgram of each peptide sample was measured in SWATH mode on a TripleTOF 5600 mass spectrometer using data-independent acquisition settings as described earlier []. Resulting raw SWATH data were analysed using an automated pipeline and the software OpenSWATH with the M. tuberculosis H37Rv SWATH assay library []. Differential expression analysis of proteins identified in M. tuberculosis H37Rv and M. bovis AF2122/97 samples was dependent on the detection of the protein in both species. The differences in protein fold changes and the corresponding FDR corrections between M. tuberculosis H37Rv and M. bovis AF2122/97 were calculated using MSstats []. A |log2FC|>0.56 and an FDR<0.05 was required for a protein to be defined as differentially expressed between M. tuberculosis H37Rv and M. bovis AF2122/97. [...] Computational analyses was performed on a 32-node Compute Server with Linux Ubuntu (version 12.04.2). Briefly, pooled libraries were deconvoluted, and adapter sequence contamination and paired-end reads of poor quality were removed. At each step, read quality was assessed with FastQC (version 0.10.1) []. Paired-end reads were aligned to the Bos taurus reference genome (B. taurus UMD3.1.1) with STAR aligner []. Read counts for each gene were calculated using featureCounts, set to unambiguously assign uniquely aligned paired-end reads in a stranded manner to gene exon annotation []. Differential gene expression analysis was performed using the edgeR Bioconductor package that was customized to filter out all bovine rRNA genes, genes displaying expression levels below 1 count per million (CPM) in at least ten individual libraries and identify DE genes between all pairs of infection groups within each time point, correcting for multiple testing using the Benjamini–Hochberg method with log2FC>1 and <−1 and an FDR threshold of significance <0.05 []. Cellular functions and pathways over-represented in DE gene lists were assessed using the SIGORA R package []. […]

Pipeline specifications

Software tools OpenSWATH, MSstats
Application MS-based targeted proteomics
Organisms Homo sapiens, Bos taurus, Mycobacterium bovis AF2122/97, Mycobacterium tuberculosis H37Rv, Mycoplasma bovis, Corynebacteriales