Computational protocol: Metaproteomic analysis of human gut microbiota: where are we heading?

Similar protocols

Protocol publication

[…] Metaproteomics workflow typically includes sample collection, protein extraction, fractionation, mass spectrometry (MS) analysis and database searches []. For human gut microbiota study, fecal and mucosal lavage samples are commonly employed to characterize global proteome of the entire gut and the mucosa interface, respectively. This mini review will focus on fecal sample as it is more widely used for metaproteomics. Sample storage is a crucial yet sometimes overlooked step in metaproteomics. Several independent studies have revealed that different storage temperatures may introduce considerable alterations to the microbial profiles and highlighted that proper storage is critical to maintain sample stability [–]. Moreover, it was found that frozen intact fecal material was more stable than frozen extracted proteins, hence is recommended for long-term storage [].Apart from storage, sample processing is another key step in metaproteomics. Sample preparation protocol primarily depends on the research questions, which isolate either host or microbial proteins or both. Most previous studies have focused on proteins of microbial origin and employed centrifugation to remove other interfering substances. However, it was observed that despite greater microbial protein identifications, the centrifugation step caused considerable protein loss due to non-specific removal of microbial cells, which led to bias in the analysis []. Conversely, stool without pretreatment provides a better representation of the microbial proteins and allows concurrent analysis of human proteins. This highlights the importance of careful consideration in selecting a suitable approach for sample processing. Alternatively, a double filtering separation step has been shown useful to deplete human proteins for selective enrichment of microbial proteins, which was demonstrated to enhance proteome coverage by facilitating the identification of low-abundance proteins [].Next, efficient protein extraction from the complex microbial samples is critical to allow accurate representation of the intracellular protein content. In the metaproteomic analysis of environmental samples, different protein extraction methods have been shown to isolate different subset of proteins with only minimal overlap, which underlines the importance of selecting appropriate protocol to obtain optimal protein sample []. For gut microbiota study, several studies have indicated that mechanical disruption by bead beating was an efficient protein extraction method, particularly for lysing Gram-positive bacteria [, ]. Thus far, there is a major gap in the characterization of extracellular proteins that may serve as major mediators of host-microbiota interactions. The challenge to capture the secreted proteins from a complex ecosystem is huge, as consideration for intracellular protein removal either from the host or microbiota must be taken into account. Fecal samples may provide sufficient protein yield for this kind of secretome study but protein loss is inevitable given the necessity of an extensive clean-up procedure that follows due to the nature of the sample itself. Fecal proteins may also undergo some alteration along the intestine. Lichtman et al. described the enrichment of secreted gut luminal proteins from feces that can be applied to facilitate analysis of secreted host proteome []. Other than that, targeted analysis of specific subcellular fraction such as membrane proteins and post-translational modifications are also likely to provide additional functional insights.To date, MS remains as the analytical platform of choice for metaproteomics. Prior to MS analysis, extensive fractionation using multidimensional LC separations (GeLC-MS/MS or 2D-LC-MS/MS) is particularly useful to reduce sample complexity and improve protein identification. The final and fairly demanding stage for metaproteomics is data analysis. Several software tools such as Pipasic [], MetaProteomeAnalyzer [] and Unipept [] have been developed to facilitate metaproteomic data analysis. One of the key elements for a successful metaproteomic study is the availability of a relevant database for mass spectra searching. Strategy using either matched or unmatched metagenomes has been successfully employed for metaproteomic protein identifications [, ]. Furthermore, iterative workflow using synthetic metagenome generated from known gut microbiota has been shown successful to enhance protein identifications [].The choice of database is a critical factor in data analysis. Parallel use of multiple databases to improve protein yields may be the way forward as demonstrated by Tanca et al. in which the use of different databases in gut microbial metaproteome data analysis has led to complementary identification of unique peptides []. More recently, a data analysis pipeline coupling publicly accessible gene catalog databases with iterative database searching known as MetaPro-IQ was introduced by Zhang et al. []. The pipeline enabled efficient identification and quantification of over 120,000 peptides corresponding to >30,000 protein groups from human and mouse gut microbial metaproteome. To date, it represents the most extensive metaproteome coverage and appears to be a promising approach for future metaproteomic study. […]

Pipeline specifications

Software tools Pipasic, MPA, Unipept
Application MS-based metaproteomics
Organisms Homo sapiens
Diseases Neoplasms