Computational protocol: Taxonomic and Functional Diversity of Soil and Hypolithic Microbial Communities in Miers Valley, McMurdo Dry Valleys, Antarctica

Similar protocols

Protocol publication

[…] The two most representative samples for each statistically supported grouping as delineated in the t-RFLP analysis were further interrogated via barcoded pyrosequencing using the Roche GS Junior System (454 Life Sciences Corp., Branford, CT, USA). Since the objective was to estimate diversity in samples, we delineated the most representative samples on the basis of greatest number of shared operational taxonomic units (OTUs) as indicated by t-RFLP analysis. Amplification of 16S rRNA genes was achieved using primer pair 341F and 907R () with PCR conditions as described above. For each amplicon library (n = 2) purification was carried out with Agencourt AMPure XP Bead (Beckman Coulter, CA, USA) according to manufacturers instructions. The library was quantified with Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen Life Technologies, NY, USA2) using FLUOstar OPTIMA F fluorometer (BMG Labtech GmbH, Offenburg, Germany) and library quality was assessed with the FlashGel System (Lonza Group Ltd., Basel, Switzerland). Emulsion-PCR was carried out with GS Junior Titanium emPCR Kit (Lib-L, 454 Life Sciences Corp., CT, USA) according to the emPCR Amplification Method Manual – Lib-L, Single-Prep. The sequencing reaction was carried out with the GS Junior Titanium Sequencing Kit and GS Junior Titanium PicoTiterPlate Kit (454 Life Sciences Corp.) according to the manufacturers instructions. The sequencing run was conducted in 200 cycles.Pyrosequencing reads were sorted according to barcoding prior to analysis and processing using the software package MOTHUR (). De-noising was carried out with sequences removed from analysis if they met any of the following criteria: the length was shorter than 300 bp; with an average quality score less than 25; contained ambiguous characters or more than six homopolyers; or did not contain the primer sequence or barcode. In order to remove sequences that were probably due to pyrosequencing errors, sequences were pre-clusted using a pseudo-single linkage algorithm as implemented in MOTHUR. Chimera check was performed using UCHIME with the de novo mechanism (). Hierarchical clustering was performed with the remaining sequences to form clumps that were small enough to align using USEARCH (). A master set was created using the longest sequence from each clump. Sequences in the clumps and master set were aligned using MUSCLE (). The aligned sequences were merged into a final alignment with the master set as a guide. Alignment columns containing more than 90% gaps were trimmed using trimAL (). To correct the differences in sequencing depth among individual samples, the datasets were rarefied to 2,600 sequences. Rarefaction was carried out using MOTHUR () and phylogenetic trees constructed with FastTree () to compare phylogenetic similarity between samples as calculated by the weighted UniFrac metrics (). The distances in UniFrac matrix were calculated based on the fraction of branch length shared between two communities within a phylogenetic tree. Alpha diversity was assessed by constructing the rarefaction curves defined at 97% sequence similarity cutoff for OTUs. Taxonomic classification of 16S rRNA gene sequences was made using the ribosomal database project Classifier (). Sequence data have been deposited in NCBI’s sequence read archive under accession number SRA052054.1. […]

Pipeline specifications