Computational protocol: Pinus flexilis and Picea engelmannii share a simple and consistent needle endophyte microbiota with a potential role in nitrogen fixation

Similar protocols

Protocol publication

[…] Sequences were analyzed and processed using the QIIME package (). Briefly, sequences were quality filtered (minimum quality score of 25, minimum length of 200 bp, and no ambiguity in primer sequence) and assigned to their corresponding sample by the barcode sequences. Samples with less than 200 sequences were removed. These included three samples from a P. engelmannii WE1 individual, and one sample from one P. engelmannii WE2 individual. The remaining sequences were clustered into phylotypes using UCLUST (), with a minimum coverage of 99% and a minimum identity of 97%. A representative sequence was chosen for each phylotype by selecting the longest sequence that had the highest number of hits to other sequences of that particular phylotype. Chimeric sequences were detected with ChimeraSlayer and removed before taxonomic analysis (). Representative sequences were aligned using PyNAST () against the Greengenes core set (). Taxonomic assignments were made using the Ribosomal Database Project (RDP) classifier (). Sequences classified as “Chloroplast” (0.2%) or “Mitochondria” (8%) were removed from the alignment.To compare diversity levels between WE and treeline samples and control for differences in sequencing depth between samples from the two environments, we conducted rarefaction analyses with 800 randomly selected sequences per sample. The rarefaction curves are displayed in Figure . The relative abundance of bacterial classes in each sample, displayed in Figure , was calculated as the percentage of sequences belonging to a particular phylum of all 16S rRNA gene sequences recovered from each sample, with the Proteobacteria split into classes. The Alphaproteobacterial phylogenetic tree displayed in Figure was created by first searching Alphaproteobacterial sequences that occurred at least 100 times in our data against the GenBank 16S rRNA database, using BLAST. The top hit for each sequence was then downloaded aligned from RDP. Our sequences, along with an outgroup sequence (Burkholderia arboris) were added to the alignment using ClustalW, before a maximum likelihood tree was inferred using RAxML (1000 bootstrap replicates; ). To create the heatmap displayed in Figure , the heatmap function in QIIME was used. The function visualizes the operational taxonomic unit (OTU) table generated by QIIME (this table tabulates the number of times an OTU is found in each sample). In Figure , only the 10 most common OTUs (phylotypes) were included. To create Figure , a heatmap of all OTUs was generated to identify phylotypes unique to each species and shared across all samples, respectively. For each phylotype in Figures and , the similarity to known isolates was determined though a BLAST search against the NCBI 16S rRNA database. To create Figure , an approximately maximum-likelihood tree was constructed from the alignment using FastTree (). An unweighted UniFrac distance matrix was constructed from the phylogenetic tree. The unweighted Unifraq distances were visualized using principal coordinate analysis (PCoA) and an UPGMA tree was created from the UniFraq distance matrix. Confidence ellipses (95%) were drawn around groups on the PCoA plots using the ordiellipse function of the Vegan package in R (). […]

Pipeline specifications

Software tools QIIME, UCLUST, ChimeraSlayer, PyNAST, RDP Classifier, Clustal W, RAxML, FastTree, vegan
Databases Greengenes
Applications Phylogenetics, 16S rRNA-seq analysis
Organisms Gluconacetobacter diazotrophicus
Chemicals Nitrogen, Acetic Acid