Computational protocol: Fungal Diversity Associated with Hawaiian Drosophila Host Plants

Similar protocols

Protocol publication

[…] Sequences were assembled and edited in Geneious (Biomatters Ltd), trimming terminal regions with >5% probability of error per base call. We tested for chimeric sequences using the ChimeraSlayer module in mothur with the dataset serving as its own reference for identifying potential pairs of chimeric “parents.” Each rDNA sequence was queried against GenBank using BLAST and assigned to the BLAST hit with the highest bit score. In the case of multiple GenBank sequences with equal bit scores, one hit was arbitrarily chosen to represent the query sequence and all subsequent sequences that gave the same BLAST results. Results were binned based on the following approximations: if the sequence similarity between the BLAST hit and the query was 95% or greater, the rDNA sequence was considered to belong to the same genus as the top hit; if less than 95% similar, the query was considered to be “near” the genus of the top hit. We did not attempt to identify sequences to the level of species.Assignment of sequences to operational taxonomic units (OTU) and subsequent genetic diversity calculations were carried out in mothur . We aligned the sequences with and without the GenBank sequences using the E-INS-i method in MAFFT on the CIPRES gateway . Duplicate sequences were removed, a genetic distance matrix was constructed in PHYLIP v3.69 , and run through the hcluster module of mothur to determine each sequence’s OTU affinity using the furthest neighbor calculation. Values of over 1% genetic divergence at the ascomycetous D1/D2 domain of 26S rDNA have been proposed as representing differentiation at the species level , , so we defined OTUs rather conservatively as containing individuals with ≥97% genetic similarity. We used the Venn module in mothur to calculate the number of shared OTUs among the different fungal communities. For each collection (e.g., CheiFrLf, CheiRtLf, ClerFrLf), we calculated rarefaction curves and the Chao-1 index to evaluate our sampling effort. The parametric (H) and nonparametric (HNP) Shannon-Weiner diversity indices were used to compare diversity. HNP is more appropriate for comparisons within our study because of the likelihood of a non-negligible number of undetected species . However, because most other published studies have relied on the parametric index H to calculate diversity, we also calculated it for the sake of comparison, with the understanding that it will be biased downward . We tested for differences in community structure across the various substrates using homogeneity of molecular variance (HOMOVA) , analysis of molecular variance (AMOVA) , and unweighted UniFrac , correcting for multiple comparisons using Q-Value . For the three types of analyses, an initial test for significance over all data was performed. If this test was significant, all possible pairwise comparisons (45 possible) between plant-substrate combinations were performed (e.g., CheiFrLf vs. CheiRtLf, CheiFrLf vs. PisoFrLf, etc.). AMOVA tests for differences between the diversity present in each community and the diversity of all the communities pooled. Significant results indicate that genetic composition differs among communities, but that they may or may not differ in amount of genetic diversity. HOMOVA tests for differences in the amount of genetic diversity in each community. UniFrac tests for shifts or pivots in the genetic structure of populations wherein the amount of diversity may be the same but the composition differs among communities. If both the AMOVA and HOMOVA tests are significant, UniFrac offers little additional information . [...] An unrooted neighbor-joining (NJ) tree derived from a second MAFFT alignment, including our sequences and the best hit GenBank sequences, was constructed to help visualize taxonomic delimitations. We used PAUP* using the GTR model of evolution with 1000 bootstrap replications on the 50% majority-rule consensus tree. We then mapped the host plant and substrate type of each sample and the class and species identity of the top BLAST hits onto the NJ tree using MacClade . […]

Pipeline specifications

Software tools Geneious, ChimeraSlayer, mothur, MAFFT, PHYLIP, UniFrac, MacClade
Applications Phylogenetics, 16S rRNA-seq analysis
Organisms Drosophila melanogaster, Fungi