Computational protocol: Metabarcoding Analysis of Fungal Diversity in the Phyllosphere and Carposphere of Olive (Olea europaea)

Similar protocols

Protocol publication

[…] Data analysis was conducted using the bioinformatics pipeline QIIME v. 1.8 []. De-multiplexing and quality filtering analyses were done using a minimum quality score of 25, a minimum/maximum length ratio of 200/1000 and a maximum number of homopolymer bases of 6. Additionally, the sliding window test of quality scores (-w) was enabled with a value of 50 to discard sequences with bad windows according to the "-g" command. Sequences were denoised using the denois wrapper [] and the ITS2 region was extracted using ITSx software []. Chimeric sequences were identified and filtered using USEARCh 6.1 []. The most abundant sequences were picked as representative sequences to be used in Operational taxonomical units (OTUs) picking and taxonomy assignments. OTUs were picked using the BLAST method [] and the UNITE dynamic database released on February 2, 2014 (http://unite.ut.ee/). The same database was also utilized for taxonomy assignments [] using a sequence similarity threshold of 0.97 and maximum e-values of 0.001 and 1e-10 in picking OTUs and in taxonomy assignments, respectively. The taxonomic assignments and the operational taxonomical unite map (OTU map) were used to create the OTU table needed to construct the heat-map and the taxa summaries.Since the rarefaction plots of the entire OTU table as a function of the sequencing effort with a maximum of 6000 sequences per sample revealed heterogeneity in sampling, the OTU table was rarefied to even sequencing depth of 2800 sequences to remove sample heterogeneity. Weighted and unweighted UniFrac metrics were utilized to evaluate Beta diversity []. Alfa diversity was determined by Shannon’s Diversity Index and Chao1 estimate. Beta diversity served to construct UPGMA trees and PCoA plots. The uncertainty in the UPGMA tree was estimated by performing jack-knifing at a depth of 2000 sequences. Trees were visualized and edited in Mega6 [].To highlight shared phylotypes, Venn Diagrams were created using the OTUs table created in QIIME and visualized on the website http://bioinfogp.cnb.csic.es/tools/venny/index.html []. [...] In order to confirm the accuracy of taxonomic assignments, sequences associated with each OTU within each identified fungal genus, were extracted and introduced in ElimDupes (http://hcv.lanl.gov/content/sequence/ELIMDUPES/elimdupes.html) to detect multiple identical sequences and determine their frequency. Unique representative sequences defined as sequence types (STs), i.e. distinct and reproducible ITS2 sequences recovered in this study, were than manually blasted to identify the closest available reference sequences in the complete NCBI nucleotide collection (http://blast.ncbi.nlm.nih.gov/Blast). Furthermore, ITS2 sequences of the most abundant fungal genera according to the QIIME taxonomic assignments (Aureobasidium spp., Colletotrichum spp., Cladosporium spp., Pseudocercospora spp., and Devriesia spp.) were phylogenetically analyzed. STs were analyzed along with genetically closely related reference sequences of the same genus to determine their phylogenetic collocation and enable their identification with the highest possible level of accuracy. Before analysis, validated panels of reference ITS2 sequences of Colletotrichum acutatum s.l. [–], C. boninense s.l. [], Pseudocercospora spp. [], Devriesia spp. [], Cladosporium spp. [] and Aureobasidium spp. [] were analyzed with the software ElimDupes to delete multiple identical sequences. Some identical reference sequences were included in the panel because they were representative of different species. When none of the above-validated reference sequences was identical to sequences identified in the present study, eventual more closely related sequences were identified by BLAST analyses. Despite being low abundant, a similar analysis was also performed for the genus Spilocaea in light of its relevance as olive fungal pathogen []. In this case, reference sequences were downloaded from GenBank because of the lack of a validated panel of reference sequences.For each genus, STs identified in the present study and reference sequences were aligned using MUSCLE and introduced to MEGA for phylogenetic analysis with the Maximum Likelihood method using the Tamura-Nei model []. Analyses were performed with 1000 bootstrap replications. […]

Pipeline specifications

Software tools QIIME, ITSx, USEARCH, UniFrac, MEGA, VENNY, MUSCLE
Applications Phylogenetics, 16S rRNA-seq analysis, Nucleotide sequence alignment
Organisms Fungi, Olea europaea, Colletotrichum acutatum