Computational protocol: Testing the potential of a ribosomal 16S marker for DNA metabarcoding of insects

Similar protocols

Protocol publication

[…] includes a flow chart of the data processing steps. All used custom R scripts are available in . First, reads were demultiplexed (R script splitreads_ins_v11.R) and paired end reads merged using USEARCH v8.0.1623 -fastq_mergepairs with -fastq_merge_maxee 1.0 (). Primers were removed with cutadapt version 1.8.1 (). Sequences from all ten replicates were pooled, dereplicated, and singletons were removed to find operational taxonomic units (OTUs) using the UPARSE pipeline (cluster_otus, 97% identity, ). Chimeras were removed from the OTUs using uchime_denovo. The remaining OTUs were identified by querying against the nucleotide non-redundant database (NR) on NCBI using the Blast API (Entrez Programming Utilities) and our local 16S database using BLAST 2.2.31+ (). Taxonomy was assigned and checked manually, and in rare cases matches of ∼90% identity were accepted, if they matched the patterns which were previously reported for COI ().The ten samples were dereplicated using derep_fulllength, but singletons were included in the data set. Sequences of each sample were matched against the OTUs with a minimum match of 97% using usearch_global. The hit tables were imported and the sequence numbers were normalised to the total sequence abundance and tissue weight for the various taxa. Only OTUs with a read abundance above 0.003% in at least one replicate were considered in downstream analysis.Due to the exponential nature of PCR, statistical tests on weight adjusted relative read abundances were carried out on decadic logarithm. Expected relative abundance was calculated by dividing 100% by the number of morphospecies detected with each marker. […]

Pipeline specifications

Software tools USEARCH, cutadapt, UPARSE, UCHIME
Application 16S rRNA-seq analysis
Diseases Cytochrome-c Oxidase Deficiency