Computational protocol: The Potential of Metatranscriptomics for Identifying Screening Targets for Bacterial Vaginosis

[…] Subsequent to sequencing, all raw sequencing data was de-multiplexed according to their MID tags and the data obtained from each sample was then imported into Genomics Workbench version 4.5.1 (CLC Bio, Aarhus, Denmark), for removal of WTA primers and random primer sequences from each applicable read and data was filtered (low quality score limit of 0.05; maximum 2 ambiguous nucleotides allowed; minimum sequence length of 100 nt). Each dataset was then screened for chimeric sequences using UCHIME , with the trimmed and filtered data submitted to the MG-RAST ( bioinformatics pipeline for analysis (cDNA library MG-RAST ID = 4461586.3; DNA amplicon MG-RAST ID = 4461792.3) . A 90% cut-off was used for database searches within MG-RAST as an arbitrary cut-off for genus identities using the RDP database, and a 98% cut-off was used for species level identification . MG-RAST generates abundance counts based on the number of unique hits a particular sequence has against a particular database. It is therefore highly likely that a single read may have multiple abundance counts assigned to it if there is an equal relatedness. The identities for each read were sorted using Excel 2007 (Microsoft Corporation, Redmond, USA) to eliminate multiple identical hits for individual reads, with manual analysis being carried out using the BLAST algorithm for discrepant samples. Graphical representation of bacterial abundances was achieved using Krona charts . Functional genes from the cDNA library were characterised by SEED analysis within MG-RAST.Reads derived from the cDNA library assigned to each major genus (comprising >10% of total population) using MG-RAST were also imported into Lasergene 8 (DNASTAR Inc., Madison, USA) for manual sequence alignment was carried out using a 98% sequence match cut-off. The consensus sequences of alignments were assigned an identity using the BLAST algorithm with a ≥98% identity required to assign a species name. […]

Pipeline specifications

Software tools geWorkbench, UCHIME, Krona
Databases MG-RAST
Applications Phylogenetics, Metagenomic sequencing analysis, 16S rRNA-seq analysis
Organisms Homo sapiens
Diseases Vaginosis, Bacterial
Chemicals Titanium