Computational protocol: Metabolic capability and in situ activity of microorganisms in an oil reservoir

[…] PhyloPhlAn v0.99 [] was used to reconstruct the phylogenetic tree of all GBs based on the protein prediction results from Prodigal v2.6 []. Additionally, we used BLAST [] to compare the 16S rRNA gene annotated in each bin. Amino acid synthesis pathway, vitamin synthesis pathways, and vitamins transport system in near-complete GBs were curated manually using primary literature and the KEGG database [] as previously described []. Metatranscriptomic dataset of W15 was used in addition to help curate amino acid and vitamin synthesis pathways. For biosynthetic pathways in which only one gene was missing, we used metatranscriptomic information to evaluate expression levels of the pathway. If the overall transcription level was similar or higher than in GBs that harbor a complete pathway, this pathway was considered complete. [...] Metatranscriptomic raw reads were trimmed by quality using Prinseq (parameters were identical to the metagenome analysis), and trimmed reads were mapped to GB’s coding sequences (CDS) using Bowtie2 [] with default settings. Transcription level of each recovered GB was evaluated by computing the cDNA/DNA abundance ratio. Transcription level of individual genes in each GB was determined by mapping metatranscriptomic reads to co-assembled contigs using Bowtie2 [] with default parameters. eXpress v1.5.1 [] was used to calculate FPKM (Fragments Per Kilobase of transcript per Million) values, and the genes whose FPKM values rank in top 25th percentile were defined as actively expressed. […]

Pipeline specifications

Software tools PhyloPhlAn, Prodigal, PRINSEQ, Bowtie2
Databases KEGG
Applications Phylogenetics, Metatranscriptomic sequencing analysis
Chemicals Alkanes, Amino Acids, Carbon Dioxide, Hydrocarbons, Oxygen