[…] E. lenta and cgr2 prevalence were determined using the copy number abundance (gene copies/cell) as derived from Metaquery2 () using the median abundance from individuals with repeated sampling. E. lenta abundance was determined from a single copy E. lenta marker gene described elsewhere (elnmrk1) (). Matches were required to have a minimum 90% nucleotide identity and query/target coverage. Reconstruction of metagenomic cgr2 sequences was carried out by quality trimming reads from 96 metagenomes with >0.001 proportional abundance of E. lenta or >1 fold coverage using default sliding window settings with Trimmomatic () and extracting reads which mapped to the cgr cluster and associated intergenic space (2957889..2968387) in the reference DSM 2243 assembly with Bowtie 2. These were assembled and annotated as above. Alignments to Cgr2 in metagenomic coding sequences were filtered by a global alignment identity of >80% to position 333 ± 60 residues. For assembly-free variant calling, reads were filtered for a minimum mapping quality of 10 and a pileup was created (SAMtools). 49 metagenomes had at least one read mapping to the variant position (2959294). Variants were called when > 50% of reads at a site supported an alternative sequence from the reference. Conservation of nucleotide sequence in isolates was independently confirmed via Sanger sequencing (GENEWIZ, San Francisco, CA, USA) using the following primers: cgr2_fwd (TGCAATCAAGACAACCACGA), cgr2_internal (TCGGTGTACAACCACAATGC), and cgr2_rev (GTTGCGCTGTGATTAGACTG). PCR was carried out with high-fidelity Q5 enzyme (New England Biolabs, Ipswich, MA).To validate metagenomics inquiries, qPCR analysis with double-dye probes was carried out in a duplexed fashion using the following primers and probes: ElentaUni_F (GTACAACATGCTCCTTGCGG), ElentaUni_R (CGAACAGAGGATCGGGATGG), ElentaUni_Probe ([6FAM]TTCTGGCTGCACCGTTCGCGGTCCA[BHQ1]), cgr2_F (GAGGCCGTCGATTGGATGAT), cgr2_R (ACCGTAGGCATTGTGGTTGT), and cgr2_probe ([HEX]CGACACGGAGGCCGATGTCG[BHQ1]). Reactions were carried out in triplicate using 10 µL reactions with 200 nM primers and probes using BioRad Universal Probes Supermix on a BioRad CFX 384 thermocycler according to the manufacturer’s suggested settings for fast cycles with a 60 °C annealing temperature. The estimated assay detection limit based on spike-in experiments is 1.4 × 103 GE/g after accounting for DNA extraction. Human samples were collected for the purpose of microbiome analysis as part of the following registered studies: NCT03022682, NCT01967563, and NCT01105143 and approved by their respective institutional review boards. DNA was extracted with variable methods using either MoBio Power Soil (QIAGEN, Germantown, MD), Qiagen Fast Stool (QIAGEN, Germantown, MD), or Promega Wizard (Promega, Madison, WI) SV 96 kits. [...] All statistical analysis was carried out using either Student’s t-test as implemented in Graphpad Prism version 7 (La Jolla, CA, USA) or R version 3.4.0 using appropriate base functions for Welch’s t-test, Pearson and Spearman correlations, and ANOVA with multcomp version 1.4–6 for Dunnett’s multiple comparison test. Graphing was carried out with Graphpad Prism and R using ggplot2 version 2.2.1. Skewedness was calculated using the R package Moments version 0.14. […]

Software tools MetaQuery, Trimmomatic, Bowtie, SAMtools, multcomp, Ggplot2
Applications Miscellaneous, Metagenomic sequencing analysis, GWAS
Organisms Homo sapiens, Bacteria, Eggerthella lenta
Diseases Plant Poisoning
Chemicals Digoxin