Computational protocol: The methylome of the gut microbiome: disparate Dam methylation patterns in intestinal Bacteroides dorei

Similar protocols

Protocol publication

[…] The stool samples used here were collected by the Finnish Type 1 Diabetes Prediction and Prevention Study (DIPP) (Kukko et al., ). Newborns were screened for high-risk HLA-DR and HLA-DQ genotypes using a previously described method (Kukko et al., ). Stool samples were collected by the subjects' parents at home and mailed to the DIPP Virus Laboratory for virology in Tampere, Finland, where they were stored at −80°C. Detection of beta-cell autoimmunity was done as described in Parikka et al. (). Sample 105 was collected from the subject at 13.5 months of age. This subject became autoimmune for type 1 diabetes at 15.1 months of age. Sample 439 was collected from a subject who remained healthy at 3.3 months of age. Both subjects were genetically at high risk for type 1 diabetes given their HLA genotype.In this study, DNA extraction and 16S rRNA amplification, sequencing, and analysis was done as described previously (Fagen et al., ) except the Qiagen AllPrep DNA/RNA/Protein Mini Kit (QIAGEN) was used to extract DNA, RNA, and protein from stool. Based on the 16S rRNA results, two samples were chosen for long-read Pacific Biosciences sequencing based on the high relative abundance of this organism was 63.7 and 47.9% from samples 105 and 439, respectively. These samples, 105 and 439, were collected from children who became autoimmune or remained healthy, respectively.Pacific Biosciences (PacBio RS II system) library construction and sequencing was done by the University of Florida's Interdisciplinary Center for Biotechnology Research. Prior to sequencing, a PacBio library was made with SMRTbell Adaptors. The Bacteroides dorei genome was assembled to closure from sample 105 after obtaining eight SMRT cells of sequence data. A total of 1,502,920 reads and 1,860,712,096 bases were obtained, with a mean read length of 2706 bp. Average read quality was 0.848. The initial Pacbio reads were error corrected using the Pacbio RS_PreAssembler.1 module (Koren et al., ) with minimum subread length of 400 bp, minimum read quality 0.60, and minimum seed read length of 3800 bp. The error correction process yielded 47,654 reads of 2378 bp average length. Reads were binned according to coverage reported by the Pacbio RS_PreAssembler.1 protocol, filtering out reads with lower than 200× coverage. A set of 27 contigs was assembled directly from the binned reads using SPAdes assembler v3.0 (Bankevich et al., ). A single scaffold was obtained by detecting overlaps with Mauve 2.3.1 (Darling et al., ) and manually assembling the remaining contigs. The initial genome assembly was further refined using the Pacbio RS_Resequencing.1 module with Quiver consensus calling. The final, circular genome consists of 5,726,633 bp and an overall GC content of 42.0%.The complete B. dorei genome from sample 439 metageomic DNA was closed in the same manner as described above for the 105 sample. The 439 closed genome was significantly smaller than the 105 genome with 5,243,219 bp. Genome annotation for both genomes was done by using the NCBI Prokaryotic Genome Annotation pipeline (Angiuoli et al., ), which relies on GeneMarkS+ for gene prediction (Besemer et al., ). The NCBI accession numbers for the closed 105 B. dorei genome and the 439 B. dorei genome are CP007619 and CP008741, respectively. Two-way average nucleotide identities between pairs of genomes were determined as described by Goris et al. ().Methylation patterns obtained from the PacBio data were recovered from both genomes using the assembled genomes as a reference. Methylation data was extracted from the full set of sequence data using Kinetics Tools. This tool utilizes the P_ModificationDetection module in SMRT Portal, which is utilized by the RS Modification Detection as well as the RS Modifications and Motif Detection protocol. The Motif Detection protocol generated motifs by comparing methylation patterns to genomic context. […]

Pipeline specifications

Software tools Subread, SPAdes, Mauve, PGAP
Application Nucleotide sequence alignment
Organisms Homo sapiens
Chemicals Adenine