[…] A DNA library was prepared using a Nextera™ DNA Sample Prep Kit (Illumina-compatible, EPICENTRE Biotechnologies, Madison, WI, USA), and DNA clusters were generated on a slide using a Cluster Generation Kit (version 2) with an Illumina cluster station (Illumina, San Diego, CA, USA) according to the manufacturer's instructions. The general procedure described in the standard protocol (Illumina) was performed to obtain standard ~1.0 × 107 short reads for 1 lane. All of the sequencing runs for generating 126-mers were performed with a Genome Analyzer IIx using an Illumina Sequencing Kit ( Fluorescence images were analyzed using Illumina base-calling pipeline (version 1.4.0) to obtain FASTQ-formatted sequence data. The short-read sequences have been deposited in DNA Data Bank of Japan (DDBJ; accession numbers: DRA000895 and DRA001171). All of the obtained DNA sequencing reads were aligned to a reference human genomic sequence using BWA-SW read-mapping software (Li and Durbin, ), with quality trimming to remove low-quality reads. The remaining sequence reads were subjected to a megaBLAST search against a nucleotide database. The results of this search were analyzed and visualized using MEGAN version 4.62.3 (Huson et al., ), with a minimum support of 1 hit and a minimum score of 150. [...] A metagenomic biomarker discovery approach, LEfSe, was used to identify the microbial components whose sequences were more abundant in the fecal samples of the KD patients during the acute phase than in those of the KD patients during the non-acute phase and the controls. For LEfSe, Kruskal–Wallis and pairwise Wilcoxon tests are performed, followed by LDA to assess the effect size of each differentially abundant taxon (Segata et al., ). In this study, a p-value of <0.05 was considered significant for both statistical methods. Bacteria with markedly increased numbers were defined as those with an LDA score (log10) of over 2. Less than 0.01% of the total bacterial reads, corresponding with ≤107 CFU/g feces, were omitted from further analysis because of low and unreliable read counts, although significant LDA scores were observed in LEfSe. [...] A draft genome sequence was obtained by whole-genome sequencing using MiSeq with a NEXTERA XT library preparation kit (Illumina), followed by de novo assembly with A5-MiSeq pipeline (Tritt et al., ). The resulting scaffolds were annotated using RAST server (Aziz et al., ). Maximum likelihood phylogenetic analysis of Streptococcus 16S-rDNA was performed using MEGA 6.0 with 1000 bootstrap iterations (Tamura et al., ). […]

Pipeline specifications

Software tools BWA, BLASTN, MEGAN, LEfSe, A5, RAST, MEGA
Databases DDBJ
Applications Phylogenetics, Metagenomic sequencing analysis
Organisms Homo sapiens
Diseases Pneumonia, Genetic Diseases, Inborn