Computational protocol: Draft Genome Sequence of Cellulosilyticum sp. I15G10I2, a Novel Bacterium Isolated from a Coal Seam Gas Water Treatment Pond

Similar protocols

Protocol publication

[…] Cellulosilyticum sp. strain I15G10I2 was isolated from a coal seam gas (CSG) water treatment pond sample collected from the Spring Gully water treatment facility, Roma, Queensland, Australia. Strain I15G10I2 was isolated in a medium containing 0.1% NaCl, 0.1% NH4Cl, 0.03% H2KPO4, 0.01%, MgCl2.6H2O, 0.001% CaCl2.2H2O, 50 mM HEPES, 0.01% yeast extract, 0.02% sodium thioglycolate, 0.2% ammonium ferric iron citrate, and 4 mM 4-methoxy benzoate (p-anisic acid) anaerobically at 25°C following the method of Ogg et al. (, ). Strain I15G10I2, a strict anaerobe, grew optimally at 37°C and pH 8 and utilized glucose, fructose, xylose, arabinose, galactose, sucrose, starch, cellobiose, xylan, and dextrin as carbon sources. The 16S rRNA gene sequence (1,489 bp) revealed that the closest phylogenetic neighbors were C. lentocellum DSM 5427T (93.71%) and C. ruminicola strain H1T (92.85%), the only two taxonomically validated members of the genus Cellulosilyticum (), family Lachnospiraceae, commonly associated with ruminants and human digestive tracts (). Here, we present the draft genome sequence of isolate I15G10I2, a novel anaerobe from a CSG water treatment pond.High-molecular-weight DNA of strain I15G10I2 was extracted using a modification of Marmur’s method () and submitted to the Australian Genomic Research Facility (AGRF) for TruSeq library preparation and sequencing on the Illumina MiSeq platform with specifications set for paired-end (PE) sequencing (2 × 250-bp read lengths). The sequencing of 718,395 PE reads (359,197,500 bp) were quality trimmed and filtered to 524,113 PE reads (258,522,023 bp) using Trimmomatic (), and reads sharing an overlap of 30 bp with a maximum overlap difference of 10% were joined using fastq-join (). The joined PE reads (179,454) and unjoined PE reads (344,659) were assembled using SPAdes version 3.5.0 () to produce a genome consisting of 30 contigs (N50 of 419,226 bp, coverage of 56.7×) with 4,489,861 bp and a G+C% content of 35.23. Prokka version 1.11 () annotation identified 4,170 protein-coding genes and 84 RNA genes. Phyla-AMPHORA () analysis of 56 universal genes extracted from 186 representative genomes of the family Lachnospiraceae confirmed that strain I15G10I2 was most closely related to members of the genus Cellulosilyticum, but ANIb and the Genome-to-Genome Distance Calculator () showed a weak relationship (averages of 72.93% and 19.6%, respectively), suggesting that it was a new species of the genus. RAST () analysis of strain I15G10I2 showed that 2,507 genes shared an identity of less than 50% (BLAST) to genes in the two Cellulosilyticum genomes combined, providing further evidence of its novelty. […]

Pipeline specifications

Software tools Trimmomatic, ea-utils, SPAdes, Prokka, AMPHORA, GGDC, RAST
Application Phylogenetics