Computational protocol: Draft Genome Sequences of Type Strains Bacillus drentensis DSM 15600T and Bacillus novalis DSM 15603T

Similar protocols

Protocol publication

[…] Type strains Bacillus drentensis DSM 15600T and Bacillus novalis DSM 15603T are Gram-positive, spore-forming, and aerobic bacteria, isolated from the soil of several disused hay fields in the Drentse A agricultural research area (the Netherlands). As a result of the recent decrease in the cost of genomic sequencing, it has been proposed that whole-genome sequencing information be combined with the main phenotypic characteristics as a polyphasic approach strategy (taxono-genomics) to describe new bacterial taxa (). In this study, a high-quality genome sequence of B. drentensis DSM 15600T was sequenced, which may promote research in the genomic taxonomy of the Bacillus-like bacteria.The genomes of B. drentensis DSM 15600T and B. novalis DSM 15603T were sequenced with massively parallel sequencing (MPS) Illumina technology. Two DNA libraries were constructed: a paired-end library with an insert size of 500 bp and a mate-pair library with an insert size of 5 kb. The 500-bp library and the 5-kb library were sequenced using an Illumina HiSeq 2500 by PE125 strategy. Library construction and sequencing were performed at the Beijing Novogene Bioinformatics Technology Co., Ltd. Quality control of both paired-end and mate-pair reads were performed using an in-house program. After this step, Illumina PCR adapter reads and low-quality reads were filtered. The filtered reads were assembled by SOAPdenovo (, ) to generate scaffolds. All reads were used for further gap closure. Through the data assembly, 5,305,306 bp within three scaffolds and 5,668,192 bp within two scaffolds were obtained, and the scaffold N50 values were 5,303,701 bp and 5,667,584 bp, respectively, for B. drentensis DSM 15600T and B. novalis DSM 15603T. The longest and shortest scaffolds of these two species were 5,303,701 bp and 689 bp and 5,667,584 bp and 608 bp, respectively.For the genome assemblies of these two species, gene prediction was performed with GeneMarkS (). Transfer RNA (tRNA) genes were predicted with tRNAscan-SE (), rRNA genes were predicted with rRNAmmer () and short RNAs (sRNAs) were predicted by BLAST against the Rfam () database. PHAST () was used for prophage prediction and CRISPRFinder () was used for clustered regularly interspaced short palindromic repeat (CRISPR) identification. Totals of 5,516 and 5,986 genes of B. drentensis DSM 15600T and B. novalis DSM 15603T were predicted, including 5,337 and 5,827 coding sequences (CDS), respectively, and four sRNAs, 125 tRNAs, 50 rRNAs (17 5S, 16 16S, and 17 23S) and five sRNAs, 118 tRNAs, 36 rRNA (13 5S, 11 16S, and 12 23S), respectively. The average DNA G+C contents were 38.91% and 40.01%, respectively. […]

Pipeline specifications