[…] The quality of all raw sequenced data was analysed using FastQC 0.11.1, and an in-house script ( was used to obtain reads with a PHRED quality score of at least 20 (i.e., -q 20 parameter) and exclude adaptors sequence (i.e., -l 17 parameter). Genomes were then de novo assembled using Newbler 2.0 (Roche, USA) with -urt -noace -m -a 50 parameters, and scaffolds were generated using CONTIGuator, using SA20 as a reference strain and default parameters. Gap filling was performed manually using CLC Genomics Workbench 7.0 (CLC-gw) (Qiagen, USA), mapping the reads to the genomes and gradually extending the flanks of the gaps. Annotation was performed manually for SA20 using Uniprot database (, which was then used to annotate the other strains by the software Prokka version 1.11, with modification to use nested databases in this order: manually curated CDSs from SA20, RefSeq database only with S. agalactiae proteins, and finally all proteins from RefSeq. The annotated genomes visualization and the manual correction of frameshifts were performed on Artemis and CLC-gw, respectively. […]

Pipeline specifications

Software tools FastQC, Newbler, CONTIGuator, CLC Genomics Workbench, Prokka
Application De novo sequencing analysis
Organisms Streptococcus agalactiae