[…] Whole Genome Shotgun project including the genome sequences of 14 Curtobacterium, 1 Frigoribacterium, and 1 Plantibacter isolates deposited at GenBank under BioProject PRJNA342146 with accessions MJGI00000000-MJGX00000000. Paired-end 100 bp × 100 bp whole genome sequencing libraries with a mean gap size of 400 bp were prepared from genomic DNA using the Nextera DNA Library Preparation Kit (Illumina Inc., San Diego, CA, USA). Genomes were sequenced on an Illumina HiSeq 2500 apparatus (Illumina Inc., San Diego, CA, USA) at the Whitehead Institute Genome Technology Core (Cambridge, MA). After quality trimming and removal of short (<30 bp) reads, an initial de novo assembly was performed in CLC Genomics Workbench (CLC Bio, Cambridge, MA, USA) using the default parameters., Genomes (fully assembled and whole genome shotgun assembly) belonging to the Microbacteriaceae were retrieved from the Pathosystems Resource Integration Center (PATRIC) database (Wattam et al., ). To annotate these downloaded genomes and our isolate genomes, we first assigned open reading frames (ORFs) sequences as called by Prodigal v2.6 (Hyatt et al., ). Genomic ORFs were searched against the Pfam database (Finn et al., ) for the presence of protein families using HMMer (Johnson et al., ). We identified the GH families as in Berlemont and Martiny () and compiled the number of occurrences of each GH family in each genome. To create a phylogeny of the whole genome sequences, the 16S rRNA region of each genome was predicted using Barrnap. The resulting sequences were used for phylogenetic reconstruction as described above., We isolated 17 Curtobacterium strains from two invasive grassland sites. Although similar in their vegetation, LRGCE and BACE sites are 4130 km apart across the North American continent. Yet, from these sites, Curtobacterium strains comprised 10 and 15% of culturable isolates in LRGCE and BACE, respectively. Beyond these two terrestrial sites, data collected from a wide array of studies and isolation sources reveal that Curtobacterium is an abundant and globally distributed taxon. In total, we obtained 3360 16S rRNA sequences with corresponding metadata from GenBank and the EMP databases. The genu […]

