Similar protocols

Protocol publication

[…] b' [,], and eight strains selected to represent highly diverse AFLP profiles were chosen for sequencing (Table\xc2\xa0). The identity of isolate DNA was tested by sequencing approximately 1000 bp of the 16S rRNA gene and by comparing the results with A. butzleri sequences within the National Centre for Biotechnology Information (NCBI) genetic database [,]. The DNA for isolates to be sequenced was quantified by spectrophotometry (A600) (Ultrospec 3100 pro, GE Healthcare Life Sciences, Baie d\xe2\x80\x99Urfe, QC). Isolates were sequenced as paired-end, 100 bp reads on a HiSeq platform (Illumina Inc., San Diego, CA) with Phred30 (99.9%) base-calling accuracy [], and reads were de novo assembled into contigs using ABySS [] with specifications for short paired-end reads. Sequencing data for the A. butzleri isolates were accessioned in the NCBI genetic sequence database as a single bioproject (PRJNA233527).Table 3, Rapid Annotation Using Subsystem Technology [] was used to identify open reading frames (ORF) for the eight sequenced A. butzleri genomes, as well as three previously available genome assemblies (RM4018 - PRJNA58557, ED1 - PRJNA158699, JV22 - PRJNA61483). The genome assembly for a fourth strain, 7h1h (PRJNA200766), was not available at the time that the comparative genomic analysis was performed, however we were able to utilize the four published WGS strains for all subsequent in silico CGF analyses., To identify core and accessory genes, the ORFs from each genome were searched against the eleven genome assemblies using the program BLASTP from the Basic Local Alignment Search Tool [,], with filtering to remove redundant results from likely orthologous genes. ORFs present in all assemblies were identified as core, and all non-redundant ORFs absent from one or more strains were designated as accessory., To simplify CGF assay design, accessory genes with limited genotypic potential due to a highly biased population distribution (i.e. present in greater than 80% of strains or present in fewer than 20% of strains) were eliminated from further consideration as candidate markers. Moreover, for groups of accessory genes that presented redundant patterns of presence and absence in the dataset (i.e. genes that are typically linked and provide l' […]

Pipeline specifications

Software tools ABySS, RAST, BLASTP, BLASTN