Similar protocols

Protocol publication

[…] g(N−1))≈1bit, corresponding to the likelihood that Stj is twice as big as the likelihood of Stk. For example, if a strain has 1000 genes, then we would need to observe 693 genes without observing g to be able to conclude that the observed data were twice as likely to be generated from the species with a single gene deletion. For comparison, we would need to only sequence 100 genes from Stk to get an expected log-likelihood difference of 1 bits versus Stj, demonstrating the extra information in gene ’presence’ versus ’absence’ typing., We downloaded the resistance gene database from ResFinder [] (accessed July 2015). We aligned each gene to the collection of bacterial genomes in RefSeq using blastn [], and used the best alignment of the gene to extract 100 bp sequences flanking the antibiotic resistance genes. We found that the inclusion of these flanking sequences improved the sensitivity of mapping MinION reads to the gene database., We then grouped these genes based on 90 % sequence identity into 609 groups. We manually checked and found that genes within a group were variants of the same gene. We selected the longest gene in each group to make up a reduced resistance gene database. To create a benchmark of resistance genes for a sample, we used blastn to compare the Illumina assembly of the sample against this reduced gene database, and reported genes with greater than 85 % coverage and identity., Our analysis pipeline aligned MinION sequencing data to this reduced resistance gene database using BWA-MEM [] in a streamlined fashion, and examined genes with reads mapping to the whole gene (not including flanking sequences). Because of high error rates with MinION sequence data, we noticed a high rate of false positive genes. To reduce false positives, we used kalign2 [] to perform a multiple alignment of reads that were aligned to the same gene. The consensus sequence resulting from the multiple alignment was then compared with the gene sequence using a probabilistic Finite State Machine (see below). The pipeline then reported gene classes based on the genes detected., Our methods for MLST strain typing and antibiotic resistance gene identification require the alignment of a consensus sequence to a gene or a gene allele. Such an alignment generally assumes a model and a set of parameters of the differences between the sequences. It is widely recognised that the accuracy of the alignment is sensitive to these parameters [–]. However, in the contex […]

Pipeline specifications

Software tools Metrichor, npReader, Trimmomatic
Diseases Bacterial Infections