Computational protocol: An antimicrobial peptide-resistant minor subpopulation of Photorhabdus luminescens is responsible for virulence

Similar protocols

Protocol publication

[…] Genomic DNA was extracted from bacteria grown in LB or LB plus polymyxin B and harvested at an OD540 of 0.3–0.45 with the QIAamp DNA Mini Kit (Qiagen). An additional purification was performed with the DNA Clean-up kit (MoBio Laboratories). The DNA libraries were prepared according to PacBio guidelines and sequenced on two PacBio SMRT cells on a Pacific Biosciences RSII instrument (Genome Québec, Montréal, Canada). The raw data were processed with the PacBio SMRT Analysis Suite (version v2.3 p3). The reads were assembled de novo, with the high-quality Hierarchical Genome Assembly Process HGAP.3 algorithm. The assembled genomes were deposited in the Microscope platform database. No major rearrangements were observed with progressiveMauve 2.1.0.a1. Conserved Synteny LinePlot revealed 100% conservation of synteny groups between the two genomes studied, with a synton size ≥3 and the P. luminescens TT01 NC_005126 genome as the reference (data not shown). We used two programs for the detection of SNPs, insertions and deletions: GATK v3.3.0 (The Genome Analysis Toolkit) which was used to map PacBio reads onto the NC_005126 reference genome, and LAST v712, which is based on the alignment of the NC_005126 reference genome and the two HGAP.3 assembled genomes. Regions of poor quality displaying homopolymers ≥4 nucleotides in length or discordant multiple alignments (i.e. mapping several times onto the reference genome, with one correct mapping and one (or several) displaying <100% homology) were considered irrelevant and discarded. With 20% of its genome consisting of repeats, P. luminescens TT01 NC_005126 has been identified as a genome with a particularly high level of repeat coverage. [...] The RNA-seq libraries were prepared with the TruSeq® Stranded mRNA sample prep kit (Illumina). Samples depleted of rRNA were fragmented and reverse-transcribed with random hexamers, Superscript II (Life Technologies) and actinomycin D. During the generation of the second strand, dTTP was replaced with dUTP. Double-stranded cDNAs were adenylated at their 3′ ends before ligation with Illumina indexed adapters. Ligated cDNAs were amplified by 15 cycles of PCR and purified with AMPure XP Beads (Beckman Coulter Genomics). Libraries were validated with a DNA1000 chip (Agilent) and quantified with the KAPA Library quantification kit (Clinisciences). Twelve libraries were pooled in equimolar amounts in a single lane and were sequenced on a HiSeq2000 machine, with the single-read protocol (50 nt). Image analysis and base-calling were performed with Illumina HiSeq Control Software and the Real-Time Analysis component. Demultiplexing was performed with Illumina sequencing analysis software (CASAVA 1.8.2). Data quality was assessed with FastQC from the Babraham Institute and the sequencing analysis viewer (SAV) from Illumina software. [...] High-throughput transcriptomic sequencing data were processed with a bioinformatic pipeline implemented at the Microscope platform. The reads were mapped onto the P. luminescens subsp. laumondi TT01 genome sequence (EMBL accession number: BX470251) with BWA software (v. 0.7.4). We then used SAMtools (v.0.1.12) to lower the false-positive discovery rate and to extract reliable alignments from BAM-formatted files. The number of reads matching each genomic object harbored by the reference genome was then calculated with the Bioconductor-GenomicFeatures package. For reads matching several genomic objects, the count number was weighted so as to keep the total number of reads constant. Finally, we used the Bioconductor-DESeq package with default parameters to analyze raw count data and to evaluate differential expression between conditions. In more details, we used the False positive Discovery Rate (FDR) method (variance estimate for each gene with the Negative Binomial distribution followed by a per gene Wald-test generating p-values that were adjusted by the method. Between 11 and 20 million Illumina sequences (50-base reads) were obtained for each sample and between 15 and 40% of high-quality sequences mapped to at least one site in the reference genome. The complete dataset from this study has been deposited in the GEO database, under accession number GSE76559. […]

Pipeline specifications

Software tools SMRT-Analysis, Mauve, GATK, HSC, BaseSpace, FastQC, BWA, SAMtools, GenomicFeatures, DESeq
Applications RNA-seq analysis, Nucleotide sequence alignment
Organisms Photorhabdus luminescens, Bacteria