Computational protocol: Whole genome sequencing of enriched chloroplast DNA using the Illumina GAII platform

Similar protocols

Protocol publication

[…] Genomic DNA (gDNA), chloroplast-enriched DNA (cpDNA) and RCA amplified chloroplast-enriched DNA (RCAcpDNA) from the Chatham Island sample were quantified fluorometrically using the Quant-iT™ dsDNA HS assay kit on a Qubit™ Quantitation Platform (Invitrogen). The concentration of the gDNA, cpDNA and RCAcpDNA was 110 ngμL-1, 20 ngμL-1 and 104 ngμL-1, respectively, in a total volume of 50 μL of AE buffer (Qiagen). The purity of gDNA, cpDNA and RCAcpDNA samples was determined by A260/A280 and A260/A230 ratios on a NanoDrop (NanoDrop Technologies) spectrophotometer. Enrichment for chloroplast DNA was determined by quantitative real-time PCR (qPCR) with gDNA, cpDNA and RCAcpDNA templates; the quantity of the plastid gene psbB was determined relative to nuclear encoded 18S rRNA by comparative quantification []. Gene-specific primers were designed for psbB (psbB F 5'GGGGGTTGGAGTATCACAGG3'; psbB R 5'CCAAGAAGCACAAGCCAGAA3', 103 bp amplicon) using Primer3 [] and primers for 18S are described by Zhu and Altmann []. qPCR was performed using Lightcycler480 SYBR Green1 Master (Roche Diagnostics) reagents in a Rotor Gene 3000 instrument (Corbett Research) with four technical replicates per sample. Template DNA was diluted 20-fold for cpDNA, and 100-fold for gDNA and RCAcpDNA samples for qPCR. The qPCR cycling conditions were: 95°C 10 min, (95°C 10 s, 60°C 15 s, 72°C 20 s) × 40 cycles with fluorescent detection at 72°C and during the final melt. Melt curve analysis confirmed the amplification of a single product. [...] Reads from each indexed sample were trimmed to remove poor quality sequence at the 3' end. To determine the optimum trim length, initial de novo assemblies were made for read sets of different length (untrimmed reads, and reads trimmed to 70, 65, 55, 50 bp). These assemblies were carried out using Velvet 0.7 [] with a range of hash lengths from 33 to 63 and a minimum k-mer coverage of 5×. For these initial assemblies, the data were treated as single reads, that is, the paired-end information was not used. Maximum contig lengths and N50 values were tabulated and the hash lengths that gave the highest N50 for each trimmed set of reads were selected for further optimisation. A second round of assembly was carried out on each trimmed set of reads using the hash length determined above and varying the coverage cut-off parameter from 1 to 100. Finally, paired-end assembly was carried out for each of these read-length/hash-length combinations using the coverage cut-off value that gave the highest N50 value. For these paired-end assemblies, expected coverage was set to the length-weighted median of the coverage values obtained in the initial single read assemblies, and the insert length was estimated as 240 bp. Assembled contigs were aligned to the Cucumis sativus chloroplast genome [GenBank: NC_007144; GenBank: DQ119058] using Geneious 4.7 [].Four short regions of ambiguous sequence were checked by PCR amplification using the following primers, custom designed using Primer3 [] unless referenced: Corlaerps2-rpoc2F (TATAGGGTGCCATTCGAGGA), Corlaerps2-rpoc2R GTATCAACAACGGCCAATCC; CorlaendhAF (GGAATAGGATGGAGATAAGAAAGAC), CorlaendhAR (CACGATTCCGATCCAGAGTA); psbJ ATAGGTACTGTARCYGGTAT [], petA AACARTTYGARAAGGTTCAATT []; psbAR (CGCGTCTCTCTAAAATTGCAGTCAT) [], CorlaepsbA-R (ATCCGACTAGTTCCGGGTTC). Figure shows the relative position of the priming sites on the karaka chloroplast genome. The PCR cycling conditions were modified slightly from an existing published protocol [] as follows: template denaturation at 80°C for 5 min followed by 32 cycles of denaturation at 95°C for 1 min, primer annealing at 50°C for 1 min, followed by a ramp of 0.3°C/s to 65°C, and primer extension at 65°C for 4 min; followed by a final extension step of 5 min at 65°C. Amplified PCR products were sequenced using the BigDye Terminator Cycle Sequencing Kit (Applied Biosystems) and an ABI 3730 automated capillary sequencer at Massey Genome Service (Massey University, Palmerston North, New Zealand). The resulting sequences were visualised and edited using Sequencher 4.9 software for Mac (Gene Codes Corporation, Ann Arbor, MI). Using Geneious [], the four ambiguous regions of the assembled genome were edited, where necessary, to match the Sanger sequences. […]

Pipeline specifications

Software tools Primer3, Velvet, Geneious, Sequencher
Application qPCR
Organisms Corynocarpus laevigatus