Computational protocol: Along for the ride or missing it altogether: exploring the host specificity and diversity of haemogregarines in the Canary Islands

Similar protocols

Protocol publication

[…] Molecular characterization was performed on a subset of the infected samples. For Gallotia, up to five random representatives from each infected population were extracted (plus some extra individuals were sequenced to confirm parasite identification). Moreover, all infected Tarentola and Chalcides samples were also analyzed (although for one Chalcides individual, all PCR amplification attempts were unsuccessful). DNA from host blood or tissue was extracted using standard high-salt methods []. The PCR reactions were performed using primers specific for a 600 bp long region of the 18S rRNA gene, HepF300 and HepR900 []. For details on PCR conditions, see Harris et al. []. The 18S rRNA gene remains the most used genetic marker for the majority of haemogregarine clades [, ]. The pair of primers was chosen given its higher amplification success across different haemogregarine lineages infecting reptiles when compared to other primers (e.g. HEMO1 and HEMO2 [], and EF and ER []), and because it provides comparable results to longer sequences in phylogenetic analysis []. Efforts to amplify other genetic markers available for haemogregarines [], namely other fragments of the 18S rRNA gene [, ] and the ITS1 region [], were unsuccessful. The amplified products were purified and sequenced by an external company (Beckman Coulter Genomics, UK).Sequences were compared to the GenBank database to confirm the identity of the amplified products using the NCBI nucleotide BLAST. A total of 137 haemogregarine sequences were obtained from Gallotia hosts, 12 from Tarentola and four from Chalcides. Sequences were corrected and aligned in Geneious v5.6.7 [], using the MAFFT algorithm []. Clean high-quality sequences were first used to identify the distinct haplotypes present, then the sequences of lower quality and with ambiguities were compared to these to ascertain their identity. Cases of sequences with double peaks were consistent with the previously identified haplotypes and are regarded as mixed infections. Uncorrected pairwise distances (p-distances) between haplotypes were calculated in MEGA7 [], using a 569 bp alignment of the nine new haplotypes (Additional file : Table S2). New haplotype sequences are deposited in the GenBank database, under the accession numbers MG787243-MG787253.For the phylogenetic analyses, 137 GenBank sequences of other haemogregarines were added (Additional file : Table S3). As putative outgroups of the Canarian haemogregarines, sequences of specimens infecting Chalcides, Tarentola and other lizard species from the African and Iberian mainland were included (previously assessed in [, , , ]), as well as the two new sequences from the host genus Psammodromus, the closest relative to Gallotia. In accordance with Barta et al. [], Haemogregarina balli and Dactylosoma ranarum were used as outgroups for the overall phylogeny. In total, the final alignment matrix included 148 sequences and 583 nucleotide positions. The substitution model of evolution was chosen according to the BIC criterion selected by jModelTest 2 (model TIM1+I+G) []. Phylogenetic relationships were estimated using maximum likelihood (ML) and Bayesian inference (BI) methods. ML analysis was performed in PhyML 3.1 [], with nodal support estimated using the bootstrap technique [] with 1,000 replicates. For the BI analysis, MrBayes v.3.2.6 [] was used, with parameters estimated as part of the analysis. The analysis was run for 1 × 107 generations, saving 1 tree each 1000 generations. The log-likelihood values of the sample point were plotted against the generation time and all the trees prior to reaching stationarity were discarded as ‘burn-in’ samples (25%). Remaining trees were combined in a 50% majority consensus tree, in which frequency of any particular clade represents the posterior probability. Additionally, phylogenetic networks for the clades containing Canarian haemogregarines were constructed using the statistical parsimony approach implemented in TCS [] and displayed graphically using tcsBU [].A hierarchical analysis of molecular variance (AMOVA) was performed to test the hypotheses regarding genetic structure among haemogregarine haplotypes, using islands and host species as variables, and locations within these. Analysis was run both in the software Arlequin version 3.5.2 [] and in the poppr package [] in R [], which implements the AMOVA tests from the packages ade4 [] and pegas [], with 1000 Monte Carlo permutations in all cases to assess statistical significance. […]

Pipeline specifications

Software tools Geneious, MAFFT, MEGA, jModelTest, PhyML, MrBayes, tcsBU, Arlequin, pegas
Applications Phylogenetics, Population genetic analysis
Organisms Serinus canaria, Toxoplasma gondii