Computational protocol: Microsatellite abundance across the Anthozoa and Hydrozoa in the phylum Cnidaria

Similar protocols

Protocol publication

[…] DNA was extracted using QIAGEN DNeasy kits with 2 elution steps of 5 min of elution time. High-quality DNA (~3 mg) was extracted from eight species: Leiopathes glaberrima, Tanacetipathes sp, Corynactis californica, Amplexidiscus fenestrafer, Eunicea flexuosa Plumarella sp, Metridium senile and Millepora alcicornis (Table ). These samples encompass two classes (Anthozoa, Hydrozoa), three subclasses (Hexacorallia, Octocorallia, Hydroidolina) and five orders (Anthipatharia, Corallimorpharia, Alcyonacea, Actiniaria, and Anthoathecata) within the Cnidaria. Genomic libraries were prepared from the double-stranded DNA using Nextera DNA Sample Prep Kit (Epicentre Biotechnologies, Madison WI) and shotgun sequenced on a 454 GS-FLX sequencer using the Titanium Sequencing Kit (Roche Diagnostics Corporation, Indianapolis, IN).Sequences were trimmed with PipeMeta [] and assembled with the GS De Novo Assembler (Roche Diagnostics Corporation, Indianapolis, IN) keeping the default settings and a minimum sequence length of 45 base pairs. Sequences are available from NCBI Sequence read archive: Leiopathes glaberrima [Genbank: SRX323262], Tanacetipathes sp [Genbank: SRX327567], Plumarella sp [Genbank: SRX326898], Eunicea flexuosa [Genbank: SRX326897], Corynactis californica [Genbank: SRX326758], Amplexidiscus fenestrafer [Genbank: SRX326761], Metridium senile [Genbank: SRX327565], Millepora alcicornis [Genbank: SRX323169].In addition, the whole genome sequence scaffolds from Nematostella vectensis [], Acropora digitifera [] and Hydra magnipapillata strain 105 [] were obtained from GenBank. Whole genome sequences (WGS) were generated from symbiont-free tissues (larvae for N. vectensis and sperm for A. digitifera) [, ] except Hydra for which contaminant sequences were removed manually after assembly [, , ].Several steps were taken to avoid/minimize sequence contamination with symbiotic dinoflagellate algae in the zooxanthellate corals (E. flexuosa, A. fenestrafer and M. alcicornis). When available, DNA was extracted from Symbiodinium-free larvae (E. flexuosa). Amplexidiscus DNA was extracted from the base of the anemone’s foot, which contains lower concentrations of symbionts []. Millepora DNA was extracted from bleached colonies which also feature a significantly reduced symbiont density []. In addition, the Partial Genome Sequences (those containing both flanking regions) were aligned to a custom database containing sequences from three Symbiodinium species: 454 sequences of clade C (Wham et al. unpublished) and assembled EST sequences of clades A and B [], using BLASTn [] and BLASTx [] programs to check for the presence of Symbiodinium sequences. Sequences with more than 75 percent identity, alignment lengths larger than 50 bp and e-values lower than 1e-05 were filtered out of the cnidarian sequences because they represented putative Symbiodinium DNA and aligned against the NCBI database (Additional file : Table S3).Cnidarian sequences were imported to the Tandem Repeat Finder (TRF) database [] and processed using the default alignment parameters as follows: Match: 2; Mismatch: 7; Indels: 7. Sequences were categorized as having at least one flanking region or having two flanking regions (of at least 6 nucleotides) and run in the program SciRoKo [] to extract all perfect tandem repeats between sizes two and six, containing at least three consecutive repeats. Microsatellite search parameters in SciRoKo were as follows: Search mode: Mismatched, Fixed Penalty; Mismatched Search Setting: Required score: 15; Mismatch penalty: 5; SSR seed minimum length: 8; SSR seed minimum repeat: 3, Maximum mismatches at once: 3. High error rate in homopolymer regions have been observed for Roche 454 []; for this reason mononucleotides sequences were excluded from the analyses.Microsatellite coverage and GC content were calculated for each species based on the full data set, using SciRoKo []. Because only one representative of each species was sequenced, the coverage of microsatellite types for each species was bootstrapped using the boot function in R [], to assign a measure of confidence to the coverage value. The subset of sequences with both flanking regions was used to calculate microsatellite length and repeat number. Analysis of Variance (ANOVA) was performed to compare microsatellite lengths among species using SPSS version 19.0 (IBM). Sequencing methodologies varied between species for which whole genomes are available (N. vectensis and H. magnipapillata: Sanger, A. digitifera: Roche 454GS-FLX and Illumina Genome Analyser IIx) and those that were sequenced in this study likely resulting in different sequencing biases between these two data sets []. Thus, WGS and PGS data sets were tested for differences due to sequencing methodology and were only combined when sequencing methodology did not influence the patterns.For the phylogenetic analysis, COI sequences for each species were downloaded from Genbank (Additional file : Table S4), translated to proteins and aligned in Geneious version 5.5.4 []. Bayesian phylogenies were generated in Geneious with Mr. Bayes [] using the mixed amino acid model with gamma distributed variation rates, a uniform branch length clock, and MCMC settings of 4 heated chains for 1000000 generations. A maximum clade credibility tree was constructed in TreeAnnotator v 1.6.2 in the BEAST package []. Regressions of the phylogeny and the microsatellite relative abundance and length were performed with BayesTraits [] using Model A and B, and followed by a log likelihood test, to test for a relationship between phylogeny and microsatellite traits. Species were grouped based on microsatellite abundances and microsatellite lengths using hierarchical clustering in R, with the function hclust from the pvclust package []. […]

Pipeline specifications

Software tools Newbler, BLASTN, BLASTX, SciRoKo, Geneious, BEAST, BayesTraits, Hclust, Pvclust
Databases SRA
Applications Phylogenetics, WGS analysis
Organisms Nematostella vectensis, Hydra vulgaris, Acropora digitifera