Computational protocol: Spatiotemporal dynamics of Puumala hantavirus associated with its rodent host, Myodes glareolus

Similar protocols

Protocol publication

[…] PUUV sequences were aligned manually using BioEdit 7.1.9 (Hall ). DnaSP version 5 (Librado and Rozas ) was used to determine the number of synonymous and nonsynonymous substitutions and to estimate nucleotide diversity in the PUUV sequences. Median-joining networks were produced with Network 4.6 (Bandelt et al. ), and genetic population structure was inferred with an analysis of molecular variance (amova) implemented in Arlequin 3.5 (Excoffier and Lischer ). Pairwise FST values were calculated for each PUUV segment separately and also for the three segments concatenated with 10 000 permutations to assess the level of statistical significance. BaTS (Parker et al. ) was used to determine the association between the phylogeny and the geographical location of the samples, by estimating the association index (AI), the parsimony score (PS) and the maximum clade (MC) size statistics. We tested for recombination and reassortment between the PUUV segments with RDP4 (Martin et al. ) analogous to the analyses in Fink et al. ().Phylogenetic analyses were performed with the concatenated segments including four published PUUV sequences from the sampling sites (S segment: schle_05_001, varus_09_024, astrup_07_003; M segment: schle_05_015) and prototype strain Sotkamo as outgroup (accession numbers NC_005224, NC_005223, and NC_005225 for S, M, and L segments, respectively). Mega 5.1 (Tamura et al. ) was used to reconstruct phylogenetic trees based on neighbor-joining (NJ) algorithms. The HKY+G substitution model showed the best fit to our data based on the Bayesian Information Criterion tested in jModelTest 2.13 (Darriba et al. ). Bayesian phylogenetic analyses were performed with BEAST 1.7.5 (Drummond et al. ) on the Cipres portal (Miller et al. ). After initial tests, we used a strict molecular clock, a coalescent Bayesian skyline tree prior with 10 groups, and otherwise default priors for two runs of 100 million generations each with sampling every 20 000 generations. A burn-in of 10% was discarded, and convergence of model parameters was checked with Tracer 1.5 (Rambaut and Drummond ). The runs were combined using LogCombiner 1.7.5 (Drummond et al. ). A maximum clade credibility tree was produced with TreeAnnotator 1.7.5 and visualized in FigTree 1.4.0 (http://tree.bio.ed.ac.uk/software/figtree/). [...] BEAST2 (Bouckaert et al. ) was used to shed light on the viral population dynamics. Bayesian skyline plot analysis (Drummond et al. ) was performed to estimate the effective population size and the substitution rate of PUUV based on the Schledehausen and Astrup datasets without the outgroup sequence. The two datasets were analyzed jointly, enabling the estimation of a common substitution rate. All other parameters, including the phylogenies, were estimated separately. The BEAST specifications remained as described above. The Bayesian skyline plots were drawn with Tracer. Path-O-Gen (http://tree.bio.ed.ac.uk/software/pathogen/) was used to regress the root-to-tip distance against the sampling date, in order to confirm the presence of temporal signal in the dataset. [...] Each vole sampling locality was checked for the presence of null alleles with MicroChecker 2.2 (Van Oosterhout et al. ). Deviations from Hardy-Weinberg equilibrium (HWE) were tested per population with Arlequin 3.5 (Excoffier and Lischer ). Pairwise FST between populations was computed as for the PUUV populations. We tested also for significant genetic changes in the bank vole populations over time, computing pairwise FST between samples from different years for the localities Schledehausen and Astrup, for which the largest sample sizes were available. Population structure in the voles was analyzed further with the clustering algorithm in Structure 2.3 (Pritchard et al. ), assuming an admixture model with correlated allele frequencies (Falush et al. ) and without information about the sampling population. We performed ten runs each for K between one and ten with 400 000 Markov chain Monte Carlo (MCMC) iterations and a burn-in of 40 000 iterations. The estimation of K followed the method suggested by Evanno et al. (), and the figures were displayed with Distruct 1.1 (Rosenberg ). […]

Pipeline specifications

Software tools BioEdit, DnaSP, Arlequin, RDP4, MEGA, jModelTest, BEAST, CIPRES Science Gateway, FigTree, TempEst
Applications Phylogenetics, Population genetic analysis
Organisms , Homo sapiens
Diseases Animal Diseases, Hemorrhagic Fever, American, Kidney Diseases, HIV Infections