Computational protocol: Inferring epidemiologic dynamics from viral evolution: 2014–2015 Eurasian/North American highly pathogenic avian influenza viruses exceed transmission threshold, R0 = 1, in wild birds and poultry in North America

Similar protocols

Protocol publication

[…] All data were obtained from the National Institute for Allergy and Infectious Diseases Influenza Research Database (IRD) through the web site at (Squires et al., ; accessed 26 October, 2016). We obtained nucleotide sequences of complete segments of the HA segment isolated from avian species in Canada and the USA that were part of clade HPAIVs (Smith & Donis, ). Next, we obtained full sequences of the M segment and viral polymerase complex PB2 segments isolated from the same host, identified via the strain name.Nucleotide sequences of each segment were aligned using the multiple sequence alignment algorithm implemented in the R package DECIPHER (r function AlignSeqs; Wright, , ). Following initial alignment, areas of the multiple alignments with information content <0.5 bits and greater than 20% of sequences containing gaps were masked using moving averages of 10 nucleotides (r function MaskAlignment; Wright, ). Final sequence alignments were near full length for each segment with masked regions occurring only at the 3′ and 5′ ends (HA 1744 nucleotides, PB2 2280 nucleotides, M 987 nucleotides).We modeled the phylodynamics of the clade HA, PB2, and M segments using a multitype birth–death process on a time‐rooted phylogenetic tree implemented in a Bayesian framework to make inference on the basic reproduction number (R 0) of each segment during the sampling of the outbreak and viral migration between host types using two nested analyses (Kuhnert et al., ). The isolates (tips on the phylogenetic tree) were annotated with strain name defined in the IRD, sample collection date, host type (wild bird or poultry/domestic bird), host species, state or province of collection, viral subtype (defined by HA and Neuraminidase [N] combination), and sequence accession identifier (Table ). This family of birth–death phylodynamic models integrated uncertainty of the phylogeny of isolated sequences with an epidemiological model analogous to a compartmental model (Kuhnert, Stadler, Vaughan, & Drummond, ). We used the general time‐reversible (GTR) +Γ4 + I model with a relaxed log‐normal molecular clock to estimate the phylogeny (Chen & Holmes, ). We used vague informative priors for the mean clock rate based on previously estimated rates for each segment (Chen & Holmes, ; Tables ). Distributions for specified priors along with their posterior estimates and convergence diagnostic statistics are presented in Tables .The first analysis used all isolated sequences from the HA, PB2, and M segments, separately, to analyze the transmission dynamics among host types defined as wild bird or poultry/domestic bird. Owing to the short time span of the outbreak (earliest isolate collection 2 December 2014; last isolate: 1 June 2015), we estimated constant transmission model rates for the entire time period with no sampling before or after the earlier and latest isolates, respectively. We assumed that once sampled, that virus was removed from the population because wild bird sampling was from a mortality event or hunter harvest (Bevins et al., ) and infected poultry operations were subject to high biosecurity and depopulation once infection was detected (the point in time where the isolate was obtained; USDA APHIS ). For epidemiological parameters, we used the multitype birth–death model and parameter notation of Kuhnert et al. (). The primary epidemiological processes of interest were the basic reproduction number within each host type (R 0_poultry and R 0_wild), the rate of becoming noninfectious (δ; assumed to be the same per host type), the migration rate of viral lineages among subtypes (m poultry_to_wild and m wild_to_poultry), and the probability of sampling a viral lineage per subtype (ψpoultry and ψwild). Using Bayesian methods, we jointly estimated these fundamental epidemiological parameters with the evolutionary model governing the nucleotide changes and were able to infer a time‐rooted phylogeny, estimate ancestral host types of common ancestors, and estimate the relative contribution of viral transmission within host types to viral migration between host types (Kuhnert et al., ). We used this analysis to address the first two hypotheses by testing the prediction that R 0_wild < 1 and R 0_poultry < 1 during the 2014–2015 outbreak.We performed a second analysis to evaluate the hypothesis that recombination of the ancestral Asian virus (EA H5N8) with a North American virus to produce the EA/NA H5N2 increased the transmission among wild bird hosts. Again, we used the multitype birth–death model and included only HA, M, and PB2 sequences isolated from wild bird hosts where there was evidence of transmission among wild birds (Lee et al., ). We defined this set of viral isolates as all publically available wild bird sequences collected prior to 1 February 2015. We defined the host type based on the HA and N segment subtype that was infecting the wild bird hosts (Table ): H5N8 subtypes had a full set of Eurasian gene segments (Ip et al., ; Lee et al., ) and H5N2 was the Eurasian–North American reassortment where the PB1, NS, NP, and N segments were of North American origin and were consistent through the outbreak (Lee, Torchetti, Killian, Deliberto, & Swayne, ; Pasick et al., ). We excluded the H5N1 reassortant isolates because there were too few to make confident inferences. We used the same molecular clock model as the analysis of the full set of sequences (priors listed in Tables ) to estimate the epidemiological parameters: the basic reproduction number within each host type (R 0_H5N2 and R 0_H5N8), the rate of becoming noninfectious (δ; assumed to be the same per subtype), the migration rate of viral lineages among subtypes (m H5N2_to_H5N8 and m H5N8_to_H5N2), and the probability of sampling a viral lineage per subtype (ψH5N2 and ψH5N8). We used this analysis to address the third hypothesis by testing the prediction that R 0_H5N2 > R 0_H5N8 in wild birds. We used vague informative priors for the R 0, δ, and m parameters. The priors for R 0 and m were chosen to limit unrealistically high parameter values. The prior for δ was based on the infectious period of avian influenzas (Aldous et al., ) with additional uncertainty from the interpretation of host‐specific infectious periods relative to the phylodynamic model analog: lineage infectious period that may span multiple host infections. We used uninformative priors for the ψ parameters and distributions for specified priors, along with their posterior estimates and convergence diagnostic statistics are presented in Tables .We made inference to the phylogenetic and epidemiological parameters in a Bayesian framework using the bdmm model (Kuhnert et al., ; available at, accessed 19 September 2016) implemented using Markov chain Monte Carlo (MCMC) methods in BEAST v2.4.3 (Bouckaert et al., ). We ran four separate chains of 8–10 million MCMC iterations with the same priors and randomly selected starting values (Tables ). We ensured convergence of each chain by visually assessing MCMC traces for each parameter and whether effective sample sizes were sufficiently large (>200). We then performed a Gelman–Rubin diagnostic implemented in the coda package for R (Plummer, Best, Cowles, & Vines, ) for the MCMC chains excluding a 10% burn‐in on each parameter to ensure that scale reduction factor estimates were all <1.1 (Gelman et al., ). Once we confirmed convergence, we discarded 10% burn‐in from each chain, combined chains using the program Log Combiner v2.4.2 (, and sampled from the posterior distribution of parameters every 1,000 MCMC iterations. We reported the full set of model parameter priors, posterior estimates (mean and 95% highest posterior density interval, HPD), and convergence diagnostics for all three segments for the full wild‐poultry and nested wild bird analyses (Tables ). We focused on reporting estimated divergence times along with the epidemiological parameters for the wild‐poultry type analysis (hypotheses 1 & 2) with all the sequences and only the epidemiological parameters for the wild H5N2‐H5N8 subtype analysis on the isolate subset (hypothesis 3). In addition to the epidemiological rate parameters for the poultry and wild host type analysis (R 0_poultry, R 0_wild, m poultry_to_wild, m wild_to_poultry, ψpoultry and ψwild , δ), we also calculated estimated numbers of realized transmission events from poultry to wild birds and vice versa, based on the viral migration rates (m poultry_to_wild and m wild_to_poultry) and the relative frequencies of the ancestral host states estimated from the phylogenetic model (Kuhnert et al., ). […]

Pipeline specifications

Software tools BDMM, BEAST
Application Phylogenetics