Computational protocol: The Role of Human Transportation Networks in Mediating the Genetic Structure of Seasonal Influenza in the United States

Similar protocols

Protocol publication

[…] Sequences were aligned using MUSCLE in Geneious [] and the HA1 domain was extracted for use in all analyses (H3N2: 987 nt, H1N1: 1701 nt). Seasonal influenza is introduced into the US multiple times over the course of the season []. To account for these multiple introductions, phylogenetic trees were inferred separately for each season using a bayesian framework in the program BEAST [, ]. To construct phylogenies, we used the SRD06 codon position model to accommodate different substitution rates for the first and second versus the third codon position, with the HKY85 substitution model applied over these two codon positions []. For two seasons for which an extremely large number of sequences were available, H3N2 2007–2008 and H3N2 2012–2013, we down-sampled from states that contributed exceptionally large numbers of sequences. For the H3N2 2007–2008 season, the GTR+I+G model used, as convergence could not be achieved using the codon position model. Trees were constructed using a strict molecular clock, with an exponential growth tree prior and relatively uninformative priors on all phylogenetic parameters except for the substitution rate, for which we used a lognormal prior with mean = 0.0055 (sd = 0.7) substitutions/site/year for H3N2 sequences [] and mean = 0.0018 substitutions/site/year (sd = 0.4) for H1N1 sequences []. MCMC chains were run until convergence was reached and a maximum clade credibility tree was annotated after removing the first 10% of the sampled trees as a burn-in. We defined clades as groups of at least 20 sequences stemming from a node with a posterior probability of > 0.9. We corrected for independent introductions into the US by choosing clades for which the entire HPD interval for the divergence time of the MRCA did not fall more than three months before the beginning of the flu season. This time limit was chosen as it was generally the most recent time period for which high posterior support could be obtained for clades. Since several clades fitting these criteria were often identified within a single season, we used a Bonferroni correction within seasons, based on the number of clades identified for a season to account for these multiple comparisons.For each clade analyzed, pairwise genetic distances were calculated as the proportion of sites that differed between each pair of sequences. To ensure that the choice of genetic distance metric did not affect our results, analyses were repeated using the evolutionary substitution models available in the R package APE []. The results remained the same regardless of the distance metric chosen, so we chose to present those results obtained using the raw pairwise distance measure. Pairwise spatial distances were calculated based on the great circle distance between state population centers.The 2008–2009 and 2009–2010 seasons presented a special case for H1N1, as a new pandemic lineage emerged in the spring of 2009 that differed markedly from the currently and previously circulating H1N1 lineages. As epidemiological dynamics of influenza pandemics differ substantially from those of annual seasonal epidemics [], sequences from the pandemic lineage in the 2008–2009 season, as well as the entire 2009–2010 season, were excluded from all analyses. To distinguish between antigenically distinct pandemic isolates and the previously circulating H1N1 viruses, a phylogenetic tree was inferred for the 2008–2009 season using a neighbor-joining algorithm. Two clades were immediately obvious, each encompassing distinct time periods during the influenza season that corresponded well with the circulation times of the epidemic and pandemic lineages. Using the A/California/07/2009 strain of pandemic H1N1 (GenBank accession: FJ981613) as a reference, sequences were classified and excluded accordingly. […]

Pipeline specifications

Software tools Geneious, BEAST, APE
Application Phylogenetics
Organisms Homo sapiens