Computational protocol: Evolution of Highly Pathogenic H5N1 Avian Influenza Viruses in Vietnam between 2001 and 2007

Similar protocols

Protocol publication

[…] To identify the potential precursor viruses for each Vietnam isolate, we applied our newly developed Gene In Network (GIN) method . Briefly, GIN first measured the evolutionary distances between isolates using the Complete Composition Vector (CCV) approach , , . Then GIN identified influenza virus modules, clusters of viruses with small evolutionary distances, using a local optimization program based on the thresholds derived from Bayesian analysis . Unlike the conventional phylogenetic tree construction approach, GIN is able to efficiently analyze a larger number of sequences. Inclusion of all available 44,398 influenza viral sequences during this analysis facilitated systematic identification of the progenitor genes of Vietnam H5N1 viruses. Following identification of related groups of viruses using GIN, selected isolate sequences from the clusters of larger groups were subsequently analyzed by conventional phylogenetic methods as described below. Potential precursor viruses to Vietnam isolates identified in this study were selected based on three main criteria: (1) the precursor viral gene must have the shortest phylogenetic distance to the homologous genes of its associated Vietnam isolate(s); (2) the precursor virus contributes the majority of the donor gene segments to the Vietnam isolate(s) in comparison to other closely related viruses; (3) the specific sub-lineage with a precursor virus and its associated Vietnam isolates must have a statistical confidence value (bootstrap value or posterior probability, described in the following section) over 60.Phylogenetic inferences relied on Maximum Parsimony (MP) and Neighbor-Joining (NJ) methods using PAUP* 4.0 Beta , as described earlier . Maximum Likelihood (ML) tree estimation was evaluated using GARLI version 0.951 . Bayesian trees were estimated by MrBayes version 3.1.2 with 1 million generations, sampling every 100 generations, using the default heating parameter, in two runs. The consensus trees were calculated using the allcompat option from the final 10,001 trees from each run. Tree topologies were confirmed between each of these three methods. Bootstrapping support for tree topologies were performed using NJ methods implemented in PAUP* 4.0 Beta with 1,000 replicates. When Bayesian trees were estimated, the posterior probability for each split was generated using the MrBayes sumt option with a 25% burnin , . These posterior probabilities were used as an alternative measure of clade assignment support. The nucleotide substitution models for ML and NJ methods were selected using MODELTEST 3.7 . The positive selection analyses for HK821-like isolates and other 2007 VN isolates were conducted using PAML . Control and log files for all stand-alone programs used in these analyses and other methodological materials are available upon request. […]

Pipeline specifications

Software tools GARLI, MrBayes, ModelTest-NG, PAML
Application Phylogenetics
Organisms Homo sapiens, Anas platyrhynchos
Diseases Infection