Computational protocol: Evolutionary History of HIV-1 Subtype B and CRF01_AE Transmission Clusters among Men Who Have Sex with Men (MSM) in Kuala Lumpur, Malaysia

Similar protocols

Protocol publication

[…] Genetic subtype and potential transmission clusters from the time-stamped sequence dataset were first deduced by neighbour-joining tree reconstruction using MEGA version 5.05 based on Kimura-2 parameter model . The robustness of the transmission clusters were further tested by the more rigorous maximum likelihood inference implemented in PAUP version 4.0 using gamma distribution with discrete gamma categories. The reliability of the branching orders was assessed by bootstrap analysis of 1000 replicates. The most appropriate nucleotide substitution model was determined using FindModel, a web implementation of Modeltest available at the HIV database (http://www.lanl.gov.com). In addition, Bayesian maximum clade credibility (MCC) trees were constructed separately using BEAST 1.7 to determine the posterior probability values for each cluster in subtype B and CRF01_AE. In this study, identification of transmission clusters was based on recently reported guidelines, where a transmission cluster is characterized by the following criteria: (a) a cluster consisting of at least 2 isolates from different individuals of the same geographical (i.e country) origin , and (b) a phylogenetic clade supported by high bootstrap values (>90%) and Bayesian posterior probability value of 1 at the tree node , .To estimate the divergence times of the respective subtype B and CRF01_AE transmission clusters, the Bayesian coalescent-based relaxed molecular clock model was performed in BEAST 1.7 . The uncorrelated lognormal model nested in general time-reversible (GTR) nucleotide substitution model with a proportion of invariant sites and four rate categories of gamma-distribution model of among site rate heterogeneity was employed to estimate viral phylogenies, nucleotide substitution rates and to date the time of the most recent common ancestor (tMRCAs) for the respective subtype B and CRF01_AE transmission clusters. As for the coalescent priors, different parametric demographic models namely, constant population size, exponential and logistic growth as well as a nonparametric Bayesian skyline plot (BSP) was applied. The best fits coalescent model was chosen by means of a Bayes factor (BF), using marginal likelihoods, determined by Tracer version 1.5 (http://tree.bio.ed.ac.uk/software/tracer) . The Markov chain Monte Carlo (MCMC) analysis was computed for 50 million states sampled every 10,000 states and output was assessed for convergence by means of effective sampling size (ESS) after a 10% burn-in using Tracer version 1.5. Since higher ESS value indicates lower standard errors, only traces with ESS of more than 200 were accepted. In order to infer the tMRCA for each transmission clusters with greater confidence divergence time for subtype B′ of Thai origin (isolates 93CNRL42, 96M145, 96TH_NP1538, 96M081, 99TH_C1416, 99MMmSTD101, 00TH_C3198, 01CNHN24, 02CNHNsc11, 02CNHNsmx2 and 02CNHNsq4), which were thought to be descended from the ancestral subtype B lineages around 1980 to 1991 , , were co-estimated and checked for concordance with the current estimates. […]

Pipeline specifications

Software tools MEGA, PAUP*, ModelTest-NG, BEAST
Application Phylogenetics
Organisms Human immunodeficiency virus 1, Homo sapiens
Diseases HIV Infections