Computational protocol: Fast Dissemination of New HIV-1 CRF02/A1 Recombinants in Pakistan

[…] PCR amplicons and TruSeq RNA libraries were quantified using qPCR with KAPA Library Quantification Kit Illumina platform (Kapa Biosystems, Wilmington, MA). The PCR amplicon or TruSeq library from each sample was barcoded and then sequenced on MiSeq (Illumina, San Diego, CA) using the MiSeq Reagent Nano kit v2 (300 bp). The average coverage was 500 and 8000 for each base for PCR amplicons and TruSeq libraries, respectively. The final consensus sequence from each library was obtained by assembling raw sequences reads using Geneious software (Biomatters, Auckland, New Zealand) or High-performance Integrated Virtual Environment (HIVE) [].The final sequences were aligned together with subtype reference sequences from HIV database in Los Alamos ( using CLUSTAL W [] and manual adjustment for optimal alignment was done using SEAVIEW. Subtypes of newly characterized HIV-1 genomes were determined by phylogenetic tree analysis using the neighbor-joining (NJ) method with Kimura two-parameter model [, ], and the reliability of topologies was estimated by bootstrap analysis with 1000 replicates. Recombination patterns in newly characterized HIV-1 genomes were initially analyzed by the jumping profile Hidden Markov Model (jpHMM; []. The recombination breakpoints were confirmed by BootScan implemented in Simplot version 3.5.1 []. The recombination pattern map for each virus was generated using RecDraw []. [...] Neighbor-joining phylogenetic tree was first analyzed with TempEst v1.5 ( to determine the temporal signal for reliable estimation of MCRA before sequences were analyzed in BEAST []. The divergence times for subtype A1a and CRF02 were estimated using Bayesian Markov Chain Monte Carlo (MCMC) approach available in the BEAST v1.8.2 package. Both strict and relaxed (uncorrelated lognormal) molecular clocks were enforced under the GTR and HKY nucleotide substitution models [], respectively, with a gamma-distribution model of among site rate heterogeneity (with four rate categories)[]. Each MCMC analysis was run for 50 million steps and sampled every 10,000 states. Posterior probabilities were calculated with a 10% burn-in and checked for convergence using Tracer v1.6. The maximum clade credibility tree was generated using Tree Annotator v1.8.2 available in BEAST and FigTree 1.4.2 was used for visualization of the annotated trees []. […]

Pipeline specifications

Software tools Geneious, HIVE, Clustal W, SeaView, jpHMM, SimPlot, RecDraw, TempEst, BEAST, FigTree
Application Phylogenetics
Organisms Human immunodeficiency virus 1