Computational protocol: Further Evidence of Increasing Diversity of Plasmodium vivax in the Republic of Korea in Recent Years

Similar protocols

Protocol publication

[…] Since asexual P. vivax stages are haploid, an infection was defined as polyclonal if more than one allele was observed at one or more loci. The Multiplicity of Infection (MOI) for a given sample was defined as the maximum number of alleles observed at any of the loci investigated. The mean MOI was calculated from the individual sample MOIs for each study site. With the exception of the MOI calculations, in all analyses, only the predominant allele at each locus in each isolate was used [].The expected heterozygosity (HE) was measured as an index of population diversity. HE was calculated for each locus using the formula HE = [n/ (n-1)] [1-Σp i 2], where n is the number of isolates analyzed and pi is the frequency of the ith allele in the population. The correction factor n/(n-1) was included to enhance comparison between populations with differing sample size.The pairwise FST metric was used to gauge the genetic distance between populations. Calculations were undertaken using Arlequin software (version 3.5) []. In addition to the classic FST metric, standardized measures of the genetic distance (F’ST) were calculated to adjust for high marker diversity and enable greater comparability with other studies []. The F’ST provides a measure of FST expressed as a fraction of the maximum possible value of this statistic, whereby F’ST = FST /FST−max. FST−max was calculated by recoding the data to obtain the maximum divergence among populations.Population structure was further assessed using STRUCTURE software version 2.3.3 to determine the most likely number of populations (K) and ancestry of each isolate to the K populations []. Twenty replicates, with 10,000 iterations (10,000 burn-in) and were run for each of K from 1–10 using the model parameters of admixture with correlated allele frequencies. The most probable K was derived by calculating ΔK as described elsewhere [] for each of K = 2–8. Barplots illustrating the ancestry of each isolate to each of the K populations were prepared using distruct software version 1.1 [].Multi-locus genotypes (or infection haplotypes) were reconstructed from the predominant allele at each locus in isolates with no missing data at any of the loci investigated. Using these multi-locus genotypes, linkage disequilibrium (LD) was measured by the standardised index of association (IAS) using the web-based LIAN 3.5 software []. Under the null hypothesis of linkage equilibrium, the significance of the IAS estimates was assessed using 10,000 random permutations of the data. LD was assessed in 1) the full sample set and 2), for assessment of epidemic transmission, with each unique haplotype represented just once.Using the multi-locus genotypes described above, the genetic relatedness between sample pairs was assessed by measuring the proportion of alleles shared between haplotype pairs (ps). Using (1-ps) as a measure of genetic distance [], an unrooted neighbour-joining tree [] was generated with the APE (Analysis of Phylogenetics and Evolution) package in R []. Suspected imported cases were included in the neighbour-joining analysis only. […]

Pipeline specifications

Software tools Arlequin, DISTRUCT, LIAN, APE
Applications Phylogenetics, Population genetic analysis
Organisms Plasmodium vivax, Homo sapiens
Diseases Infection, Malaria