Computational protocol: Natural Selection Mediated Association of the Duffy (FY) Gene Polymorphisms with Plasmodium vivax Malaria in India

Similar protocols

Protocol publication

[…] In order to catch most of the polymorphisms of the FY gene and to meaningfully conduct population genetic survey, we have divided the whole of India into six distinct zones; viz., North India (NI), Central India (CI), West India (WI), South India (SI), East India (EI) and North-East India (NEI). Further, since many tribes live in different Indian states, we have included samples from three Indian tribes; Juango, Bonda and Kutia Kandha inhabiting in Odisha state. The details of the Indian states included under each zone and the number of individuals sampled from each state are provided in the . In total, 250 healthy and unrelated Indians (including 28 tribal individuals) representing almost all the political states of India were sampled. For positive control, blood samples from two Africans (Cameroonians colleagues of the authors working in India) were also analyzed. From each individual, two milliliter of intravenous blood was collected in heparinized vacutainers and stored at −20°C until DNA extraction was performed.We have selected the 1096 bp region of the FY gene located in the human chromosome 1 (), comprising the two main polymorphic sites which govern the FY*O allele (T-33C) in the promoter region and the FY*A/FY*B alleles (G125A) in the exon-2. For ease in PCR amplification and automated DNA sequencing, we have divided the 1096 bp region into two overlapping sequences () employing two primer pairs (). Genomic DNA from blood sample of each individual was extracted and PCR amplification reactions were carried out following standard protocols in a final volume of 20 µl. Five microliters of amplified PCR product of each fragment were run on 2% agarose gel to check the quality of amplification. If a single band without any primer-dimer was present, the amplicons were considered for DNA sequencing. For this, the PCR amplicons were purified with Exonuclease-I and Shrimp Alkaline Phosphatase (Exo-Sap, Fermentas, Life Sciences) following standard protocol. Sequencing reactions were followed with Big Dye Terminator (BDT) ready reaction mix as per the Applied Biosciences (ABI) protocol and DNA sequencing was performed in an ABI 3730XL DNA Analyzer (in-house facility of NIMR). Each DNA fragment was sequenced in both the forward and reverse directions (2× coverage) and the resulted DNA sequences were assembled and edited using the SeqMan and EditSeq computer programs (DNASTAR, Madison, WI, USA) for each individual. The DNA sequence chromatograms were carefully visually inspected for the occurrence of single/double peaks in the sequence chromatogram at both the T-33C and G125A nucleotide positions. Altogether, 252 DNA sequences (250 Indian, two African) were aligned using the MegAlign computer program (DNASTAR) following Clustal-W algorithm to identify Single Nucleotide Polymorphisms (SNPs) at these two nucleotide positions (T-33C and G125A). For instance, detection of the C nucleotide in a single peak at the −33rd position indicates the presence of FY*O homozygote, both C and T in double peak as heterozygote genotype and the T nucleotide in a single peak indicates the absence of FY*O genotype. Similarly, detection of the G and A nucleotides (in single peaks) at the 125th position indicate homozygosity for FY*A and FY*B, respectively, whereas presence of double peaks of both G and A nucleotides at this position categorize the individual as a heterozygote (FY*A/FY*B). Following these approaches, each of the 250 Indians and the two Africans were independently genotyped and frequencies of each allele for the seven Indian population samples were calculated based on the genotype data for population genetic analyses (see below).The genotype data (FY*A/FY*A, FY*A/FY*B and FY*B/FY*B) from each of the seven zonal population samples were analyzed by Chi-square Test to determine if the population samples deviate from the expectation under the Hardy-Weinberg (H-W) equilibrium. Genetic differentiation between pairs of population samples were determined by two test statistics; (i) Fisher's Exact Test of differentiation using the Arelquine computer program (, and (ii) Nei's genetic distance between a pair of population sample (as measured by D) using the POPGENE computer program ( In order to visualize genetic interrelationships among different Indian population samples, the D matrix was used to obtain a Neighbour-Joining (NJ) population phylogenetic tree using the Phylip computer program ( and visualized with the online computer program Drawgram ( following a similar approach as described by Das and co-workers and Gupta and co-workers . In order to discern if ecological parameters (that change with population coordinates) influence the distribution patterns of the two FY alleles, frequencies of the two different FY alleles were correlated with both latitude and longitude of the population samples (determined from the centrally placed city of each zone, see Supplementary ) independently by calculating the Pearson's correlation coefficient (r). Furthermore, since the FY*A allele has been shown to provide protection against vivax malaria in comparison to the FY*B allele , and epidemiology of P. vivax malaria varies in India, we have correlated the extent of vivax malaria for each Indian state ( with the frequencies of both the FY*A and FY*B alleles and with three FY genotypes by calculating the r values independently. Data in percent frequencies were converted through Arcsine Transformation before employing in any of the above statistical analyses. […]

Pipeline specifications

Software tools Clustal W, POPGENE, PHYLIP,
Applications Phylogenetics, Population genetic analysis
Organisms Plasmodium vivax, Homo sapiens
Diseases Malaria, Vivax