Computational protocol: Arylamine N-Acetyltransferase 2 (NAT2) Genetic Diversity and Traditional Subsistence: A Worldwide Population Survey

[…] To adequately characterize the worldwide patterns of NAT2 gene variation, only the 74 samples genotyped for the seven common SNPs of NAT2 were used so as to avoid possible haplotype and phenotype misclassifications due to incompleteness of genotype data. As the SNP 191G>A has been shown to be rare in non-African populations , we also included the 25 non-African samples genotyped for all SNPs except this one in the diversity survey, leading to a total of 99 population samples (11,286 individuals) belonging to four continental regions (Africa, Europe, Asia and America) available for analysis.In each sample, NAT2 haplotypes were either directly resolved using molecular-haplotyping techniques (through allele-specific PCR and restriction mapping) or computationally inferred from the unphased multilocus genotypes using statistical algorithms (based either on a parsimony, maximum-likelihood, or Bayesian approach). For some samples, a combination of the two approaches was used. The specific haplotyping method used in each sample is specified in . NAT2 haplotypes were named in accordance with the consensus gene nomenclature of human NAT2 alleles ( The fast NAT2*4 haplotype was considered as the ancestral human haplotype, as inferred from primate sequences (unpublished data).Thanks to the well-established genotype-phenotype correlation , the individual acetylation phenotype could be predicted from the pair of multilocus haplotypes carried by each subject at NAT2, following the acknowledged classification of NAT2* alleles into fast and slow haplotypes. The acetylation phenotype for each individual was inferred by assuming that the homozygous or compound heterozygous genotype for two haplotypes of the series NAT2*4, NAT2*11, NAT2*12 or NAT2*13 results in the rapid acetylator status, the occurrence of one of these haplotypes in combination with a haplotype of the series NAT2*5, NAT2*6, NAT2*7 or NAT2*14 results in the intermediate acetylator status and the occurrence of two haplotypes of the series NAT2*5, NAT2*6, NAT2*7 or NAT2*14 results in the slow acetylator phenotype. The proportions of slow, intermediate and fast acetylators in the 99 samples studied are provided in .Analysis of molecular variance (AMOVA) , FST statistic , and measures of haplotype diversity based on estimated haplotype frequencies were computed using Arlequin v.3.11 software . The molecular distance matrix (number of pairwise differences) between NAT2 haplotypes was included in AMOVA and FST computations. […]

Pipeline specifications

Software tools UNPHASED, Arlequin
Application Population genetic analysis
Organisms Homo sapiens