Similar protocols

Protocol publication

[…] . Because chi-square statistics are affected by sample size under the alternative hypothesis, analyses were also performed using a random selection of 42 individuals from each HapMap comparison group to facilitate more direct comparison of the results across the HapMap populations. The value of 42 was used as it represented the smallest group (ASW, N = 42)., Population substructure was evaluated using PC analyses for: () the Malawi population, () the Malawi population combined with the HapMap populations of African (ASW, LWK, MKK, YRI) ancestry, and () the Malawi population combined with the HapMap populations of African and European (CEU, TSI) ancestry. The PC analyses were conducted using EIGENSOFT version 2.0 (). SNP inclusion in the PC analysis was restricted to autosomal SNPs that had MAF > 0.05 and observed genotype frequencies consistent with Hardy-Weinberg equilibrium expected proportions (p > 0.001) in each participating individual population. Strict SNP pruning based on pair-wise SNP-SNP LD was conducted to identify a subset of independent SNPs for inclusion in PC analysis. Specifically, we calculated pair-wise SNP-SNP LD, measured by r2, between all SNP pairs within 500 kb in the BMW sample using PLINK. A custom computer program was used to select the largest number of SNPs from each chromosome such that each selected SNP had no other selected SNPs within 500 kb that were in LD with it (defined by r2 > 0.01). Based on these selection rules, we identified () 23,612, () 18,481 and () 16,912 SNPs for use in the three PC analyses, respectively., Finally, we performed global ancestry estimation using the software ADMIXTURE on our combined sample of 7 African and European populations using the same 16,912 SNPs included in the PC analyses (). ADMIXTURE uses a maximum likelihood approach to model the probabilities of the observed genotype data using ancestry proportions and population allele frequencies. Similar to the program STRUCTURE, ADMIXTURE requires the user specification of the number of postulated ancestral populations that preceded the observed populations included in the study sample. For this study, we considered K = 2, 3 and 5 ancestral populations., The HapMap data available at the time of this study contained approximately 1.5 million SNPs for 1,115 individuals of 11 unique ethnic groups. The Malawi data included 112 males and 114 females, with a genotyping call rate of 99.975%. Following quality control, the combined HapMap and Malawi dataset included 1,150 individuals, 602 of which were of African ancestry, and 633,763 SNPs, 617,715 of which were on autosomal chromosomes and incorporated in the analyses., Th […]

Pipeline specifications

Software tools EIGENSOFT, PLINK, ADMIXTURE
Organisms Homo sapiens
Chemicals Nucleotides