Computational protocol: Genetic evidence supports linguistic affinity of Mlabri - a hunter-gatherer group in Thailand

[…] Haplotypes of 22 autosomes were inferred for each individual from its genotypes with fastPHASE [] version 1.2. "Population labels" were applied during the model fitting procedure to enhance accuracy. The number of haplotype clusters was set to 20, the number of random starts of the EM algorithm (-T) was set to 20, and the number of iterations of EM algorithm (-C) was set to 50. This analysis was used to generate a "best guess" estimate of the true underlying patterns of haplotype structure []. We run fastPHASE for 55,561 SNPs shared by 17 populations, and only unrelated individuals were included. [...] Principal component analysis (PCA) was performed at individual level using EIGENSOFT version 2.0 []. [...] The tree of individuals was reconstructed based on ASD distance and using Neighbor-Joining algorithm [] with the Molecular Evolutionary Genetics Analysis software package (MEGA version 4.0) []. Trees of populations as well as components were reconstructed using maximum likelihood method [] with CONTML program in PHYLIP package []. [...] The program frappe [] implements a maximum likelihood method to infer genetic ancestry of each individual. As in STRUCTURE analysis, this analysis considers each person's genome as having originated from K ancestral, but unobserved, populations whose contributions are described by K coefficients that sum to 1 for each individual []. The program was run for 10,000 iterations from K = 2 to 18 and repeated 10 times for each single K. […]

