Computational protocol: Y-chromosome diversity suggests southern origin and Paleolithic backwave migration of Austro-Asiatic speakers from eastern Asia to the Indian subcontinent

Similar protocols

Protocol publication

[…] We estimated the time of most recent common ancestor (TMRCA) of the O2a1-M95 lineage using Y-STRs variation in each population as described previously, with a 25-year generation time and a mutation rate of 6.9 × 10−4 (). For comparison, when calculating the ages we used three sets of loci for each population: a) the actual number of loci in the corresponding references, b) a 7-loci set (DYS19, DYS389 I, DYS389 II, DYS390, DYS391, DYS392 and DYS393) and c) a 6-loci set (DYS19, DYS389 I, DYS390, DYS391, DYS392 and DYS393), and the results from different calculations are very similar for most populations (). The mean TMRCAs of a geographic region are the average of its populations (). We also estimated the unbiased haplotype diversity of every population using GenAlEx 6.3. When estimating the age and diversity, O2a1-M95 populations with less than 10 samples were either excluded or merged to other closely related populations. In total, the coalescence ages and diversity of the O2a1-M95 lineages from 105 Asian populations were calculated ( and ).A median-joining network, resolved with the MP algorithm, was constructed using the Network package ( The O2a1-M95 variance isofrequency maps based on frequency and unbiased haplotype diversity were generated using Surfer10 (Golden Software Inc., Golden, USA), following the Kriging procedure. Average number of pairwise difference of Y-STRs for the studied populations was calculated using the Arlequin 3.5, and NJ-tree was constructed with MEGA 6.0. We performed principal component analysis (PCA) based on the frequencies of mtDNA haplogroups according to the method developed by Richards et al. with MVSP 3.13.To compare the paternal and maternal gene pool between populations from East Asia and South Asia, we analyzed ~21,470 mtDNA sequences among these populations published previously (). […]

Pipeline specifications

Software tools GenAlEx, Arlequin, MEGA
Application Population genetic analysis