Computational protocol: Sorghum Landrace Collections from Cooler Regions of the World Exhibit Magnificent Genetic Differentiation and Early Season Cold Tolerance

Similar protocols

Protocol publication

[…] The markers were first evaluated independently for polymorphism. Out of 67 markers screened, 50 of them produced sufficient level of polymorphism while the remaining 17 markers did not sufficiently distinguish the genotypes and hence were excluded from the analysis. The diversity analysis was performed using Nei’s 1983 method in PowerMarker version 3.25 () and confirmed in POPGENE 1.32 (). Diversity indices including total number of alleles, allele number per locus, gene diversity (expected heterozygosity), and polymorphism information content (PIC) were determined. The neighbor-joining (NJ) tree analysis was conducted with 100 replications of bootstrapping as implemented in PowerMarker () and the tree was constructed using Molecular Evolutionary Genetics Analysis (MEGA) 5.10 ().The software STRUCTURE version 2.3.4 () was used to analyze the population structure based on admixture model where each individual draws some fraction of its genome from each of the k populations. It identifies gene flow events between subpopulations and individuals whose genotypes indicate admixture are assigned to two or more subpopulations. The STRUCTURE program was run 20 times for each subpopulation (k) value ranging from 1 to 10 with 50,000 replicates for burn-in and 50,000 replicates during analysis. The consensus number of subpopulations was determined based on the results of NJ tree analysis and the point where the posterior probability (LnP(D) began to plateau (). Based on these, k = 4 was chosen as the optimal number of subpopulations. A graphical bar plot was then generated using the posterior membership coefficients. The NJ tree and STRUCTURE analyses were further confirmed by PCA conducted using R program. The analysis of molecular variance (AMOVA) for subpopulations and pairwise population differentiation (FST) comparisons were performed in GenAIEx 6.501 software (; ; ). The genetic distances between the four identified subpopulations were calculated using POPGENE v 1.32 () based on unbiased genetic distance. [...] Linkage disequilibrium was determined as the square value of correlation coefficient (r2) between all pairs of markers in TASSEL software version 3.0 using a sliding window (). LD calculations were carried out within all 136 accessions and for each subpopulation identified using STRUCTURE. The LD extent was estimated separately for unlinked (loci on different chromosomes) and linked loci (on the same chromosome). All marker pairs with LD probability values of less than 0.05 were considered to be in significant LD. The significance of the p-values of r2 for each marker pair was performed using 10,000 permutations in TASSEL 3.0. The LD decay was determined when squared correlation coefficient, r2 = 0.02, and the scatterplots of estimated r2 values vs. distance (cM) between markers on a whole genome, chromosomes 1 and 2 were performed using curvilinear regression in SPSS software version 22.0 (). Genetic distances for 50 SSR markers for LD decay analysis were obtained from a genetic linkage map of sorghum constructed by . The LD decay curves were fitted for the whole genome, chromosomes 1 and 2 using 50, 9 and 11 SSR markers, respectively. [...] The marker-trait associations were performed using GLM and MLM () in TASSEL 3.0 (). In the GLM, the Q-matrix was integrated as a covariate in order to correct for the effects of population structure while Q and K matrices were both used in the MLM to correct for population structure and familial relatedness, respectively. The Q and K matrices were calculated using STRUCTURE 2.3.4 () and SPAGeDi (), respectively. All kinship values between individuals that were negative were set to zero. Significance of marker-trait associations was based on the threshold of p ≤ 1 × 10-3, a stringent Bonferroni correction determined by dividing 0.05 by 50 (number of markers used in this study) as described by . […]

Pipeline specifications

Software tools PowerMarker, POPGENE, MEGA, TASSEL, SPAGeDi
Applications Phylogenetics, Population genetic analysis
Organisms Sorghum bicolor