Computational protocol: Clonal diversity and conservation genetics of the medicinal plant Carapichea ipecacuanha (Rubiaceae)

Similar protocols

Protocol publication

[…] ISSR bands were treated as dominant genetic markers and scored as 1 (present) or 0 (absent). Only polymorphic bands that could be unambiguously scored across all the surveyed clusters were considered for further analysis. Six aerial stems of the Atlantic range and three aerial stems of the Amazonian range performed poorly during PCR amplifications; these samples were excluded from all subsequent analyses.An analysis of molecular variance (AMOVA; ) was estimated using Arlequim 3.01 (). AMOVA examined how genetic diversity was partitioned within and among clusters, at range and species levels. The significance of the genetic differentiation was tested with 1000 permutations, where P denotes the probability of having a more extreme variance component than the observed values by chance alone.Cluster analyses based on unweighted pair groups with arithmetic average (UPGMA) were done using the genetic distance with the computer program NTSYSpc 2.2 () to determine the relationship among the 291 aerial stems. Neighbor-joining (NJ) analysis, as implemented in MEGA 3.1 (), was used to construct a phenogram representing the genetic distances among the 50 clusters. This latter analysis was based on pairwise FST values taken from Arlequin 3.01 (). The overall fit of the NJ tree to the original distance matrix was evaluated by a cophenetic correlation coefficient implemented in NTSYS-pc 2.2 ().Inference of genetic structure within a given cluster of ipecac was done with a Bayesian MCMC approach, as implemented in STRUCTURE version 2.2 (; ). In STRUCTURE, individuals (ipecac aerial stems in this case) may be members of several Bayesian groups, with the sum of membership coefficients across all groups being 1. For each population, this analysis organized the ipecac aerial stems in K groups that exhibited distinct ISSR marker frequencies (where K is chosen in advance or can be varied across different runs), with no prior information on origin. For our analysis, each class of genotypes was treated as haploid alleles, as recommended in the softwares documentation. The program was set to run the datasets from each of the ten ipecac populations as discrete inputs that were used to independently infer the genetic structure of each population. In all cases, we followed an ancestry model of admixture and a frequency model in which the allele frequencies are correlated. We set runs with a burn-in period of 20,000 and a Monte Carlo Markov chain (MCMC) of 20,000, with 10 repetitions for K = 2 to 8. STRUCTURE produced nearly identical membership coefficients at each K (data not shown) and indicated a convergence to K = 5 such that 5 (the number of cluster per population) was chosen as the best K value (). […]

Pipeline specifications

Software tools NTSYSpc, MEGA, Arlequin
Application Population genetic analysis