Computational protocol: Genetic Diversity and Variability in Endangered Pantesco and Two Other Sicilian Donkey Breeds Assessed by Microsatellite Markers

Similar protocols

Protocol publication

[…] Individual multilocus genotypes were processed by means of GENALEX v.6.4 platform [] to perform file conversions and calculate the main parameters of genetic variability. For each locus and breed and on the whole sample, the allele frequencies, private alleles (A p), and observed (H o) and unbiased expected (H e) heterozygosities were calculated.The polymorphism information content (PIC) for each locus and breed was calculated [].Hardy-Weinberg equilibrium was tested by the software Genepop v.4.0 [] which was used to perform the score test per locus and breed and global tests across loci and across sample; tests were implemented using the Markov chain algorithm (10000 dememorizations, 5000 batches, and 5000 iterations per batch).The presence of null alleles was tested with MICRO-CHECKER v.2.2.3 [], using the methods by Chakraborty et al. [] and Brookfield [].FSTAT v.2.9.3 software [] was used to estimate the F-statistics [] and their significance as well as the rarefacted number of alleles (Ar) based on the minimum sample size.The significance levels obtained from multiple tests, carried out for HW-Equilibrium and F-statistics, were corrected by the sequential Bonferroni method [] to reduce the occurrence of type I error.In order to measure the short-term divergence of the donkey breeds, the Reynolds' (D R) pairwise genetic distances [] were calculated by PHYLIP ver.3.69 package []. Moreover, the Neighbour-Joining algorithm was implemented on D R and the strength of the nodes was based on 1000 bootstrap resamplings of the allelic frequencies.The model-based approach proposed in the software STRUCTURE 2.3 [] was used to assess the genomic clustering of the sample. As suggested by the authors for populations with possible mixed ancestry, the admixture model associated to the option of correlated allele frequencies [] was implemented to infer the populations' structure using no prior information. Running length was set to 500000 burn-ins followed by 500000 iterations. The range of possible clusters (K) tested was from 1 to 10 and 10 different runs were carried out for each K. The number of clusters fitting best our data was established by plotting the mean ln⁡ Pr(X | K) over the multiple independent runs for each K, as suggested by the authors.The correspondence analysis in which the Chi-square distances measure the proximity of the taxa was performed by GENETIX v.4.05 software [] and breeds and individuals were spatially plot in accordance with allele frequencies. […]

Pipeline specifications

Software tools GenAlEx, Genepop, PHYLIP
Application Population genetic analysis
Organisms Equus asinus