Computational protocol: Genetic diversity analysis of two commercial breeds of pigs using genomic and pedigree data

[…] Prior to analysis of the genotyping data, 116 animals were excluded from the dataset based on the following criteria and using PLINK []: call rate less than 0.90, a level of heterozygosity higher than 3 standard deviations from the mean, and duplicated samples (match level >99 %). Pedigree errors based on IBD levels (sire or dam to offspring and full-sibs IBD ~0.5, half-sibs IBD ~0.25 and first cousins IBD ~0.125) and sex mis-assignments based on X chromosome inbreeding estimates (F) using standard values of F < 0.2 and >0.8 for females and males, respectively, were also verified using PLINK [].For the analysis of genomic inbreeding, SNPs with an unknown position based on the Pig60K_SNP_pos_build 10.2 (see, SNPs with a call rate higher than 0.90, and SNPs located on sex chromosomes were removed. A total of 45,766 SNPs were used to estimate genomic inbreeding in the LA and LW pig breeds.Additional data pruning was performed with R snpStats (v 1.14.0) to prepare data for analyses of LD, correlation of linkage phase, and Ne []. The following quality control (QC) criteria were used to remove SNPs that had a call rate lower than 0.98, a minor allele frequency (MAF) lower than 0.03 and that deviated significantly from Hardy–Weinberg equilibrium (p < 10−6). The final dataset contained 41,041 SNPs for LA and 36,452 SNPs for LW, and a total of 2262 samples, i.e. 1168 for LA (91 males and 1077 females) and 1094 for LW (114 males and 980 females). Sporadically missing genotypes were imputed during the phasing procedure using FImpute software [].Quality control was performed on both breeds for genomic inbreeding estimates, while for estimates of Ne, LD and correlation of linkage phase, it was performed within breed to avoid SNPs being penalized by the HWE criterion, since some SNPs can be fixed within one breed only. […]

