Computational protocol: Prediction of Genes Related to Positive Selection Using Whole-Genome Resequencing in Three Commercial Pig Breeds

Similar protocols

Protocol publication

[…] We used genomic DNA samples gathered from 12 males and 12 females of three pig breeds: Yorkshire, 2 males and 3 females; Landrace, 7 males and 6 females; and Duroc, 3 males and 3 females. Blood samples were collected for DNA extraction. Sample collections and DNA quality check procedures were performed according to the manufacturer's instructions. Next, we constructed genomic DNA libraries for each sample using TruSeq DNA Library kits (Illumina). The paired-end library was sequenced on an Illumina HiSeq 2000 sequencing platform.The pair-end sequence reads were aligned to the reference pig genome sequence from University of California, Santa Cruz (UCSC;; susScr3) using Bowtie2 with the default setting. We used the following open-source software: Bowtie2, Picard tools 1.94 (, Samtools 0.1.19 [], Genome Analysis Toolkit (GATK) 2.6.4 [], and VCFtools 4.0 [] for resequencing data processing and SNP calling. Substitution calling was performed using GATK Unified-Genotyper. We phased the haplotypes for the entire pig populations using BEAGLE []. Picard tools was used for duplicate read removal and all mate-pair information confirmation. Samtools was used for indexing the results from bam files and calculating the mapped reads using the flagstat option. GATK was used for realignment and SNP calling from resequencing data, and VCFtools was used when VCF files were handled. After filtering, non-biallelic SNPs were excluded. [...] Based on the findings of positive selection, enrichment analysis was performed to examine the biological functions of genes in detected regions. In this study, regions with SNPs under positive selection were extended approximately 10 kb upstream and downstream []. We assembled genes located within the extended region using the RefGene from the UCSC Genome Browser (; ver. hg19). Gene enrichment analysis was performed GO terms [], including biological process, molecular function, cellular component, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis [] using the DAVID ( tool []. […]

Pipeline specifications