Computational protocol: The Genetic Architecture of Climatic Adaptation of Tropical Cattle

Similar protocols

Protocol publication

[…] For the present study, we used 2,112 Brahman and 2,550 Tropical Composite cattle from the resource population genotyped using either the BovineSNP50 or the BovineHD BeadChip (Illumina Inc., San Diego, CA) that includes more than 770,000 SNP.All SNP had been mapped to the UMD build assembled by the University of Maryland using the updated version 3.1 of the genome (available from Genbank accession DAAA00000000.2 and at http://www.cbcb.umd.edu/research/bos_taurus_assembly.shtml). Animals genotyped using the lower density array had their genotypes imputed to higher-density based on the genotypes of relatives, consisting of 589 Tropical Composite and 304 Brahman animals including all available sires, that had been genotyped using the BovineHD BeadChip. The imputation was performed within breed using as reference 519 Brahman and 351 Tropical Composites genotypes using the BovineHD (Illumina) and 30 iterations of BEAGLE , which resulted in a final dataset of 729,068 SNP genotypes per individual as reported in Bolormaa et al. . For the estimation of indicine content, the full genotype dataset was filtered by linkage disequilibrium (LD) to reduce redundant information and optimise computation utilization. The LD filter was applied using PLINK v1.07 in a sliding window consisting of 50 adjacent SNP, and if r2>0.5 was detected between a pair of SNP one of the SNP was removed, and then LD for the window was re-calculated. Once no more pairs of SNP had r2>0.5 the window moved 10 SNP along the chromosome. This procedure yielded a dataset containing 227,085 SNP. For SNP association analyses, the SNP were consistently encoded for all traits as AA, AB and BB using the TOP/BOT encoding of Illumina (http://http://res.illumina.com/documents/products/technotes/technote_topbot.pdf checked 30 July 2014) and then converted into numerical values of 0, 1, and 2 B alleles. [...] We used PLINK v1.07 to calculate multi-dimensional scaling (MDS) and genetic relationship matrices based on the genotypes to quantify the sample substructure for both the full dataset as well as the LD filtered dataset. The Angus and Nelore data were used as representatives of pure European taurine and pure indicine animals respectively. We note that indicine percent lines up with the first principal component of a principal component analysis, whereas African ancestry lines up with the second principal component, where the full diversity of cattle is included in the same analysis .The Tropical Composite sample consisted of beef industry lineages formed using indicine, Sanga and taurine cattle , and included animals with African taurine and no reported indicine ancestry, such as some sectors of the Belmont Red breed. The Brahman breed in Australia started in the 19th century from various Indian cattle including animals from the Melbourne Zoo, upgraded by American Brahman and the Indu-Brazilian breeds so also includes a small proportion of taurine ancestors , , . It therefore includes breeds such as the Kankrej (Guzerat), the Ongole (Nelore), and the Gir (Gyr). To estimate the amount of taurine and indicine content in the Tropical Composite and Brahman animals, HD genotypes for 81 Angus (Beef CRC) and 91 Nelore cattle were used as a reference. Estimates of indicine ancestry were also obtained using 55 Hereford, 54 Shorthorn, 44 Angus and 50 Gir. Some Angus, and the Hereford and Shorthorn samples were obtained from the Beef CRC database and some Angus, and the Nelore and Gir samples were obtained from the Bovine Hapmap , with some Nelore and Gir animals sampled from Brazil . Ancestry was estimated using Admixture software on the LD filtered dataset (∼228 K SNP) under supervised mode. Most of these Tropical Composite cattle were descended from the Hereford or Shorthorn breeds as taurine ancestors. Thereafter to evaluate the potential impact of breed on the indicine estimates, these were also obtained using different combinations of Hereford, Shorthorn and Gir. The correlation between different estimates of indicine content were>0.91 for these comparisons.To estimate the effect of indicine percent on adaptability-related traits a linear model was fitted using SAS (SAS Inst., Cary, NC). The covariates used for the estimation of genetic parameters were used as fixed effects. Given the number of comparisons, the type I significance threshold was Bonferroni adjusted to α = 0.005. [...] Genome-wide association studies (GWAS) were performed separately within each breed and for each of the ten traits using the final dataset with 729,068 SNP. The GWAS were performed one SNP at a time using the same univariate linear mixed models stated above, which included the fixed effects of contemporary group (combination of year and location), age of dam and estimated indicine percent of the individual as covariate; as well as animal as random additive effect and the random residual component and the SNP genotype (recoded as 0, 1 or 2) as an additional linear covariate. Solutions for the SNP effects and associated P-values were obtained using Qxpak5 . As the coefficient for each SNP is provided as a signed number, the signs of significant coefficients of the same SNP for different traits can be compared to determine whether the SNP shows effects in the same or opposite directions between two traits. To determine whether fitting indicine percent substantially changed the outputted results, we compared GWAS output with and without indicine percent in the Brahman sample and found that the allele effects across traits had an average correlation of 0.97 (). […]

Pipeline specifications

Software tools PLINK, ADMIXTURE, QxPak
Applications Population genetic analysis, GWAS
Organisms Bos taurus
Chemicals Taurine