Computational protocol: Genome-wide association study for backfat thickness in Canchim beef cattle using Random Forest approach

[…] Animals used in this study were part of the Canchim Breeding Association from seven herds located in two Brazilian states (São Paulo and Goiás). This research is in agreement with the ethical principles of animal experimentation of Embrapa Southeast Livestock Ethical Committee of Animal Use (CEUA-CPPSE), and has been performed with the approval of CEUA-CPPSE under protocol number 02/2009. An initial sample of 987 animals (males and females) was evaluated for backfat thickness by ultrasound in vivo over the 12th rib around the age of 18 months. All animals evaluated were born between 2003 and 2005 and raised on natural pastures.These 987 animals had the estimated breeding value (EBV) predicted by restricted maximum likelihood using the MTDFREML software []. The animal model included fixed effects of contemporary group (sex, year, herd, and genetic group) and age at measurement as a linear covariate, the additive genetic effect and error were included as random effects. From these animals, a sample of 400 was selected considering: EBV, accuracy, family size, and proportion between males (196) and females (204). These 400 animals were offspring of 50 different sires (with 1 to 30 offspring per sire). [...] A pathway analysis was conducted to characterize the genomic regions identified by the set of SNPs previously selected and to identify candidate genes influencing biological functions and pathways related to backfat thickness and fat-related traits.The software fastPHASE version 1.4.0 [] was used for reconstructing the haplotypes for each chromosome. Afterwards, the reconstructed haplotypes were analyzed by the software Haploview [] (using default parameters) for estimating haplotype blocks and linkage disequilibrium (LD), which was calculated based on the squared correlation coefficient between SNP pairs (r2). Considering the extent of LD based on the overall average r2 (average r2 = 0.12 at a distance of 250Kb, data not shown), a window of 500Kb (SNP position ± 250Kb) surrounding each SNP previously selected by the stepwise regression was considered to define the region used for candidate gene discovery and pathway annotation.The Cattle Genome Browser through the UMD 3.1 Cattle genome assembly [], was used for visualization of the selected SNPs and surrounding areas for localization and identification of QTLs, genes, and other interesting genomic landmarks. Other databases, such as the NCBI BioSystems database [], and Kyoto Encyclopedia of Genes and Genomes (KEGG) [,] were also used for pathway annotation to gain insight into the biological processes involved in backfat thickness deposition. […]

