Computational protocol: Phenotypic and molecular characterization of sweet sorghum accessions for bioenergy production

Similar protocols

Protocol publication

[…] Leaf samples were collected from five plants per accession and the DNA extraction was performed using the Dneasy® Plant Mini Kit (QIAGEN, Germantown, Maryland, USA). The quality and quantity of extracted DNA was checked in agarose gel and NanoDrop® ND-1000 Spectrophotometer. Genotyping-by-sequencing (GBS) [] was performed by the Institute for Genomic Diversity at Cornell University. Genomic DNA was digested individually with ApeKI, and the bar-coded DNA samples were pooled and sequenced in a HiSeq2000 platform (Illumina Inc., San Diego, California, USA). Sequencing data were separated for each accession and aligned to the BTx623 Sorghum bicolor reference genome [, ] version 2.1, using the Burrows-Wheeler Aligner (BWA) software []. SNPs were called using the GBS pipeline available in the software TASSEL []. Subsequently, SNP markers were filtered considering a minor allele frequency (MAF) of 5% and a maximum of 5% of missing genotypes per locus. [...] Genetic diversity analyses in the sweet sorghum accessions were conducted separately using the phenotypic and the molecular data. Initially, for the phenotypic data, all morphological and agro-industrial variables were standardized. Then, the dissimilarity matrix between lines was calculated using the Average Euclidean distance []. The relative contribution of each morphological and agro-industrial trait for the diversity analysis was evaluated based on the Mahalanobis distance (D2), according to the method proposed by Singh [], using the software Genes []. Subsequently, genetic distances between the sweet sorghum accessions were calculated based on the SNP data using the identity-by-state (IBS) coefficient [] in the software TASSEL. This measure of similarity takes into account the number of identical alleles, whether or not inherited from a common ancestor. Based on the phenotypic and the molecular dissimilarity matrices, two separate cluster analyses were performed through the Neighbor-Joining method [] using the software DARwin []. Different clusters of sweet sorghum accessions were identified according to the nodes present in the Neighbor-Joining trees. The Mantel test [] was performed, using the software Genes, to test the significance of the correlation between the phenotypic and the molecular dissimilarity matrices, considering ten thousand random permutations and a 5% significance level. Averages of the agro-industrial traits were estimated for each cluster obtained through the phenotypic and the molecular diversity analysis, and were compared using the Duncan′s test [] at a 5% significance level. In addition, a principal component analysis (PCA) [] was performed, based on the molecular similarity matrix, in order to infer the population structure in the sweet sorghum accessions, using the package pcaMethods for the R software [], available at the Bioconductor software []. […]

Pipeline specifications

Software tools BWA, TASSEL, DARwin, pcaMethods
Applications Phylogenetics, GBS analysis
Organisms Sorghum bicolor, Zea mays
Chemicals Carbohydrates