Computational protocol: Evolutionary Analyses of GRAS Transcription Factors in Angiosperms

Similar protocols

Protocol publication

[…] The manually curated inference of OGs was performed on the basis of reciprocal BLASTp analyses, as described in Cenci et al. (). Briefly, a BLASTp analysis was performed for each GRAS sequence on protein databases of the studied species; for each species, proteins with best scores were grouped and those belonging to the OG confirmed by reciprocal BLASTp.In the case of species missing representatives in one or more OG, BLASTp analyses were performed on “nr” protein database (All non-redundant GenBank CDS translations + PDB + SwissProt + PIR + PRF excluding environmental samples from WGS projects) to look for the presence of OG members in wider taxonomic groups. In order to be consistent with GRAS classification in previous studies and to have a more detailed picture of the GRAS family in angiosperms, subfamilies were established. A subfamily is usually defined as a group of sequence displaying a significant degree a sequence similarity, regardless of phylogenetic events. Here, a subfamily is considered with regards to the orthogroups it contains. Consequently, as higher classification rank of OG, a subfamily can include one to several OGs. The subfamily names were defined based on the name assigned to the first described gene. The names of OGs included in a same subfamily were differentiated by a hyphen and a number (Table ). Hyphens and numbers were not added in non-assembled OGs (in these cases a subfamily included only one OG). [...] A phylogenetic reconstruction was performed for each OG, using all the GRAS sequences identified for each OG in the eight studied species. The phylogenetic analysis of the complete GRAS family was performed using all the member sequences of eight studied species and also with A. trichopoda, V. vinifera, and P. dactylifera. For both, protein sequences were aligned with MAFFT program (Katoh and Standley, ) via the EMBL-EBI bioinformatics interface (Li et al., ) using default parameters. Conserved blocks were extracted from the alignments with Gblocks (http://molevol.cmima.csic.es/castresana/Gblocks_server.html) (Castresana, ). The analysis was performed by allowing: (i) smaller final blocks, (ii) gap positions within the final blocks, and (iii) less strict flanking positions. Phylogenetic trees were built with PhyML (Guindon et al., ) available at http://phylogeny.lirmm.fr/ (Dereeper et al., ) using an LG substitution model and an Approximate Likelihood-Ratio Test (aLRT) as statistical tests for branch support. Phylogenetic trees were visualized with MEGA6 (Tamura et al., ) and iTOL (http://itol.embl.de/) (Letunic and Bork, ). [...] The location of the genes in syntenic blocks was retrieved using SynMap program (http://genomevolution.org/CoGe/SynMap.pl) obtained using a default parameter (Lyons et al., ) except for C. canephora which was studied with the Coffee Genome Hub (http://coffee-genome.org/syntenic_dotplot) (Dereeper et al., ). […]

Pipeline specifications