Computational protocol: Genome-wide analysis of Aux/IAA and ARF gene families in Populus trichocarpa

Similar protocols

Protocol publication

[…] Genes were initially identified using Pfam domain IDs assigned to predicted Populus gene models in the JGI (DOE Joint Genome Institute, CA) annotation pipeline [], and by using Arabidopsis ARF and Aux/IAA proteins [,] as queries in BLASTP searches of predicted Populus proteins (Populus Genome v 1.1, January 2007). Populus proteins identified in this initial search were used as query sequences in additional BLASTP searches of the predicted Populus protein set for exhaustive identification of divergent Populus gene family members. Redundant and invalid gene models were verified based on gene structure, intactness of conserved motifs, EST support and synteny analysis. We have also included in our study an incomplete gene model (fgenesh4_pg.C_scaffold_1006000001) representative of PoptrARF6.3 because this gene model is flanked by a sequence gap followed by the conserved ARF amino terminus domains, which could potentially be corrected into a complete gene model in the upcoming version of the genome. Furthermore, this gene showed very strong evidence of expression based on microarray analyses. Sequence conservation and microsynteny analysis of Populus gene models with Populus homeologous (duplicated) genomic regions and the Arabidopsis genome was conducted using the Vista Browser tool with default curve calculation parameters; nucleotide sequence 'conservation identity' of 70% and 'minimum conservation width' of 100 bp []. One gene model per locus, which included some JGI annotated models representing the promoted or reference set, were used in this study. It should also be noted that in the current version of the genome (v 1.1), some of the scaffolds could potentially represent haplotypes and not unique unassembled genomic regions. Gene nomenclature is based on the consensus standard established by the International Populus Genome Consortium (IPGC) to distinguish Populus trichocarpa from other Populus species. [...] Conserved protein motifs were determined from CD-searches [] and using MEME-MAST programs [,]. Sequence identity between two genes was determined using the bl2seq tool []. Multiple sequence alignments were performed using MUSCLE sequence alignment program []. Phylogenetic trees were constructed in two ways using amino acid sequence alignments of conserved regions or full-length sequences of all predicted proteins in Aux/IAA and ARF gene families of Populus, Arabidopsis [,,] and rice [,]. Arabidopsis and rice sequences were obtained from TAIR and TIGR databases. Unrooted PHYLIP trees with 1000 bootstraps were generated by Neighbor-Joining method using ClustalX 1.83 program. Phylograms were visualized in TreeView v1.6.6. Bayesian phylogenetic analysis of conserved, collated and aligned Aux/IAA (See Additional file ) and ARF amino acid sequences (See Additional file ) was performed using the MRBAYES (version 3.1.2) package [,]. We used the WAG substitution frequency matrix [] with among-sites rate variation modeled by means of a discrete γ distribution with four equally probable categories. Two independent runs of 1–2 million Monte Carlo Markov Chain generations with four chains each were run. Trees were sampled every 100 generations and stationary phase and burnin value was determined by plotting the likelihood scores against number of generations. Posterior probabilities calculated from consensus are shown on branches. […]

Pipeline specifications

Software tools BLASTP, FGENESH, VISTA Browser, MEME, MUSCLE, PHYLIP, Clustal W, TreeViewX, MrBayes
Databases TAIR Pfam
Applications Phylogenetics, Nucleotide sequence alignment, Genome data visualization
Organisms Arabidopsis thaliana, Populus trichocarpa, Oryza sativa
Diseases Hypoprothrombinemias