Computational protocol: Bioinformatic mapping of AlkB homology domains in viruses

Similar protocols

Protocol publication

[…] The NCBI nr protein sequence database was searched with PSI-Blast [], with the output limited to viral sequences. Multiple alignments were made with ClustalX version 1.8 []. The phylogenetic tree in Figure was made from ClustalX alignments by MEGA2 [], using the neighbour-joining (NJ) approach with complete deletion of gap positions, Poisson correction of distances and 500 bootstrap steps. Phylogenetic trees for sequence regions from sequences with AlkB domains were made with the NJ approach as described above, but with 10.000 bootstrap steps. Corresponding trees were also made by the maximum likelihood approach (ML) by Tree-Puzzle version 5.2 [], using an exact likelihood function, the VT matrix [] and 10.000 puzzling steps. The trees from Tree-Puzzle were visualised with TreeView version 1.6.6 [], and the NJ and ML trees were compared with Component version 2.0 []. Significance of pairwise tree distances were estimated using the data of Day []. Pairwise distances between sequences, for comparing evolution of AlkB domains to other viral domains, were computed directly from ClustalX alignments with local tools, using the Blosum50 mutation matrix [], but without any correction for multiple substitutions. Motifs in protein sequences were identified using HMMER version 2.3.2 [] with the Pfam library version 11.0 []. A Pfam-type profile for the methyltransferase domains of Flexiviridae and Tymoviridae was generated from a ClustalX alignment, using hmmbuild and hmmcalibrate from the HMMER package. Visualisation of motif positions in viral sequences was generated directly from the HMMER output files using a local tool as an interface to the GNU [] groff software. Systematic large scale searches with polyprotein subsequences were done locally with PSI-Blast and the NCBI reference sequence database []. Dot plots for comparison of viral protein sequences were generated with Dotter version 3.0 []. […]

Pipeline specifications

Software tools BLASTP, Clustal W, MEGA, TREE-PUZZLE, TreeViewX, HMMER, Dotter
Databases Pfam
Applications Phylogenetics, Genome data visualization
Organisms Escherichia coli