Computational protocol: Evolutionary mechanisms driving the evolution of a large polydnavirus gene family coding for protein tyrosine phosphatases

Similar protocols

Protocol publication

[…] We also searched for homologous PTP genes in CcBV, CvBV (Cotesia vestalis bracovirus previously known as Cotesia plutellae bracovirus), GiBV, GfBV and MdBV in public databanks (see accession numbers in Additional file ). Twenty-eight PTP sequences from CvBV, 42 sequences from GiBV, 32 sequences from GfBV, 13 sequences from MdBV and 27 sequences from CcBV were retrieved at NCBI (http://blast.ncbi.nlm.nih.gov/Blast.cgi). Both retrieved and newly isolated PTPs were used for phylogenetic and selection analyses. For the analysis of duplications, the maps of the PTP genes on the virus segments (Figure , Additional files , and ) are represented based on gene positions indicated in Genbank.ClustalX alignment [] of translated sequences were corrected manually based on the PTP conserved motifs []. This alignment (Additional file ) was then submitted to Gblocks0.91b [] in order to eliminate poorly aligned positions and divergent regions that could be misleading in phylogenetic analyses.This sequence alignment was used to construct a tree in order to give an overview of PTP evolution in Microgastrinae. The GTR + I + G model of sequence evolution was selected for most clades using Modeltest version 3.7 [] according to the likelihood ratio test (LRT) and the Akaike information criterion (AIC). For the IZCα clade the HKY 85 + G model was selected. Bayesian MCMC analyses were performed for the entire data set using MrBayes version 3.12 []. Two independent analyses were run simultaneously for each data set, each consisting of 106 generations, sampled every 103 generations and using four chains and uniform priors. Maximum likelihood analysis (ML) was performed on PHYML program [] using the same evolutionary model. The topology and the branch length estimations were repeated 1000 times for bootstrap test. [...] Consensus trees were chosen as a phylogenetic hypothesis for the estimation of nonsynonymous to synonymous substitution rate ratio (ω = dN/dS) models on each clade using PAML 4.2 []. Six different models of site- and/or branch-specific ω ratios [,] were optimised using Bayesian methods in PAML 4.2 []. The maximum likelihoods were compared between nested models by the Likelihood Ratio Test (LRT).Site-specific positive selection was tested by comparing the selective model M8 (ω>1) to the non-selective model M8a (ω = 1) by LRT []. Branch specific selection was tested by comparing models M0b (branch specific selection and no variation among sites) to M0 (no branch or site-specific selection) using an LRT. Finally, the branch + site models were developed to address positive selection at a subset of sites on branches specified a priori. Model MA defines four classes of sites, where the two last classes have ω > 1 on the lineage of interest and ω < 1 for the rest of branches. This model is compared with MAnull which imposes ω = 1 for the latter two classes. […]

Pipeline specifications

Software tools Clustal W, Gblocks, ModelTest-NG, MrBayes, PhyML, PAML
Application Phylogenetics
Diseases Parasitic Diseases