Computational protocol: Epitope Discovery with Phylogenetic Hidden Markov Models

Similar protocols

Protocol publication

[…] Given an inferred topology (see below) and assuming F3 × 4 codon equilibrium frequencies, all other mutational parameters in equation and the branch lengths were estimated by maximum likelihood in HyPhy () under a standard MG94 × HKY85 Dual GDD 3 × 3 model of codon substitution (i.e., without the factor involving ζ in equation ). Site likelihoods were then computed under the Halpern–Bruno model defined by equation for discrete values of γ and γ0 given the previously inferred mutational parameters and branch lengths. The hyperparameters of the prior distributions on γ and γ0 and the transition probabilities between hidden states were optimized with the Baum–Welch algorithm implemented in the R Language and Environment for Statistical Computing (R Development Core Team 2009). The initial state probabilities were constrained such that the Viterbi path always begins in the nonepitope state. The computer code is freely available from the corresponding author.In order to determine the extent to which our results depend on the inferred phylogeny, we estimated several maximum likelihood topologies using PhyML () and GARLI (). Fifteen trees were estimated with different starting topologies in PhyML by specifying a general time reversible model of nucleotide substitution with variable substitution rates modeled by a four-category discrete gamma distribution with unit mean. Branch swaps were achieved by subtree pruning and regrafting and nearest neighbor interchange. A further 20 trees were inferred with GARLI under a GY94 × HKY85 codon substitution model with nonsynonymous rates drawn from a three-category general discrete distribution. We trained our phylo-HMMs using several of the highest scoring topologies and did not find any substantive differences in our results. The phylogeny ultimately used here was resolved in PhyML and had the highest likelihood score computed in HyPhy under the MG94 × HKY85 Dual GDD 3 × 3 model of codon substitution. […]

Pipeline specifications

Software tools HyPhy, PhyML, GARLI
Application Phylogenetics
Organisms Human immunodeficiency virus 1