Computational protocol: Similarity in Shape Dictates Signature Intrinsic Dynamics Despite No Functional Conservation in TIM Barrel Enzymes

Similar protocols

Protocol publication

[…] We used protein structures from five different superfamilies, referred to as Triose phosphate isomerase (TIM), Aldolase class I (ALD1), Enolase (ENOL), Chitinase (CHTN) subfamily of the Glycosidases (GLYC) and Phosphatidylinositol (PI) phospholipase C (PIPLC) in Nagano et al. [], summarised in . According to the phylogenetic analysis in this review, the protein families (nomenclature used is in parentheses) cluster according to the following groupings where TIM and ALD1 are closely linked, followed by ENOL, with CHTN and PIPLC being distant outliers. CHTN and PIPLC are also considered to be two of the four superfamilies to have little evolutionary link to the rest of the superfamilies. These superfamilies relate to each other at the Topology level in the CATH database [] as of January 2011. Since then, two of the superfamilies, TIM and ALD1, have been reclassified to be part of the same Homology level [].For the purposes of comparison, we picked five representative structures from each of these superfamilies, which are further annotated in (). These structures all exist as part of dimers, and have varying lengths that include additional secondary structures, further illustrated in . The structures are also treated as monomers, even though most come as dimers. As the enzymes chosen are subject to the CATH domain classification, we found that it is appropriate to exploit the structural information that the classification provides as a starting point, as has been done previously by Zen et al. []. Moreover, we find the conformation of a subunit isolated from an oligomer is able to capture the influence of the interactions of other subunits []. The structures were prepared according to the domain annotation found in CATH, which included the truncation of 2 structures: the first domain of 1KKO and a domain sitting on loop 7 of 1E15. This resulted in a set of structures with varying length, with 1KKO as the smallest at 246 amino acids, followed closely by 1N55 (248), then 3CH0 (271), 3CWN (315) and 1E15 (355). Three of the five structures bind to a phosphate moiety in their substrate (1N55, 3CWN and 3CH0), while two structures, 1KKO and 3CH0, have Mg2+ and Ca2+ metal ions as co-factors respectively (). There is no consensus on the positions of their catalytic or substrate binding sites on the fold. We also investigated the biological assemblies as provided by the Protein Data Bank (PDB) for comparison (Cf ).For the first half of the analysis, we also include homologues from each of these superfamilies. The homologues were retrieved from Blastp, searched against the PDB database. The lowest E-values were chosen for each, where the hits were not identical to the query sequence (i.e. below 99% identical) and did not have the same taxonomic rank. This led us to pick four additional structures for all the superfamilies except for the Enolase, where only two other structures were found to fit the criteria, resulting in a total of 23 structures analysed (). [...] The sequence alignments performed to determine sequence identities between the various structures were generated using MUSCLE [] as implemented in JALVIEW []. The web service SIAS ( was then used to calculate the sequence identity which is defined as the following: S=100⋅(IL)(1) where I is the number of identical residues and L is the length of the alignment, including the gaps.We obtained the structural alignments from MUSTANG []. This alignment program was one of the top three performers in a benchmarking study [] and at aligning TIM Barrel proteins reliably []. It aligns the structures using the topological information from the Cα atoms in the backbone via an optimised progressive pairwise algorithm. The resulting alignment was used in the FASTA format for the comparative analysis of the intrinsic dynamics. [...] The correlation matrix as defined by Ichiye and Karplus [] is calculated from the normal modes. Each element in the matrix quantifies the coupling between two atoms i and j as: Cij=∑m=13N−61λm[vm]i⋅[vm]j(∑m=13N−6[vm]i⋅[vm]i)12⋅(∑m=13N−6[vm]j⋅[vm]j)12(9) where vm and λm are eigenvectors and eigenvalues of the mth normal mode respectively and the i and j subscripts denote the component of the mode corresponding to individual atoms, summed over all non-trivial modes. Cij is the expected inner product of displacements of atom i and j, and ranges from –1 to 1, where –1 and 1 are maximal anti-correlations and correlations, respectively, and 0 represents a lack of any correlation.For visual inspection, strong Cij correlation scores in the correlation matrix collected as objects are represented as sticks in Figs and , as implemented in []. The correlation scores are chosen to reflect the 95th percentile rank of their absolute values, because their magnitudes describe correlations of the same strength. We chose a percentile threshold instead of a threshold based on the value of the pairwise correlation because of its strength to identify the most significant correlations in a protein structure. In the absence of such a criteria, the choice of a threshold based on a hard correlation value cut-off, would imply that we arbitrarily decide which correlation values are relevant without a reliable reference. The correlated pairs of Cα atoms are later separated by positions that have positive correlations above the 95th percentile and those that have negative correlations below the negative of this score. Furthermore, only the correlations between atoms that are at least 8Å apart are considered, to filter out the pairs of Cα atoms whose correlations are along the peptide backbone and are heavily influenced by adjacent bonding and interactions due to close proximity. These pairs of Cα atoms are also linked by the springs with the stronger force constants in the ENM. The distance threshold is reduced to 4Å, while the score threshold is increased to the 97.5th percentile rank when examining signification correlations that originate at the β-strands, as it corresponds to the approximate distance of the Cα atoms in adjacent strands (). The objects resulting from the search are visualised using the molecular graphics program PyMOL [] as sticks between atom pairs, in red when positive and blue when negative. […]

Pipeline specifications