Computational protocol: Exploring the Mechanism Responsible for Cellulase Thermostability by Structure-Guided Recombination

Similar protocols

Protocol publication

[…] SCHEMA is a structure-guided computational approach to creating chimeric proteins that retain proper folding and functionality, but explore other properties linked to sequence, such as stability. SCHEMA algorithms identify sites for recombining homologous proteins that minimize structural disruption by maximizing the retention of parental residue-residue contacts in their folded structures []. Noncontiguous recombination identifies blocks of sequence that are contiguous in the 3-D structure, but are not necessarily contiguous in the primary sequence. Contacts (residues that are < 4.5 Å apart) are identified from one or more of the crystal structures, and the SCHEMA energy E for a given chimera is calculated by counting the number of residue–residue contacts that are disrupted by recombination. Partition sites of the aligned homologous proteins are chosen to minimize the average of SCHEMA energy of all possible chimeras made by recombining those sequence fragments.In this study, noncontiguous SCHEMA recombination was designed as previously described []. The SCHEMA algorithms uses sequence alignment and structure data to create a SCHEMA contact map for proper chimera design. In the generated structures, the algorithm was set to consider any two amino acids in contact if any atoms, excluding hydrogen, are within a distance of 4.5Å from the residues. A SCHEMA contact map was first generated for each parent. During recombination, the contacts that are not conserved among the parental proteins were considered broken, and so a final ‘average’ contact map could be built by weighting the retention of each parental contact (0.5 for a single parent, 1 for both parents). The SCHEMA contact map can be abstracted as a graph in which every node represents a non-conserved residue, and is linked by the edges representing the average weighted SCHEMA contacts between two residues. The problem of finding crossover locations that minimize the SCHEMA contact numbers to yield the low-disruption chimeras can therefore be reformulated as the problem of minimizing the edges during graph partitioning, which was solved with the hMETIS graph partitioning suite [, ].Here, the amino acid sequence alignment of the parental enzymes BsCel5A [] and GsCelA [] was created using PROMALS3D.24. Crystal structures 3PZT [] and 4XZB were used to create the BsCel5A and GsCelA SCHEMA contact maps, respectively. As the catalytic cores of BsCel5A and GsCelA share 58% sequence identity, an eight-block chimera design was selected with an average as 16.25 and average as 47 compared to the closest parent. [...] Both crystal structures (P1 and C10) were determined by molecular replacement using the MOLREP program of the CCP4 program suite, using the crystal structure of endo-1, 4-beta-glucanase (PDB: 3PZT) from Bacillus subtilis [] as a search model. P1 and C10 crystals belong to space groups P212121 and C2221, respectively. Throughout the refinement, 5% of randomly-selected data were set aside for cross-validation with Rfree values. Manual modifications of the models were performed using the program Coot []. Difference Fourier (Fo-Fc) maps were calculated to locate the solvent molecules. Both crystal structures were refined using Refmac5 []. Data collection and final model statistics are shown in . The molecular figures were produced using UCSF Chimera []. The atomic coordinates and structural factors of P1 and C10 have been deposited in the Protein Data Bank with accession codes 4XZB and 4XZW, respectively. The molecular figures were produced using PyMOL ( and UCSF Chimera []. […]

Pipeline specifications