Computational protocol: The Landscape of the Prion Protein's Structural Response to Mutation Revealed by Principal Component Analysis of Multiple NMR Ensembles

Similar protocols

Protocol publication

[…] For each dataset being studied, a multiple sequence alignment of all structures, based on ATOM residues, was generated using EBI MUSCLE . This alignment and the corresponding structures were used as input in the Bio3D package within the R statistical program . Iterated rounds of structural superposition of PrP structures by Cα atoms, ignoring gap/insertion regions and missing residues, was performed to identify invariant core residues of PrP with a 1°A core cutoff. The structurally invariant core was used as a reference frame for structural alignment of the PrP NMR models, and Cartesian coordinates of the aligned Cα atoms were used as input for principal component analysis (PCA).PCA maps high-dimensional data into fewer dimensions by a linear transformation , and has been employed in several studies to provide insight into the nature of conformational changes within proteins and protein families. In this study, PCA finds axes along which the high-dimensional ensemble of PrP protein structures can be best separated. The input is a coordinate matrix, X, composed of N by P dimensions, where N represents the number of structures and P represents three times the number of residues , , and each row of the matrix corresponds to the Cα coordinates of each structure. PCA is based on diagonalization of the covariance matrix, C, with elements Cij built from X as follows:wherei,j = all pairs of 3N Cartesian coordinates< > = average over N atoms under considerationPrincipal components (orthogonal eigenvectors) describe axes of maximal variance of the distribution of structures, and eigenvalues provide the percentage of variance (total mean square displacement) of atom positional fluctuations captured along each PC. Projecting PrP structures onto the conformational subspace defined by the largest PCs produces a low-dimension “conformer plot” which allows for the identification of dominant conformational changes and the characterization of inter-conformer relationships . Additionally, the relative displacement of each residue described by a given PC can be represented in a “residue contribution” plot. Collectively, both plots allow for the identification of “conformationally variable subdomains” that are responsible for conformational clustering of the PrP structures, and which contribute to the structural variation observed in the datasets. These subdomains represent the largest segments of structural plasticity within the prion protein, making them candidate sites in the PrP conversion process.Variation within models of an NMR ensemble poses a challenge for PCA analysis: how does the selection of a particular model influence the structural variation of a dataset? To test the extent to which inter-model variation within an NMR ensemble influences identification of variable PrP subdomains, we conducted PCA analyses on randomly selected NMR models within the hPrP and mPrP datasets. Using the total hPrP (11 PDBs) and mPrP (14 PDBs) datasets listed above, an NMR model was selected at random from each of the NMR ensembles within that set, creating a subset of ‘representative’ NMR models for all the structures. The process was repeated 50 times and PCA was performed on each of the selected subsets. These random PCA runs on NMR models (, ) succeed in identifying the same variable subdomains as those identified using ensembles, for hPrP (), and for mPrP (). [...] Molecular figures have been rendered using PyMOL and VMD . […]

Pipeline specifications

Software tools MUSCLE, PyMOL, VMD
Applications Protein structure analysis, Nucleotide sequence alignment
Organisms Mus musculus, Homo sapiens