Article pubs.acs.org/JPCB
Hot Spot of Structural Ambivalence in Prion Protein Revealed by Secondary Structure Principal Component Analysis Norifumi Yamamoto* Department of Life and Environmental Sciences, Faculty of Engineering, Chiba Institute of Technology, 2-17-1 Tsudanuma, Narashino 275-0016, Japan ABSTRACT: The conformational conversion of proteins into an aggregation-prone form is a common feature of various neurodegenerative disorders including Alzheimer’s, Huntington’s, Parkinson’s, and prion diseases. In the early stage of prion diseases, secondary structure conversion in prion protein (PrP) causing β-sheet expansion facilitates the formation of a pathogenic isoform with a high content of β-sheets and strong aggregation tendency to form amyloid fibrils. Herein, we propose a straightforward method to extract essential information regarding the secondary structure conversion of proteins from molecular simulations, named secondary structure principal component analysis (SSPCA). The definite existence of a PrP isoform with an increased β-sheet structure was confirmed in a free-energy landscape constructed by mapping protein structural data into a reduced space according to the principal components determined by the SSPCA. We suggest a “spot” of structural ambivalence in PrPthe C-terminal part of helix 2that lacks a strong intrinsic secondary structure, thus promoting a partial α-helix-to-β-sheet conversion. This result is important to understand how the pathogenic conformational conversion of PrP is initiated in prion diseases. The SSPCA has great potential to solve various challenges in studying highly flexible molecular systems, such as intrinsically disordered proteins, structurally ambivalent peptides, and chameleon sequences.
■
INTRODUCTION Several studies on prion protein (PrP) have been conducted to understand the mechanism of protein misfolding and aggregation in various neurodegenerative disorders, such as Alzheimer’s, Huntington’s, and Parkinson’s diseases.1−6 The aggregation-prone property of PrP is responsible for transmissible spongiform encephalopathies (TSEs), also known as prion diseases, which are infectious and lethal neurodegenerative disorders in humans and animals.7−10 Prion diseases are believed to be caused by the conversion of the normal cellular form of prion protein (PrPC) into the pathogenic scrapie isoform (PrPSc).11−13 Although PrPC and PrPSc have identical chemical structures, it has been shown that the conversion from PrPC to PrPSc causes a substantial change in the secondary structure of the protein.14 PrPC mainly contains α-helices, which consist of three long helical motifs, H1, H2, and H3, and a short β-sheet, which consists of two strands, S1 and S2.15 Experimental studies showed that the content ratio of α-helix in the C-terminal domain of PrPC (residues 90−230) is ∼48%, while that of β-sheet is ∼8%.15−18 The structural details of PrPSc are still unknown, except that it has a significantly higher proportion of β-sheet content (∼50%) than PrPC.19,20 A significant decrease in the α-helix content and an increase in the β-sheet content, i.e., β-sheet expansion, might occur during the pathological conversion from PrPC to PrPSc;14 therefore, the helix-to-sheet conversion leading to the β-sheet expansion in © XXXX American Chemical Society
PrP has been a major focus in prion biology. Currently, however, less structural basis is available. Besides experimental studies, molecular dynamics (MD) simulations have been conducted to solve the challenges of prion diseases.21−30 Some results showed that PrPC favors the formation of partially unfolded structures.21−23 Moreover, several isoforms of PrP with an enhanced β-sheet proportion than PrPC have been observed in MD simulations.24−30 These results support a possible modelthe intrinsic instability of PrPC facilitates the β-sheet expansion in PrP. This leads the molecular architecture to adopt an aggregation-prone conformation, triggering the pathological conversion of PrPC. Thus, the special PrP isoforms with an increased β-sheet structure might sometimes serve as the on-pathway intermediates for the formation of PrPSc; however, the precise probability of forming such characteristic metastable states has remained controversial. Free-energy provides quantitative information on finding an inherent conformational state. A free-energy landscape (FEL) is defined to provide a probability distribution of all possible conformations of a molecular system as a function of a few selected degrees of freedom, named collective variables (CVs), which describe the essential physics of a biomolecular process such as protein folding and conformational conversion. The Received: April 7, 2014 Revised: July 26, 2014
A
dx.doi.org/10.1021/jp5034245 | J. Phys. Chem. B XXXX, XXX, XXX−XXX
The Journal of Physical Chemistry B
Article
ps. The final average exchange rate was 21.3%, affording an average exchange time of 18.8 ps. For each replica, the 72 ns MD simulation was performed in an isothermal−isobaric (NpT) ensemble, where the velocity scaling scheme39 was used for temperature control with a coupling constant of 0.5 ps, and a constant pressure of 1 atm was maintained using the Berendsen scheme40 with a coupling constant of 1 ps. All the bonds were constrained using the linear constraint solver (LINCS) algorithm.41 The simulations were run with an integration time step of 4 fs, removing hydrogen vibrations by associating the virtual site approach42 to the LINCS constraint solver. The particle-mesh Ewald (PME) approaches43 were used with a cutoff length of 0.9 nm to manage long-range electrostatic effects. The van der Waals interactions were treated using a 1.025 nm cutoff. The nonbonded pair list was updated every 0.1 ps. All the MD simulations were performed using the Gromacs 4.6.5 package44,45 with the OPLS-AA force field.46,47 Finally, the first-half period was eliminated from the beginning of the 72 ns trajectory, and the resulting 70 parallel 36 ns production MD trajectories were used for analyses, providing a total 2.52 μs worth of sampling data. For each production trajectory, the protein conformations were stored every 4 ps, giving 9001 coordinate frames. Thus, a total of 630 070 conformations were collected from the trajectories. Secondary Structure Principal Component Analysis. Because the REMD simulation investigates a considerable amount of the conformational states of proteins including less probable highly heterogeneous structures, there has been an increasing interest to develop methods to extract the essential information from the MD trajectories. For example, to represent the FEL of proteins, it is necessary to characterize their conformational properties in terms of distributions projected onto the subsets of representative coordinates. In this context, secondary structure is useful for characterizing inherent conformations in a schematic manner by projecting the three-dimensional (3D) coordinates onto a set of secondary structure elements (SSEs) for each amino acid residue in a protein. First, let us denote s(i) = (s1(i), ..., sn(i), ..., sN(i))T, a set of SSEs for the ith protein coordinates q(i), where sn(i) is the SSE for the nth amino acid residue and N is the number of residues in a protein. In this study, each SSE was assigned for each residue according to the STRIDE program.48 The STRIDE program provides seven secondary structure indexes; i.e., sn(i) ∈ {α-helix, π-helix, 310-helix, β-bridge, β-sheet, turn, coil}. Herein, to simplify the problem by focusing on the fundamental motifs, we defined three sets of SSEs: a set of helix elements, H = {α-helix, π-helix, 310-helix}, a set of extended strand elements, E = {βbridge, β-sheet}, and all the rest, C = {turn, coil}. The secondary structure sequence, s(i), is an essential input for our approach; however, because s(i) is defined as the non-numerical data, it is intractable in the usual manner. A commonly used method to treat such a non-numerical data set is to map the data into a binary representation. Considering that vσ(i) = (Iσ(s1(i)), ..., Iσ(sn(i)), ..., Iσ(sN(i)))T is a vector of variables that represent SSEs in binary, where Iσ is the indicator function of a set
FEL provides essential information for elucidating molecular biophysical processes; however, it suffers from the limitation of the nontrivial choice of pertinent CVs.31 There is no definite method to select a correct set of CVs that classify structures on the basis of their differences in atomic structure, causing significant differences in protein morphologies. The determination of CVs is therefore an active field of biomolecular simulation. In particular, for studies on prion and other conformational diseases, it is essential to find a proper method to determine CVs that enable extracting information on secondary structure conversions related to β-sheet expansion in these proteins. In this paper, we introduce a straightforward method to determine a proper set of CVs that can extract the essential information regarding the secondary structure conversions of proteins from molecular simulations. The FEL, defined from the probability distribution in the reduced space covered by representative coordinates determined by our method, provides a definite evidence for the β-sheet expansion in PrP.
■
METHODS Construction of a Model Structure. The model consisted of PrP and 8514 water molecules, adding three Na+ atoms as the counterions. The TIP4P water model32 was used to describe the solvents. The coordinates of human PrP obtained from the Protein Data Bank (ID: 1HJM)33 were used as the model structure of PrPC. The 1HJM coordinates include a Cterminal half domain of PrP (residues 125−228). According to the protein-only hypothesis,7,8 the fragment of PrP containing residues 90−230, generated by amino-terminal truncation through digestion with proteinase K, is considered to be the necessary infectious unit that retains prion infectivity and aggregation proneness. This fragment consists of two domains: the disordered N-terminal domain, residues 90−120, and the ordered C-terminal domain, residues 121−230.16,17 Some computational studies indicated that the disordered N-terminal domain plays an important role in the conversion of PrPC.24,26 On the other hand, experimental studies showed that the formation of an amyloid form with physical properties of PrPSc can be achieved in the absence of the disordered N-terminal domain.34−36 Thus, instead of using the full-length PrP, we used the truncated coordinates of the ordered C-terminal domain, residues 125−228. The 1HJM structure consists of three α-helices, termed H1 (residues 144−156), H2 (residues 173−194), and H3 (residues 200−227), and two β-strands, termed S1 (residues 129−133) and S2 (residues 160−163).33 Replica-Exchange Molecular Dynamics Simulation. To investigate the conformational space extensively, we performed temperature replica-exchange molecular dynamics (T-REMD) calculations;37 70 parallel 72 ns simulations were conducted at different temperatures ranging from 300 to 455 K. We determined the temperature series for ensuring homogeneous exchange rates of 20% between the replicas using the Patriksson’s protocol.38 The following temperatures were selected for the 70 replicas: 300.0, 301.9, 303.8, 305.7, 307.6, 309.5, 311.4, 313.4, 315.3, 317.3, 319.2, 321.2, 323.2, 325.2, 327.2, 329.2, 331.2, 333.3, 335.3, 337.4, 339.4, 341.5, 343.6, 345.7, 347.8, 349.9, 352.0, 354.2, 356.3, 358.5, 360.7, 362.9, 365.1, 367.3, 369.5, 371.7, 373.9, 376.2, 378.5, 380.7, 383.0, 385.3, 387.6, 390.0, 392.3, 394.6, 397.0, 399.4, 401.7, 404.1, 406.6, 409.0, 411.4, 413.8, 416.3, 418.8, 421.2, 423.7, 426.2, 428.8, 431.3, 433.8, 436.4, 439.0, 441.5, 444.1, 446.8, 449.4, 452.0, and 454.7. The replica exchange was attempted every 4
⎧ ⎪ 1 (x ∈ σ ) Iσ(x) = ⎨ ⎪ ⎩ 0 (x ∉ σ ) B
(1)
dx.doi.org/10.1021/jp5034245 | J. Phys. Chem. B XXXX, XXX, XXX−XXX
The Journal of Physical Chemistry B
Article
for σ = H, E, and C. For the binary representation of vσ(i), the standard mathematical manipulation can be applied. Next, we define a function that quantifies the similarity in the secondary structure between the i- and jth protein coordinates as the inner product between vσ(i) and vσ(j) k(q(i), q(j)) =
The PCs are arranged by their contribution ratios that are assessed by a value of λp/∑mp=1λp. Among the PCs, some important components show a high contribution, thus retaining most of the variations present in all of the original variables. In the following text, we refer to the PCs determined by the SSPCA as SSPCs. The idea of the SSPCA is simple; however, it may suffer from computational bottlenecks when used with massive data sets. In this study, 630 070 conformations were sampled from 70 trajectories of the REMD simulations; therefore, the diagonalization of a kernel matrix of whole set becomes impractical. To make this problem computationally tractable, we used landmark points. First, we selected 9001 conformations from a REMD trajectory performed at 300 K. Next, we extracted 3631 conformations that are a set of unique secondary structure sequences. Then, we randomly selected half of the data. Finally, after the problem size was largely reduced, the algorithm can be conducted. For the selected 1816 landmark points, we applied the SSPCA to determine PCs according to secondary structures, which is a less demanding process than solving the global manifold shape of the original huge data. Free-Energy Landscape. The 630 070 conformations were obtained from the equilibrated parts of REMD simulations performed at multiple temperatures in the range from 300 to 455 K. In this study, we restricted the analysis to the thermodynamic properties at 300 K, because we were interested to know what happens under the physiologically relevant conditions. Therefore, we treated the entire REMD structural data based on the weighted histogram analysis method (WHAM)53 for this target temperature, by combining the histograms of the energy from the simulations at different temperatures. We projected the entire 630 070 data points of protein structures onto the two-dimensional (2D) subspace covered by the first two SSPCs. From the 2D joint canonical probability distribution function of the projected data, obtained using the WHAM method, P(Q1, Q2), where Q1 and Q2 are the coordinates in the 2D space of the first two SSPCs, we obtained the relative FEL as follows:
∑ ωσ (v(σi))T v(σj) (2)
σ
where ωσ is an arbitrary factor, which puts on weight depending on the type of SSE. The weighting allows one to emphasize the SSEs of the sequence. The selection of the weight parameters determines the topology structure of the data, resulting in the method of structural data mapping. We applied the function k(q(i),q(j)) in the kernel version of principal component analysis (kPCA).49 The PCA method is known as a powerful method for extracting the features of a set of data in the order of their importance.50 The conventional PCA method provides good result when the distinct data are linearly separable; however, for the data that are nonlinearly separable, the kernel version is more efficient for extracting features.51 The kPCA approaches are popular for analyzing diverse data set, e.g., in the field of computational biology.52 Our procedure exactly corresponds to the kPCA method and is a novel extension for extracting features according to the protein’s secondary structure. In the following text, we refer to our method as the secondary structure PCA (SSPCA). We describe briefly a general procedure for the kPCA method. First, we constructed the kernel matrix K from an input data set of q(1), q(2), ..., q(m), (K)i , j = k(q(i), q(j))
(3)
where k(q(i), q(j)) is the kernel function, which is defined by eq 2 for the SSPCA. The dimension of K is m × m, where m is the number of input data. Next, we centralized the kernel matrix ⎛ ⎞ ⎛ ⎞ 1 1 K̃ = ⎜Im − 1m (1m )T ⎟K⎜Im − 1m (1m )T ⎟ ⎝ ⎠ ⎝ ⎠ m m
(4)
where Im is the identity matrix of size m and 1m is the singular column vector of size m with all the entries equal to 1, i.e., 1m = (1, ..., 1)T. Finally, we diagonalized the centralized kernel matrix using an eigenvalue decomposition,
ΔG(Q 1 , Q 2) = −kBT[ln P(Q 1 , Q 2) − ln Pmax ]
where kB is the Boltzmann constant, T is temperature, and Pmax is the maximum value of the probability density function.
■
m
K̃ =
∑ λp up(up)T
RESULTS AND DISCUSSION Structural Data Mapping. We applied the SSPCA for the 1816 landmark points of protein structures sampled from the REMD trajectories. First, we compared the following three cases of structural data mapping by the SSPCA using different sets of weight parameters: (a) case 1, {ωH, ωE, ωC} = {1, 1, 1}, where all the SSEs are treated equally; (b) case 2, {ωH, ωE, ωC} = {NH−1, NE−1, 0}, where NH = 63 and NE = 9 are the number of the helix and strand elements in the native structure of PrP, defined by the 1HJM coordinates;33 i.e., the major SSEs obtain lower weights; and (c) case 3, {ωH, ωE, ωC} = {0, 1, 0}, where only the strand element is treated. The contribution ratios of the first five SSPCs were in the following order: (a) case 1, 17%, 12%, 9%, 7%, and 6%; (b) case 2, 25%, 15%, 9%, 7%, and 6%; and (c) case 3, 41%, 21%, 11%, 10%, and 5%. For these three cases, the 1816 landmark points were projected onto the 2D subspace covered by the first two SSPCs, namely, SSPC1 and SSPC2.
(5)
p=1
where λ1 ≥ λ2 ≥ ... ≥ λm ≥ 0 is the eigenvalue and up is the pth eigenvector with the entries up(1), up(2), ..., up(m). Once the eigenvectors are determined, we can project structural data q′ onto the eigenvector up, m
Q p(q′) =
∑ u(pi)k(̃ q′, q(i))
(6)
i=1
̃ which are the pth principal components (PCs). Herein, k(q′, q(i)) is the centralized kernel function, k(̃ q′, q(i)) = k(q′, q(i)) − −
1 m
m
∑ k(q(r), q(i))+ r=1
1 m
1 m2
m
∑ k(q′, q(r)) r=1 m m
∑ ∑ k(q(r), q(s)) r=1 s=1
(8)
(7) C
dx.doi.org/10.1021/jp5034245 | J. Phys. Chem. B XXXX, XXX, XXX−XXX
The Journal of Physical Chemistry B
Article
The results are shown in Figure 1, where each point was colored according to the secondary structure content of the
Figure 2. Two-dimensional FEL of PrP constructed in terms of the first two SSPCs. L0−L9 indicate the local minima, which were numbered according to their depth of free-energy. L0 is the global minimum. The lower-lying L0−L3 states represent the normal form of PrPC. The L5−L9 states involve helix-to-sheet conformational conversions, where L9 is the conformational state with the highest β-sheet proportion. Contour lines were drawn with an interval of 5 kJ mol−1.
local minima, L1−L9, whose relative free-energies are lower than 8.6 kJ mol−1. Table 1 lists the relative free energies of the selected 10 conformational states of the local minima, L0−L9, along with
Figure 1. Three cases of structural data mapping for the 1816 landmark points by the SSPCA with different sets of weight parameters: (a) case 1, (b) case 2, and (c) case 3. SSPC1 and SSPC2 are the first two principal components. Each projected point is colored according to the content ratio of residues that are a part of the helix or strand motif in PrP. (d) Key for the color scheme.
Table 1. Relative Free Energies of the Local Minima in the FEL of PrP, along with the Secondary Structure Content Ratios of the Residues That Are a Part of Helix, Strand, or Coil Motif in a Structure
residues that were identified as the part of a helix or a strand motif in a protein. In case 1, as shown in Figure 1a, there is an apparent separation between a helix-rich part in the left side, in red, and a rather strand-rich part in the right side, in blue, clearly showing the potential of the SSPCA for characterizing conformational states according to secondary structure sequences. In case 2, as shown in Figure 1b, the projected landmark points are clustered together according to their similarity in secondary structure sequences, thus dividing a set of data into some groups. This shows an improved separation of structural data compared to case 1. In case 3, as shown in Figure 1c, the 1816 landmark points were mapped only on 34 points. This is because the topology structure defined in this case provides information on the difference in the arrangements of strand elements, but it provides no information on the arrangements of the other two SSEs. These results of the three cases provide an illustrative perspective of the manner in which given protein structural data are projected onto a reduced space based on the SSPCA. In the following analysis, we used the first two SSPCs determined in case 2. These two SSPCs account for 40% of the total contribution. Thus, their selection allowed us to reduce the dimensionality of the massive data set without losing a significant amount of information. Free-Energy Landscape. After the investigation on the manner of structural data mapping by the SSPCA, it was possible to attempt a reliable description of the FEL. Figure 2 shows the 2D FEL of PrP as a function of the first two SSPCs. The FEL is rough and divided into several local minima, associated with the existence of multiple conformational states. We denote the state that is located at the global minimum on the landscape as L0 and consider it as the zero point of freeenergy. We identified additional nine metastable states of the
content [%] local state
free-energy [kJ mol−1]
helix
strand
coil
L0 L1 L2 L3 L4 L5 L6 L7 L8 L9
0.0 1.6 1.8 1.9 5.4 5.6 6.2 7.3 7.9 8.6
50 46 54 46 48 48 48 37 40 48
6 8 8 6 4 10 10 8 10 12
44 46 38 48 48 42 42 55 50 40
the percent contents of the residues that are part of a helix, strand, or coil motif in a protein. Herein, the secondary structure contents were calculated from the conformations sampled in the neighborhood of each local minimum. The energy scale of the thermal fluctuation at 300 K is 2.5 kJ mol−1; thus, the lower-lying states of L0−L3 are thermally accessible, which are close in energy within 1.9 kJ mol−1 and much more stable than the others by at least 3.5 kJ mol−1. L0−L3 have the helix content of 46−54% and the strand content of 6−8%, which are nearly identical to those of PrPC observed in the experimental studies.15−18 Thus, these four lower-lying conformational states, L0−L3, should represent the normal form of PrPC, preserving native packing features and SSEs. Moreover, PrP can adopt non-native conformations with an enhanced strand content of 10−12%. In particular, L9 has the highest strand content (12%) among all the conformational states recognized in this study. These results clearly indicate D
dx.doi.org/10.1021/jp5034245 | J. Phys. Chem. B XXXX, XXX, XXX−XXX
The Journal of Physical Chemistry B
Article
one of the normal PrPC states, has the most identical secondary structure as the native structure derived from the 1HJM coordinates. Evidently, the C-termini of helices are less well preserved in most of the local minima. In particular, L0 and L1, which are thermally accessible as mentioned above, undergo partial unfolding at the C-terminus of H2, residues 190−194. Furthermore, L5−L9 show a distinct difference in the secondary structure at the C-terminus of H2, forming an enhanced β-sheet between the two strands at the C-terminal region of H2, residues 190−193, and in this neighborhood, residues 196 and 197. Some computational studies reported an elongation of the native β-sheet between S1 and S2;24,27 however, no sign of the native β-sheet elongation was found among the metastable states recognized in this study. These results demonstrate that the C-terminus of H2 would be one of the “hot spots” of structural ambivalence in PrP, leading to the β-sheet expansion in PrP. The potential role of H2 and H3 in the β-sheet expansion of PrP was first suggested by Dima and Thirumalai.54 Some experimental evidence for a crucial role for H2 in the formation of PrPSc have been reported in more recent studies.55−58 Several MD simulations showed that parts of the H2 and H3 in PrP can convert into β-sheet structures.27−30 The β-sheet nature of aggregates has been established through numerous studies; thus, an intermolecular cross-β-sheet formation is the generally accepted mechanism of protein aggregation. 4 Consequently, the previous studies and this study support the hypothesis that the H2 domain may participate in the conversion of PrPC to a monomeric aggregation-prone state. Because a region of the local sequence in PrP lacks an intrinsic secondary structure, interactions of the β-sheet regions in PrPSc leads to a helix-to-sheet structural change, forming the cross-βsheet assembly of the proteins. The hot spot of structural ambivalence in PrP, i.e., the Cterminus of H2, has a threonine-rich region (residues 190− 193) including the TTTT sequence. Interestingly, such a sequence is frequently observed in β-sheet conformations.59,60 Indeed, experimental studies revealed that the peptide derived from the H2 segment of PrP has a high propensity for β-strand and forms β-sheet-rich amyloid-like fibrils,61−63 where a freeenergy difference of 5−8 kJ mol−1 was found to separate the αand β-forms of the H2-derived peptide.61 The calculated free-energy differences between L0 and the states with the β-sheet expansion, i.e., L5−L9, are in the range 5.6−8.6 kJ mol−1, which shows an approximate agreement with the experimentally predicted free-energy of 8.0 ± 1.7 kJ mol−1 between PrPC and a folding intermediate.64 On the basis of the calculated free-energy differences, the relative existence ratios of L0 and L5−L9 are governed by the Boltzmann distribution law, PL5−L9/PL0 = exp(−ΔG/kBT), which are estimated to be PL5−L9/PL0 = 0.11−0.03 at 300 K. Therefore, there can be a small but definite existence of helix-to-sheet conversions even at the physiologically normal condition. The fate of hot spots in PrP during the amyloid formation, leading to PrPSc, is unknown. Note that the elongation of the native β-sheet between S1 and S2 has been postulated as one of the possible pathway of PrP aggregation; however, this site was shown to be protected by a β-bulge motif, promoted by G131, that hinders the formation of intermolecular β-sheet.65 Considering the increase in the strand content in the hot spot region, therefore, it is tempting to speculate that, besides the native β-sheet moiety, the transient β-sheet motif present in H2 may enhance binding to PrPSc than the PrPC form.
that PrP has some regions with intrinsic structural ambiguity, promoting the β-sheet expansion in PrP. Hot Spot of Structural Ambivalence. A key issue in this study involves the regions in PrP that are most susceptible to the conversion of PrPC. The FEL indicates that PrP exhibits some regions of structural ambivalence. Figure 3 shows the
Figure 3. Helix and strand contents of each residue for the four conformational states of (a) L0, (b) L2, (c) L6, and (d) L9, along with the snapshot pictures of the corresponding 3D structures. In the bar graphs, red and blue show the formation ratio of the helix and strand motifs, respectively. Dashed lines indicate three α-helix regions, in red, and two β-sheet regions, in blue, defined from the 1HJM coordinates.33 In the 3D structures, residues colored in red and blue represent helix and sheet motifs, respectively. (a) L0 and (b) L2 are regarded as the normal form of PrPC; (c) L6 and (d) L9 involve the helix-to-sheet conversions of the protein.
helix and strand contents of each residue for four conformational states of L0, L2, L6, and L9, along with the corresponding 3D structures as shown in the cartoon view. Herein, the helix or strand content of the nth residue is defined as a percentage of conformations, where the nth residue is found in the helix or strand state. L0 and L2 are regarded as the PrPC states, and L6 and L9 involve the helix-to-sheet conformational conversions. These four states are easily distinguished in terms of their secondary structure sequences. Most of the residues have a high content ratio of the SSEs, where each strand content reached almost 100%; however, the helix content fluctuated partly. This is because we selected the weight parameters in eq 2 for ensuring that a strand element is given much weight than a helix element. These results support that the conformations are clearly classified according to their secondary structures by the SSPCA. Figure 4 shows the main SSEs of each residue for L0−L9 schematically, along with the secondary structure sequence of 1HJM.33 As illustrated in this diagram, L2, which is regarded as E
dx.doi.org/10.1021/jp5034245 | J. Phys. Chem. B XXXX, XXX, XXX−XXX
The Journal of Physical Chemistry B
Article
Figure 4. Main SSEs of each residue for L0−L9, along with the secondary structure sequence of 1HJM33 and one-letter amino acid abbreviations. “H” in red, “E” in blue, and “−” show the residues found in the helix, strand, and coil states, respectively.
■
CONCLUSION Metastable states with an enhanced β-sheet proportion than PrPC are proposed to serve as the on-pathway intermediate precursors for the pathological conversion to PrPSc; however, the precise probability of causing the β-sheet expansion in PrP has remained unclear. In this study, we constructed the FEL of PrP by a straightforward method that maps the protein structural data into a reduced space according to secondary structure, named SSPCA. We obtained the evidence for the definite existence of the PrP isoforms with an increased β-sheet content: For every 1 M PrPC, 0.03−0.11 M of the isoforms exist under normal circumstance. The “hot spot” of structural ambivalence is responsible for the β-sheet expansion in PrP, which leads the partial helix-to-sheet conversion at the Cterminal region of H2 in the protein. The hot spot in PrP may play a key role in the interface between PrPC and PrPSc during the formation of nascent amyloid fibrils. Generally, secondary structure contributes to the unique properties and functions of a protein; therefore, it is essential to describe the characteristic of protein folding processes and secondary structure changes. The SSPCA, which is a straightforward method to extract the necessary information on the secondary structure conversions of proteins, thus can be applied to solve various challenges. For example, intrinsically disordered proteins (IDPs) are highly flexible proteins;66 although these proteins lack a well-defined 3D conformation,
they are involved in diverse biological functions. For IDPs, the standard PCA would be problematic because of the fully coiled states that fill most of the conformational space when the Cartesian coordinates are used. On the other hand, the SSPCA maps these fully coiled states to a point on the reduced space of SSPC coordinates; i.e., we can separately handle structureless states from characteristic states. The SSPCA, used for PrP in this study, has great potential in studying the function of highly flexible molecular systems, such as IDPs, structurally ambivalent peptides,67 and chameleon sequences.68
■
AUTHOR INFORMATION
Corresponding Author
*(N.Y.) E-mail:
[email protected]. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS This work was supported by a Grant-in-Aid for Young Scientists (B) from the Ministry of Education, Culture, Sports, Science, and Technology (No. 20750008), Japan.
■
REFERENCES
(1) Taylor, J. P.; Hardy, J.; Fischbeck, K. H. Toxic Proteins in Neurodegenerative Disease. Science 2002, 296, 1991−1995.
F
dx.doi.org/10.1021/jp5034245 | J. Phys. Chem. B XXXX, XXX, XXX−XXX
The Journal of Physical Chemistry B
Article
Conversion of the Prion Protein. Proc. Natl. Acad. Sci. U.S.A. 2001, 98, 2985−2989. (25) Colacino, S.; Tiana, G.; Broglia, R. A.; Colombo, G. The Determinants of Stability in the Human Prion Protein: Insights Into Folding and Misfolding From the Analysis of the Change in the Stabilization Energy Distribution in Different Conditions. Proteins 2006, 62, 698−707. (26) DeMarco, M. L.; Daggett, V. Molecular Mechanism for Low pH Triggered Misfolding of the Human Prion Protein. Biochemistry 2007, 46, 3045−3054. (27) Langella, E.; Improta, R.; Barone, V. Checking the pH-Induced Conformational Transition of Prion Protein by Molecular Dynamics Simulations: Effect of Protonation of Histidine Residues. Biophys. J. 2004, 87, 3623−3632. (28) Campos, S. R. R.; Machuqueiro, M.; Baptista, A. M. ConstantpH Molecular Dynamics Simulations Reveal a β-Rich Form of the Human Prion Protein. J. Phys. Chem. B 2010, 114, 12692−12700. (29) Khorvash, M.; Lamour, G.; Gsponer, J. Long-Time Scale Fluctuations of Human Prion Protein Determined by Restrained MD Simulations. Biochemistry 2011, 50, 10192−10194. (30) Chakroun, N.; Fornili, A.; Prigent, S.; Kleinjung, J.; Dreiss, C. A.; Rezaei, H.; Fraternali, F. Decrypting Prion Protein Conversion into a β-Rich Conformer by Molecular Dynamics. J. Chem. Theory Comput. 2013, 9, 2455−2465. (31) Zuckerman, D. M. Statistical Physics of Biomolecules: An Introduction; CRC Press: Boca Raton, FL, 2010. (32) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926−935. (33) Calzolai, L.; Zahn, R. Influence of pH on NMR Structure and Stability of the Human Prion Protein Globular Domain. J. Biol. Chem. 2003, 278, 35592−35596. (34) Hornemann, S.; Glockshuber, R. A Scrapie-Like Unfolding Intermediate of the Prion Protein Domain PrP(121−231) Induced by Acidic pH. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 6010−6014. (35) Martins, S. M.; Chapeaurouge, A.; Ferreira, S. T. Folding Intermediates of the Prion Protein Stabilized by Hydrostatic Pressure and Low Temperature. J. Biol. Chem. 2003, 278, 50449−50455. (36) Martins, S. M.; Frosoni, D. J.; Martinez, A. M. B.; De Felice, F. G.; Ferreira, S. T. Formation of Soluble Oligomers and Amyloid Fibrils with Physical Properties of the Scrapie Isoform of the Prion Protein From the C-terminal Domain of Recombinant Murine Prion Protein mPrP-(121−231). J. Biol. Chem. 2006, 281, 26121−26128. (37) Sugita, Y.; Okamoto, Y. Replica-Exchange Molecular Dynamics Method for Protein Folding. Chem. Phys. Lett. 1999, 314, 141−151. (38) Patriksson, A.; van der Spoel, D. A Temperature Predictor for Parallel Tempering Simulations. Phys. Chem. Chem. Phys. 2008, 10, 2073−2077. (39) Bussi, G.; Donadio, D.; Parrinello, M. Canonical Sampling Through Velocity Rescaling. J. Chem. Phys. 2007, 126, 014101−1− 014101−7. (40) Berendsen, H. J. C.; Postma, J. P. M.; van Gunsteren, W. F.; DiNola, A.; Haak, J. R. Molecular Dynamics with Coupling to an External Bath. J. Chem. Phys. 1984, 81, 3684−3690. (41) Hess, B. P-LINCS: A Parallel Linear Constraint Solver for Molecular Simulation. J. Chem. Theory Comput. 2008, 4, 116−122. (42) Feenstra, K. A.; Hess, B.; Berendsen, H. J. C. Improving Efficiency of Large Time-Scale Molecular Dynamics Simulations of Hydrogen-Rich Systems. J. Comput. Chem. 1999, 20, 786−798. (43) Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A Smooth Particle Mesh Ewald Method. J. Chem. Phys. 1995, 103, 8577−8593. (44) Hess, B.; Kutzner, C.; van der Spoel, D.; Lindahl, E. GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J. Chem. Theory Comput. 2008, 4, 435−447. (45) Pronk, S.; Páll, S.; Schulz, R.; Larsson, P.; Bjelkmar, P.; Apostolov, R.; Shirts, M. R.; Smith, J. C.; Kasson, P. M.; van der Spoel, D.; et al. GROMACS 4.5: A High-Throughput and Highly Parallel
(2) Ross, C. A.; Poirier, M. A. Protein Aggregation and Neurodegenerative Disease. Nature Med. 2004, 10, S10−S17. (3) Chiti, F.; Dobson, C. M. Protein Misfolding, Functional Amyloid, and Human Disease. Annu. Rev. Biochem. 2006, 75, 333−366. (4) Harrison, R. S.; Sharpe, P. C.; Singh, Y.; Fairlie, D. P. Amyloid Peptides and Proteins in Review. Rev. Physiol. Biochem. Pharmacol. 2007, 159, 1−77. (5) Brundin, P.; Melki, R.; Kopito, R. Prion-like Transmission of Protein Aggregates in Neurodegenerative Diseases. Nature Rev. Mol. Cell Biol. 2010, 11, 301−307. (6) Prusiner, S. B. A Unifying Role for Prions in Neurodegenerative Diseases. Science 2012, 336, 1511−1513. (7) Prusiner, S. B. Prions. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 13363−13383. (8) Prusiner, S. B.; Scott, M. R.; DeArmond, S. J.; Cohen, F. E. Prion Protein Biology. Cell 1998, 93, 337−348. (9) Collins, S. J.; Lawson, V. A.; Masters, C. L. Transmissible Spongiform Encephalopathies. Lancet 2004, 363, 51−61. (10) Harris, D.; True, H. New Insights Into Prion Structure and Toxicity. Neuron 2006, 50, 353−357. (11) Bolton, D. C.; McKinley, M. P.; Prusiner, S. B. Identification of a Protein That Purifies with the Scrapie Prion. Science 1982, 218, 1309−1311. (12) Prusiner, S. B. Scrapie Prions. Annu. Rev. Microbiol. 1989, 43, 345−374. (13) Borchelt, D. R.; Scott, M.; Taraboulos, A.; Stahl, N.; Prusiner, S. B. Scrapie and Cellular Prion Proteins Differ in Their Kinetics of Synthesis and Topology in Cultured Cells. J. Cell Biol. 1990, 110, 743−752. (14) Pan, K.-M.; Baldwin, M.; Nguyen, J.; Gasset, M.; Serban, A.; Groth, D.; Mehlhorn, I.; Huang, Z.; Fletterick, R. J.; Cohen, F. E.; et al. Conversion of α-Helices Into β-Sheets Features in the Formation of the Scrapie Prion Proteins. Proc. Natl. Acad. Sci. U.S.A. 1993, 90, 10962−10966. (15) Riek, R.; Hornemann, S.; Wider, G.; Billeter, M.; Glockshuber, R.; Wüthrich, K. NMR Structure of the Mouse Prion Protein Domain PrP(121−231). Nature 1996, 382, 180−182. (16) Riek, R.; Hornemann, S.; Wider, G.; Glockshuber, R.; Wüthrich, K. NMR Characterization of the Full-Length Recombinant Murine Prion Protein, mPrP(23−231). FEBS Lett. 1997, 413, 282−288. (17) Donne, D. G.; Viles, J. H.; Groth, D.; Mehlhorn, I.; James, T. L.; Cohen, F. E.; Prusiner, S. B.; Wright, P. E.; Dyson, H. J. Structure of the Recombinant Full-Length Hamster Prion Protein PrP(29−231): The N Terminus Is Highly Flexible. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 13452−13457. (18) Zahn, R.; Liu, A.; Lührs, T.; Riek, R.; von Schroetter, C.; López García, F.; Billeter, M.; Calzolai, L.; Wider, G.; Wüthrich, K. NMR Solution Structure of the Human Prion Protein. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 145−150. (19) Caughey, B. W.; Dong, A.; Bhat, K. S.; Ernst, D.; Hayes, S. F.; Caughey, W. S. Secondary Structure Analysis of the Scrapie-Associated Protein PrP 27−30 in Water by Infrared Spectroscopy. Biochemistry 1991, 30, 7672−7680. (20) Gasset, M.; Baldwin, M. A.; Fletterick, R. J.; Prusiner, S. B. Perturbation of the Secondary Structure of the Scrapie Prion Protein Under Conditions That Alter Infectivity. Proc. Natl. Acad. Sci. U.S.A. 1993, 90, 1−5. (21) Dima, R. I.; Thirumalai, D. Probing the Instabilities in the Dynamics of Helical Fragments From Mouse PrPC. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 15335−15340. (22) De Simone, A.; Zagari, A.; Derreumaux, P. Structural and Hydration Properties of the Partially Unfolded States of the Prion Protein. Biophys. J. 2007, 93, 1284−1292. (23) Yamamoto, N.; Kuwata, K. Regulating the Conformation of Prion Protein Through Ligand Binding. J. Phys. Chem. B 2009, 113, 12853−12856. (24) Alonso, D. O. V.; DeArmond, S. J.; Cohen, F. E.; Daggett, V. Mapping the Early Steps in the pH-Induced Conformational G
dx.doi.org/10.1021/jp5034245 | J. Phys. Chem. B XXXX, XXX, XXX−XXX
The Journal of Physical Chemistry B
Article
Open Source Molecular Simulation Toolkit. Bioinformatics 2013, 29, 845−854. (46) Jorgensen, W. L.; Maxwell, D. S.; Tirado-Rives, J. Development and Testing of the OPLS All-Atom Force Field on Conformational Energetics and Properties of Organic Liquids. J. Am. Chem. Soc. 1996, 118, 11225−11236. (47) Kaminski, G. A.; Friesner, R. A.; Tirado-Rives, J.; Jorgensen, W. L. Evaluation and Reparametrization of the OPLS-AA Force Field for Proteins via Comparison with Accurate Quantum Chemical Calculations on Peptides. J. Phys. Chem. B 2001, 105, 6474−6487. (48) Frishman, D.; Argos, P. Knowledge-Based Protein Secondary Structure Assignment. Proteins 1995, 23, 566−579. (49) Schölkopf, B.; Smola, A. J. Learning with Kernels; MIT Press: Cambridge, MA, 2002. (50) Jolliffe, I. T. Principal Component Analysis, 2nd ed.; Springer: New York, 2002. (51) Schölkopf, B.; Smola, A.; Müller, K.-R. Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Comput. 1998, 10, 1299−1319. (52) Kernel Methods in Computational Biology; Schölkopf, B.; Tsuda, K.; Vert, J.-P., Eds.; MIT Press: Cambridge, MA, 2004. (53) Kumar, S.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A.; Rosenberg, J. M. The Weighted Histogram Analysis Method for FreeEnergy Calculations on Biomolecules. I. The Method. J. Comput. Chem. 1992, 13, 1011−1021. (54) Dima, R. I.; Thirumalai, D. Exploring the Propensities of Helices in PrPC to Form β Sheet Using NMR Structures and Sequence Alignments. Biophys. J. 2002, 83, 1268−1280. (55) Kuwata, K.; Kamatari, Y.; Akasaka, K.; James, T. Slow Conformational Dynamics in the Hamster Prion Protein. Biochemistry 2004, 43, 4439−4446. (56) Kuwata, K.; Nishida, N.; Matsumoto, T.; Kamatari, Y. O.; Hosokawa-Muto, J.; Kodama, K.; Nakamura, H. K.; Kimura, K.; Kawasaki, M.; Takakura, Y.; et al. Hot Spots in Prion Protein for Pathogenic Conversion. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 11921−11926. (57) Cobb, N. J.; Apetri, A. C.; Surewicz, W. K. Prion Protein Amyloid Formation under Native-like Conditions Involves Refolding of the C-terminal α-Helical Domain. J. Biol. Chem. 2008, 283, 34704− 34711. (58) Palladino, P.; Ronga, L.; Benedetti, E.; Rossi, F.; Ragone, R. Peptide Fragment Approach to Prion Misfolding: The Alpha-2 Domain. Int. J. Pept. Res. Ther. 2009, 15, 165−176. (59) Minor, D. L., Jr.; Kim, P. S. Measurement of the β-SheetForming Propensities of Amino Acids. Nature 1994, 367, 660−663. (60) Smith, C. K.; Regan, L. Construction and Design of β-Sheets. Acc. Chem. Res. 1997, 30, 153−161. (61) Tizzano, B.; Palladino, P.; De Capua, A.; Marasco, D.; Rossi, F.; Benedetti, E.; Pedone, C.; Ragone, R.; Ruvo, M. The Human Prion Protein α2 Helix: A Thermodynamic Study of Its Conformational Preferences. Proteins 2005, 59, 72−79. (62) Ronga, L.; Palladino, P.; Saviano, G.; Tancredi, T.; Benedetti, E.; Ragone, R.; Rossi, F. Structural Characterization of a Neurotoxic Threonine-Rich Peptide Corresponding to the Human Prion Protein α2-Helical 180−195 Segment, and Comparison with Full-Length α2Helix-Derived Peptides. J. Pept. Sci. 2008, 14, 1096−1102. (63) Yamaguchi, K.; Matsumoto, T.; Kuwata, K. Critical Region for Amyloid Fibril Formation of Mouse Prion Protein: Unusual Amyloidogenic Properties of the Helix 2 Peptide. Biochemistry 2008, 47, 13242−13251. (64) Zhang, H.; Stöckel, J.; Mehlhorn, I.; Groth, D.; Baldwin, M. A.; Prusiner, S. B.; James, T. L.; Cohen, F. E. Physical Studies of Conformational Plasticity in a Recombinant Prion Protein. Biochemistry 1997, 36, 3543−3553. (65) De Simone, A.; Dodson, G. G.; Fraternali, F.; Zagari, A. Water Molecules as Structural Determinants Among Prions of Low Sequence Identity. FEBS Lett. 2006, 580, 2488−2494. (66) Tompa, P. Structure and Function of Intrinsically Disordered Proteins; CRC Press: Boca Raton, FL, 2009.
(67) Kuznetsov, I. B.; Rackovsky, S. On the Properties and Sequence Context of Structurally Ambivalent Fragments in Proteins. Protein Sci. 2003, 12, 2420−2433. (68) Ikeda, K.; Higo, J. Free-Energy Landscape of a Chameleon Sequence in Explicit Water and Its Inherent α/β Bifacial Property. Protein Sci. 2003, 12, 2542−2548.
H
dx.doi.org/10.1021/jp5034245 | J. Phys. Chem. B XXXX, XXX, XXX−XXX