Analysis of Structural Stability of Chignolin - The Journal of Physical

Mar 12, 2018 - We discuss the stability of an entire protein and the influence of main chains and side chains of individual amino acids to investigate...
0 downloads 3 Views 6MB Size
Subscriber access provided by Kaohsiung Medical University

B: Biophysical Chemistry and Biomolecules

Analysis of Structural Stability of Chignolin Yutaka Maruyama, and Ayori Mitsutake J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.8b00288 • Publication Date (Web): 12 Mar 2018 Downloaded from http://pubs.acs.org on March 12, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Analysis of Structural Stability of Chignolin Yutaka Maruyama† and Ayori Mitsutake∗,‡ †Co-Design Team, FLAGSHIP 2020 Project, RIKEN Advanced Institute for Computational Science, Kobe 650-0047, Japan ‡Department of Physics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan E-mail: [email protected]

Abstract We discuss the stability of an entire protein and the influence of main-chains and side-chains of the individual amino acids to investigate the protein folding mechanism. For the purpose, we calculated the solvation free energy contribution of individual atoms using the three-dimensional reference interaction site model with the atomic decomposition method. We generated structures of chignolin miniprotein by a molecular dynamics simulation and classified them into six types: native 1, native 2, misfolded 1, misfolded 2, intermediate, and unfolded states. The total energies of the native (−171.1 kcal/mol) and misfolded (−171.2 kcal/mol) states were almost the same and lower than those of the intermediate (−158.5 kcal/mol) and unfolded (−148.1 kcal/mol) states; however, their components were different. In the native state, the side-chain interaction between Thr6 and Thr8 is important for the formation of π-turn. On the other hand, the hydrogen bonds between the atoms of the main-chains in the misfolded state become stronger than that in the intermediate state.

1

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Introduction Proteins are polypeptides consisting of amino acids, and to function effectively, the protein chains must be properly folded. That is, the structural stability of protein is important for its biological function. Similarly, the mechanism of protein folding and its pathway is of interest in physical chemistry. When considering protein stability, the contribution of surrounding water needs to be considered because the competition between the conformational energy of protein and the solvation free energy (SFE) determines the structural stability of protein. 1–3 The presence of water affects protein stabilization and folding mechanism. Recently, we investigated the stability of protein by introducing solvent effect using the three-dimensional reference interaction site model (3D-RISM) theory 4–6 and the reference modified density functional theory (RMDFT). 3 The 3D-RISM, which is the statistical mechanics theory for molecular liquid, reproduces the distribution function of solvent around solute molecule and gives several physical quantities. Unfortunately, the Singer-Chandler formula, 7,8 which is generally used to calculate the SFE, shows a value higher than the actual value. 9 Then we proposed a new SFE functional, RMDFT, to obtain a more accurate estimation of the SFE. 3 For a large number of trajectories of four proteins obtained from folding simulations, 10 the total energy, which is given by the sum of the conformational energy of protein and the SFE, was calculated using 3D-RISM with RMDFT. 3 In the four proteins, the total energies of the native structures were the lowest. This indicates that the proteins fold to their native structures to attain stability. Since the conformational energy and SFE tend to be inversely correlated, a detailed analysis of solvation effect is necessary to show that the native structure is more stable than other structures. In the previous work, however, the stabilities of the entire structures of the proteins were investigated. 3 For protein folding, it is interesting to know the parts of the protein that stabilize upon folding and the differences between the native structures and the compact structures. Thus, estimating the stabilities of individual amino acids is important. A polypeptide can be divided into main-chains that determine the structure of protein and side-chains that determine 2

ACS Paragon Plus Environment

Page 2 of 44

Page 3 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

the nature of the amino acid. In a stable structure, the main-chain and side-chain contribute to structural stability by forming hydrogen bonds with other main-chains or side-chains, being exposed to the surrounding water, and packing with other atoms of the protein. The patterns of hydrogen bonds in the protein are as follows: (a) between main-chains (b) between main-chain and side-chain (c) between side-chains (d) between main-chain and water (e) between side-chain and water. The formation of these hydrogen bonds in the protein stabilizes the protein structure due to the decrease of conformational energy. On the other hand, intermolecular hydrogen bonds between protein and water stabilize the protein due to the decrease of SFE. Despite the importance of the energies, it is difficult to accurately estimate their contribution to protein stability. We would like to investigate the effects of main- and side-chain behaviors on the structural stability of protein. Therefore, it is necessary to obtain the contributions of main-chains and side-chains of individual amino acids to structural stability. Thus, the conformational energy and SFE of individual atoms need to be calculated. The conformational energy of protein consists of van der Waals, electrostatic, bond stretching, angle bending, and torsional energy terms. It is not difficult to split the contribution of each energy term into individual atoms. Because the van der Waals, electrostatic, and bond stretching energy terms are two-body potentials, the energy between two atoms can be divided equally between the two atoms; thus, the energy term is divided into contribution of individual atom. Although the process of dividing three- or four-body potential, such as angle bending and torsional energy terms, is not clear, they can be defined by dividing them equally between the atoms. Note that these are small compared to two-body potential terms, and thus, their effects are small as well. 1 On the other hand, two methods using the 3D-RISM theory 4–6 have been proposed to calculate the contribution of individual atoms’ or groups’ SFE. 11–15 One is a spatial decomposition analysis (SDA) method, which is proposed by Yamazaki and Kovalenko, to decompose the solvation thermodynamics quantities. 11,12 This method estimates the contribution of in-

3

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

dividual groups on the solute by using Voronoi tessellation. They applied the SDA method to the complexation of β-cyclodextrin and 1-adamantanecarboxylic acid in water to investigate the thermodynamic properties of the association. They also applied the SDA method to chignolin miniprotein 16 and its mutant (CLN025) 17 and examined the effects of mutation on structural change. 12 In addition, Aono et al. extended the method to a quantum chemical calculation with the 3D-RISM theory by adding a MP2 correction term. 18 The calculation of Voronoi tessellation is clear, and this method works well for relatively small molecules, where all the atoms or groups are in contact with the solvent. 11,12 However, the handling of atoms or groups buried inside a large protein is unclear in this method. The other is an atomic decomposition (AD) method based on Kirkwood charging formula proposed by Chong and Ham. 13–15 The formula of the method is exact, then the handling of atoms buried is clear. They applied this method to amyloid β proteins to investigate the molecular origin for the increase in hydrophobicity. They also identified the structural and thermodynamic characteristics of amyloidogenic intermediates. 19 We employed this method to calculate the SFE contribution of individual atoms in the present work. However, it is necessary to repeat 3D-RISM calculation to perform numerical integration, and the calculation cost of this method is high. Therefore, we reduce the calculation time by performing 3D-RISM calculation on graphic processing unit. 20 Chignolin 16 is an artificial protein consisting of 10 residues and is used as a test system of various new methods. 12,21–31 Chignolin is characterized by two stable states at room temperature at 1 atm in molecular dynamics (MD) simulations: a native state and a misfolded state. 27,29,32–35 The existence ratio of the misfolded and native states depends on the use of potential parameter. 36 One of authors performed a simulation of chignolin at a transition temperature and analyzed obtained conformations in details 37–39 by using relaxation mode analysis. 31 The sampled conformations were classified into the native, misfolded, intermediate, and unfolded states clearly. The native and misfolded states have a common β-turn structure from Asp3 to Thr6 but slightly different hydrogen bond patterns for the backbone

4

ACS Paragon Plus Environment

Page 4 of 44

Page 5 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

atoms. While the distributions of the dihedral angles from Tyr2 to Thr6 are similar in the native and misfolded states, those of Gly7 and Thr8 are different. Moreover, in the misfolded state, the side-chain of Trp9 is not stacked on Tyr2, unlike in the native state. Previous works 31,33 also have explained the folding process: chignolin folds to the native or misfolded structures through an intermediate structure, which has a common β-turn, from the extended structures. While the detailed folding process of chignolin has been investigated, the stabilities of the obtained states are not clear. In this study, we investigated the differences in stability mechanisms of the native, misfolded, intermediate, and unfolded states of chignolin by introducing the solvation effect. We generated the conformations in four states at room temperature by performing a MD simulation. We calculated the conformational energy and SFE of the generated conformations to investigate the stabilities of the states. We also applied the AD method to chignolin to investigate the contribution of main- and side-chains to its stability. From the detailed analysis of the structures (especially, the hydrogen bond formations) and energy decomposition of the main-chains and side-chains, we obtained knowledge about the important residues that stabilize the intermediate state (forming a β-turn), misfolded state, and native state.

Methods Free energy function of a protein in water Here, we define the total energy, G, as the sum of the conformational energy of the protein, E, and the SFE, ∆µ: 1–3,40 G = E + ∆µ.

(1)

The ∆µ term is divided into two terms: the solvation energy, Es , and the solvation entropy, ∆S: ∆µ = Es − T ∆S,

5

ACS Paragon Plus Environment

(2)

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 44

where T is the temperature. Thus, we define E + Es as the energy term and −T ∆S as the entropy term. Note that the conformational entropy of the protein is not treated directly by our definition of G. The effect on conformational entropy is implicitly included by averaging the energy terms over conformations obtained in each state by the MD simulation. 41 These components can be divided into the contributions of main- and side-chains.

E = EM + ES ,

(3)

∆µ = ∆µM + ∆µS .

(4)

and

Here, M and S indicate the main- and side-chains, respectively. Further, we can define a main-chain component of G, GM = E M + ∆µM ,

(5)

GS = E S + ∆µS .

(6)

and a side-chain component of G,

These are further divided into the contributions of individual amino acids, i:

E=

n ∑

(EiM + EiS ),

(7)

i

and

n ∑ S ∆µ = (∆µM i + ∆µi ),

(8)

i

where n is the number of amino acids in the protein. To investigate the thermodynamic M/S

properties of a protein, we compute E or Ei M/S

∆µ or ∆µi

directly from MD simulation and calculate

by applying the 3D-RISM theory to the simulated protein conformations.

Here, we consider a method for dividing the conformational energy (E), which consists 6

ACS Paragon Plus Environment

Page 7 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

of van der Waals, electrostatic, bond stretching, angle bending, and torsional energy terms. The van der Waals, electrostatic, and bond stretching energy terms are two-body potentials, and can be divided equally into two interacting atoms. On the other hand, since the angle bending or torsional energy terms are three- or four-body potentials, the method of dividing these energy terms is not clear. However, the angle bending and torsional energy terms have small standard deviation values compared to two-body potentials, and thus, their influences are small. 1 Hence, for three- or four-body potentials, we consider that the energy term is equally divided into the interacting atoms. The method of dividing SFE, ∆µ, is described in the next subsection.

Calculation of SFE using 3D-RISM theory In this study, we consider the SFE of not only the whole protein but also individual atoms. The AD of SFE can calculate such a property. 13–15 This method is based on the expression of the SFE in terms of solute-solvent interaction and the solvent distribution function around the solute. The interaction potential acting on the solvent site, γ, of position r is denoted by uγ (r). The distribution function of the solvent site, γ, around the solute is indicated by gγ (r)(= hγ (r) + 1), where hγ (r) is the total correlation function. The SFE is given by the Kirkwood charging formula: 42

∆µ =

∑ γ

∫ ργ



1



dr

0

∂uγ (r; λ) gγ (r; λ), ∂λ

(9)

where ργ denotes the average number density of solvent site, γ. The coupling parameter, λ, gradually switches on the interaction potential from λ = 0 (no interaction) to λ = 1 (full interaction). u(r; λ) varies according to λ, and g(r; λ) denotes the distribution function under u(r; λ). (We will show the calculation of g(r; λ) or h(r; λ) later.) ∆µ is the SFE of the whole solute, and thus we consider the decomposition of SFE into contributions from the individual atoms. uγ (r) is represented by a sum of the potentials between the solute atomic

7

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 44

site, α, and the solvent site, γ,

uγ (r) =

N ∑

uαγ (|r − rα |),

(10)

α=1

where rα is the position of the atomic site, α, and N is the number of atomic sites in the solute. Thus, we obtain the following basic expressions from eqs 9 and 10,

∆µ =

N ∑

(11)

∆µα ,

α=1

and ∆µα =

∑ γ

∫ ργ



1

dλ 0

dr

∂uαγ (|r − rα |; λ) gγ (r; λ). ∂λ

(12)

The most commonly used form of the solute-solvent interaction potential, uαγ (r), is represented by a sum of the Lennard-Jones (LJ) and Coulomb electrostatic terms,

elec uαγ (r) = uLJ αγ (r) + uαγ (r).

(13)

12 Here, uLJ − (σαγ /r)6 )] and uelec αγ (r) = 4ϵαγ [(σαγ /r) αγ (r) = qα qγ /r, where ϵαγ , σαγ , qα , and

qγ are the LJ parameters and atomic charges, respectively. When both LJ parameter and electrostatic potentials are present, it is necessary to treat them separately by introducing two coupling parameters, λ1 and λ2 , in the charging formula. The parameters can be chosen to scale the LJ parameter, σαγ , and the atomic charge in the solute, qα , respectively. Then the solute-solvent interaction potential is

elec uαγ (r; λ1 , λ2 ) = uLJ αγ (r; λ1 ) + uαγ (r; λ2 ),

where

[( uLJ αγ (r; λ1 ) = 4ϵαγ

σαγ λ1 r

8

)12

( −

σαγ λ1 r

ACS Paragon Plus Environment

(14)

)6 ] ,

(15)

Page 9 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

and uelec αγ (r; λ2 ) =

qα qγ λ2 . r

(16)

Thus, the integration path is as follows: first, LJ interaction is performed from 0 to 1 with λ2 = 0, and then, the electrostatic interaction is switched on by performing the integral of λ2 from 0 to 1 with λ1 = 1. Finally, ∆µα is expressed as: ∑

[∫



∂uαγ (|r − rα |; λ1 , λ2 = 0) gγ (r; λ1 , λ2 = 0) ∂λ1 0 γ ] ∫ 1 ∫ ∂uαγ (|r − rα |; λ1 = 1, λ2 ) + dλ2 dr gγ (r; λ1 = 1, λ2 ) . ∂λ2 0

∆µα =

ργ

1

dλ1

dr

(17)

After calculating ∆µα once, we could reproduce the contributions of amino acid residues or main-/side-chains. To calculate ∆µα , we require gγ (r; λ1 , λ2 ) or hγ (r; λ1 , λ2 ). We employed the 3D-RISM theory to obtain hγ (r; λ1 , λ2 ) under uγ (r; λ1 , λ2 ). For a solute-solvent system at infinite dilution, the 3D-RISM equation is written as

hγ (r; λ1 , λ2 ) =



[ ] cγ ′ (r; λ1 , λ2 ) ∗ wγvv′ γ (r) + ργ ′ hvv γ ′ γ (r) ,

(18)

γ′

where hγ (r; λ1 , λ2 ) and cγ (r; λ1 , λ2 ) are, respectively, the 3D total and direct correlation functions of the solvent site, γ, around the solute under λ1 and λ2 , the asterisk (*) denotes a convolution integral in real space, wγvv′ γ (r) is the site-site intramolecular correlation function of the solvent, and hvv γ ′ γ (r) is the site-site total correlation functions of pure solvent. The site-site correlation functions of the solvent were obtained in advance from the 1D-RISM theory for pure solvent. The 3D-RISM equation is complemented by the Kovalenko-Hirata

9

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 44

(KH) closure equation.    exp(χγ ) − 1 (χγ < 0) hγ (r; λ1 , λ2 ) =   χγ (χγ ≥ 0)

(19)

χγ = −βuγ (r; λ1 , λ2 ) + hγ (r; λ1 , λ2 ) − cγ (r; λ1 , λ2 ) Here, β = 1/(kB T ) is the inverse of the Boltzmann constant times temperature. Finally, the 3D distribution function, gγ (r; λ1 , λ2 ), is defined from hγ (r; λ1 , λ2 ) by

gγ (r; λ1 , λ2 ) = hγ (r; λ1 , λ2 ) + 1.

(20)

We calculated gγ (r; λ1 , λ2 ) at every integration step of eq 17.

Computational Details Various structures of chignolin were generated using a conventional MD simulation independently. The MD simulation was performed with the AMBER package (AMBER 11.0) 43 using ff99SB force field for the protein 44 and TIP3P model for water. 45 Chignolin consists of a 10-amino acid sequence: GYDPETGTWG. We generated an extended structure of chignolin using the leap command and solvated it with a 15-Å buffer of TIP3P water around the protein in each direction. The number of atoms of chignolin and water molecules were 138 and 10,941 (3,647 water molecules), respectively. Asp and Glu are negatively charged under neutral pH. Two sodium ions (Na+ ) are included in the system, resulting in a net neutral system. After solvating chignolin, we minimized the structure with Cα constrains by using steepest descent method (500 step) and conjugate gradient method (500 step). After minimizing the structure, we heated the system from 0.1 K to 298.15 K during 50 ps. After heating the system, we performed a 50-ps MD simulation at a constant pressure (1 atm) and 298.15 K with Cα constrains to adjust the density of the system. Finally, a 5600-ns MD simulation was performed following a 500-ps MD simulation at 1 atm and 298.15 K for 10

ACS Paragon Plus Environment

Page 11 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

equilibration. The time step for the simulation was set at 2 fs. The Langevin thermostat with a friction constant γ = 2.0 ps−1 was used for temperature control. The cutoff was 8 Å, which was used to limit the direct space sum for the particle mesh Ewald (PME) method of AMBER. For the equilibration and production run, pmemd command was used. For analysis, the coordinates were saved every 1 ns. The number of samples was 5,600. To identify the characteristic states, we applied principal component analysis (PCA) to the obtained structures by using heavy atoms as coordinates to distinguish the conformational changes of the side-chains. The number of degrees of freedom was 414. After removing the translational and rotational motions from the coordinates of the heavy atoms, PCA was carried out. In RISM calculations, we used the TIP3P model with an additional parameter (σ = 0.4 Å and ϵ = 0.046 kcal/mol) for water molecule. 46 The conventional 1D-RISM theory is used for the site-site correlation function, wγvv′ γ (r) and hvv γ ′ γ (r), of water. The number density of water molecules and the temperature were 0.033329 molecule/Å3 and 298.15 K, respectively. The 3D-RISM integral equations were solved for a grid of 1283 points in a cubic cell with a size of 64 Å3 using GPU. A grid space of 0.5 Å was sufficient to calculate the SFE without significant numerical errors. For each conformation generated by MD simulation, we can determine the interaction potential, uγ (r; λ1 , λ2 ), around the protein along the integration path for λ1 and λ2 . We used 0.05 as the increment width of λ1 and used the following numbers as λ2 ticks: [0.001, 0.005, 0.01, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.9, 1.0]. Herein, we refer to the amide nitrogen atom and carbonyl oxygen atom on the main-chain of the ith amino acid as XxxiN and XxxiO, respectively. Here, Xxx is the three-letter code of the ith amino acid. Moreover, we refer to the nitrogen atom and oxygen atom on the side-chain of the ith amino acid as XxxiNs and XxxiOs , respectively. In addition, we refer to the hydrogen bond between A atom and B atom as H(A-B).

11

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Results and Discussion Simulation results and classification of chignolin structures The time series of the root mean square distance (RMSD) of heavy atoms from a native structure is shown in Figure 1. Here, the native structure is the first coordinate of 1UAO.pdb. RMSD was calculated after fitting the obtained structures to the native structure. Previous simulations have shown that chignolin has two characteristic structures (native and misfolded structures) at room temperature under 1 atm. 27,29,31–36,47 In this trajectory, the first-time transition from the unfolded state to the native state (RMSD ≈ 2 Å) occurred around 100 ns, and the second-time transition from the native state to the misfolded state (RMSD ≈ 4 Å) occurred around 1,500 ns. At this time, once it changed to about RMSD ≈ 5 Å, it settled in the misfolded state. The simulation is suitable for calculating energy profiles of several structures because it contains both the native and misfolded structures. We defined the native, misfolded, intermediate, and unfolded states as follows. First, we calculated the free energy surface along the first and second PC axes shown in Figure 2a. In this analysis, we performed PCA of heavy atoms and then the differences in the structures of the side-chains were extracted. We observed four local minimum-energy states; two of them had similar hydrogen bonds in the native state and the other two had similar hydrogen bonds in the misfolded state. Therefore, we referred to the four states as native 1 (n1), native 2 (n2), misfolded 1 (m1), and misfolded 2 (m2), as shown in Figure 2a. In previous works, 27,32 the distance between Asp3N and Gly7O (d(Asp3N-Gly7O)) and that between Asp3N and Thr8O (d(Asp3N-Thr8O)) were used to classify the two states. We calculated d(Asp3N-Gly7O) and d(Asp3N-Thr8O) to remove the native and misfolded states and then classified the intermediate and unfolded states. In Figure 2b, the distributions of the two distances are shown. The samples of native and misfolded states are located at the lower left and inside the dashed line, as shown in Figure 2b. The intermediate and unfolded states are defined by a region above the dashed line. The classification between the intermediate and

12

ACS Paragon Plus Environment

Page 12 of 44

Page 13 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

unfolded states is based on the dihedral angle of Pro4 (see Figure S2 of the supplementary material for Ref. 31 ). The number of structures in n1, n2, m1, m2, intermediate, and unfolded states are 495, 412, 2721, 380, 112, and 13, respectively. The average RMSD of heavy atoms of the n1, n2, m1, m2, intermediate, and unfolded states from the native structure were 1.94, 2.25, 3.69, 3.96, 3.47, and 6.79 Å, respectively. The two native states, n1 and n2, have lower RMSD values. Herein, we explain the characteristic structures of the six states in detail to examine the relation between the structures and the energy terms. Figure 3 shows the characteristic structures of all the states. In Figure 3(a)–(d), the red region indicates a side-chain of Tyr2, the blue region indicates that of Trp9, and the orange region indicates those of Thr6 and Thr8. In the intermediate state (Figure 3(e)), the green portion indicates a β-turn. The native (n1 and n2) and misfolded (m1 and m2) structures appear as a common β-hairpin-like structure from Asp3 to Thr6. As seen from Figure 3a,b, the shapes of the main-chain in n1 and n2 are almost the same. The difference between n1 and n2 is only in the orientation of the Trp9 side-chain. The different arrangement of the side-chain was determined by performing PCA for the heavy atoms. As seen from Figure 3c,d, the difference between m1 and m2 is also due to the arrangement of the Trp9 side-chain. The native and misfolded states have a common β-turn structure from Asp3 to Thr6 but slightly different hydrogen bond patterns for the backbone atoms, as explained in the next paragraph. The differences between the native and misfolded states except for the different hydrogen bonds were owing to the arrangements of the side-chains of Thr8 and Trp9 because the value of ψ of the dihedral angle of Gly7 between the two states was different. 31 The side-chains of Tyr2 and Trp9 in n1 and n2 are in contact with each other, while these are located at opposite sides in m1 and m2. Similarly, the side-chains of Thr6 and Thr8 in n1 and n2 are located at the same side, while those of m1 and m2 are located at opposite sides. For the intermediate state, even with an expanded shape as shown in Figure 3e, it forms the characteristic β-turn in the part from Asp3 to Thr6, which is a common structure of the native and misfolded states.

13

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

This indicates that through the intermediated state, chignolin becomes compact native or misfolded states (see Ref. 31 ). The unfolded state (Figure 3f), however, is fully extended, and thus has no β-turn structure. The turns and hydrogen bond patterns of the native, misfolded, and intermediate states are schematically illustrated in Figure 4. In general, there are several turn structures: βturns, α-turns, and π-turns. 48–51 The β-turns, α-turns, or π-turns form hydrogen bonds between the backbone atoms of the ith residue and the i + 3th, i + 4th, or i + 5th residue, respectively. A part of an α-turn or a π-turn corresponds to a β-turn structure. From the dihedral angles (see Figure S2 of the supplementary material for Ref. 31 ), the native state seems to have a π-turn from Asp3 to Thr8, while the misfolded state seems to have an α-turn (I-αRS turn in Ref. 49 ), both of which includes a β-turn from Asp3 to Thr6. 50 Moreover, both the native state and misfolded state are mainly stabilized by hydrogen bonds, as listed in Table 1 and shown in Figure 4a and b, respectively. Further, both the native and misfolded states have similar side-chain hydrogen bonds of Asp3Os . However, the hydrogen bonds for the main-chains are different: H(Asp3N-Thr8O) and H(Gly1O-Gly10N) in the native state and H(Asp3N-Gly7O) and H(Glu1O-Thr9N) in the misfolded state. Thr6Os forms hydrogen bond with Thr8Os in n1 and n2, while it is not observed in m1 and m2. In the intermediate state, the turn from Asp3 to Thr6 is similar to that in the native and misfolded states because of the similar distributions of the dihedral angles from Asp3 to Glu5 (see Figure S2 of the supplementary material for Ref. 31 ). The intermediate state has characteristic hydrogen bonds, as listed in Table 1 and shown in Figure 4c. Asp3Os tends to form hydrogen bonds with Glu5N, Thr6N, and/or Thr6Os , while Asp3O sometimes forms hydrogen bond with Thr6N or Gly7N. These hydrogen bond patterns are similar to those of the misfolded state. We calculated the SFEs of these structures using the 3D-RISM theory. In Figure 5, we show the distributions of water atoms of the typical structures of the six states; the composition is the same as in Figure 3. Comparison of n1 and n2 reveals that the solvation structures are similar, except for a difference in the vicinity of the contacted Tyr2 and Trp9

14

ACS Paragon Plus Environment

Page 14 of 44

Page 15 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

in Figure 5a,b. On the other hand, the difference between the side-chain orientations of m1 and m2 greatly influences the solvation structures of the terminal part (Figure 5c and d, respectively). In the intermediate and unfolded states (Figure 5e and f, respectively), chignolin forms more hydrogen bonds with the surrounding water molecules.

Stability of chignolin We defined the six characteristic states, i.e., n1, n2, m1, m2, intermediate, and unfolded states, in the previous section. Here, we discuss the stability to investigate the average values of the total energy given by the sum of the conformational energy and SFE. We calculated the SFE by the 3D-RISM theory, and present the average values of the total energy, G, conformational energy, E, and SFE, ∆µ, of each state in Table 2. The G values of the native (−171.1 kcal/mol) and misfolded (−171.2 kcal/mol) states (i.e., n1, n2, m1, and m2) are lower than those of the intermediate (−158.5 kcal/mol) and unfolded (−148.1 kcal/mol) states. The native and misfolded states have similar stabilities and are more stable than the intermediate and unfolded states. The G value of the intermediate state is between that of the compact states (n1, n2, m1, and m2) and the unfolded state. While the G values of the misfolded states are slightly lower than those of the native states, the total energy differences between the native and misfolded states are within the error range. This is possibly caused by the force field dependence (amber99sb force field). In previous simulations, both the native and misfolded states were observed as stable conformations. (The results are in agreement with those of Ref. 36 ) The results support that G is an appropriate energy function for the estimation of protein stability. Note that there are two types of chignolin: the original chignolin and CLN025 that contains mutations of the two terminal amino acids (G1Y and G10Y). In a previous simulation of CLN025 at a transition temperature, 10 the structure such as the misfolded state of the original chignolin was not observed. In our previous results, 3 only the native structure of CLN025 had a lower total energy, which was calculated by a similar approach to that em15

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ployed in the present work. The differences in the stabilities of original chignolin and CLN025 can also be explained with the energy function. In Table 2, a clear inverse correlation between conformational energy, E, and SFE, ∆µ, is observed. The compact states (native and misfolded states) have lower conformational energy than extended states (intermediate and unfolded states). The extended states have lower the SFE than the compact states. The results are similar to those of Refs. 1–3,41 This is because the intermediate and unfolded states are extended and form intermolecular hydrogen bonds between the protein and water surrounding the protein, whereas the native and misfolded states are compact and form intramolecular hydrogen bonds in the protein. Note that the intermediate state seems to have more hydrogen bonds between the atoms of protein than the unfolded state does because its conformational energy is lower than that of the unfolded state. There seems to be a balance between the intramolecular hydrogen bonds in protein and the intermolecular hydrogen bonds between protein and water. Because the conformational energy difference (≈ 132 kcal/mol) is reduced by the inversely correlated SFE difference (≈ 109 kcal/mol), the total energy difference becomes to be small (≈ 23 kcal/mol). Therefore, the SFE renders the free energy surface smooth. 41 For investigating the stability between the compact structures and extended structures, the conformational energy can be one of the criteria for stability tendency. However, we need to consider both the conformational energy and SFE in order to compare the stability between the compact structures (i.e., n1, n2, m1, and m2). The mechanisms of structural stability of the native and misfolded states are different from each other: for the native states, the average values of conformational energy (−29.1 and −23.1 kcal/mol) are lower than those of the misfolded states (−12.8 and −8.0 kcal/mol). On the other hand, the SFE enhances the stability of misfolded states (−142.0 and −148.0 kcal/mol for native states and −158.4 and −163.2 kcal/mol for misfolded states). Consequently, the native and misfolded states have similar stabilization. For side-chain effects, we remark that n1 and n2, which have the same hydrogen bonds between the backbone atoms and different side-chain arrangements of

16

ACS Paragon Plus Environment

Page 16 of 44

Page 17 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Trp9, have the same total energy, and m1 and m2 demonstrate a similar tendency. Because the conformational energy and SFE of the exposed side-chains cancel out, the side-chain arrangement of Trp9 exposed to the solvent does not affect the stability difference between the compact states of chignolin. Further, we divided the contribution of SFE into solvation energy, Es , and solvation entropy, −T ∆S. n1 and the unfolded states have the highest and lowest Es (−382.1 and −512.3 kcal/mol), respectively listed in Table 3. This is because the unfolded states are extended and form intermolecular hydrogen bonds between the protein and water surrounding the protein, whereas the native states are compact and form intramolecular hydrogen bonds in the protein, as mentioned before. To investigate the contribution of the energy components of G, we calculated the sum of the average values of conformational energy, E, and solvation energy, Es . The E + Es values of the six states were similar to each other (−413.0 ∼ −409.8 kcal/mol), and E and Es cancelled out because of the competition of intramolecular and intermolecular hydrogen bonds. The driving force for the change from the unfolded state (extended conformation) to the compact states (i.e., n1, n2, m1, and m2) seems to be the contribution of solvation entropy, −T ∆S. The compact states have lower −T ∆S values (240.1, 240.5, 241.4, and 241.8 kcal/mol) compared to the intermediate and unfolded states (249.3 and 261.0 kcal/mol). The results are in agreement with those of previous studies. 1–3 Nevertheless, there are small differences between the solvation entropies of the compact structures. We need to consider the energy terms to compare the compact structures. We examined the relation between solvation entropy and partial molar volume (PMV). Usually, the value of −T ∆S decreases as the volume decreases. In the present study, for the native, intermediate, and unfolded states, we observed this tendency by comparing the PMV (740.2 and 739.4 for native states, 745.1 for intermediate state, and 748.2 cc/mol for unfolded state) and −T ∆S (240.1 and 240.5 for native states, 249.3 for intermediate state, and 261.0 kcal/mol for unfolded state). However, the native and misfolded states

17

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

have negative correlation. While the misfolded states have large −T ∆S (241.4 and 241.8 kcal/mol) compared to the native states (240.1 and 240.5 kcal/mol), the misfolded states have smaller PMVs (737.5 and 735.2 cc/mol) compared to the native states (740.2 and 739.4 cc/mol), which implied that the conformational transition from native to misfolded states may have occurred under high pressure. 47 Table 4 lists the total energy and its energy components for the main- and side-chains of the six states. From conformational energy, it was found that the native states had more hydrogen bonds in the main-chains as compared to other states. However, when the solvation effect was included, the total energies of the main-chain for the misfolded states (−116.1 and −115.2 kcal/mol) were the most stable as compared to those of the other states. For the side-chain, while m1 had the lowest E value (25.6 kcal/mol), G of m1 (−55.1 kcal/mol) was less stable than that of n1, n2, and m2 because of the SFE. The number of hydrogen bonds is not necessarily positively correlated with protein stability, and it is necessary to consider the solvent effect. Among the compact structures, the native states or the misfolded states were more stabilized by the side-chains (−57.7 and −57.8 kcal/mol for native states) or the main-chains (−116.1 and −115.2 kcal/mol for misfolded states), respectively.

Energy contribution of each residue In this subsection, we investigate the energy stability of each residue of chignolin in detail. The differences between the average total energy (∆G), average conformational energy (∆E), and average SFE (∆∆µ) of the main- and side-chains of each residue of the two states were calculated. The difference in average conformational energies may be affected by the difference in the structures of the two states. The intramolecular hydrogen bond in protein contributes to the conformational energy, while the intermolecular hydrogen bond between protein and water contributes to the SFE. Thus, competition between conformational energy and SFE may be observed. First, we compared the energy contribution between the native states (n1 and n2) and 18

ACS Paragon Plus Environment

Page 18 of 44

Page 19 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

between the misfolded states (m1 and m2), as shown in Figures 6a and 6b, respectively, to investigate the effect of stability on the side-chain of Trp9. n1 and n2 (and m1 and m2) have similar hydrogen bonds between the backbone atoms and slightly different side-chain arrangements, especially, Trp9 (see Figure 3a–d). Figure 6a shows the energy difference from n1 to n2, where the positive value indicates that n1 is stable. The conformational energy differences between the main-chains of n1 and n2 are small except for in Gly1. This indicates that the backbone structures are similar to each other except for Gly1 (see Figure 3a,b). The structural arrangement of the side-chain of Trp9 slightly affects the conformational energy of the side-chain. Even the structures of Gly1 and Trp9 are different; the differences between the total energies of both the residues are small due to the balance between the conformational energy and SFE, and both the states (n1 and n2) can be observed. Figure 6b shows the energy difference between the misfolded states (m1 and m2), where the positive value indicates that m1 is stable. It is observed that the energy changes are larger than those between the native states. This indicates that the structural variety of misfolded states is larger than that of the native states. The backbone structures of m1 and m2 are similar to each other except for Gly10 (see Figure 3c,d). In addition, conformational energy differences are observed for the side-chain of Trp9 because their structures for m1 and m2 are different. For m1, Trp9Ns tends to form a hydrogen bond with Tyr2O (H(Tyr2OTrp9Ns )) (see Figure 4b), while m2 does not show such interaction. The structural difference is the reason for the conformational energy difference between the side-chain of m1 and m2, as shown in Table 4. However, the large change in the conformational energy of the sidechain of Trp9 cancels out that of the SFE; thus, the stability does not change significantly. This implies that the side-chain of Trp9 can be located at the positions of the two states, which appear in Figure 2a, in the misfolded state. As seen in Figure 6a,b, in the native states, the conformational energy change in the C-terminal (Gly10) is slightly small because the structures of n1 and n2 are restricted due to H(Gly1O-Gly10H) and the hydrophobic side-chain, Trp9, which is in contact with Tyr2. The larger conformational energy changes

19

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

at C-terminal (Trp9 and Gly10) compared to the native states indicate that the fluctuation of Gly10 is large in the misfolded states. In the native states or in the misfolded states, the terminal residues and the side-chain of Trp9 can be fluctuated because the structural changes in these parts do not affect the stability of the protein due to the cancelation of conformational energy and SFE. Next, we investigated the energy differences between the structures such as the native, misfolded, intermediate, and unfolded states. Hereafter, we treat n1 as the native state and m1 as the misfolded state. Figure 7a shows the energy difference from n1 to m1, where the positive value indicates that n1 is stable. The energy difference is larger than that between the native states or the misfolded states. The differences between the conformational energy of the main-chain from Tyr2 to Thr6 are small because the dihedral angles from Tyr2 to Thr6 are similar in the native and misfolded states (see Figure S2 of the supplementary material for Ref. 31 ). The decrease in conformational energy of the main-chain for Thr8 is due to the hydrogen bond between Asp3 and Thr8 in the native state. On the other hand, the decrease in conformational energy of the main-chain for Trp9 is due to the hydrogen bond between Gly1 and Trp9 in the misfolded states. Both Gly1 and Gly10 of the native state decrease the conformational energy. This indicates that the native state forms hydrogen bond between the N- and C-terminuses; however, the misfolded state is hydrated. For the terminal structures, the total energy difference is small, and thus, the contribution of structural stability difference is small. For side-chains, the native state tends to form two hydrogen bonds, H(Asp3Oδ -Thr6Oγ ) and H(Thr6Oγ -Thr8Oγ ), while the misfolded state forms H(Asp3Oδ -Thr6Oγ ). Because of the hydrogen bonds, a decrease in conformational energy for the side-chains of Thr6 and Thr8 in the native state and an increase in that for Asp3 were observed. The decrease in conformational energy of the side-chain of Trp9 is due to H(Tyr2O-Trp9Ns ) in the misfolded state. These large conformational energy changes are reduced by the SFE, especially in both the terminals. Consequently, it seems that the difference in total stability is caused by the conformational energy of the side-chains. The

20

ACS Paragon Plus Environment

Page 20 of 44

Page 21 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

native state is more stabilized by the hydrogen bonds of the side-chains of Thr6 and Thr8 and destabilized by the hydrogen bonds of the side-chains of Asp3 and Trp9 as compared to the misfolded state. Figure 7b shows the energy differences between the intermediate and native states, where the positive value indicates that the native state is stable. From Figure 4, Table 1, and Figure 7b, the following can be explained. The differences between the conformational energies of the main-chains from Asp3 to Thr6 are small because both the structures form the turn from Asp3 to Thr6. Thr6 in the turn is more stable in the native state than in the intermediate state. The conformational energy gain for the main-chains of Gly7 and Thr8 are due to the hydrogen bonds between the atoms of the main-chains, H(Asp3O-Gly7N) or H(Asp3O-Thr8N), that for the main-chains of Gly1 and Gly10 are due to the hydrogen bond, H(Gly1O-Gly10N), and that for the side-chains of Thr6 and Thr8 are due to the hydrogen bonds between the atoms of the side-chains, H(Asp3Os-Thr6Os) and H(Thr6Os-Thr8Os). Despite lacking a characteristic hydrogen bond in the side-chain of Tyr2, the conformational energy is more stabilized. The reason seems to be the packing effect of Tyr2. While the conformational stability almost cancels out due to the solvation effect, it slightly contributes to the stability of the native state. As a result, Gly1, Tyr2, Thr6, Gly7, and Thr8 are more stable in the native state than in the intermediate state. Figure 7c shows the energy differences between the intermediate and misfolded states, where the positive value indicates that the misfolded state is stable. From Figure 4, Table 1, and Figure 7c, the following can be explained. The differences between the conformational energies of the main-chains from Pro4 to Thr6 are small because of turn formation from Asp 3 to Thr6. While Thr6 is more stable in the native state than in the intermediate state, Asp3 in the misfolded state is more stable. For Trp9, a hydrogen bond between Tyr2O and Trp9Ns can form because the terminuses contact each other. The conformational energy of the main-chain of Trp9 is stabilized through hydrogen bonding with Gly1. The solvation effects reduce the conformational stability. As a result, Gly1, Tyr2, Asp3, Gly7, and Trp9

21

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

are stable in the misfolded state rather than in the intermediate state. Figure 7d shows the energy differences between the unfolded and native states. First, it is characteristic that ∆G is positive except for Asp3. Over almost the entire amino acid sequence, the native state has energy gains over the unfolded one. The native state only loses energy at Asp3; however, this is similar as compared to the misfolded and intermediate states (see Figure 7a,b). Particularly, energy changes are observed for β-turn, which indicates that the β-turn structure vanishes in the unfolded state. Thr6 is mostly stabilized because of turn formation and H(Asp3Oδ -Thr6Oγ ) and H(Thr6Oγ -Thr8Oγ ). The energy difference between the unfolded and misfolded state is shown in Figure 7e. The profile is similar to that of Figure 7d, but the misfolded state is stable even at Asp3. There is a small energy difference at Thr8 because the side-chain of Thr8 is exposed to water. Trp9 is mostly stabilized because of H(Gly1O-Trp9N) and H(Tyr2O-Trp9Ns ), and the energy differences between the terminal parts (Gly1 and Gly10) are smaller than those seen in Figure 7d. Figure 7f shows the energy differences between the unfolded and intermediate states. The conformational energies of the main-chains of Asp3, Thr6, and Gly7 are stabilized due to the hydrogen bonds, H(Asp3O-Thr6N) and H(Asp3O-Thr7N), and the conformational energies of the side-chains of Asp3 and Thr6 are stabilized due to the hydrogen bond between Asp3Os and Thr6Os. Pro4 in the intermediate state is more stable than in the unfolded state. The structural difference between the intermediate and unfolded states is the dihedral angle of Pro4 (see Figure S2 of the supplementary material for Ref. 31 ). Beside of no characteristic hydrogen bond for Trp9, the conformational energy of the main-chain of Trp9 is stabilized in the intermediate state. This is because the intermediate state becomes more compact due to turn formation, and thus, the side-chain of Trp9 is packed with the other atoms of chignolin. In addition, the solvation effects reduce the stability of the conformational energy. Consequently, the turn from Asp3 to Thr6 and the side-chain of Trp9 stabilize the intermediate state.

22

ACS Paragon Plus Environment

Page 22 of 44

Page 23 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Energy stabilization during the folding process (turn formations) In order to investigate the stabilization mechanism of each amino acid during the process of folding, the differences between the total energies of the main-chains and side-chains of the unfolded and intermediate states, the intermediate and native states, and the intermediate and misfolded states are shown in Figure 8. Herein, we focus on the types of hydrogen bonds between the main-chains, between the main-chain and side-chain, and between the side-chains. In an early stage of chignolin folding, the turn from Asp3 to Thr6 is formed. Turn formation is important for the folding process. β-turns are defined by four residues at positions designated i to i + 3. There are characteristic dihedral angles, i + 1 and i + 2, in β-turns. Based on the dihedral angles i+1 and i+2, the obtained turn corresponds to type-I β-turn from Asp3 to Thr6. In type-I β-turn, Pro residue predominates at position i + 1 and Asp, Asn, Ser, and Cys residues frequently occur at position i, where their side-chains often form a hydrogen bond with the amide nitrogen atom of the residue i + 2. 52 In chignolin, Asp3 and Pro4 correspond to the i and i + 1 positions of type-I β-turn, respectively. It has been found that for turn formation from the unfolded state, the hydrogen bonds between the side-chain of Asp3 (Asp3Os ) and the main-chains of Glu5 and Thr6 (H(Asp3Os -Glu5N) and H(Asp3Os -Thr6N)) and the side-chain of Thr6Os (H(Asp3Os -Thr6Os )), as shown in Figure 4, appear important. The characteristic hydrogen bond of type-I β-turn is H(Asp3Os Glu5N). The hydrogen bonds of the main-chain of Asp3 (H(Asp3O-Thr6N) and H(Asp3OGly7N)) also seem to be important for turn formation. Asp3 is one of the important residues for the formation of β-turn that stabilizes Glu5, Thr6, and Gly7. Because of the competition between the conformational energy and the SFE, the stability due to a hydrogen bond with solvation effect is considered to be about 0.5–1.5 kcal/mol (see Figure 8a). Pro4 is more stabilized in the intermediate state than in the unfolded state. As mentioned earlier, the dihedral angles of Pro4 in the intermediate state and in the unfolded state are different. The angle can help distinguish the intermediate and misfolded states. Pro4 seems to be stabilized by taking the dihedral angle corresponding to the intermediate state. The dihedral angle 23

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

of Pro4 also corresponds to that of a β-turn. This seems to be one of the reasons for the preferential location of Pro4 at a turn. In addition, it can be seen that the side-chain of Trp9 contributes to the stability of the intermediate state of chignolin because the turn structure is more compact than in the unfolded state, and thus, the residue contacts the other atoms of chignolin. Pro4 and the charged residue, Asp3, which forms hydrogen bonds between the main-chain and side-chain, seem to stabilize turn formation. For Trp9, the SFE did not cancel out the conformational stability (see Figure 7f) and thus, the residue seems to be stable due to the packing effect. The native state forms a π-turn, which includes the turn from Asp3 to Thr6. π-turns consist of six residues from i to i + 5 and have a hydrogen bond between the oxygen atom of the first residue and the amide nitrogen atom of the sixth (i–i + 5). π-turns have four characteristic dihedral angles from i + 1 to i + 4. 50 As observed from the dihedral angles of Pro4 to Gly7, the native state seems to form a π-turn. Note that the distribution of the dihedral angle of Gly7 is located in a local area even though glycine is usually fluctuated and its dihedral angle distributes widely. 31,52 Thr6 and Thr8, which are located at the same side, form hydrogen bonds between the side-chains, H(Thr6Os -Thr8Os ). The hydrogen bond seems to fix the structure of Gly7. Thr8 seems to be important to the stabilization of the π-turn. The packing of Tyr2 and Trp9 also contributes to the stability. The gain in total energy for Gly1 is large (3 kcal/mol), which is possibly due to the hydrogen bond of the main-chains of Gly1 and Gly10, in which the effects of packing of Gly1 and Gly10 are also included. Overall, during the formation of the native state from the turn state, the characteristic stabilization seems to be due to the side-chains. The misfolded state forms α-turn, which also includes the turn from Asp3 to Thr6. αturns consist of five residues from i to i + 4 and have a hydrogen bond between the oxygen atom of the first residue and the amide nitrogen atom of the fifth (i–i + 4). α-turns have three characteristic dihedral angles from i + 1 to i + 3. From the dihedral angles of Pro4 to Thr6, the misfolded state seems to form an α-turn (I-αRS turn). 49 The hydrogen bonds of the

24

ACS Paragon Plus Environment

Page 24 of 44

Page 25 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

misfolded state are more stabilized than those of the intermediate state. Gly1 exhibit similar tendency for the native state. The hydrogen bond, H(Tyr2O-Trp9Ns ), also contributes to stabilization. Overall, during the formation of the misfolded state from the turn state, the main contribution to stabilization seems to arise from the formation of hydrogen bonds in the main-chains. Here, we discuss the structural difference between original chignolin and its mutant, CLN025, which has mutations of G1Y and G10Y. The backbone structure of CLN025 is similar to the native structure of original chignolin. 16,17 While the native and misfolded states of original chignolin has been observed in several MD simulations, the misfolded state of CLN025 is not observed. 10,53 This implies that original chignolin has some stable structures and CLN025 has a unique stable structure. The present results reveal that residues from Asp3 to Thr6 and Trp9 are important for turn formation, which is observed during protein folding to the native or misfolded states. Moreover, for the native state, Thr8 seems to be important for π-turn formation due to the hydrogen bond of the side-chain. In addition, Tyr2 is important due to the packing effect. The contribution of the stabilities of the terminal parts in the native and misfolded states are similar to each other. Gly1 and Gly10 may be changed without loss of stability of the π-turn. As can be seen from the native and misfolded structures shown in Figure 4d,e, the side-chain atoms of Gly1 are positioned toward the exterior and interior of the sheet for the native state and the misfolded state, respectively. This is because of the arrangements of the side-chains from Thr8 are different due to the variations in the dihedral angles from Gly7. Thus, the mutation of Gly1 to the residue with a large side-chain affects the relative stability of the native and misfolded states. It is difficult for the misfolded state to retain its structure because of the overlaps of the sidechain and Trp9 due to the mutation of Gly1. The native state is more stabilized compared to the misfolded state in the mutation. This seems to be one of the reasons that only the native structure of CLN025 is stable. In addition, the mutation of Thr8 may be effective for changing the relative stabilization of the misfolded and native states. The π-turn of the

25

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

native state seems to be stabilized by the hydrogen bond, H(Thr6Os -Thr8Os ). Thr8 is not related to the stabilization of the misfolded state. Thus, the mutation of Thr8 to a neutral amino acid may affect the relative stability between the misfolded state and the native state; the misfolded state becomes more stable than the native state.

Conclusion Structural stability of protein is of interest for understanding the protein folding and unfolding mechanisms. Not only the conformational energy of protein but also the influence of hydration around the protein is important for its stability. The conformational energy and SFE compete with each other because the gains in conformational energy and SFE are mainly due to the intramolecular hydrogen bonds in the protein and intermolecular hydrogen bonds between protein and water, respectively. In this study, we investigated the stability of chignolin miniprotein using the sum of the conformational energy and SFE as a total energy function. The SFE was calculated using 3D-RISM with AD method. The average values of the total energies of the native and misfolded states are almost the same and lower than those of the intermediate and unfolded states. Therefore, the native and misfolded states have similar stabilities and are more stable compared to the intermediate and unfolded states. The results reveal that the total energy is an appropriate energy function to estimate the stability of protein. Further, an inverse correlation between the conformational energy and SFE was evident. This is because the intermediate and unfolded states were extended and form intermolecular hydrogen bonds between the protein and the water surrounding the protein, whereas the native and misfolded states are compact and form intramolecular hydrogen bonds in the protein. The native and misfolded states were stabilized by the conformational energy, whereas the intermediate and unfolded states were stabilized by the SFE. The conformational energy slightly contributes to stabilize the structures of protein. From the decomposition of the total energy into energy and solvation

26

ACS Paragon Plus Environment

Page 26 of 44

Page 27 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

entropy effects, the driving force for the change from unfolded structures to compact structures seems to be the contribution of solvation entropy. However, detailed energy analysis is required to compare the stability between compact structures. Furthermore, while the total energies of the native and misfolded states were almost the same, the energy components were different: the native state had lower conformational energy and higher SFE compared to the misfolded state. The competition renders the total energy almost equal to that of the misfolded state. Furthermore, the energy profiles of the main- and side-chains of each residue were calculated using the AD method. The conformational energy difference was due to the conformational change and especially caused by hydrogen bonds and packing effects of the atoms of protein. Because of the competition between conformational energy and SFE, the stability of a hydrogen bond between the atoms of protein ranges from 0.5 kcal/mol to 1.5 kcal/mol. Examination of the characteristic structures of the states and energy decompositions showed that the hydrogen bonds of the main- and side-chain of Asp3 and the packing effect of Trp9 stabilize the turn from Asp3 to Thr6 during folding to the intermediate state from the unfolded state. For the formation of π-turn in the native state, side-chain interaction between Thr6 and Thr8 seems to be important. In the misfolded state, the hydrogen bonds between the atoms of the main-chains seem to become stronger than that in the intermediate state. The analysis reveals important residues in each state; especially, the key residues for turn formation were examined. Finally, it is important to include the effect of solvent to investigate the stability of protein. The total energy function is appropriate for investigating the stability of chignolin protein. It is useful to classify the structures of protein using PCA or relaxation mode analysis 31,37–39,54 and to investigate the difference in structures in the energy profile of each residue. In future works, we will apply this strategy to investigate the effect of hydration on α-helix or β-sheet formations on the folding processes of large proteins.

27

ACS Paragon Plus Environment

The Journal of Physical Chemistry

Figures 10

RMSD [Å]

8

6

4

2

0 0

1000

2000

3000

4000

5000

6000

Time [ns]

Figure 1: Time series of RMSD from NMR structure. 18

Intermediate Unfolded Native Misfolded

d(Asp3N-Gly8O) [Å]

16

PC2 (Å)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 44

14 12

Misfolded

10 8 6 4

Native

(a)

2

PC1 (Å)

(b)

2

4

6

8

10

12

14

d(Asp3N-Gly7O) [Å]

16

18

Figure 2: Free-energy surface for first and second PC modes (a) and distributions of classified structures along the distances between Asp3N and Thr8O (d(Asp3N-Thr8O)) and between Asp3N and Gly7O (d(Asp3N-Gly7O)) (b).

28

ACS Paragon Plus Environment

Page 29 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 3: Various chignolin structures: native structures (n1 (a) and n2 (b)), misfolded structures (m1 (c) and m2 (d)), intermediate structure (e), and unfolded structure (f). Red and blue regions indicate the side-chains of Tyr2 and Trp9, respectively. Orange regions indicate the side-chains of Thr6 and Thr8. Green region is a specific turn of the intermediate structures in (e).

29

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(a)

(b) Glu5

Glu5

Pro4

Thr6 Pro4

Thr6 Gly7 Asp3

Asp3

Thr8

Tyr2 Gly1

Tyr2 Gly1

Trp9

Gly7 Thr8 Trp9 Gly10

Gly10

(c)

Page 30 of 44

(d)

Glu5

Gly1

Pro4 Thr6

(e)

Asp3

Gly1

Figure 4: Typical hydrogen bonds for native (a), misfolded (b), and intermediate (c) states and local structures near Gly1 for native (d) and misfolded (e) states. The residues from Asp3 to Thr8, Asp3 to Gly7, or Asp3 to Glu5 (turn parts) are colored in light green for the native, misfolded, or intermediate states, respectively. The green hydrogen bond corresponds to one between backbone’s atoms. The red corresponds to one between backbone’s atom and sidechain’s atom or between sidechain’s atoms. The strong and weak hydrogen bonds detailed in Table 1 correspond to solid and dashed lines, respectively.

30

ACS Paragon Plus Environment

Page 31 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 5: Distributions of water oxygen (red) and hydrogen (cyan) of (a) and (b) native, (c) and (d) misfolded, (e) intermediate, and (f) unfolded structures. They are displayed in spacefill for easy viewing. The red region indicates water oxygen, and the cyan region indicates water hydrogen. The threshold of display of the distribution function was set at 3.0 or more, which implies that the probability of existence is three times that of bulk water.

31

ACS Paragon Plus Environment

(a)

10

ΔG ΔEM ΔΔµM ΔES ΔΔµS

Energy difference [kcal/mol]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy difference [kcal/mol]

The Journal of Physical Chemistry

5

0

-5

-10

Gly1 Tyr2 Asp3 Pro4 Glu5 Thr6 Gly7 Thr8 Trp9 Gly10

(b)

Page 32 of 44

G E µ E µ M

10

M

S

S

5

0

-5

-10

Gly1 Tyr2 Asp3 Pro4 Glu5 Thr6 Gly7 Thr8 Trp9 Gly10

Figure 6: Differences in average total energy, average conformational energy, and average solvation free energy of main-chains and side-chains of each residue of n2 from n1 (a) and m2 from m1 (b). Black, light-gray, dark-gray, white, and gray indicate total energy, conformational energy of main-chain, solvation free energy of main-chain, conformational energy of side-chain, and solvation free energy of side-chain, respectively.

32

ACS Paragon Plus Environment

Page 33 of 44

M

Energy difference [kcal/mol]

Energy difference [kcal/mol]

14.5

M

S

S

5

0

-5

-10

(a)

-14.7

10

22.1

G E µ E µ

M

S

S

5

0

-5

-10

-19.3

-13.9

Gly1 Tyr2 Asp3 Pro4 Glu5 Thr6 Gly7 Thr8 Trp9 Gly10

31.3

M

S

5

0

-5

-10

-28.2

-22.7

Gly1 Tyr2 Asp3 Pro4 Glu5 Thr6 Gly7 Thr8 Trp9 Gly10

(f)

29.4

S

5

0

-5

-10

10

-25.6

-28.6

37.9

G E µ E µ M

38.8

M

S

S

5

0

-5

-10

(d)

S

M

M

-34.5

-37.3

G E µ E µ

24.4

M

G E µ E µ

Gly1 Tyr2 Asp3 Pro4 Glu5 Thr6 Gly7 Thr8 Trp9 Gly10

Energy difference [kcal/mol]

10

G E µ E µ

28.7

S

(b)

14.9

M

10

Gly1 Tyr2 Asp3 Pro4 Glu5 Thr6 Gly7 Thr8 Trp9 Gly10

Energy difference [kcal/mol]

Energy difference [kcal/mol]

Gly1 Tyr2 Asp3 Pro4 Glu5 Thr6 Gly7 Thr8 Trp9 Gly10

(c)

(e)

G E µ E µ

10

Energy difference [kcal/mol]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

M

10

M

S

S

5

0

-5

-10

Gly1 Tyr2 Asp3 Pro4 Glu5 Thr6 Gly7 Thr8 Trp9 Gly10

Figure 7: Differences in average total energy, average conformational energy, and average solvation free energy of main-chains and side-chains of each residue of m1 from n1 (a), intermediate state from n1 (b), intermediate state from m1 (c), unfolded state from n1 (d), unfolded state from m1 (e), and unfolded state from intermediate state (f).

33

ACS Paragon Plus Environment

The Journal of Physical Chemistry

4

GM GS

Energy difference [kcal/mol]

(a) 3

2

1

0

-1

-2 Gly1

Tyr2

Asp3

Pro4

Glu5

Thr6

Gly7

Thr8

4

Energy difference [kcal/mol]

Trp9

Gly10

Trp9

Gly10

Trp9

Gly10

GM GS

(b) 3

2

1

0

-1

-2 Gly1

Tyr2

Asp3

Pro4

Glu5

Thr6

Gly7

Thr8

4

GM GS

(c) Energy difference [kcal/mol]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 44

3

2

1

0

-1

-2 Gly1

Tyr2

Asp3

Pro4

Glu5

Thr6

Gly7

Thr8

Figure 8: Main- and side-chain components of total energy difference of unfolded state from intermediate state (a), intermediate state from native state (b), and intermediate state from misfolded state (c). The arrows indicate hydrogen bonds.

34

ACS Paragon Plus Environment

Page 35 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Tables Table 1: Strong and Weak Hydrogen Bonds in Native, Misfolded, and Intermediate States. The definition of the strong hydrogen bond or the weak hydrogen bond is that the percentage of the structure, which has the corresponding hydrogen bond in samples of each state is more than 30 % or between 10 to 30 %, respectively. Here, the total numbers of samples in the native, misfolded, and intermediate states are 907, 3101, and 112, respectively. The hydrogen bond was defined by using Hydrogen Bonds command of VMD 55 with cutoff distance = 3.3 Å and angle cutoff = 35◦ . Type native

Hydrogen bond (strong) Gly1N-Gly10O Gly1O-Gly10N Asp3N-Thr8O Asp3Os -Thr6N, Thr6Os Thr6Os -Thr8Os misfolded Gly1N-Trp9O Gly1O-Trp9N Asp3N-Gly7N Asp3Os -Glu5N, Thr6Os Asp3O-Gly7N intermediate Asp3Os -Glu5N, Thr6N, Thr6Os

Hydrogen bond (weak) Asp3Os -Glu5N Asp3O-Gly7N, Thr8N

Tyr2O-Trp9Ns (for m1) Asp3Os -Thr6N Asp3O-Thr6N

Asp3O-Thr6N, Thr6Os , Gly7N

Table 2: Average Values of Total Energy, G, Conformational Energy, E, and Solvation Free Energy, ∆µ, of Each State. The value in parenthesis is the difference from the unfolded state. Energy unit is kcal/mol. Type n1 n2 m1 m2 intermediate unfolded

G = E + ∆µ −171.1 ± 9.4(−23.0) −171.1 ± 8.9(−23.0) −171.2 ± 9.0(−23.1) −171.2 ± 9.1(−23.1) −158.5 ± 8.5(−10.4) −148.1 ± 8.7

E −29.1 ± 17.3(−132.3) −23.1 ± 19.8(−126.3) −12.8 ± 16.6(−116.0) −8.0 ± 18.2(−111.2) 47.4 ± 29.9(−55.8) 103.2 ± 15.4

35

ACS Paragon Plus Environment

∆µ −142.0 ± 13.4(109.3) −148.0 ± 16.5(103.3) −158.4 ± 13.7(92.9) −163.2 ± 14.6(88.1) −205.9 ± 27.6(45.4) −251.3 ± 13.4

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 44

Table 3: Solvation Energy, Es , and Solvation Entropy, −T ∆S, of ∆µ and the Partial Molar Volume of Each State. The value in parenthesis is the difference from the unfolded state. Energy unit for Es , −T ∆S, and E + Es is kcal/mol, and unit of PMV is cc/mol. Type n1 n2 m1 m2 intermediate unfolded

Es −382.1(130.2) −388.5(123.8) −399.8(112.5) −405.0(107.3) −455.2(57.1) −512.3

−T ∆S 240.1(−20.9) 240.5(−20.5) 241.4(−19.6) 241.8(−19.2) 249.3(−11.7) 261.0

PMV 740.2 ± 6.7 739.4 ± 6.7 737.5 ± 6.6 735.2 ± 7.1 745.1 ± 6.7 748.2 ± 2.84

E + Es −411.2(−1.4) −411.6(−1.8) −412.6(−2.8) −413.0(−3.2) −407.8(2.0) −409.8

Table 4: Total Energy, G, Conformational Energy, E, and Solvation Free Energy, ∆µ, of Main-Chain and Side-Chain of Each State. The value in parenthesis is the difference from the unfolded state. Energy unit is kcal/mol. Type n1 n2 m1 m2 intermediate unfolded

G −113.4(−13.9) −113.2(−13.7) −116.1(−16.7) −115.2(−15.7) −104.9(−5.4) −99.5

Main-chain E −56.8(−110.2) −52.4(−105.8) −38.4(−91.8) −40.7(−94.1) 13.4(−40.0) 53.4

36

∆µ −56.6(96.3) −60.8(92.1) −77.7(75.2) −74.5(78.4) −118.3(34.6) −152.9

ACS Paragon Plus Environment

G −57.7(−9.0) −57.8(−9.1) −55.1(−6.4) −56.2(−7.5) −53.6(−4.9) −48.7

Side-chain E 27.7(−22.0) 29.3(−20.4) 25.6(−24.1) 32.6(−17.1) 33.9(−15.8) 49.7

∆µ −85.4(13.0) −87.1(13.3) −80.7(17.7) −88.8(9.6) −87.5(10.9) −98.4

Page 37 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Acknowledgement This work was supported by JST PRESTO (JPMJPR13LB). This research is partially supported by the Initiative on Promotion of Supercomputing for Young or Women Researchers, Supercomputing Division, Information Technology Center, The University of Tokyo. Numerical calculations were performed in part using HA-PACS at the Center for Computational Sciences (CCS), University of Tsukuba. Molecular graphics for Figures 3 and 5 were performed with the UCSF Chimera package. Chimera is developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco (supported by NIGMS P41-GM103311). 56 Figure4 was made with VMD software support. VMD is developed with NIH support by the Theoretical and Computational Biophysics group at the Beckman Institute, University of Illinois at Urbana-Champaign. 55

References (1) Imai, T.; Harano, Y.; Kinoshita, M.; Kovalenko, A.; Hirata, F. A theoretical analysis on hydration thermodynamics of proteins. J. Chem. Phys. 2006, 125, 024911. (2) Maruyama, Y.; Harano, Y. Does water drive protein folding? Chem. Phys. Lett. 2013, 581, 85–90. (3) Maruyama, Y.; Mitsutake, A. Stability of unfolded and folded protein structures using a 3D-RISM with the RMDFT. J. Phys. Chem. B 2017, 121, 9881–9885. (4) Beglov, D.; Roux, B. An integral equation to describe the solvation of polar molecules in liquid water. J. Phys. Chem. B 1997, 101, 7821–7826. (5) Kovalenko, A.; Hirata, F. Three-dimensional density profiles of water in contact with a solute of arbitrary shape: A RISM approach. Chem. Phys. Lett. 1998, 290, 237–244. (6) Kovalenko, A.; Hirata, F. Potential of mean force between two molecular ions in a polar 37

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

molecular solvent: A study by the three-dimensional reference interaction site model. J. Phys. Chem. B 1999, 103, 7942–7957. (7) Singer, S. J.; Chandler, D. Free-energy functions in the extended rism approximation. Mol. Phys. 1985, 55, 621–625. (8) Kovalenko, A.; Hirata, F. Self-consistent description of a metal-water interface by the Kohn-Sham density functional theory and the three-dimensional reference interaction site model. J. Chem. Phys. 1999, 110, 10095–10112. (9) Ten-no, S. Free energy of solvation for the reference interaction site model: Critical comparison of expressions. J. Chem. Phys. 2001, 115, 3724–3731. (10) Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Shaw, D. E. How fast-folding proteins fold. Science 2011, 334, 517–520. (11) Yamazaki, T.; Kovalenko, A. Spatial Decomposition analysis of the thermodynamics of cyclodextrin complexation. J. Chem. Theory Comput. 2009, 5, 1723–1730. (12) Yamazaki, T.; Kovalenko, A. Spatial decomposition of solvation free energy based on the 3D integral equation theory of molecular liquid: application to miniproteins. J. Phys. Chem. B 2011, 115, 310–318. (13) Chong, S.-H.; Ham, S. Atomic decomposition of the protein solvation free energy and its application to amyloid-beta protein in water. J. Chem. Phys. 2011, 135, 034506. (14) Chong, S.-H.; Ham, S. Component analysis of the protein hydration entropy. Chem. Phys. Lett. 2012, 535, 152–156. (15) Chong, S.-H.; Ham, S. Impact of chemical heterogeneity on protein self-assembly in water. Proc. Natl. Acad. Sci. USA 2012, 109, 7636–7641. (16) Honda, S.; Yamasaki, K.; Sawada, Y.; Morii, H. 10 residue folded peptide designed by segment statistics. Structure 2004, 12, 1507–1518. 38

ACS Paragon Plus Environment

Page 38 of 44

Page 39 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(17) Honda, S.; Akiba, T.; Kato, Y. S.; Sawada, Y.; Sekijima, M.; Ishimura, M.; Ooishi, A.; Watanabe, H.; Odahara, T.; Harata, K. Crystal structure of a ten-amino acid protein. J. Am. Chem. Soc. 2008, 130, 15327–15331. (18) Aono, S.; Hosoya, T.; Sakaki, S. A 3D-RISM-SCF method with dual solvent boxes for a highly polarized system: application to 1,6-anhydrosugar formation reaction of phenyl α- and β-D-glucosides under basic conditions. Phys. Chem. Chem. Phys. 2013, 15, 6368–6381. (19) Chong, S.-H.; Hong, J.; Lim, S.; Cho, S.; Lee, J.; Ham, S. Structural and thermodynamic characteristics of amyloidogenic intermediates of β-2-microglobulin. Sci. Rep. 2015, 1–9. (20) Maruyama, Y.; Hirata, F. Modified anderson method for accelerating 3D-RISM calculations using graphics processing unit. J Chem. Theory Comput. 2012, 8, 3015–3021. (21) van der Spoel, D.; Seibert, M. Protein folding kinetics and thermodynamics from atomistic simulations. Phys. Rev. Lett. 2006, 96, 238102. (22) Xu, W.; Lai, T.; Yang, Y.; Mu, Y. Reversible folding simulation by hybrid Hamiltonian replica exchange. J. Chem. Phys. 2008, 128, 175105. (23) Zacharias, M. Combining elastic network analysis and molecular dynamics simulations by hamiltonian replica exchange. J. Chem. Theory Comput. 2008, 4, 477–487. (24) Kier, B. L.; Andersen, N. H. Probing the lower size limit for protein-like fold stability: ten-residue microproteins with specific, rigid structures in water. J. Am. Chem. Soc. 2008, 130, 14675–14683. (25) Roy, S.; Goedecker, S.; Field, M. J.; Penev, E. A minima hopping study of all-atom protein folding and structure srediction. J. Phys. Chem. B 2009, 113, 7315–7321.

39

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(26) Moritsugu, K.; Terada, T.; Kidera, A. Scalable free energy calculation of proteins via multiscale essential sampling. Journal of Chemical Physics 2010, 133, 224105. (27) Harada, R.; Kitao, A. Exploring the folding free energy landscape of a β-hairpin miniprotein, chignolin, using multiscale free energy landscape calculation method. J. Phys. Chem. B 2011, 115, 8806–8812. (28) Okumura, H. Temperature and pressure denaturation of chignolin: Folding and unfolding simulation by multibaric-multithermal molecular dynamics method. Proteins 2012, 80, 2397–2416. (29) Harada, R.; Nakamura, T.; Takano, Y.; Shigeta, Y. Protein folding pathways extracted by OFLOOD: Outlier FLOODing method. J. Comput. Chem. 2015, 36, 97–102. (30) Harada, R.; Takano, Y.; Shigeta, Y. Enhanced conformational sampling method for proteins based on the TaBoo SeArch algorithm: application to the folding of a miniprotein, chignolin. J. Comput. Chem. 2015, 36, 763–772. (31) Mitsutake, A.; Takano, H. Relaxation mode analysis and Markov state relaxation mode analysis for chignolin in aqueous solution near a transition temperature. J. Chem. Phys. 2015, 143, 124111–16. (32) Satoh, D.; Shimizu, K.; Nakamura, S.; Terada, T. Folding free-energy landscape of a 10-residue mini-protein, chignolin. FEBS Lett. 2006, 580, 3422–3426. (33) Suenaga, A.; Narumi, T.; Futatsugi, N.; Yanai, R.; Ohno, Y.; Okimoto, N.; Taiji, M. Folding dynamics of 10-residue β-hairpin peptide chignolin. Chem. Asian J. 2007, 2, 591–598. (34) Kitao, A. Transform and relax sampling for highly anisotropic systems: application to protein domain motion and folding. J. Chem. Phys. 2011, 135, 045101.

40

ACS Paragon Plus Environment

Page 40 of 44

Page 41 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(35) Shao, Q. Folding or misfolding: the choice of β-hairpin. J. Phys. Chem. B 2015, 119, 3893–3900. (36) Kührová, P.; Simone, A. D.; Otyepka, M.; Best, R. B. Force-field dependence of chignolin folding and misfolding: comparison with experiment and redesign. Biophys. J. 2012, 102, 1897–1906. (37) Takano, H.; Miyashita, S. Relaxation modes in random spin systems. J. Phys. Soc. Jpn. 1995, 64, 3688–3698. (38) Koseki, S.; Hirao, H.; Takano, H. Monte Carlo Study of Relaxation Modes of a Single Polymer Chain. J. Phys. Soc. Jpn. 1997, 66, 1631–1637. (39) Hirao, H.; Koseki, S.; Takano, H. Molecular dynamics study of relaxation modes of a single polymer chain. J. Phys. Soc. Jpn. 1997, 66, 3399–3405. (40) Chong, S.-H.; Ham, S. Protein folding thermodynamics: A new computational approach. J. Phys. Chem. B 2014, 118, 5017–5025. (41) Mitsutake, A.; Kinoshita, M.; Okamoto, Y.; Hirata, F. Combination of the replicaexchange Monte Carlo method and the reference interaction site model theory for simulating a peptide molecule in aqueous solution. J. Phys. Chem. B 2004, 108, 19002– 19012. (42) Ben-Naim, A. Molecular Theory of Solutions; Oxford University Press: New York, 2006. (43) Case, D. A.; Darden, T. A.; Cheatham III, T. E.; Simmerling, C. L.; Wang, J.; Duke, R. E.; Luo, R.; Walker, R. C.; Zhang, W.; Merz, K. M.; et al., AMBER11 ; University of California: San Francisco, 2010.

41

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(44) Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins 2006, 65, 712–725. (45) Jorgensen, W.; Chandrasekhar, J.; Madura, J.; Impey, R.; Klein, M. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926– 935. (46) Pettitt, B.; Rossky, P. Integral-equation predictions of liquid-state structure for waterlike intermolecular potentials. J. Chem. Phys. 1982, 77, 1451–1457. (47) Yamauchi, M.; Okumura, H. Development of isothermal-isobaric replica-permutation method for molecular dynamics and Monte Carlo simulations and its application to reveal temperature and pressure dependence of folded, misfolded, and unfolded states of chignolin. J. Chem. Phys. 2017, 147, 184107–15. (48) Sibanda, B. L.; Blundell, T. L.; Thornton, J. M. Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. J. Mol. Biol. 1989, 206, 759–777. (49) Pavone, V.; Gaeta, G.; Lombardi, A.; Nastri, F.; Maglio, O. Discovering protein secondary structures: Classification and description of isolated α-turns. Biopolymers 1996, 38, 705–721. (50) Gunasekaran, K.; Ramakrishnan, C.; Balaram, P. Beta-hairpins in proteins revisited: lessons for de novo design. Protein Engng. 1997, 10, 1131–1141. (51) Rotondi, K. S.; Gierasch, L. M. Natural polypeptide scaffolds: beta-sheets, beta-turns, and beta-hairpins. Biopolymers 2006, 84, 13–22. (52) Creighton, T. E. Proteins: Structures and Molecular Properties 2nd Edition; W.H. Freeman and Company: New York, 1992. 42

ACS Paragon Plus Environment

Page 42 of 44

Page 43 of 44 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(53) McKiernan, K. A.; Husic, B. E.; Pande, V. S. Modeling the mechanism of CLN025 beta-hairpin formation. J. Chem. Phys. 2017, 147, 104107. (54) Mitsutake, A.; Iijima, H.; Takano, H. Relaxation mode analysis of a peptide system: Comparison with principal component analysis. J. Chem. Phys. 2011, 135, 164102–15. (55) Humphrey, W.; Dalke, A.; Schulten, K. VMD: visual molecular dynamics. J. Mol. Graphics 1996, 14, 33–38. (56) Pettersen, E. F.; Goddard, T. D.; Huang, C. C.; Couch, G. S.; Greenblatt, D. M.; Meng, E. C.; Ferrin, T. E. UCSF Chimera–A visualization system for exploratory research and analysis. J. Comput. Chem. 2004, 25, 1605–1612.

43

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Graphical TOC Entry Misfolded

Unfolded

Native Intermediate

44

ACS Paragon Plus Environment

Page 44 of 44