Dissecting Protein Configurational Entropy into Conformational and

Sep 8, 2015 - ... of the vibrational entropy is the significantly larger component in each of the ... Computer Simulations of Intrinsically Disordered...
1 downloads 0 Views 650KB Size
Subscriber access provided by West Virginia University | Libraries

Article

Dissecting Protein Configurational Entropy into Conformational and Vibrational Contributions Song-Ho Chong, and Sihyun Ham J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.5b07060 • Publication Date (Web): 08 Sep 2015 Downloaded from http://pubs.acs.org on September 11, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Dissecting Protein Configurational Entropy into Conformational and Vibrational Contributions Song-Ho Chong and Sihyun Ham∗ Department of Chemistry, Sookmyung Women’s University, Cheongpa-ro 47-gil 100, Yongsan-Ku, Seoul 140-742, Korea E-mail: [email protected] Phone: +82 2 710 9410. Fax: +82 2 2077 7321

1

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Quantifying how the rugged nature of the underlying free-energy landscape determines the entropic cost a protein must incur upon folding and ligand binding is a challenging problem. Here, we present a novel computational approach that dissects the protein configurational entropy based on the classification of protein dynamics on the landscape into two separate components: short-term vibrational dynamics related to individual free-energy wells, and long-term conformational dynamics associated with transitions between wells. We apply this method to separate the configurational entropy of the protein villin headpiece subdomain into the conformational and vibrational components. We find that the change in configurational entropy upon folding is dominated by the conformational entropy, despite the fact that the magnitude of the vibrational entropy is the significantly larger component in each of the folded and unfolded states, which is in accord with the previous empirical estimations. The straightforward applicability of our method to unfolded proteins promises a wide range of applications, including those related to intrinsically disordered proteins.

Keywords: thermodynamics; protein folding; molecular dynamics simulations; integralequation theory; villin headpiece subdomain

2

ACS Paragon Plus Environment

Page 2 of 29

Page 3 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

1. INTRODUCTION There is a growing interest in protein configurational entropy as a major factor controlling protein activities associated with signaling, regulation, and recognition. 1–5 Understanding its relationship to protein structure, as well as its variation upon conformational change and ligand binding, is therefore of fundamental importance. This entails a full characterization of the protein free-energy landscape since the configurational entropy is a measure of the extent of the configuration space accessible to the protein’s internal degrees of freedom. However, this is a daunting task since the protein free-energy landscape is rugged, i.e., it comprises numerous local minima separated by low free-energy barriers. In this regard, it was a keen insight to suppose that the protein configurational entropy (Sconfig ) may be characterized just by two terms, i.e., the conformational (Sconf ) and vibrational (Svib ) components. 6 The conformational component is associated with the number of accessible free-energy wells, whereas the vibrational component reflects the average width of the individual wells. (Although they are often used interchangeably, we shall refer to “conformational” entropy as a subcategory of “configurational” entropy as just described.) Dissecting Sconfig into Sconf and Svib enables one to characterize modulations of the free-energy landscape caused by intrinsic and extrinsic factors in simple terms; furthermore, it will facilitate the discovery of molecular mechanisms underlying protein activities, in particular, those of intrinsically disordered proteins. 7 Configurational entropy, however, is known as one of the most difficult thermodynamic quantities to estimate, and a significant amount of effort has been devoted to developing computational methods for dealing with this quantity. 8–25 The most detailed approach in our opinion is the mining-minima method, in which low free-energy conformations are first identified to take into account the multiple-minima structure of the free-energy landscape; next, vibrational properties in individual wells are incorporated through the harmonic approximation with anharmonic corrections. 13,14 By construction, this method is also able to provide separate estimates for the conformational and vibrational terms. However, enumerating the entire minima is not computationally feasible for complex molecules such as 3

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

proteins, and the applicability of this method is limited to relatively simple molecular systems. More recently, methods combined with information-theoretic tools such as mutual information expansion that focus on dihedral angle distributions have been developed. 17–20 These methods are currently being applied to proteins, 21–25 which also aim at evaluating conformational and vibrational contributions. However, this type of approach suffers from the slow convergence with respect to the order of correlations between dihedral angles, in particular those between backbone and side-chain dihedral angles, since the computation of the high-order correlations can be rather expensive even for small peptide systems. 24 This problem becomes even more problematic in handling unfolded or disordered proteins that sample a wide range of high-dimensional dihedral angle space in an intricate manner. Here, we propose a computational approach from a different perspective, which is based on the classification of protein dynamics on the free-energy landscape into conformational and vibrational contributions (Fig. 1). This method is a dynamic extension of the energetic approach we recently developed for protein folding thermodynamics that focuses on the statistical range to which a protein explores the free-energy landscape. 26,27 The energetic approach has been validated through comparison of the folding free energy with experimental measurements, 27 and its applicability to intrinsically disordered proteins (one of the most significant advantages of this approach) has also been demonstrated. 28 From an inspection of time-dependent free-energy curves, we clearly identify the presence of slowly varying and quickly oscillating components, with the latter enslaved to the former, indicating that the protein dynamics can be described as a series of vibrational dynamics within individual wells and intervening conformational transitions between them. We dissect these two components of disparate timescales using the detrending technique known as Hodrick-Prescott filtering. 29 Such a dissection naturally leads to the separation of configurational entropy into the conformational and vibrational terms. We illustrate our method through its application to the protein villin headpiece subdomain. Thereby, we uncover characteristics of the rugged protein free-energy landscape in the folded- and unfolded-state regions, and demonstrate how

4

ACS Paragon Plus Environment

Page 4 of 29

Page 5 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

they manifest themselves in configurational entropy change that occurs upon folding.

2. THEORY In this section, we present our theory for dissecting the protein configurational entropy, and this is done through the following steps. We start from the statistical mechanics formulation of the protein dynamics on the free-energy landscape (Sec. 2.1). Next, we introduce the energetic approach that allows to connect the protein dynamics with thermodynamics (Sec. 2.2). We then describe the separation method of the protein dynamics on the rugged free-energy landscape (Sec. 2.3). Finally, we show that the separation of the protein dynamics naturally leads to that of the protein configurational entropy (Sec. 2.4). Technical details on the derivation of some results are presented in Appendixes, which are provided in Supporting Information.

2.1. Protein dynamics on the free energy landscape Let us suppose that we have protein configurations generated by molecular dynamics simulations, from which we want to quantify the protein dynamics on the free-energy landscape. This can be done with an idea that stems from the following examination of the configuration integral ZX (X refers to the state of interest such as the folded and unfolded states). The configuration integral ZX , the potential-energy part of the partition function, for a protein dissolved in water is given by the integration of the Boltzmann factor e−βEtot over protein and water configurations. Here β = 1/(kB T ) is the inverse temperature, and Etot = Eu +Euv +Ev is the total potential energy of the system consisting of intra-protein (Eu ), protein-water (Euv ), and water-water (Ev ) interactions. Since we are interested in protein configurations only (to be collectively denoted as ru ), the integration over water configurations shall be performed for the parts in the Boltzmann factor, e−β(Euv +Ev ) , associated with protein-water

5

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 29

and water-water interactions. This results in, up to an irrelevant multiplicative constant: 26,30

ZX =

Z

X

dru exp [−βf (ru )]

(1)

in terms of the solvent-averaged Boltzmann factor e−βf (ru ) . Here, the integration over configurations is restricted to the state-X region, and f (ru ) = Eu (ru ) + Gsolv (ru ), called the effective energy, is the sum of the intra-protein energy Eu (ru ) and the solvation free energy Gsolv (ru ). f (ru ) is the reversible work for a process in which all the constituent atoms of a protein are moved from infinite separation in a vacuum to a particular configuration ru in water: Eu (ru ) corresponds to the reversible work required to form a protein of configuration ru in a vacuum, and Gsolv (ru ) accounts for the solvation process. Hence, f (r′u ) − f (ru ) provides a free energy difference for a microscopic, fluctuating process in which the protein configuration is changed from ru to r′u , and f (ru ) is precisely the quantity that defines the free-energy landscape in the protein configuration space. 30 Protein dynamics on the free-energy landscape is thus quantified by ft ≡ f (ru (t)), where ru (t) is the simulated protein configuration at time t. Eu (ru (t)) can be computed directly from the force field employed in the simulations, whereas Gsolv (ru (t)) can be obtained, e.g., by applying the liquid integral-equation theory such as the 3D-RISM. 31

2.2. Energetic approach connecting protein dynamics and thermodynamics The configuration integral ZX given in eq 1, from which thermodynamic functions can be derived, involves in principle the entire configuration space accessible to a protein in state X. In order to connect the protein dynamics on the free-energy landscape (ft ) with thermodynamics, we require a quantity associated with ZX that is solely based on the ft values of the protein configurations accessed by the simulations. As detailed in our previous work, 27

6

ACS Paragon Plus Environment

Page 7 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

the “normalized” configuration integral M 1ZT 1 X ZˆX = e−βft dt e−βft = T 0 M t=1

(2)

plays such a role. (Here, in the first equality the simulation time varying from t = 0 to T is considered to be continuous, whereas in the second equality it is discretized with the interval ∆t = T /M .) Indeed, one can show that the ratio of the configuration integrals of two states – e.g., X and Y – can be equated to that of the normalized ones, ZX /ZY = ZˆX /ZˆY , provided the same number (M ) of protein configurations is used in computing ZˆX and ZˆY (see Appendix A in Supporting Information). Since what is measured in experiments is the difference in thermodynamic quantities such as the folding free energy, it is sufficient for most practical purposes to have an access to the ratio of configuration integrals whose logarithm amounts to the free energy difference. A distinguishing feature of the normalized form ZˆX is R that it can be expressed as ZˆX = df W (f ) e−βf with the normalized distribution function

W (f ) = (1/T )

RT 0

dt δ(f − ft ) = (1/M )

PM

t=1

δ(f − ft ) satisfying

R

df W (f ) = 1 instead of the

unnormalized density of states necessary for ZX , which is practically difficult to obtain.

2.3. Separating fast and slow components in protein dynamics As we will see later in our numerical demonstration, the dynamics exhibited by ft reflects the rugged nature of the protein free-energy landscape, consisting of rapid vibrations in individual free-energy wells and slow conformational transitions between them. We separate such fast and slow components in the time series ft (t = 1, . . . , M ) using the detrending method known as Hodrick-Prescott filtering. 29 This procedure dissects ft into a slow conformational component ftconf and a fast vibrational component ftvib such that ft = ftconf + ftvib . This is accomplished by solving the following minimization problem:

min

[ ftconf ]M t=1

( M X t=1

(ft −

ftconf )2



M −1 h X

conf (ft+1



ftconf )



t=2

7

ACS Paragon Plus Environment

(ftconf



i conf 2 ft−1 )

)

(3)

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 29

The basic idea here is to ensure that the slow component ftconf varies smoothly over time; this is embodied through the second term, a sum of the squares of the second difference (an analogue of the second time derivative), which is a measure of the smoothness. At the same time, the difference between the original time series ft and its slow component ftconf is minimized through the first sum. The smoothing parameter λ > 0 gives relative weights to these two opposing effects, and yields the smoother ftconf with the larger value of λ. We overview some aspects of Hodrick-Prescott filtering in Appendix B. In particular, it is demonstrated there that, denoting the time average with a bar, e.g., f = (1/M ) the slow and fast components fulfill the relations

f conf = f ,

f vib = 0

PM

t=1

ft ,

(4)

2.4. Dissecting protein configurational entropy Here, we combine the results of the previous subsections and demonstrate that the dissection ft = ftconf + ftvib of the protein dynamics on the rugged free-energy landscape has the direct consequence of decomposing the protein configurational entropy (Sconfig ) into the conformational (Sconf ) and vibrational (Svib ) components:

Sconfig = Sconf + Svib

(5)

Before presenting the derivation of this result, we briefly outline, as a reference, the conventional formulation of the protein configurational entropy and its dissection.

8

ACS Paragon Plus Environment

Page 9 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2.4.1. Derivation based on the conventional configuration integral We introduce, in view of eq 1, the probability distribution p(ru ) = e−βf (ru ) /ZX for observing the protein configuration ru . The protein configurational entropy is then defined by

Sconfig = −kB

Z

dru p(ru ) log p(ru )

(6)

By substituting p(ru ) = e−βf (ru ) /ZX into log p(ru ) of this expression, one obtains

−kB T log ZX = hf i − T Sconfig

in which hf i =

R

(7)

dru p(ru )f (ru ) denotes the ensemble average taken with respect to p(ru ).

While the water (solvation) entropy, included through the solvation free energy in f = Eu + Gsolv , and the protein configuration entropy are separated in this expression, they are not totally independent. Indeed, f (and hence, the water entropy) enters into the defining equation (eq 6) of the protein configurational entropy via p(ru ) = e−βf (ru ) /ZX . The dissection of the protein configurational entropy into the conformational and vibrational components can be done by approximating ZX by a sum of local configuration integral zj originating from a local free energy well j, ZX =

ZX =

X

P

j

zj with −kB T zj = fj − T Svib,j , 14 i.e.,

e−β(fj −T Svib,j )

(8)

j

Here, fj denotes the average effective energy and Svib,j is the vibrational entropy for the single well j. Introducing the probability pj = zj /ZX of being in the jth energy well, the conformational entropy associated with the presence of multiple wells is defined by Sconf = −kB

P

j

pj log pj . One can then derive the following free energy expression: 14

−kB T log ZX =

X

pj (fj − T Svib,j ) − T Sconf

j

9

ACS Paragon Plus Environment

(9)

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Since hf i =

P

entropy Svib =

j

Page 10 of 29

pj fj , the dissection (eq 5) follows from eqs 7 and 9 with the vibrational

P

j

pj Svib,j given by the average of the ones from individual wells.

Although the formulation presented here is clear and straightforward, its applicability to complex systems like proteins is limited since enumerating the entire minima (labeled by j above) is not practically feasible for such systems, in particular, in their unfolded states. 2.4.2. Derivation based on the normalized configuration integral Now, we derive the dissection (eq 5) starting from the normalized configuration integral ZˆX . R P We first recall the expression ZˆX = df W (f ) e−βf in which W (f ) = (1/M ) M t=1 δ(f − ft )

(Sec. 2.2), and recognize that W (f ) provides the role for p(ru ) here: averages, to be denoted

with a bar, are taken with respect to W (f ), e.g., f =

R

df W (f ) f = (1/M )

PM

t=1

ft . The

logarithm of ZˆX can then be expressed as log ZˆX = log e−βf , which can be handled with the cumulant-expansion method. 32 We obtain in a form analogous to eq 7: 26 −kB T log ZˆX = f − T Sconfig

(10)

with T Sconfig =

∞ X β (−β)n−1 ∆f 2 − (∆f )n c 2 n! n=3

(11)

Here, ∆f = f − f and (∆f )n c denotes the nth cumulant average. When the distribution function W (f ) is Gaussian, cumulant averages higher than the second order vanish. In this case, the configurational entropy is simply given by

T Sconfig =

β ∆f 2 2

(if W (f ) is Gaussian)

(12)

in terms of the range to which the system explores the free-energy landscape. To separate Sconfig into the conformational and vibrational components, we use the dis-

10

ACS Paragon Plus Environment

Page 11 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

section ft = ftconf + ftvib (Sec. 2.3) to write ZˆX as ZˆX = (1/T )

Z T 0

conf

dt e−βft

vib

e−βft

(13)

We assume here the timescale separation between the conformational and vibrational components, i.e., that ftconf does not change much before ftvib reaches local equilibrium on the fast timescale. (This assumption amounts to the aforementioned approximation ZX =

P

j

zj

used in eq 8, and will be justified through our numerical demonstration.) In this case, the behavior of the fast component ftvib can be approximately described by holding the slow component ftconf fixed at its instantaneous value and using the conditional probability distribution Wvib (fvib ; t) for the vibrational part: Z 1ZT conf ZˆX = dfvib Wvib (fvib ; t) e−βfvib dt e−βft T 0

(14)

This is a procedure known as the averaging method in multiscale modeling. 33 We rewrite this expression using the probability distribution for the conformational component characterized by a value fj of the local free energy well j, Wconf (fj ) = (1/T ) ZˆX =

Z

dfj Wconf (fj ) e

−βfj

Z

RT 0

dt δ(fj − ftconf ):

dfvib Wvib (fvib ; j) e

−βfvib



(15)

Here, the quantity in square brackets is the vibrational integral in the local free energy well R j, Zˆvib,j = dfvib Wvib (fvib ; j) e−βfvib . Since Zˆvib,j takes the same normalized form as ZˆX , we

introduce, in view of eq 10 and the relation f vib = 0 (eq 4), the vibrational entropy in the

local free energy well j via −kB T log Zˆvib,j = −T Svib,j . Then, ZˆX can be written as ZˆX =

Z

dfj Wconf (fj ) e−β(fj −T Svib,j )

which is the expression corresponding to eq 8.

11

ACS Paragon Plus Environment

(16)

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 29

By referring to eq 9 and recognizing that Wconf (fj ) plays the role of pj here, we introduce the conformational entropy through −kB T log ZˆX =

Z

dfj Wconf (fj ) (fj − T Svib,j ) − T Sconf

(17)

Since dfj Wconf (fj ) fj = f conf = f (eq 4), we obtain the desired relation Sconfig = Sconf + Svib R

from the comparison of eqs 10 and 17, in which the vibrational entropy is given by the average of the individual-well contributions, Svib =

R

dfj Wconf (fj ) Svib,j .

A practically useful expression for the vibrational entropy Svib is derived in Appendix C: introducing the probability distribution Wvib (fvib ) = (1/T )

RT 0

dt δ(fvib − ftvib ) for the vibra-

tional component ftvib , the vibrational entropy Svib can be expressed in terms of the cumulant averages taken with respect to Wvib (fvib ) (see eq S19). A particularly simple expression is obtained when Wvib (fvib ) is Gaussian:

T Svib =

β 2 ∆fvib 2

(if Wvib (fvib ) is Gaussian)

(18)

As we will see below, both of the distribution functions W (f ) and Wvib (fvib ) are well approximated by the Gaussian distribution, and this holds not only in the folded state but also in the unfolded state. In this case, Sconfig and Svib can be simply computed from eqs 12 and 18, respectively, and Sconf can be evaluated through the relation Sconf = Sconfig − Svib . Thus, our theory provides a simple and practical means to obtain the configurational entropy and its dissection for complex systems including their unfolded states.

3. COMPUTATIONAL DETAILS 3.1. Molecular Dynamics Simulations We illustrate the potential of our theory through its application to the protein villin headpiece subdomain (HP-36). 34 To generate protein configurations, we carried out explicit-water 12

ACS Paragon Plus Environment

Page 13 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

molecular dynamics simulations of this protein at a temperature of 300 K and a pressure of 1 bar. Three independent 1 µs simulations for the folded state were performed starting from the NMR structure (protein data bank code 1VII), 34 whereas ten independent 1 µs simulations for the unfolded state were initiated from heat-denatured structures, whose details were reported previously. 27 Figure 1 displays the representative HP-36 structures taken from the folded- and unfolded-state simulations.

3.2. Effective Energy Calculations From each 1 µs trajectory of the three independent folded-state and ten independent unfoldedstate simulations, we took M = 100, 000 protein configurations with a 10 ps time interval to be denoted as ru (t) with t = 1, . . . , M . They were used to compute the effective energy ft = Eu (ru (t)) + Gsolv (ru (t)) as a function of simulation time t. Eu (ru (t)) was calculated directly from the force field used in the simulation, whereas Gsolv (ru (t)) was obtained from the 3D-RISM theory. 31 This theory computes, for each protein configuration, the 3D distribution function gγ (r) of the water site γ (oxygen or hydrogen) at position r by solving selfconsistently the following two equations for the total correlation function hγ (r) = gγ (r) − 1 and the direct correlation function cγ (r):

hγ (r) =

XZ

dr′ χγγ ′ (|r − r′ |) cγ ′ (r′ )

(19)

γ′

and hγ (r) =

      

exp[dγ (r)] − 1 for dγ (r) ≤ 0 dγ (r)

(20)

for dγ (r) > 0

Here, χγγ ′ (r) is the site-site water susceptibility function, and dγ (r) = −uγ (r)/(kB T ) + hγ (r) − cγ (r) with uγ (r) being the protein-water interaction potential. Gsolv for each protein

13

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 29

configuration can then be obtained using the following analytical formula: 31

Gsolv = ρkB T

XZ γ

dr



1 1 hγ (r)2 Θ(−hγ (r)) − cγ (r) − hγ (r)cγ (r) 2 2



(21)

Here, ρ denotes the number density of water, and Θ(x) is the Heaviside step function.

3.3. Optimal Choice of the Smoothing Parameter The optimal choice of the smoothing parameter λ in Hodrick-Prescott filtering (see eq 3) was made using the two-fold cross-validation. 35 In this optimization, the time series ft (t = (1)

1, ..., M ) is divided into two partitions, ft (1)

We first take ft

for t = 2, 4, ..., M .

as the training set, and perform Hodrick-Prescott filtering to obtain its

conf (1)

smooth part, ft

(2)

for t = 1, 3, ..., M − 1 and ft

for t = 1, 3, ..., M − 1. We then carry out the cubic-spline interpolation

to get a “prediction” for the smooth part for t = 2, 4, ..., M , and this prediction is compared (2)

with the testing set ft (2)

taking ft

and the standard error is computed. This procedure is repeated by (1)

as the training set and ft

as the testing set, and the cross-validation standard

error sCV (λ) is computed as the average of the two standard errors. To find the optimal value, λ is varied on a grid to search for a minimum of sCV (λ). The cross-validation profiles, sCV (λ) versus log10 λ, based on the representative simulation trajectories are shown in Figure S1, from which the optimal value λ = 6400 was determined. This value is close to 1600 used in the original work. 29 (As can be inferred from Figure S1, the influence of λ varies with the logarithm of λ, in which case log10 6400 ≈ 3.8 and log10 1600 ≈ 3.2 are rather close.) We also investigated the results with λ = 640 and 64000 to demonstrate the robustness of our conclusions with respect to the choice of λ.

4. RESULTS Representative ft curves for the folded and unfolded states, quantifying the protein dynamics in the respective regions of the free-energy landscape, are shown in Figures 2a and d, re14

ACS Paragon Plus Environment

Page 15 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

spectively. We observe in these curves the presence of slowly varying and quickly oscillating components; that is to say, the center of oscillation is moving much more slowly than the oscillation period, and this feature is more evident in the unfolded-state curve (Figure 2d). This reflects the dynamics on the rugged free-energy landscape, consisting of rapid vibrations in individual free-energy wells and slow conformational transitions between them. Figure 2 also displays the dissection of ft into the conformational (ftconf ) and vibrational (ftvib ) components based on Hodrick-Prescott filtering with λ = 6400. We find that the conformational part of the unfolded-state dynamics exhibits a larger variation than that of the folded-state dynamics (cf. Figures 2b and e), reflecting the fact that a larger number of distinct free-energy wells are explored in the unfolded state. On the other hand, we observe that the widths of the vibrational dynamics do not change significantly over time and that the magnitude of the vibrational fluctuations is comparable in the folded and unfolded states (Figures 2c and f). To examine the dependence of the slow and fast components on the choice of λ, we also carried out the dissection with λ = 640 (Figure S2) and 64000 (Figure S3). We find that a smoother ftconf curve is obtained with a larger value of λ, as is consistent with the aforementioned nature of Hodrick-Prescott filtering. Thus, quantitative aspects of the conformational and vibrational components are affected by the value of λ. However, this λ dependence should not be taken as a flaw of our dissection method. In fact, dividing the dynamics into the conformational and vibrational terms necessarily calls for the introduction of a barrier cutoff. 13 For example, the dynamics between free-energy wells separated by a barrier of 0.7 kcal/mol is regarded as a conformational transition with the cutoff of 0.6 kcal/mol, but as vibrational dynamics with the cutoff of 0.8 kcal/mol. Thus, the larger portion of the dynamics is classified as vibrational with the larger cutoff, which amounts to choosing a larger value of the smoothing parameter λ. Nevertheless, as we show below, qualitative features of the conformational and vibrational components remain invariant under the change in λ. In particular, it is demonstrated that changes in conformational and

15

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

vibrational entropy upon folding are insensitive to λ. Probability distribution functions W (f ) of the effective energy f for the folded and unfolded states are shown in Figures 3a and c, respectively. We find that W (f ) is well approximated by the Gaussian distribution both for the folded and unfolded states, and this holds as a result of the central limit theorem. 27 This allows us to compute the protein configurational entropy Sconfig for the folded and unfolded states through eq 12. Thus, our approach based on the effective energy (i.e., the “energetic approach”) provides a simple recipe for computing the free energy difference (∆F = ∆f −T ∆Sconfig ). We previously found for the HP-36 that the folding free energy (−2.5 kcal/mol) computed in this way agrees with the experimental data (−2.3 to −3.2 kcal/mol). 27 We also demonstrated that this small value is a result of a large cancellation between the effective energy change (∆f = −22.1 kcal/mol), quantifying the difference in height of the folded- and unfolded-state regions of the free-energy landscape, and the protein configurational entropy change (T ∆Sconfig = −19.5 kcal/mol), associated with the magnitude of the fluctuations in the respective regions. 27 We also observe that the probability distribution Wvib (fvib ) of the vibrational component is well approximated by the Gaussian distribution, and this holds both for the folded and unfolded states (Figures 3b and d). This enables us not only to compute the vibrational entropy Svib from eq 18, but also to evaluate the conformational entropy Sconf through the relation Sconf = Sconfig − Svib . The results for Sconfig , Sconf , and Svib from all the independent trajectories as well as their averages and changes upon folding are shown in Table 1. We find that the magnitude of Svib is significantly larger than that of Sconf in each of the folded and unfolded states. However, the change in Svib upon folding is rather small; consequently, the configurational entropy change upon folding is dominated by the conformational component. These features are robust; that is to say, they remain invariant under the change in the smoothing parameter λ (see Tables S1 and S2, which display the results for λ = 640 and 64000, respectively). In particular, when the changes in these quantities upon folding are considered, the insensitivity to λ holds at the quantitative level: for λ = 640,

16

ACS Paragon Plus Environment

Page 16 of 29

Page 17 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

T ∆Sconf and T ∆Svib are −23.8 ± 2.6 and 4.3 ± 0.9 kcal/mol, respectively; for λ = 6400, these quantities are −23.9 ± 2.5 and 4.5 ± 0.9 kcal/mol, respectively; and for λ = 64000, they are −23.7 ± 2.5 and 4.2 ± 1.0 kcal/mol, respectively. Thus, the changes in the entropy components upon folding are independent of λ within the statistical errors.

5. DISCUSSION The loss of configurational entropy has been well recognized as the major unfavorable effect in the energetics of protein folding. 36 However, there has been an issue concerning whether the entropic loss arises mainly from the conformational contribution or from the vibrational term. In this regard, it was pointed out based on the harmonic approximation that the vibrational entropy is similar in magnitude between a folded protein and a random coil in a single potential well. 6 By combining this computational result with experimental implications, it was suggested that, although the vibrational entropy is significantly larger than the conformational entropy, the most important entropy change upon folding originates from the latter. 6 This was further corroborated by entropy estimations based on several empirical methods. 37 The first microscopic calculations presented in this work of the conformational (Sconf ) and vibrational (Svib ) contributions in the folded and unfolded states fully support such a conclusion. Indeed, we find that the magnitude of Svib is significantly larger than that of Sconf in each of the folded and unfolded states, but that Svib does not vary much upon folding; therefore, the change in configurational entropy upon folding is dominated by the conformational contribution. While our main focus in the present work is concerned with the ruggedness of the protein free-energy landscape, our computational method also has the potential to offer a fully microscopic characterization of the “funneled” nature of the landscape. The funneled freeenergy landscape has provided a conceptual framework for understanding many aspects of

17

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 29

protein folding. 38–40 Its essence is embodied in the following model of the free energy: 41

F (n) = E(n) −

∆E 2 (n) − T S0 (n) 2kB T

(22)

Here, we used the original notation in ref 41: E(n) is the average energy that takes into account the solvation effect as well, ∆E 2 (n) represents the distributions in energy, and S0 (n) refers to the conformational entropy, expressed as a function of the configurational similarity n to the native structure. (n = 1 corresponds to complete similarity and n = 0 to no similarity to the native structure.) In this model, the funneled shape of the landscape arises mainly because both the average energy E(n) and the entropy S0 (n) decrease as the native structure is approached (i.e., as n → 1). 41 By introducing the state characterized by n, we obtain from our approach: 2 ∆fvib (n) − T Sconf (n) Fˆ (n) = f (n) − 2kB T

(23)

The correspondence between eqs 22 and 23 should be self-evident. Whereas the terms in eq 22 – in particular, the entropy term S0 (n) – are typically handled with phenomenological models, all the terms in eq 23 can be computed based on our method, provided that a sufficient number of protein configurations are available for each n. Recent computer simulations are starting to offer such trajectories covering the full range of n, 42–44 and our method will enable a characterization of the funneled free-energy landscape from first principles. Interestingly, it has been observed in recent computational studies on small molecules that changes in configurational entropy upon binding can be attributed more to the vibrational component (∆Svib ) than to the conformational one (∆Sconf ), 13,14 which contrasts with the case of protein folding. Similar results showing more importance of the vibrational entropy were reported also for protein-ligand binding and protein-protein association. 21,22,45 These observations indicate that, to elucidate in detail the role of entropy in intrinsically disordered proteins, which often exhibit coupled folding and binding, 46 it is indispensable to have a 18

ACS Paragon Plus Environment

Page 19 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

means that is not only applicable to disordered proteins but has an access both to the conformational and vibrational entropy. These two requirements render our computational method promising because the energetic approach, on which our dissection method is based, has been successfully applied to intrinsically disordered proteins. 28 Our approach will thus lead to fascinating applications in uncovering the molecular mechanisms underlying how intrinsically disordered proteins fulfill their functions through entropy.

Acknowledgement This work was supported by Samsung Science and Technology Foundation under Project Number SSTF-BA1401-13.

Supporting Information Available Appendixes A–C for technical details on the derivation of some results; the cross-validation profile for determining the optimal smoothing parameter λ; and the dissection of the protein dynamics and configurational entropy with λ = 640 and 64000. This material is available free of charge via the Internet at http://pubs.acs.org/.

References (1) Frederick, K. K.; Marlow, M. S.; Valentine, K. G.; Wand, A. J. Conformational Entropy in Molecular Recognition by Proteins. Nature 2007, 448, 325–329. (2) Boehr, D. D.; Nussinov, R.; Wright, P. E. The Role of Dynamic Conformational Ensembles in Biomolecular Recognition. Nat. Chem. Biol. 2009, 5, 789–796. (3) Smock, R. G.; Gierasch, L. M. Sending Signals Dynamically. Science 2009, 324, 198– 203. 19

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(4) Tzeng, S.-R.; Kalodimos, C. G. Protein Activity Regulation by Conformational Entropy. Nature 2012, 488, 236–240. (5) Wand, A. J. The Dark Energy of Proteins Comes to Light: Conformational Entropy and its Role in Protein Function Revealed by NMR Relaxation. Curr. Opin. Struct. Biol. 2013, 23, 75–81. (6) Karplus, M.; Ichiye, T.; Pettitt, B. M. Configurational Entropy of Native Proteins. Biophys. J. 1987, 52, 1083–1085. (7) Flock, T.; Weatheritt, R. J.; Latysheva, N. S.; Babu, M. M. Controlling Entropy to Tune the Functions of Intrinsically Disordered Regions. Curr. Opin. Struct. Biol. 2014, 26, 62–72. (8) Andricioaei, I.; Karplus, M. On the Calculation of Entropy from Covariance Matrices of the Atomic Fluctuations. J. Chem. Phys. 2001, 115, 6289–6292. (9) Sch¨afer, H.; Mark, A. E.; van Gunsteren, W. F. Absolute Entropies from Molecular Dynamics Simulation Trajectories. J. Chem. Phys. 2000, 113, 7809–7813. (10) Schlitter, J. Estimation of Absolute and Relative Entropies of Macromolecules Using the Covariance Matrix. Chem. Phys. Lett. 1993, 215, 617–621. (11) Levy, R. M.; Karplus, M.; Kushick, J.; Perahia, D. Evaluation of the Configurational Entropy for Proteins: Application to Molecular Dynamics Simulations of an α-Helix. Macromolecules 1984, 17, 1370–1374. (12) Karplus, M.; Kushick, J. N. Method for Estimating the Configurational Entropy of Macromolecules. Macromolecules 1981, 14, 325–332. (13) Chang, C.-E.; Gilson, M. K. Free Energy, Entropy, and Induced Fit in Host-Guest Recognition: Calculations with the Second-Generation Mining Minima Algorithm. J. Am. Chem. Soc. 2004, 126, 13156–13164. 20

ACS Paragon Plus Environment

Page 20 of 29

Page 21 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(14) Chang, C.-E. A.; Chen, W.; Gilson, M. K. Ligand Configurational Entropy and Protein Binding. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 1534–1539. (15) Carlsson, J.; ˚ Aqvist, J. Calculations of Solute and Solvent Entropies from Molecular Dynamics Simulations. Phys. Chem. Chem. Phys. 2006, 8, 5385–5395. (16) Meirovitch, H. Methods for Calculating the Absolute Entropy and Free Energy of Biological Systems Based on Ideas from Polymer Physics. J. Mol. Recognit. 2009, 23, 153–172. (17) Killian, B. J.; Kravitz, J. Y.; Gilson, M. K. Extraction of Configurational Entropy from Molecular Simulations via an Expansion Approximation. J. Chem. Phys. 2007, 127, 024107. (18) Hnizdo, V.; Darian, E.; Fedorowicz, A.; Demuchuk, E.; Li, S.; Singh, H. NearestNeighbor Nonparametric Method for Estimating the Configurational Entropy of Complex Molecules. J. Comput. Chem. 2007, 28, 655–668. (19) Hnizdo, V.; Tan, J.; Killian, B. J.; Gilson, M. K. Efficient Calculation of Configurational Entropy from Molecular Simulations by Combining the Mutual-Information Expansion and Nearest-Neighbor Methods. J. Comput. Chem. 2008, 29, 1605–1614. (20) King, B. M.; Silver, N. W.; Tidor, B. Efficient Calculation of Molecular Configurational Entropies Using and Information Theoretic Approximation. J. Phys. Chem. B 2012, 116, 2891–2904. (21) Chang, C.-E. A.; McLaughlin, W. A.; Baron, R.; Wang, W.; McCammon, J. A. Entropic Contributions and the Influence of the Hydrophobic Environment in Promiscuous Protein-Protein Association. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 7456–7461. (22) Killian, B. J.; Kravitz, J. Y.; Somani, S.; Dasgupta, P.; Pang, Y.-P.; Gilson, M. K. Configurational Entropy in Protein-Peptide Binding: Computational Study of Tsg101 21

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Ubiquitin E2 Variant Domain with an HIV-Derived PTAP Nonapeptide. J. Mol. Biol. 2009, 389, 315–335. (23) Hensen, U.; Lange, O. F.; Grubm¨ uller, H. Estimating Absolute Configurational Entropies of Macromolecules: The minimally Coupled Subspace Approach. PLoS ONE 2010, 5, e9179. (24) Su´arez, E.; D´ıaz, N.; Su´arez, D. Entropy Calculations of Single Molecules by Combining the Rigid-Rotor and Harmonic-Oscillator Approximations with Conformational Entropy Estimations from Molecular Dynamics Simulations. J. Chem. Theory Comput. 2011, 7, 2638–2653. (25) Fenley, A. T.; Killian, B. J.; Hnizdo, V.; Fedorowicz, A.; Sharp, D. S.; Gilson, M. K. Correlation as a Determinant of Configurational Entropy in Supramolecular and Protein Systems. J. Phys. Chem. B 2014, 118, 6447–6455. (26) Chong, S.-H.; Ham, S. Configurational Entropy of Protein: A Combined Approach Based on Molecular Simulation and Integral-Equation Theory of Liquids. Chem. Phys. Lett. 2011, 504, 225–229. (27) Chong, S.-H.; Ham, S. Protein Folding Thermodynamics: A New Computational Approach. J. Phys. Chem. B 2014, 118, 5017–5025. (28) Chong, S.-H.; Ham, S. Conformational Entropy of Intrinsically Disordered Protein. J. Phys. Chem. B 2013, 117, 5503–5509. (29) Hodrick, R. J.; Prescott, E. C. Postwar U.S. Business Cycles: An Empirical Investigation. J. Money Credit Banking 1997, 29, 1–16. (30) Lazaridis, T.; Karplus, M. Thermodynamics of Protein Folding: A Microscopic View. Biophys. Chem. 2003, 100, 367–395.

22

ACS Paragon Plus Environment

Page 22 of 29

Page 23 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(31) Imai, T.; Harano, Y.; Kinoshita, M.; Kovalenko, A.; Hirata, F. A Theoretical Analysis on Hydration Thermodynamics of Proteins. J. Chem. Phys. 2006, 125, 024911. (32) van Kampen, N. G. Stochastic Processes in Physics and Chemistry, 3rd ed.; NorthHolland: Amsterdam, 2007. (33) Weinan, E. Principles of Multiscale Modeling; Cambridge University Press: Cambridge, 2011. (34) McKnight, C. J.; Matsudaira, P. T.; Kim, P. S. NMR Structure of the 35-Residue Villin Headpiece Subdomain. Nat. Struct. Biol. 1997, 4, 180–184. (35) Eilers, P. H. C. A Perfect Smoother. Anal. Chem. 2003, 75, 3631–3636. (36) Dill, K. A. Dominant Forces in Protein Folding. Biochemistry 1990, 29, 7133–7155. (37) Doig, A. J.; Sternberg, M. J. E. Side-Chain Conformational Entropy in Protein Folding. Protein Sci. 1995, 4, 2247–2251. (38) Wolynes, P. G.; Onuchic, J. N.; Thirumalai, D. Navigating the Folding Routes. Science 1995, 267, 1619–1620. (39) Dill, K. A.; Chan, H. S. From Levinthal to Pathways to Funnels. Nat. Struct. Biol. 1997, 4, 10–19. (40) Oliveberg, M.; Wolynes, P. G. The Experimental Survey of Protein-Folding Energy Landscapes. Q. Rev. Biophys 2005, 38, 245–288. (41) Bryngelson, J. D.; Onuchic, J. N.; Socci, N. D.; Wolynes, P. G. Funnels, Pathways, and the Energy Landscape of Protein Folding: A Synthesis. Proteins 1995, 21, 167–195. (42) Lane, T. J.; Shukla, D.; Beauchamp, K. A.; Pande, V. S. To Milliseconds and Beyond: Challenges in the Simulation of Protein Folding. Curr. Opin. Struct. Biol. 2013, 23, 58–65. 23

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(43) Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Shaw, D. E. How Fast-Folding Proteins Fold. Science 2011, 334, 517–520. (44) Freddolino, P. L.; Harrison, C. B.; Liu, Y.; Schulten, K. Challenges in Protein-Folding Simulations. Nat. Phys. 2010, 6, 751–758. (45) Thorpe, I. F.; Brooks III, C. L. Molecular Evolution of Affinity and Flexibility in the Immune System. Proc. Natl. Acad. Sci. U.S.A. 2007, 104, 8821–8826. (46) Sugase, K.; Dyson, H. J.; Wright, P. E. Mechanism of Coupled Folding and Binding of an Intrinsically Disordered Protein. Nature 2007, 447, 1021–1025.

24

ACS Paragon Plus Environment

Page 24 of 29

Page 25 of 29

Figures a

free energy landscape

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

b unfolded state conformational vibrational

folded state

configuration space

Figure 1: (a) Schematic representation of the free-energy landscape. Representative HP-36 structures from the ten independent unfolded-state simulations and from the three independent folded-state simulations are also displayed. (b) Protein dynamics on the free-energy landscape can be described as a series of vibrational dynamics within individual wells and intervening conformational transitions between them.

25

ACS Paragon Plus Environment

The Journal of Physical Chemistry

folded state c vibrational part

b conformational part 0

100

-50

-50

50

-100

-150

f t vib [kcal/mol]

0

f t conf [kcal/mol]

f t [kcal/mol]

a dynamics on the landscape

-100

-150

0

100

200

300

400

-100 0

500

0

-50

-200

-200

100

200

300

400

500

0

100

time [ns]

time [ns]

200

300

400

500

400

500

time [ns]

unfolded state d dynamics on the landscape

f vibrational part

e conformational part

100 0

-50

-100

-150

f t vib [kcal/mol]

0

f t conf [kcal/mol]

f t [kcal/mol]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 29

-50

-100

50

0

-50 -150 -100

0

100

200

300

time [ns]

400

500

0

100

200

300

400

500

time [ns]

0

100

200

300

time [ns]

Figure 2: Dissection of protein dynamics (ft ) on the free-energy landscape. (a–c) ft curve based on the representative folded-state simulation as a function of simulation time (a) and its separation into the conformational component ftconf (b) and vibrational component ftvib (c) using Hodrick-Prescott filtering with λ = 6400. (d–f) Corresponding results based on the representative unfolded-state simulation.

26

ACS Paragon Plus Environment

Page 27 of 29

folded state a probability distribution of f 0.02

b probability distribution of f vib s = 0.074

s = 0.053

0.02

k = 0.000

Wvib (f vib )

W (f )

k = -0.007

0.01

0.00 -200

-150

-100

-50

0.01

0.00 -100

0

-50

f [kcal/mol]

0

50

100

f vib [kcal/mol]

unfolded state c probability distribution of f 0.02

d probability distribution of f vib s = 0.049

s = 0.068

0.02

k = -0.006

k = 0.023

Wvib (f vib )

W (f )

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

0.01

0.00 -150

-100

-50

0.01

0.00 -100

0

f [kcal/mol]

-50

0

50

100

f vib [kcal/mol]

Figure 3: Probability distribution W (f ) of the effective energy f (a) and Wvib (fvib ) of the vibrational component fvib (b) based on the representative folded-state simulation. (c,d) Corresponding results based on the representative unfolded-state simulation. Dashed curve in each panel denotes the fit by the Gaussian distribution. Closeness to the Gaussian distribution is measured by the skewness (s) and excess kurtosis (k), 27 both of which are zero for the Gaussian distribution.

27

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 29

Table 1: Dissection of protein configurational entropy with λ = 6400 T Sconfig [kcal/mol] folded-state trajectories trajectory 1 326.8 trajectory 2 335.7 trajectory 3 330.0 average a 330.8 ± 2.1 unfolded-state trajectories trajectory 1 356.7 trajectory 2 343.0 trajectory 3 350.3 trajectory 4 347.9 trajectory 5 349.8 trajectory 6 352.0 trajectory 7 349.1 trajectory 8 356.9 trajectory 9 350.4 trajectory 10 347.1 average a 350.3 ± 1.3 T ∆Sconfig [kcal/mol] difference −19.5 ± 2.5 a

T Sconf [kcal/mol]

T Svib [kcal/mol]

51.4 59.2 51.3 54.0 ± 2.2

275.5 276.5 278.7 276.9 ± 0.8

84.0 71.7 76.3 75.5 77.0 80.3 75.9 86.5 74.6 77.4 77.9 ± 1.3 T ∆Sconf [kcal/mol] −23.9 ± 2.5

272.8 271.4 274.0 272.4 272.7 271.6 273.2 270.4 275.9 269.8 272.4 ± 0.5 T ∆Svib [kcal/mol] 4.5 ± 0.9

Average ± standard error.

28

ACS Paragon Plus Environment

Page 29 of 29

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Graphical TOC Entry configurational entropy conformational

vibrational

folded state

unfolded state

29

ACS Paragon Plus Environment