Deriving Coarse-Grained Charges from All-Atom Systems: An Analytic

Aug 19, 2016 - An analytic method to assign optimal coarse-grained charges based on electrostatic potential matching is presented. This solution is th...
0 downloads 0 Views 3MB Size
Subscriber access provided by Northern Illinois University

Article

Deriving Coarse-grained Charges from All-atom Systems: An Analytic Solution. Peter McCullagh, Peter T. Lake, and Martin McCullagh J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.6b00507 • Publication Date (Web): 19 Aug 2016 Downloaded from http://pubs.acs.org on August 22, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Deriving Coarse-grained Charges from All-atom Systems: An Analytic Solution. Peter McCullagh,† Peter T. Lake,‡ and Martin McCullagh∗,‡ †Department of Statistics, University of Chicago, Chicago, IL, USA ‡Department of Chemistry, Colorado State University, Fort Collins, CO, USA E-mail: [email protected] Abstract An analytic method to assign optimal coarse-grained charges based on electrostatic potential matching is presented. This solution is the infinite size and density limit of grid-integration charge-fitting and is computationally more efficient by several orders of magnitude. The solution is also analytically minimized with respect to coarsegrained positions which proves to be an extremely important step in reproducing the allatom electrostatic potential. The joint optimal-charge optimal-position coarse-graining procedure is applied to a number of aggregating proteins using single-site per amino acid resolution. These models provide a good estimate of both the vacuum and DebyeH¨ uckel screened all-atom electrostatic potentials in the vicinity and in the far-field of the protein. Additionally, these coarse-grained models are shown to approximate the all-atom dimerization electrostatic potential energy of ten aggregating proteins with good accuracy.

1

Introduction

Electrostatic interactions are a driving force in a variety of biomolecular processes including protein-DNA assembly, protein-protein aggregation and protein-ligand bind-

1

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ing. All-atom molecular dynamics (aaMD) is a powerful tool that has been used to study these processes due, in part, to the relatively accurate treatment of molecular level electrostatics. Many of these biological processes, however, occur on time- and length-scales that prove difficult to approach using atomic resolution. It is of interest, therefore, to create coarse-grained (CG) models that provide an accurate treatment of electrostatics. To that end we present an analytic approach to compute the optimal coarse-grained charges to match the electrostatic potential (ESP) generated by a given configuration of atomistic point charges. This solution is extremely efficient compared with grid-based integration approaches, and thus provides a platform to optimize both CG position and charge simultaneously. Protein aggregation is a ubiquitous phenomenon in cellular biology achieving functions such as actin cytoskeleton growth, microtubular assembly, and virus capsid growth. aaMD models used to investigate these processes treat the solvent, counter ions, and protein molecules as a set of finite volume charges with springs connecting bonded atoms. 1–4 The explicit representation of water allows for an accurate treatment of dielectric screening between solvent-exposed charges and the lack of this screening for buried charges. Implicit solvation approaches such as Poisson-Boltzmann (PB) equation and approximate generalized Born form attempt to account for this behavior by two different dielectric constants (typically 1 for the interior of the protein and 78 for water) and a metric of solvent exposure to determine the level of charge-screening. 5,6 If built from explicit solvent simulation data, bottom-up coarse-graining procedures such as the multiscale coarse-graining (MS-CG) 7,8 and relative entropy 9–11 can implicitly capture solvation effects using the pairwise (or higher) potentials of mean force (PMF) between coarse-grained sites. These approaches require fine-grain simulation data of the exact system of interest and thus are unable to capture transferable behavior. We utilize the importance of charge in these types of systems to develop CG models that provide a consistent electrostatic description of the underlying all-atom model. Charge-fitting for transferable molecular mechanics models dates back to deriving partial charges on atoms from quantum chemical ESPs. The pioneering work by Bayly

2

ACS Paragon Plus Environment

Page 2 of 30

Page 3 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

et al. used a restrained grid-based least squares approach to calculate optimal atomistic point charges based on quantum mechanically derived ESPs. 12 Bayly et al. noted, however, that grid-based approaches suffer from a number of issues including a poor fit for buried charges. 12 The use of multiple molecular conformations and charge magnitude penalties during the fitting process improved the fit to buried charges. This restrained ESP charge-fitting method has proven to be extremely robust and is still used to derive atomistic charges for aaMD simulations over 20 years later. An analogous protocol for deriving charges on CG sites from underlying all-atom point charges is presented here. The majority of previous approaches to CG charge-fitting use a lump charge approach in which atoms are assigned to specific CG sites and the charge on the CG site is the sum of the constituent atom charges. 13–18 Terakawa et al. note that this approach does not guarantee a good match to the all-atom electrostatic potential due to the length of charged amino acid side-chains and the possibility of intramolecular hydrogen bonding. 19 Instead, Terakawa et al. employ a grid-based charge-fitting approach in which they fit CG charges to the screened all-atom PB ESP around a set of globular proteins. 19 While this approach is designed to capture aspects of solvent screening, it requires restraints to conserve overall charge and suffers from grid-based convergence issues. Additionally, the approach of Terakawa et al. assume a Poisson-Boltzmann electrostatic screening of the all-atom ESP is fit by a Debye-H¨ uckel CG ESP. The method presented here takes the approach of fitting a vacuum all-atom ESP to a vacuum CG ESP with the anticipation of applying the appropriate charge-screening function in the CG model simulation. We show that this approach is able to reproduce both the vacuum and Debye-H¨ uckel all-atom ESPs with good accuracy. The computational efficiency of the charge-fitting approach that we present allows us to simultaneously optimize CG site charge and position based on the same residual. We show this to be an extremely important step to accurately reproduce the all-atom ESP and it is one that has been ignored in previous charge-fitting procedures. CG site placement, however, has been considered based on matching structural 20–24 or dynamic 25–28 aspects of the underlying all-atom system. Our approach to optimizing CG

3

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 30

site position differs from these approaches because we are simultaneously optimizing site-position and interaction potential. This is analogous to optimizing the relative entropy or MS-CG residuals with respect to site position which would be challenging due to the large computational cost of computing these residuals. Recent work by Rudzinski et al. has made headway in this regard but only compares a discrete number of mappings rather than performing a direct minimization of site position. 29 Additionally, our CG site position optimization does not require a mapping operator as is required in the MS-CG formalism as the contribution from all of the atoms is considered for all CG sites based on the formulation of our residual. In the subsequent section we frame the problem, derive our solution and place grid-based integration into the same framework. Examples are provided to show that grid-based integration converges to the analytic solution in the limit of a high density (Section 3.2) infinite box volume (Section 3.1). The amino acid derived examples also highlight the need to optimize CG site placement (Sections 3.2 and 3.3).

2

Methods

We are interested first in determining the optimal set of coarse-grained (CG) charges, Q, at given CG positions, R, that approximate the electrostatic potential (ESP) due to all-atom (AA) charges, q, at positions, r. This is formulated as a minimization problem of the residual χ2 :

2

χ (r, q, R, Q) =

Z

|φAA (r, q, ~x) − φCG (R, Q, ~x)|2 d~x

(1)

where φAA (r, q, ~x) is the AA ESP and φCG (R, Q, ~x) is the CG ESP both evaluated at position ~x. We dictate that the AA and CG sites are simple point monopoles, implying

4

ACS Paragon Plus Environment

Page 5 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

that: n X

φAA (r, q, ~x) =

i=1 N X

φCG (R, Q, ~x) =

i=1

qi 4πǫ0 k~x − ~ri k Qi ~ ik 4πǫ0 k~x − R

(2)

(3)

where we have n AA sites and N CG sites. It should be noted that this form of the electrostatic potentials implies that the integral in Equation 1 (over R3 ) is finite only if the total monopole moments of the AA and CG charge distributions are identical: P P qi = Qi .

Minimizing the above residual with respect to coarse-grained charges, Q, yields a

set of N linear equations, the kth of which has the form: n X

~ k) = qi f (~ri , R

i=1

N X

~ i, R ~ k) Qi f ( R

(4)

i=1

where ~ k) = f (~ri , R

Z

1 1 · d~x. ~ kk k~x − r~i k k~x − R

(5)

The solution to the above integral is the crux of the problem. It is at this juncture that grid-based approaches and our analytic solution differ.

2.1

Analytic Solution

Our approach is to restrict the inner product integral in Equation 5 to k~xk ≤ X, and subsequently to take the limit as X → ∞. The bounded integral has the solution: Z

kxk 10 ˚ A), the ESP begins to decay and becomes a

10 R

potential beyond 20 ˚ A. The grid, grid with Lagrange

multiplier and analytic solutions all qualitatively capture this behavior. The lump charge approach over-predicts the ESP in the vicinity of the protein but approaches

10 R

asymptotically due to net charge conservation. The grid with Lagrange multiplier and analytic solutions track each other perfectly throughout the entire domain of the data suggesting that a cubic grid with side length of 60 ˚ A is large enough to converge the grid with Lagrange multiplier solution. The grid method, on the other hand, deviates from the analytic solution due to a lack of net charge conservation. Increased weight of the near-field in the grid approach leads to improved agreement with the atomistic ESP at radial distances less than 17 ˚ A as compared to the analytic solution. Much like in the lysine example, the fit to the atomistic ESP can be improved by minimizing the residual in Equation 16 with respect to both CG charge and CG site placement. With optimized CG sites, the CG ESP (orange curve in Figure 5) follows the atomistic ESP to within 10 kT/e for all radial distances, outperforming the grid method in the near field and asymptotically agreeing with the all-atom ESP due to charge conservation. This result demonstrates that the asymptotically correct charges derived from the analytic solution presented here can accurately describe both the nearand far-fields of the all-atom ESP when the positions of the CG sites are optimized. Maps of the all-atom and CG ESPs, similar to those of lysine in Figure 4, show that the CG site charge and position optimization provide an excellent qualitative description of the near-field of the protein (see Figure SI3). To model how the vacuum derived CG charges reproduce the ESP in a solvent like environment we compute the ESP due to the Debye-H¨ uckel charge-screening model.

18

ACS Paragon Plus Environment

Page 19 of 30

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ID 1R6R 34 ) which is composed of 80 amino acids (1382 atoms). The secondary and tertiary structure of this protein is similar to its West Nile analog but it has a larger overall charge of +18e. CG sites are placed at the COM of each of the 80 amino acids and the four CG charge-fitting approaches are applied. The resulting vacuum ESPs as a function of radial distance from the protein COM are given in Figure 6(a). The behavior is very similar to that of the West Nile analog in that the grid with Lagrange multiplier and analytic solutions again track each other and are consistent with the atomistic ESP at radial distances greater than 17 ˚ A. The best CG approach is to optimize the analytic residual with respect to charge and site-placement resulting in the orange curve in Figure 6(a). Similar to the result for the West Nile protein example, optimizing CG site placement is found to be extremely important in reproducing the atomistic ESP. Similar trends are noted for the Debye-H¨ uckel screened ESP plotted in Figure 6(b). While it is insightful to look at the radial ESP, the proteins are not actually spherically symmetric so it is useful to probe the ESP in another way. We compute the dimerization electrostatic potential energy from the experimental structures of the protein dimers in vacuum by direct summation of the Coulomb interactions between the two monomers. In addition to the two proteins discussed above, the dimerization electrostatic potential energy is computed for eight other aggregating proteins (PDB IDs 1QGT, 35 2WQH, 36 2TMV, 37 4XHT, 38 4Y2F, 4ZMK, 39 4ZOU, 40 and 5COS 41 ). The results for the analytic CG charges placed on amino acid COM sites and on optimized sites are plotted versus the atomistic values in Figure 7. Both CG site placement models qualitatively capture the atomistic trends and optimizing the position of the CG sites improves the quantitative agreement between the CG and atomistic potential energies for all but three systems (PDBIDs 4XHT, 5COS, and 4Y2F). An average reduction in accuracy of 5% is observed for these three systems while optimizing CG site positions yields an average improvement of 30% for the other seven systems. The agreement between the atomistic model and the optimized CG position model improves with increasing electrostatic potential energy due to the increasing dominance of the

20

ACS Paragon Plus Environment

Page 20 of 30

Page 21 of 30

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

monopole moment repulsion between the two monomers. The dramatic improvement seen in optimizing the CG positions at an atomistic potential energy of ∼ 1500 kcal/mol (PDB ID 2WQH) is due to the presence of four lysines at the protein-protein interface that are particularly poorly represented by COM mapping as described in the previous example. Overall, the agreement between the dimerization electrostatic potential energy of the charge-fit model with optimized positions with the atomistic model suggests that this approach can be used to study the charge component of protein-protein aggregation.

4

Conclusions

An analytic solution to coarse-grained charge-fitting based on an electrostatic potential matching residual is presented and validated. This approach is found to have a number of benefits over grid-based solutions including guaranteed charge conservation, a lack of convergence issues and a significant reduction in computational cost. The latter benefit is leveraged to minimize the same residual with respect to CG site placement. Optimizing CG site placement is found to be an extremely important step to accurately approximate the all-atom ESP. The charge and placement optimization protocol was found to approximate the radial ESP and dimerization electrostatic potential energy of the all-atom models with high accuracy. While the results from our approach are promising, a CG model that best approximates the AA ESP is only one component needed to construct a CG model that can accurately capture aggregation behavior. Additional work is necessary to determine the appropriate dielectric screening model for aqueous aggregation, but the fact that our approach accurately captures both the vacuum and the Debye-H¨ uckel ESP behavior is promising. The treatment of internal degrees of freedom for the CG particles must also be addressed. Beyond this, the hydrophobic contributions to aggregation behavior must be investigated.

22

ACS Paragon Plus Environment

Page 22 of 30

Page 23 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

5

Appendix A: Proof of Integral Solution

Let a, b be two points in real Euclidean space, R3 , and let X be a large positive scalar. Define the bounded inner-product integral

Ia,b (X) =

Z

kxk≤X

dx , kx − ak · kx − bk

and let Da,b (X) = Ia,b (X) − I0,0 (X), where I0,0 (X) = 4πX. The goal is to show that Ia,b (X) = 4πX − 2πka − bk + O(X −1 ) for large X, or, equivalently, that Da,b (∞) = −2πka − bk. The proof is in two stages. We first show that Da,b (X) has a finite limit Da,b (∞) satisfying Dga,gb (∞) = Da,b (∞) for every transformation g : R3 → R3 in the set of Euclidean congruences, i.e., translation, rotation and reflection. Since the Euclidean norm is a maximal invariant, Da,b (∞) is a function of ka − bk. Second, for all scalar multiples λ, we show that the difference satisfies Dλa,λb (∞) = |λ|Da,b (∞), which implies Da,b (∞) ∝ ka − bk. To begin the proof, a change of variables x′ = a − x gives Z 

 1 1 Da,a (∞) = dx − 2 kxk2 R3 kx − ak  Z  1 1 = dx′ − ′ ′ 2 kx − ak2 R3 kx k = −Da,a (∞),

implying Da,a (∞) = 0 for every a. For Euclidean congruences, the transformation

23

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

x = gx′ = g1 (x′ − c) with orthogonal g1 gives

Dga,gb (∞) = = = =

Z

ZR

ZR ZR

3

3

3

R3

1 1 − dx kx − gak · kx − gbk kxk2 1 1 − dx′ ′ ′ kgx − gak · kgx − gbk kgx′ k2 1 1 − ′ dx′ ′ ′ kx − ak · kx − bk kx − ck2 1 1 1 1 − + − dx′ kx′ − ak · kx′ − bk kx′ k2 kx′ k2 kx′ − ck2

= Da,b (∞) − Dc,c (∞) = Da,b (∞),

implying that Da,b (∞) is a function of the Euclidean distance ka − bk, for example, a polynomial in ka − bk. Finally, the transformation x = λx′ for scalar λ, positive or negative, with Jacobian |λ|3 gives Z 

 1 1 dx Dλa,λb (∞) = − kxk2 R3 kx − λak · kx − λbk  Z  1 1 = − 2 ′ 2 |λ|3 dx′ 2 ′ ′ λ kx k R3 λ kx − ak · kx − bk = |λ| Da,b (∞),

implying Da,b (∞) ∝ ka − bk. This argument does not establish the proportionality constant, (−2π), but the Cauchy-Schwartz inequality q Ia,b (X) ≤ Ia,a (X) Ib,b (X) = 4πX + O(X −1 ) implies that the constant is negative, which is sufficient for optimization purposes. Hence, for any configuration of points a1 , . . . , an in R3 , the n × n inner product matrix of negative Euclidean distances Aij = −kai − aj k is positive definite on contrasts. In

24

ACS Paragon Plus Environment

Page 24 of 30

Page 25 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

other words, χ2 = 2πq ′ Aq = −2π

X

kai − aj k qi qj ≥ 0

ij

provided that the net charge is zero,

Pn

i=1 qi

= 0. For application to coarse-grain

approximation, a1 , . . . , an is the combined list of AA and CG sites, while q is the combined list of AA charges and negative CG charges.

Acknowledgement MM would like to thank Colorado State University for start-up funding.

6 6.1

Associated Content Code

All code used to generate the data in this paper can be found at: http://github. org/mccullaghlab/Coarse-grained-Charge-Fitting.

Supporting Information Available Additional information is provided on the derivation of ensemble average of the analytic charge-fitting procedure as well as a full matrix representation of the Lagrange grid charge-fitting procedure. A representation of the ESP around a protein from the allatom model and the analytic CG fit charge models is also provided. This material is available free of charge via the Internet at http://pubs.acs.org/.

References (1) Gebremichael, Y.; Chu, J.-W.; Voth, G. A. Biophys. J. 2008, 95, 2487–2499.

25

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(2) Pfaendtner, J.; Lyman, E.; Pollard, T. D.; Voth, G. A. J. Mol. Biol. 2010, 396, 252–263. (3) Saunders, M. G.; Voth, G. A. J. Mol. Biol. 2011, 413, 279–291. (4) Zhao, G.; Perilla, J. R.; Yufenyuy, E. L.; Meng, X.; Chen, B.; Ning, J.; Ahn, J.; Gronenborn, A. M.; Schulten, K.; Aiken, C.; Zhang, P. Nature 2013, 497, 643– 646. (5) Bashford, D.; Case, D. A. Annu. Rev. Phys. Chem. 2000, 51, 129–152. (6) Onufriev, A.; Bashford, D.; Case, D. A. J. Phys. Chem. B 2000, 104, 3712–3720. (7) Izvekov, S.; Voth, G. A. J. Phys. Chem. B 2005, 109, 2469–2473. (8) Izvekov, S.; Voth, G. A. J. Chem. Phys. 2005, 123, 134105. (9) Shell, M. S. J. Chem. Phys. 2008, 129, 144108. (10) Chaimovich, A.; Shell, M. S. J. Chem. Phys. 2011, 134, 094112. (11) Rudzinski, J. F.; Noid, W. G. J. Chem. Phys. 2011, 135, 214101. (12) Bayly, C. I.; Cieplak, P.; Cornell, W.; Kollman, P. A. J. Phys. Chem. B 1993, 97, 10269–10280. (13) Kim, Y. C.; Hummer, G. J. Mol. Biol. 2008, 375, 1416–1433. (14) Azia, A.; Levy, Y. J. Mol. Biol. 2009, 393, 527–542. (15) Zarrine-Afsar, A.; Zhang, Z.; Schweiker, K. L.; Makhatadze, G. I.; Davidson, A. R.; Chan, H. S. Proteins 2012, 80, 858–870. (16) Okazaki, K.-i.; Sato, T.; Takano, M. J. Am. Chem. Soc. 2012, 134, 8918–8925. (17) Chu, X.; Wang, Y.; Gan, L.; Bai, Y.; Han, W.; Wang, E.; Wang, J. PLoS Comput. Biol. 2012, 8, e1002608.

26

ACS Paragon Plus Environment

Page 26 of 30

Page 27 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(18) Cui, H.; Mim, C.; V´ azquez, F. X.; Lyman, E.; Unger, V. M.; Voth, G. A. Biophys. J. 2013, 104, 404–411. (19) Terakawa, T.; Takada, S. J. Chem. Theory Comput. 2014, 10, 711–721. (20) Tsch¨ op, W.; Kremer, K.; Hahn, O.; Batoulis, J.; B¨ urger, T. Acta Polym. 1998, 49, 75–79. (21) Arkhipov, A.; Freddolino, P. L.; Schulten, K. Structure 2006, 14, 1767–1777. (22) Harmandaris, V. A.; Reith, D.; van der Vegt, N. F. A.; Kremer, K. Macromol. Chem. Phys. 2007, 208, 2109–2120. (23) Zhang, Z.; Voth, G. A. J. Chem. Theory Comput. 2010, 6, 2990–3002. (24) Sinitskiy, A. V.; Saunders, M. G.; Voth, G. A. J. Phys. Chem. B 2012, 116, 8363–8374. (25) Gohlke, H.; Thorpe, M. F. Biophys. J. 2006, 91, 2115–2120. (26) Stepanova, M. Phys. Rev. E 2007, 76, 051918. (27) Zhang, Z.; Lu, L.; Noid, W. G.; Krishna, V.; Pfaendtner, J.; Voth, G. A. Biophys. J. 2008, 95, 5073–5083. (28) Guttenberg, N.; Dama, J. F.; Saunders, M. G.; Voth, G. A.; Weare, J.; Dinner, A. R. J. Chem. Phys. 2013, 138, 094111. (29) Rudzinski, J. F.; Noid, W. G. J. Phys. Chem. B 2014, 118, 8295–8312. (30) MacKerell, A. D.; Bashford, D.; Bellott,; Dunbrack, R. L.; Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.; Joseph-McCarthy, D.; Kuchnir, L.; Kuczera, K.; Lau, F. T. K.; Mattos, C.; Michnick, S.; Ngo, T.; Nguyen, D. T.; Prodhom, B.; Reiher, W. E.; Roux, B.; Schlenkrich, M.; Smith, J. C.; Stote, R.; Straub, J.; Watanabe, M.; Wi´orkiewicz-Kuczera, J.; Yin, D.; Karplus, M. J. Phys. Chem. B 1998, 102, 3586–3616.

27

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(31) MacKerell, A. D.; Feig, M.; Brooks, C. L. J. Comput. Chem. 2004, 25, 1400–1415. (32) Cao, Z.; Voth, G. A. J. Chem. Phys. 2015, 143, 243116–11. (33) Dokland, T.; Walsh, M.; Mackenzie, J. M.; Khromykh, A. A.; Ee, K.-H.; Wang, S. Structure 2004, 12, 1157–1163. (34) Ma, L.; Jones, C. T.; Groesch, T. D.; Kuhn, R. J.; Post, C. B. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 3414–3419. (35) Wynne, S. A.; Crowther, R. A.; Leslie, A. G. Mol. Cell 1999, 3, 771–780. (36) Krachler, A. M.; Sharma, A.; Kleanthous, C. Proteins 2010, 78, 2131–2143. (37) Namba, K.; Pattanayek, R.; Stubbs, G. J. Mol. Biol. 1989, 208, 307–325. (38) Xie, S.; Mortusewicz, O.; Ma, H. T.; Herr, P.; Poon, R. Y. C.; Helleday, T.; Qian, C. Mol. Cell 2015, 60, 163–176. (39) Deng, W.; Wu, J.; Wang, F.; Kanoh, J.; Dehe, P.-M.; Inoue, H.; Chen, J.; Lei, M. Cell Res. 2015, 25, 881–884. (40) Yin, J.; Wan, B.; Sarkar, J.; Horvath, K.; Wu, J.; Chen, Y.; Cheng, G.; Wan, K.; Chin, P.; Lei, M.; Liu, Y. Nucleic Acids Res. 2016, 44, 4871–4880. (41) Jensen, J. L.; Balbo, A.; Neau, D. B.; Chakravarthy, S.; Zhao, H.; Sinha, S. C.; Colbert, C. L. Biochemistry 2015, 54, 5867–5877.

28

ACS Paragon Plus Environment

Page 28 of 30

Page 29 of 30

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table of Contents graphic. Coarse-grained charges and positions are determined using an analytic approach to match the coarse-grained and all-atom electrostatic potentials. 88x34mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 30 of 30