Are AMBER Force Fields and Implicit Solvation Models Additive? A

Oct 12, 2016 - A Folding Study with a Balanced Peptide Test Set ... as routes to significantly increase the speed and capabilities of biomolecular sim...
1 downloads 0 Views 1MB Size
Subscriber access provided by CORNELL UNIVERSITY LIBRARY

Article

Are AMBER force fields and implicit solvation models additive? A folding study with a balanced peptide test set Melina K. Robinson, Jacob I. Monroe, and M. Scott Shell J. Chem. Theory Comput., Just Accepted Manuscript • DOI: 10.1021/acs.jctc.6b00788 • Publication Date (Web): 12 Oct 2016 Downloaded from http://pubs.acs.org on October 16, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 40

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Native (Top Cluster)

Native (Average)

Extended (Top Cluster)

Extended (Average)

Average RMSD (Å)

7 6

helical peptides

5 4 3 2 1 0

Average RMSD (Å)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

6

hairpin peptides

5 4 3 2 1 0 igb5* igb8 igb5* igb8 igb5* igb8 igb5* igb8

ff14SB ff14SBonlysc ff14ipq Force Field

ff96

ACS Paragon Plus Environment

Page 2 of 40

Page 3 of 40

Best, Native Worst, Native

Average RMSD (Å)

10

Best, Extended Worst, Extended

2I9M (helical)

8 6 4 2 0

Average RMSD (Å)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

1E0Q (beta sheet)

8 6 4 2 0

0

10

20

30 40 Time (ns)

50

ACS Paragon Plus Environment

60

Journal of Chemical Theory and Computation

Avg. sec. structure %

70 60

Avg. sec. structure %

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

60

helical peptides

50 40 30 20 10 0

Alpha Helix

hairpin peptides

Beta Sheet

50 40 30 20 10 0 igb5* igb8 igb5* igb8 igb5* igb8 igb5* igb8

ff14SB ff14SBonlysc ff14ipq Force Field

ff96

ACS Paragon Plus Environment

Page 4 of 40

Avg. percent top cluster

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Avg. percent top cluster

Page 5 of 40

80 70 60 50 40 30 20 10 0 70 60 50 40 30 20 10 0

helical peptides

hairpin peptides

Native

Extended

igb5* igb8 igb5* igb8 igb5* igb8 igb5* igb8

ff14SB ff14SBonlysc ff14ipq Force Field

ff96

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1.0 Avg. no. of salt bridges

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Native

Extended

0.8 0.6 0.4 0.2 0.0 igb5* igb8 igb5* igb8 igb5* igb8 igb5* igb8

ff14SB ff14SBonlysc ff14ipq Force Field

ff96

ACS Paragon Plus Environment

Page 6 of 40

Page 7 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

TOC Image 77x44mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Are AMBER force fields and implicit solvation models additive? A folding study with a balanced peptide test set Melina K. Robinson,† Jacob I. Monroe, † M. Scott Shell* Department of Chemical Engineering, University of California – Santa Barbara October 5, 2016

Abstract: Implicit solvation models have long been sought as routes to significantly increase the speed and capabilities of biomolecular simulations. However, it has not always been clear that force fields developed independently of solvation models can together accurately predict secondary structure and folding, and whether the separate influences of the solvation and force field models can be described as independent and additive (versus synergistic). Here, we test two implicit solvation models with several recently-developed protein force fields, within the AMBER simulation package. We create a representative set of five helical and five hairpin peptides, 11-20 amino acid residues in length, and calculate folded structures using replica exchange molecular dynamics simulations for all force field / solvent / peptide combinations, each with two instances using distinct starting configurations. In general, we find that no force field / solvent combination successfully folds all peptides, and that the hairpin peptides are more difficult to capture. That being said, the older ff96/igb5* combination does a reasonable job in folding multiple secondary structures, while ff14SB/igb5* and ff14ipq/igb8 work well for helical and hairpin motifs, respectively. All combinations give rise to similar numbers of salt bridges, except for solvent models paired with ff14ipq, which slightly enhances them. Interestingly, we are unable statistically to decouple the effects of force field, solvent model, and peptide secondary structure on performance, such that particular combinations can have specific effects. These results suggest that future efforts might benefit from co-development of implicit models with force fields, or from the use of emerging coarse-graining strategies that extract solvation effects in a bottom-up manner.



These authors contributed equally to this work.

*Corresponding author: [email protected]

ACS Paragon Plus Environment

Page 8 of 40

Page 9 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Introduction Computer simulations of peptides and proteins have been critical to understanding their structures and in turn functions, but have required an extraordinary effort to develop accurate but computationally tractable models – two goals that often seem at odds. Starting with the work of Levitt and Warshel1, numerous protein models have been proposed in order to achieve correctly folded structures, with tradeoffs between accuracy, resolution, and tractability. Using extensive computational resources, a number of atomistic simulations of small proteins have been shown to correctly reproduce folded structures starting from unfolded states2–6. Implicit solvation models have been of particular interest as they replace explicitly-represented water with effective intraprotein interactions and thus dramatically reduce simulation demands2,7–11. Indeed, removing the solvent degrees of freedom is among the most desirable ways to reduce simulation times, while retaining full protein resolution and in principle using a force field that is thermodynamically consistent with the original, explicit-water all-atom water model. However, while excellent theoretical and algorithmic advances have occurred12–15, there seems yet to exist a gold standard for implicit solvent simulations, even in approach. Moreover, it has not been clear how protein intramolecular force field parameterizations interact with and are influenced by implicit models. The aim of this work is to perform a simple test on several recent force fields in combination with two implicit solvation models, assessing the ability of each pairing to correctly fold a diverse set of short helical and hairpin peptides. This is distinguished from some recent efforts in force field development, which have focused on obtaining the correct conformational ensembles and behavior of intrinsically disordered peptides and unfolded states in general10,16–20. Accurate reproduction of such disordered states is particularly important if predictions of the mechanisms and kinetics of folding are sought3,5,21. Additionally, any temperature dependence of unfolding requires the correct relative stabilities of the full ensemble of folded and unfolded states.8,22–24 However, -helices and -sheets are the most common secondary structures found within proteins and have historically been difficult to balance with many force fields in both explicit and implicit solvent11,25–29. Since this still appears to be a predominant issue in the field29, we focus on assessing this balance in model peptide systems that have well-defined folded and secondary structures. We choose a particular set of test peptides (see Table 1) based on two criteria. First, they all exhibit a well-defined helical or hairpin fold, but their (relatively short) length allows replica

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 40

exchange simulation methods to achieve extensive sampling and equilibration with a reasonable computational effort. Second, we include an equal number of helical and hairpin peptides, so as to explore the range of potential helical or beta bias. The use of a cohort of test systems also reduces potential investigator bias in making interpretations based on the particulars of one or two peptide folding studies. We seek to characterize not only the performance of these models, but to identify the common ways by which the implicit solvation models influence the force field (e.g., are there ‘good’ or ‘bad’ solvent models across the board, or is there non-additive behavior or synergy with the force field?). Optimization of protein and peptide models of this sort clearly requires consideration of two components: the force field itself and the implicit solvent model used. In some cases, these models are developed in tandem8,30, but more often each is developed independently. In fact, it is common to use explicit solvent simulations as part of the force field optimization procedure17,28,31–34, and the practice has been to associate a particular water model with each protein force field – yet such an association is then completely lost when considering implicit solvation. We focus here on force fields and implicit solvation models that are a part of the AMBER molecular dynamics simulation package, with the hope of achieving better cooperativity between these models than might be expected by mixing force field and implicit solvent implementations from different packages. Protein force field models contain descriptions of and parameters pertaining to all possible bonded and non-bonded interactions between allowed atom types. Implicit solvent models provide a continuum-level effect of water molecules on a solute. In principle, it is possible to exactly decouple the effect of an explicit solvent into an effective intra-solute interaction free energy, to obtain a thermodynamically rigorous implicit model. If there are 

solute atoms embedded in a sea of  solvent atoms, the total potential energy can be separated into solute-solute (X-X), solute-solvent (X-S) and solvent-solvent (S-S) interactions:  ,  = XX  +   ,  +  

(1)

where  gives the total potential energy, and  and  are the coordinates of the solute and solvent atoms, respectively. Then, the total configurational partition function, 

       

 =    XX    , can be rewritten in terms of the solute atoms alone:

ACS Paragon Plus Environment

 

(2)

Page 11 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation





 =        

(3)

Here,  = 1⁄ ! as usual and #  is the effective solvent-mediated interaction free energy obtained from integrating out the solvent degrees of freedom,        

#  = − ! ln     ,



(4)

In principle, with a reference explicit solvent model, Eqn. (4) provides the ideal interaction that is sought to be modeled by any implicit strategy. However, it is a major challenge to use this approach directly as it is generally a highly multibody interaction that is not easily described by simple theoretical forms nor implemented in a computationally efficient manner. Instead, the conventional tactic in implicit models is to take a more physically-motivated approach by separately modeling solvation electrostatics and then including a surface area term for non-polar solvation energetics. The surface area term technically absorbs many proteinsolvent terms, including van der Waals interactions, the penalty for cavity formation, hydrophobic interactions, and solvent entropy effects. However, all are usually approximated through a macroscopic approach that uses an effective surface tension with the molecular surface area. The electrostatic term is more complicated and is generally modelled by considering separate dielectric mediums for the interior of the protein and the solvent. Such a scheme is well represented by numerical solution of the Poisson-Boltzmann (PB) equation; however, it is often more convenient and computationally efficient to employ an analytical approximation. This Generalized Born (GB) approach first proposed by Still and coworkers35 assigns each atom an intrinsic radius, usually close to that described by its LJ parameters, and a then-calculated Born radius that describes the extent of its burial in the protein and exposure to solvent. An efficient, pairwise method for approximating the Born radii, was first proposed by Hawkins, Cramer, and Truhlar36,37. Improvements have since focused on increasingly accurate calculations of Born radii13,38,39. One popular such method in the AMBER simulation package is due to Onufriev, Bashford, and Case38, denoted igb5 according to the AMBER input options. However, this approach has been found to over-stabilize salt-bridges7,40 and subsequent authors have suggested a slight modification in which the intrinsic radii of hydrogens attached to charged nitrogen atoms are decreased41 to correct for this. This has been our approach in earlier work25,42,43 and we

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 40

continue to use it here, which we denote as the model igb5*. We also test a more recent model, igb8, that already contains a revised set of intrinsic radii.13 The igb8 implicit solvent also introduces an expanded parameter to refine the “neck” correction of Mongan and coworkers in calculating Born radii39, following the same functional form but utilizing a new fitting procedure and more extensive training set data. The AMBER protein force fields have evolved largely independently of their implicit solvation model counterparts. The first AMBER force field, ff94, was developed by performing detailed quantum mechanical calculations of glycine and alanine dipeptides in vacuum44. The result was found to overemphasize helical behavior11,45, and thus the backbone dihedral terms were empirically corrected, yielding the ff96 force field46. Subsequently, many reparametrizations of the AMBER family of force fields have been performed8,17,28,31–34,47, though we do not present an exhaustive list here. The main focus until recently has been to perform detailed quantum mechanical calculations and subsequently modify backbone torsions in order to balance helix and hairpin preference, mostly so that simulations in explicit solvent match with experimental measures of distances and secondary structure preference. Such optimization is rare in the context of implicit solvent8,30,48. Later modifications of force fields, such as for ff99SBILDN34, include new torsion terms for side chains as well. In the recently developed ff14SBonlysc model31, side chains of all amino acids were completely re-derived. The ff14SB model includes these new parameters, but also includes a backbone correction based on simulations of (Ala)5 in TIP3P49 explicit water. The ff14ipq force field32 presents a new approach in that quantum calculations are no longer performed in the gas phase, but in a reaction field determined by expected configurations of TIP4P-Ew50 water molecules. In this methodology, all backbone and side chain torsions are determined in conjunction with IPolQ charges51. It should be noted that none of these newer force field developments (ff14SBonlysc, ff14SB, and ff14ipq) make use of experimental data, such as scalar couplings or chemical shift deviations, to directly parametrize the force field, as implemented in other recent developments28,52,53. Newly released with AMBER1454, the ff14SBonlysc, ff14SB, and ff14ipq force fields have not yet been extensively tested. Here we compare these alongside the much older ff96 force field, which we have found to be particularly successful in folding small peptides when used with implicit solvation models, as described in Shell, Rittersen, and Dill25, and in Lin and Shell42. While a study of ff14SBonlysc was performed in implicit solvent for larger peptides and

ACS Paragon Plus Environment

Page 13 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

proteins4, to our knowledge the ff14SB and ff14ipq force fields have only been tested in explicit solvent. Since both force fields were parametrized with explicit solvent, it is especially interesting to track their performance in implicit solvent simulations. Can force fields developed independently of implicit solvation models accurately predict secondary structure? Further, do any of these force fields or implicit solvation models innately bias peptides towards helical or hairpin structures?

Methods Table 1 shows the 10 peptides, of lengths 11-20 residues, that we use to examine the performance of several force fields and implicit solvent models included in the AMBER14 package. We focus on five helical and five beta hairpin peptides to investigate the balance of secondary structure propensities. Reference structures for 7 of these peptides are obtained from the Protein Databank (PDB), including: 1CB3, 1E0Q, 1GB1, 1HRX, 1L2Y, 1J4M, and 2I9M. The remaining three peptides are 15-β, C-peptide, and EK peptide. The native structure of 15-β was obtained through private communication with the authors of Santiveri et al.55. This peptide was also studied by Kim et al.24, who used replica exchange molecular dynamics to produce a fold similar to the one used here. The C-peptide is a helical fragment of the protein ribonuclease A that exhibits partial helicity at 0°C as studied by Baldwin and colleagues56. The EK peptide was also studied by Baldwin and coworkers57 and is characterized by the repeating AEAAKA sequence that gives rise to helical character. We use the two-repeat sequence in our studies, which is estimated to have 40% helicity at 270 K. Because we focus here on single native structures, and not on full conformational ensembles and populations, for both the C- and EK peptides we use an ideal, energy-minimized helix as the reference structure. The four studied AMBER force fields are ff14SB, ff14SBonlysc, ff14ipq, and ff96. The ff96 force field is an older parameterization that we have found to perform well for both α and β secondary structures in a range of peptides25. In the present study, we test two implicit solvation models, igb5 and igb8, with each of the force fields. For the igb5 implicit solvent, we include a correction to minimize the over-stabilization of salt bridges that had been found with this model41, denoted igb5* in our study. We do not employ this correction with igb8 because the solvent model already includes modifications to correct for the over-stabilization present in igb5.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 40

N and C terminal caps, ACE and NME, respectively, are added to the ends of the peptides as is consistent with previous peptide folding studies25,58,59.

Potential Number of

Peptide

Sequence

% Alpha

% Beta

15 Beta

SESYINPDGTWTVTE

0

53

0

24,55

1CB3

IDYWLAHKALA

36

0

1

60–62

1E0Q

MQIFVKTLDGKTITLEV

0

71

2

8,63–65

1GB1

GEWTYDDATKTFTVTE

0

63

1

7,8,11,24,40,45,58,66–83

1HRX

SWTWENGKWTWK

0

67

1

62,84

1J4M

RGKWTYNGITYEGR

0

43

1

85,86

1L2Y

NLYIQWLKDGGPSSGRPPPS

35

0

1

8,41,87–92

2I9M

SAAEAYAKRIAEAMAKG

71

0

2

24,93

C Peptide

KETAAAKFERQHM

85

0

2

7,8,45,56,94–98

EK Peptide

AEAAKAAEAAKA

83

0

2

8,59,99

Salt Bridges

Refs

Table 1. Peptides studied in this work. Peptide names or PDB codes and sequences are shown. Percentage alpha and beta refer to the secondary structure composition of each native structure. The maximum possible number of salt bridges based on sequence pairing of acidic and basic residues is also shown for each peptide. The references in the rightmost column refer to a sampling of earlier experimental and computational studies of these systems, and are provided for illustration.

We conduct simulations using replica exchange molecular dynamics (REMD) to accelerate sampling, which swaps replicas across a range of temperatures and in effect allows the heating and cooling of conformations while preserving Boltzmann ensembles100. Our simulations use 8-11 replicas per peptide with temperatures between 270 and 400 K, with 5 swap attempts between neighboring replicas per 20 ps simulation cycle. These settings are similar to those used in studies by Chodera and colleagues101 and result in ~40-50% acceptance of swaps for all replicas. We use the SANDER program within AMBER14 to generate peptide trajectories and an in-house Python wrapper to conduct replica swaps and gather data42,43. We run the peptides for 60 ns, allowing them to come to equilibrium during the first 50 ns42, using the last 10 ns for analysis. We perform two REMD simulations for each force field / solvent / peptide case: one with all replicas starting from the native structure and the other initiated from a fully extended

ACS Paragon Plus Environment

Page 15 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

structure (all backbone dihedrals set to 180º). This allows a consistency check to validate the convergence of simulations as the initial structure should not bias the final equilibrium ensemble. We characterize the force fields using multiple metrics computed from the production period of the simulation trajectories: (1) Dominant configurations are identified using an agglomerative, hierarchical k-means-like clustering algorithm based on root-mean-squaredeviation (RMSD).25 The percentage of time that each peptide spends in its dominant configuration – that is, the top cluster population – is an important metric of fold stability. (2) We determine the RMSD from the native structure for both the top cluster as well as the average over all frames from the 10 ns production stage. (3) The average percentage of alpha and beta character for each peptide during the production period is found using the DSSP secondary structure annotation software by Kabsch et al.102 to assign alpha, beta, or random structure states to each residue at each trajectory frame. (4) Finally, we consider the average number of salt bridges formed by the peptides during production. Based on sequence alone, nine of the ten peptides are capable of forming salt bridges in principle.

Results and Discussion

Figure 1. Force field/implicit solvent model combinations vary widely in folding success. The best (right) and worst (left) combinations for each peptide are shown using native and top cluster structures. The native structure is in green, while the simulated one is in blue. The RMSD from the native structure is reported below each case.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 40

All quantities are reported for the last 10 ns

Table 2. Summary of simulation results.

(production period) for the runs in which peptides were initialized in an extended, unfolded structure. Procedures to calculate each quantity are described in Methods. Peptide 15 Beta

1CB3

1E0Q

1GB1

1HRX

Force Field

% Top

/ Solvent

Cluster

Top RMSD (Å)

Avg. RMSD (Å)

% Alpha

%

Avg. Salt

Beta

Bridges

ff14SB/igb5*

26

6.5

6.1

37

1

0

ff14SB/igb8

83

6.6

6.7

53

0

0

ff14SBonlysc/igb5*

70

1.1

2.4

1

37

0

ff14SBonlysc/igb8

52

6.9

5.6

17

8

0

ff14ipq/igb5*

47

1

3

0

29

0

ff14ipq/igb8

37

0.9

2.5

0

47

0

ff96/igb5*

55

1.1

2.7

0

42

0

ff96/igb8

29

8.1

4.7

12

33

0

ff14SB/igb5*

76

2.6

2.5

58

0

0

ff14SB/igb8

67

4.1

3

48

0

0

ff14SBonlysc/igb5*

36

3.3

2.9

29

0

0

ff14SBonlysc/igb8

70

4.1

3

42

0

0

ff14ipq/igb5*

43

3.4

3

36

0

0.5

ff14ipq/igb8

47

2.6

2.9

51

0

0.1

ff96/igb5*

84

2.7

2.6

73

1

0

ff96/igb8

64

5.7

4.1

16

37

0

ff14SB/igb5*

43

6.6

7.1

60

1

0.2

ff14SB/igb8

63

9.7

8.2

72

0

0.2

ff14SBonlysc/igb5*

19

6.4

6.1

19

4

0.4

ff14SBonlysc/igb8

41

5.8

5.1

25

13

0.4

ff14ipq/igb5*

84

3

3

0

51

0.5

ff14ipq/igb8

59

2.2

2.4

0

59

0.7

ff96/igb5*

59

2

2.5

2

54

0.2

ff96/igb8

72

3.7

3.2

0

57

0.5

ff14SB/igb5*

65

5.7

5.9

58

0

0.4

ff14SB/igb8

55

6.4

6.4

60

0

0.6

ff14SBonlysc/igb5*

61

5.5

5.3

36

2

0.2

ff14SBonlysc/igb8

42

6.4

6.1

47

0

0.7

ff14ipq/igb5*

56

6.3

5.6

33

4

0.1

ff14ipq/igb8

77

6.7

5.9

44

2

0.7

ff96/igb5*

40

6.8

6

56

1

0.8

ff96/igb8

69

1.9

2.1

1

62

0.2

ff14SB/igb5*

21

6.9

4.7

33

0

0.4

ACS Paragon Plus Environment

Page 17 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

1J4M

1L2Y

2I9M

C Peptide

ff14SB/igb8

49

6.6

5

28

1

0.6

ff14SBonlysc/igb5*

54

4.4

3.6

12

7

0

ff14SBonlysc/igb8

40

0.6

3

6

31

0.8

ff14ipq/igb5*

46

1.7

2.8

2

17

0.1

ff14ipq/igb8

54

3.6

2.9

0

23

0.7

ff96/igb5*

61

0.8

2

1

41

0.3

ff96/igb8

51

3.4

3

0

51

0.3

ff14SB/igb5*

29

5.7

4.7

25

1

1

ff14SB/igb8

47

5.2

4.8

41

0

0.5

ff14SBonlysc/igb5*

50

4.1

4.1

4

2

1.6

ff14SBonlysc/igb8

70

3.3

3.4

7

17

0.5

ff14ipq/igb5*

48

4.3

3.6

3

8

2.2

ff14ipq/igb8

54

3.3

3.2

0

38

0.8

ff96/igb5*

37

1.7

3

1

40

1.2

ff96/igb8

58

1.6

2.5

0

55

0.9

ff14SB/igb5*

55

2.2

1.6

36

0

0.5

ff14SB/igb8

82

0.7

1.2

37

0

0.8

ff14SBonlysc/igb5*

46

0.7

1.9

30

0

1

ff14SBonlysc/igb8

80

0.8

1.4

32

0

0.9

ff14ipq/igb5*

33

2.6

3.2

25

2

1.2

ff14ipq/igb8

34

2.9

3.2

32

0

0.6

ff96/igb5*

47

1

3

38

1

0.6

ff96/igb8

45

4.2

4.9

30

4

0

ff14SB/igb5*

66

1.6

2.2

68

0

0.7

ff14SB/igb8

70

5.9

5.1

58

0

1.2

ff14SBonlysc/igb5*

60

7.4

5.5

22

0

1

ff14SBonlysc/igb8

84

6.1

5.5

46

1

1.1

ff14ipq/igb5*

28

4.1

5

43

2

1.9

ff14ipq/igb8

44

6.9

6

29

2

1.6

ff96/igb5*

73

1.8

1.7

78

0

0.9

ff96/igb8

44

8.4

8.5

0

47

1.8

ff14SB/igb5*

32

4.2

3.9

18

0

0.8

ff14SB/igb8

45

4.8

3.8

30

0

0.2

ff14SBonlysc/igb5*

48

3.6

3.9

30

0

1.3

ff14SBonlysc/igb8

32

4.9

4.8

11

11

0.5

ff14ipq/igb5*

25

4.7

4.4

7

1

2.2

ff14ipq/igb8

38

7.4

5.5

9

10

0.9

ff96/igb5*

44

5

3.7

58

1

0.9

ff96/igb8

72

7.1

7.3

0

43

1.7

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

EK Peptide

Page 18 of 40

ff14SB/igb5*

67

0.6

1.9

52

0

1.3

ff14SB/igb8

64

0.7

2.1

47

0

1

ff14SBonlysc/igb5*

33

3.2

3

27

0

1

ff14SBonlysc/igb8

32

0.8

3.2

26

0

0.8

ff14ipq/igb5*

52

0.6

2.9

31

0

0.4

ff14ipq/igb8

19

3.7

4

14

1

0.9

ff96/igb5*

78

1

1.9

72

1

0.4

ff96/igb8

46

0.9

4.1

38

26

0.6

Figure 1 shows the best and worst force field/implicit solvent combinations for each of the test peptides, as determined by top-cluster RMSD from the native structure. It is clear that the success of the folding simulations varies widely, as nearly all force field/implicit solvent combinations appear as both best and worst cases for different peptides. Clearly, no single combination is perfectly transferable across all systems studied. This emphasizes the need to examine a relatively large sample of peptides containing a variety of secondary structures, as we do here, to assess the strengths and weaknesses of each force field. Table 2 lists the results for each peptide and force field/implicit solvent combination starting from extended structure. Figures 2-6 show graphically much of this information. As can be seen in Figure 2, the most consistent model combination for all peptides is ff96/igb5*. The error bars in Figure 2 indicate the standard deviation of the performance for each force field over the different peptides, but extreme RMSD values above 9 Å and below 1 Å are also observed in several cases (such as with the best and worst peptides described above). Among both helical and hairpin peptides, the ff96/ig5* model generates one of the lowest average RMSDs. Another combination that yields a low RMSDs for helical peptides is ff14SB/igb5*. Hairpin peptides, on the other hand, are well approximated by ff14ipq/igb8, as well as ff96/igb5*. Moreover, the two models that perform particularly poorly for helical peptides, ff96/igb8 and ff14ipq/igb8, do perform well for hairpin peptides. It is notable that there are not clear differences between the effectiveness of the implicit solvent models, except for the particular force fields ff14ipq and ff96. In helical peptides, igb8 performs worse than igb5* with both of these force fields, whereas it performs equivalently to or better than igb5* in hairpin peptides. In general, most peptides perform well with some force field/implicit solvent combinations but poorly with others. However, 1GB1 and C-peptide are particularly hard to model and yield high RMSDs for nearly all of the models. On the other hand, 1L2Y and EK

ACS Paragon Plus Environment

Page 19 of 40

peptide are very well modeled by most force field combinations. When comparing performance based on RMSD values, it is important to keep in mind that, while a low RMSD is indicative of a well-performing model, variation among low-RMSD simulations is more informative than among high-RMSD simulations. High RMSDs indicate a poor model, but differences between models with high RMSDs are not as meaningful since the simulated structure already differs sufficiently from the known one that it could represent a wide range of misfolds. Difficulties with the C-peptide may not be too surprising, as the stability of its alpha-helical structure was found to strongly depend on pH in experiment56, and the level of dielectric screening in simulation7,45. Part of this helical stability may depend on the relative populations of native and non-native salt bridges7,56,94, which is a particularly difficult interaction to balance with implicit solvation models25,41.

Native (Top Cluster)

Native (Average)

Extended (Top Cluster)

Extended (Average)

Average RMSD (Å)

7 6

helical peptides

5 4 3 2 1 0

Average RMSD (Å)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

6

hairpin peptides

5 4 3 2 1 0 igb5* igb8 igb5* igb8 igb5* igb8 igb5* igb8

ff14SB ff14SBonlysc ff14ipq Force Field

ff96

Figure 2. Average RMSD for force field/implicit solvent combinations from native (blue) and extended (red) structures. The top cluster conformation and the averaged RMSD are reported over the 10 ns analysis time.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

Though the 1GB1 peptide has been thoroughly studied, full characterization of its structure in solution has not been conclusively provided. Experimental measurements of hairpin population at low temperatures have varied between 40%103 and 80%79. In simulations, a variety of structures have been observed, with salt bridges stabilizing or destabilizing the native hairpin7,11,40, and even the presence of transiently stable helical states2,58,77. Even comparisons of our results for ff96/igb5* with past studies employing igb5 are difficult. In our studies, 1GB1 is predicted to be predominantly helical by ff96/igb5*, whereas in previous studies by Shell et al25 it was found that ff96 combined with igb5 correctly predicted a hairpin structure. However, in that study all structures started from the native configuration and only 10 ns of REMD were performed. Interestingly, similar results were obtained by Zhou11, and more recently Shao et al104, who used an effective salt concentration of 0.2 M, which may reduce salt bridge propensity similarly to the igb5* correction. The very recent work of Maffucci and Contini18 found that the ff96/igb5 combination predicts helical structure, similarly to our studies here. While this variety of results suggests that 1GB1 is a difficult test-system for studies of force fields, it further emphasizes the need to test a wide variety of peptide systems. Best, Native Worst, Native

Average RMSD (Å)

10

Best, Extended Worst, Extended

2I9M (helical)

8 6 4 2 0

Average RMSD (Å)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 40

1E0Q (beta sheet)

8 6 4 2 0

0

10

20

30 40 Time (ns)

50

60

Figure 3. RMSD versus time for selected peptides. Illustrative best and worst results are shown for a helical and hairpin case, starting from native and extended structure.

ACS Paragon Plus Environment

Page 21 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Typically, simulations employing the same force field/solvent combination that begin from the extended and native structures perform similarly for each peptide. In any case where RMSD of runs starting from native and extended configurations do not agree, there is the potential for poorly converged results. Figure 3 illustrates how RMSD changes over time for simulations starting from native and extended structures for the best and worst performing force fields for two representative peptides, the helical 2I9M and hairpin 1E0Q. For both peptides, the best force field/implicit solvent combination results in stable native structures, while the worst combination clearly drives the peptide away from native conformations. It is also clear that convergence is fairly rapid (about 20 ns), which is typical in most simulations of this study. This convergence time matches previous implicit solvent simulations for peptides similar to those studied here42. A notable exception is 1GB1 with ff96/igb8. While the extended simulation appears to quickly converge to a low RMSD around 2 Å, the simulation starting from a native structure exhibits a rapid jump from around 2 Å to 7 Å around 18 ns into the trajectory at 270 K. The high RMSD configuration then stays stable as a helix until the end of the simulation. Interestingly, this may demonstrate that ff96/igb8 predicts both hairpin and helical states to be close in stability at 270 K, separated by a large intermediate barrier with little swapping between them, as also observed for ff94 in explicit solvent58. While simulations of ff96/igb5* do not exhibit such behavior, it is possible that a similar scenario is present for this combination, which would explain the wide diversity of results described above. This may generally indicate that, at least for the 1GB1 peptide, ff96 overstabilizes both helix and hairpin secondary structures relative to disordered states, even if the balance between these two common secondary structures seems reasonable. A rigorous test of this hypothesis would require converged relative populations of structures, which likely necessitates simulation lengths beyond the scope of this study. Examining the native and extended top cluster RMSD against each other for each force field/implicit solvent combination (Figure S1) shows relatively good agreement and further demonstrates convergence for most simulations. Occasional discrepancies may be attributed to the relatively small 10 ns analysis window. Over such a short period, fluctuations away from the most stable state can statistically bias the top cluster observed or average RMSD non-negligibly. In fact, Best and Mittal29 found that it can take 200 ns of REMD for the correct populations of configurations to converge. Assuming that most of these fluctuations are due to swaps with

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

higher temperature replicas, we can partially assess this sampling bias by including data from all replicas in the clustering and average RMSD analysis, using a reweighting strategy that calculates the contribution each higher temperature replica makes at the lowest temperature of interest. We find that including appropriately re-weighted configurations from all replicas does not significantly change the cluster percentages or average RMSD for any simulation, suggesting that differences are predominantly attributable to thermal fluctuations and that longer analysis periods are necessary to converge populations quantitatively. Avg. sec. structure %

70 60

Avg. sec. structure %

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

60

helical peptides

50 40 30 20 10 0

Alpha Helix

hairpin peptides

Beta Sheet

50 40 30 20 10 0 igb5* igb8 igb5* igb8 igb5* igb8 igb5* igb8

ff14SB ff14SBonlysc ff14ipq Force Field

ff96

Figure 4. Average secondary structure content for each force field/implicit solvent combination, broken down by helical and hairpin peptides. Averaging across the native structures of peptides, the native helix content for alpha peptides is 62% and beta sheet content for hairpin peptides is 59%, as calculated from Table 1 (shown as dashed lines).

Helical peptides have lower RMSDs than hairpin peptides for the majority of the force field/solvent combinations tested (Figure 2). This may hint at a helical bias for many of these force fields (except for the ff96/igb8 combination, which seems to favor hairpins instead), and is further explored in Figure 4. ff14SB (and to a lesser extent ff14SBonlysc) with either implicit solvent model emphasizes helix formation regardless of the secondary structure of the native peptide. ff96/igb5* and ff14ipq/igb8 are similar in that they are both able to distinguish between

ACS Paragon Plus Environment

Page 22 of 40

Page 23 of 40

helical and hairpin peptides and perform differently in each case. ff96/igb5* strongly encourages helix formation in alpha peptides and is able to make the switch to mostly beta sheet content in hairpin peptides, although it performs better with helices. ff14ipq/igb8 follows an opposite trend in that it does better with hairpins than with helices but is still able to produce a higher percentage of the appropriate secondary structure. All secondary structure percentages are lower than the averages for each peptide type when compared to native percent secondary structure (Table 1). This is expected if the native state is not the only sampled configuration, and would be more suitably compared to experimental circular dichroism spectroscopy or J-coupling data,

Avg. percent top cluster

which is not available for all peptides that we study.

Avg. percent top cluster

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

80 70 60 50 40 30 20 10 0 70 60 50 40 30 20 10 0

helical peptides

hairpin peptides

Native

Extended

igb5* igb8 igb5* igb8 igb5* igb8 igb5* igb8

ff14SB ff14SBonlysc ff14ipq Force Field

ff96

Figure 5. Average population of the dominant configurational cluster. Results from the native and extended structures are shown for helical and hairpin peptides.

We also calculate the average percentage of time that each force field/implicit solvent combination spends in the top cluster, displayed in Figure 5. High top cluster percentages seem to reflect low RMSDs for helical peptides, but less so for hairpin peptides. For example, with helical peptides, ff96/igb5*, ff14SB/igb5*, and ff14SB/igb8 all have high average dominant cluster percentages and result in low RMSDs. These three force field/implicit solvent combinations are the only cases with average top cluster percentages over 55% for helical

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 40

peptides. They display a combined average RMSD of 2.8 Å from native structure, which is lower than any of the average top cluster RMSDs for models with dominant cluster populations below 55%. These trends are visualized in Figure S2. The best models for hairpin peptides, ff96/igb5* and ff14ipq/igb8, do not appear to have higher top cluster percentages than poor-performing models in hairpin peptides. This may further suggest helical bias, but might also accurately reflect generally lower stability of hairpin peptides or structures (which have fewer intramolecular backbone hydrogen bonds to comparable-length helices). When top cluster RMSD is examined against top cluster percentage for hairpin peptides, no notable trends emerge. Both low and high RMSDs are observed with high top cluster percentages across all force field/implicit solvent combinations, indicating that dominant cluster percentages cannot be used predict the effectiveness of these models in hairpin peptides. For lower-populated clusters, there is great variability in the RMSD values observed across peptides. In many cases, peptides with well-performing force field/implicit solvent combinations show the same secondary structure as the native peptide in the majority of nondominant cluster structures. However, in these cases the variability in RMSD from native structure is wide, and occupying the same secondary structure does not necessarily indicate that these lower clusters are good approximations to native structure. In addition, within wellperforming combinations, we frequently see lower clusters that have secondary structure characteristics opposite of native, although this is usually seen at very low cluster percentages. For example, starting from native structure with ff96/ibg8, 1J4M has a top cluster with a low RMSD, but three lower cluster structures show non-hairpin character, with populations all less than 1%. The opposite trend is often seen using poor force field/implicit solvent combinations that yield a dominant cluster with a high RMSD. These cases frequently have the majority of lowerpopulated clusters far from the native structure, but often have a small number of non-dominant clusters with the appropriate secondary structure. Runs from extended structure of 1GB1 are good examples of this across all force fields except ff96/igb8, which performs well in this case. In 1GB1's poor-performing force field/implicit solvent combinations, nearly all clusters have high RMSDs and only two clusters appear with an RMSD from native structure below 3.0 Å throughout these seven models.

ACS Paragon Plus Environment

Page 25 of 40

It is also possible for force field/implicit solvent combinations to produce a wide variety in RMSDs of the clusters to native structure, which is typical of a less-stable folded structure. For example, 1HRX (starting from extended) has clusters with RMSDs 0.6-5.5 Å with ff14SBonlysc/igb8, and 2.52-8.94 Å with ff96/igb8. Still other cases generate clusters with a particularly narrow range of RMSDs. In general, this indicates a very stable conformation, but can either be close to or far from the native structure. The clusters of 1CB3, when simulated from both native and extended structure, fall within a notably narrow range of RMSDs across all force field/implicit solvent combinations. The vast majority of the cluster RMSDs are within 2-5 Å, with no clusters exceeding 5 Å. The only exceptions to this are three low-populated clusters (less than or equal to 1%) with RMSDs from native structure between 1.5 and 1.9 Å. The discrepancy from the native structure here is likely due to all models over-predicting the stability of the native helix (also see SI). In experiment, the helix is only found to form within the central residues, with the ends remaining unstructured61. 1.0 Avg. no. of salt bridges

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Native

Extended

0.8 0.6 0.4 0.2 0.0 igb5* igb8 igb5* igb8 igb5* igb8 igb5* igb8

ff14SB ff14SBonlysc ff14ipq Force Field

ff96

Figure 6. Average number of salt bridges formed with each force field/implicit solvent combination. Results are shown from native (blue) and extended (red) structures. The maximum number of salt bridges that may simultaneously form for each peptide are given in Table 1, which ranges from 0 to 2 with an average over all peptides of 1.3.

As mentioned earlier, accurate prediction of salt bridge propensities can be key to determination of peptide native structures, and has previously been a focal point during the refinement of implicit solvent models30,41. The potential (theoretical) number of salt bridges that can be formed simultaneously in each peptide (Table 1) ranges from 0 to 2. However, the average number of salt bridges actually formed is not strictly proportional to the potential

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 40

number. On average, our results show that alpha peptides form more salt bridges than beta ones, as expected due to the higher potential for salt bridge formation in the helical structure. Every force field/implicit solvent combination generates a greater average number of salt bridges in helical than in hairpin peptides when simulated from an extended structure, particularly ff14ipq/igb5*, ff96/ibg8, and ff14onlysc/igb5* (shown in Figure S3). Most models show approximately the same average number of salt bridges, with the exceptions of ff14ipq/igb5* and ff14/igb8, which emphasize salt bridge formation slightly more (Figure 6). These results are perhaps not surprising, as the developers of ff14ipq noted an overstabilization of salt bridges in the GB1 peptide, as well as enhanced affinity of amine nitrogens of lysine residues in the K19 peptide for the backbone32. This is attributed to a reduction of polar atom intrinsic radii when these atoms interact with explicit water molecules. While this achieves correct solvation free energies for amino acids, it seems to result in an imbalance of solute-solute and solute-solvent charged interactions. Since in all other force fields, both implicit solvent models seem to result in similar numbers of salt bridges, our studies lend support to the idea that the charges used in ff14ipq overestimate the strength of electrostatic interactions. The increased salt bridge formation in ff14ipq/igb5* is seen significantly more in helical peptides than in hairpin peptides (Figure S3). In fact, the average number of salt bridges formed in simulations from extended structure for helical peptides is over double that of hairpin peptides with this model. Since the helical peptides studied here are anticipated to have more salt bridges than the hairpin peptides, this might seem to suggest that the correct salt-bridges are predicted by ff14ipq/igb5*, but are significantly overstabilized. However, from Figure 1 and the Supporting Information, it is apparent that helical structures are poorly represented by this combination, indicating that non-native salt bridges likely formed. To more rigorously pin down any biases of force fields or solvent models, we perform an ANOVA for each of percentage helix, percentage beta, and number of salt bridges treated as dependent variables, with the independent variables the force field, implicit solvent model, and peptide type (see SI). Unfortunately, in all cases, low p-values for cross correlations indicate that the effects of all independent variables are strongly coupled in a non-linear fashion. Thus we cannot confidently de-convolute the effects of force field, solvent, or peptide type to observe the independent effect of each. This is not to say that previously mentioned trends for specific force field and implicit solvent combinations are not significant. For instance, it is fairly clear that igb8

ACS Paragon Plus Environment

Page 27 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

biases ff14ipq and ff96 towards hairpin conformations; however, it does not consistently bias all force fields towards hairpins. In such a case, the ANOVA analysis would not be able to fully demonstrate independence of the force field from the implicit solvent model, as our results show. This gives further support to the co-development and parametrization of force fields and implicit solvent models, which seem to exhibit strong non-additivities even when treated as separate entities for the purposes of model development.

Conclusions We presented REMD simulations for a variety of short helical or hairpin peptides using combinations of four AMBER force fields and two implicit solvent models. For each peptide, we performed simulations from both native and extended structures and examined the resulting RMSD from native. We find that no combination accurately models all peptides, but in general, ff96/igb5* performs most consistently. Helical peptides are best approximated by ff14SB/igb5* and hairpin peptides generally perform well with ff14ipq/igb8. However, these combinations are not uniformly robust for every simulation and demonstrate variability even among peptides with the same secondary structure. The RMSD comparison between simulations starting from native and extended structure shows that for the majority of the peptide/force field/implicit solvent combinations, convergence occurs rather quickly and is maintained throughout the remainder of the REMD run. When we examine the average secondary structure content for each force field/implicit solvent combination based on peptide secondary structure, we find that all models except ff96/igb8 favor helix formation in helical peptides. Interestingly, most models do not approximate beta-sheet percentages well, and only ff14ipq/igb8, ff96/igb5*, and ff96/igb8 favor hairpin formation over helices. This is consistent with our findings that RMSDs are typically lower for helical peptides than for hairpin peptides. Top cluster percentages vary widely, with some high percentages indicating stability of configurations both far and near to native. Salt bridge formation is similar in all models except ff14ipq, which may over-stabilize them. Overall, the peptide test set that we studied presented a challenging target for any force field and implicit solvent combination. Because the peptides here are all relatively short (between 11 and 20 residues), the stability of their folds is likely marginal compared to that of larger proteins, such that native structures in the latter are more tolerant to force field errors. The

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 40

recent successes of the ff14SBonlysc/igb8 combination in folding a set of larger proteins4 would seem to support this idea. For peptides short enough to exhibit a single secondary structure, small differences between secondary structure free energy basins are critical to observing the correct configuration. Compared to larger proteins, these systems lack tertiary contacts that may further differentiate the thermodynamic stability of various secondary structures. In this study, we did not investigate the use of more computationally expensive implicit solvent models that more accurately approximate the polar30,105,106 or non-polar12,107 contributions to the free energy of solvation. Previously it has been noted that advances in the accuracy of the non-polar solvation term beyond a simple proportionality to SASA may greatly enhance implicit solvation models108. The relative success of explicit solvent simulations suggests that even more accurate implicit solvent models may benefit from parametrization steps aimed at correctly reproducing the configurational ensembles observed in these simulations. However, as noted by Nguyen and coworkers,13 such a fitting procedure would need to carefully consider transferability. By simultaneously or individually parametrizing polar and/or non-polar terms based on peptide configurational ensembles, either might begin to reflect information theoretically belonging to the other. While this might lead to a higher-quality model for the peptide training set used, these changes could be less transferable to other peptide sequences, let alone other types of molecules beyond peptides and proteins. Emerging strategies that depart completely from the Generalized Born/SASA approach may also be useful. In particular, bottom-up coarse-graining strategies now provide systematic tools for developing interaction potentials when degrees of freedom are removed and may prove a fruitful new strategy; for example, a recent approach parameterized mean-field multibody implicit solvation potentials from explicit-water simulations that were able to capture cooperativity in hydrophobic interactions.109 At the same time, an easier and more practical approach may be to simply parametrize force field and implicit solvent models simultaneously. Force fields are already parametrized in the context of specific explicit water models, and have variable performance in alternate ones. Small modifications to the force field to optimize for a certain water model can lead to large improvements in accuracy17. Similar approaches to optimize force fields in tandem with implicit solvent models could prove highly beneficial in more accurately reproducing the folded structures of short peptides and the conformational states of proteins.

ACS Paragon Plus Environment

Page 29 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Acknowledgements We gratefully acknowledge the National Science Foundation for financial support (Project DMR-1312548). J.I.M. acknowledges support from the NSF Graduate Research Fellowship Program (Grant No. DGE 1144085).

Supporting Information Supporting Information includes: visual representations of top clusters, associated RMSD values, and secondary structure information for each simulation with a given peptide/force field/implicit solvent combination; figures S1-S3 mentioned in the text; and ANOVA results for percent helix, percent hairpin, and average number of salt bridges, each as a function of force field, implicit solvent model, and type of peptide. This information is available free of charge via the Internet at http://pubs.acs.org

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References (1)

Levitt, M.; Warshel, A. Computer Simulation of Protein Folding. Nature 1975, 253 (5494), 694–698.

(2)

Zagrovic, B.; Sorin, E. J.; Pande, V. β-Hairpin Folding Simulations in Atomistic Detail Using an Implicit Solvent Model. J. Mol. Biol. 2001, 313 (1), 151–169.

(3)

Piana, S.; Lindorff-Larsen, K.; Shaw, D. E. How Robust Are Protein Folding Simulations with Respect to Force Field Parameterization? Biophys. J. 2011, 100 (9), 47–49.

(4)

Nguyen, H.; Maier, J.; Huang, H.; Perrone, V.; Simmerling, C. Folding Simulations for Proteins with Diverse Topologies Are Accessible in Days with a Physics-Based Force Field and Implicit Solvent. J. Am. Chem. Soc. 2014, 136, 13959–13962.

(5)

Lindorff-Larsen, K.; Maragakis, P.; Piana, S.; Eastwood, M. P.; Dror, R. O.; Shaw, D. E. Systematic Validation of Protein Force Fields against Experimental Data. PLoS One 2012, 7 (2), e32131.

(6)

Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Shaw, D. E. How Fast Folding Proteins Fold. Science (80-. ). 2011, 334 (6055), 517–520.

(7)

Felts, A. K.; Harano, Y.; Gallicchio, E.; Levy, R. M. Free Energy Surfaces of β-Hairpin and α-Helical Peptides Generated by Replica Exchange Molecular Dynamics with the AGBNP Implicit Solvent Model. Proteins Struct. Funct. Bioinforma. 2004, 56 (2), 310– 321.

(8)

Kim, E.; Jang, S.; Pak, Y. Consistent Free Energy Landscapes and Thermodynamic Properties of Small Proteins Based on a Single All-Atom Force Field Employing an Implicit Solvation. J. Chem. Phys. 2007, 127 (14), 145104.

(9)

Jang, S.; Kim, E.; Pak, Y. Direct Folding Simulation of α-Helices and β-Hairpins Based on a Single All-Atom Force Field with an Implicit Solvation Model. Proteins Struct. Funct. Bioinforma. 2006, 66 (1), 53–60.

(10)

Hayre, N. R.; Singh, R. R. P.; Cox, D. L. Evaluating Force Field Accuracy with LongTime Simulations of a β-Hairpin Tryptophan Zipper Peptide. J. Chem. Phys. 2011, 134 (3), 35103.

(11)

Zhou, R. Free Energy Landscape of Protein Folding in Water: Explicit vs. Implicit Solvent. Proteins Struct. Funct. Genet. 2003, 53 (2), 148–161.

(12)

Gallicchio, E.; Paris, K.; Levy, R. M. The AGBNP2 Implicit Solvation Model. J. Chem.

ACS Paragon Plus Environment

Page 30 of 40

Page 31 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Theory Comput. 2009, 5 (9), 2544–2564. (13)

Nguyen, H.; Roe, D. R.; Simmerling, C. Improved Generalized Born Solvent Model Parameters for Protein Simulations. J. Chem. Theory Comput. 2013, 9 (4), 2020–2034.

(14)

Fennell, C. J.; Kehoe, C.; Dill, K. A. Oil/water Transfer Is Partly Driven by Molecular Shape, Not Just Size. J. Am. Chem. Soc. 2010, 132 (1), 234–240.

(15)

Arthur, E. J.; Brooks, C. L. Parallelization and Improvements of the Generalized Born Model with a Simple sWitching Function for Modern Graphics Processors. J. Comput. Chem. 2016, 103695, 927–939.

(16)

Chen, W.; Shi, C.; Shen, J. Nascent β-Hairpin Formation of a Natively Unfolded Peptide Reveals the Role of Hydrophobic Contacts. 2015, 109 (August), 630–638.

(17)

Best, R. B.; Mittal, J. Protein Simulations with an Optimized Water Model: Cooperative Helix Formation and Temperature-Induced Unfolded State Collapse. J. Phys. Chem. B 2010, 114 (46), 14916–14923.

(18)

Maffucci, I.; Contini, A. An Updated Test of AMBER Force Fields and Implicit Solvent Models in Predicting the Secondary Structure of Helical, β-Hairpin, and Intrinsically Disordered Peptides. J. Chem. Theory Comput. 2016, 12 (2), 714–727.

(19)

Palazzesi, F.; Prakash, M. K.; Bonomi, M.; Barducci, A. Accuracy of Current All-Atom Force-Fields in Modeling Protein Disordered States. J. Chem. Theory Comput. 2015, 11 (1), 2–7.

(20)

Smith, M. D.; Rao, J. S.; Segelken, E.; Cruz, L. Force-Field Induced Bias in the Structure of Aβ 21–30 : A Comparison of OPLS, AMBER, CHARMM, and GROMOS Force Fields. J. Chem. Inf. Model. 2015, 55 (12), 2587–2595.

(21)

Mittal, J.; Best, R. B. Tackling Force-Field Bias in Protein Folding Simulations: Folding of Villin HP35 and Pin WW Domains in Explicit Water. Biophys. J. 2010, 99 (3), L26– L28.

(22)

Best, R. B.; Mittal, J.; Feig, M.; MacKerell, A. D. Inclusion of Many-Body Effects in the Additive CHARMM Protein CMAP Potential Results in Enhanced Cooperativity of αHelix and β-Hairpin Formation. Biophys. J. 2012, 103 (5), 1045–1051.

(23)

Cino, E. A.; Choy, W.; Karttunen, M. Comparison of Secondary Structure Formation Using 10 Different Force Fields in Microsecond Molecular Dynamics Simulations. J. Chem. Theory Comput. 2012, 8 (8), 2725–2740.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(24)

Kim, E.; Jang, S.; Pak, Y. Direct Folding Studies of Various α and β Strands Using Replica Exchange Molecular Dynamics Simulation. J. Chem. Phys. 2008, 128 (17), 175104.

(25)

Shell, M. S.; Ritterson, R.; Dill, K. A. A Test on Peptide Stability of AMBER Force Fields with Implicit Solvation. J. Phys. Chem. B 2008, 112 (22), 6878–6886.

(26)

Matthes, D.; De Groot, B. L. Secondary Structure Propensities in Peptide Folding Simulations: A Systematic Comparison of Molecular Mechanics Interaction Schemes. Biophys. J. 2009, 97 (2), 599–608.

(27)

Best, R. B.; Buchete, N.-V.; Hummer, G. Are Current Molecular Dynamics Force Fields Too Helical? Biophys. J. 2008, 95 (1), L07–L09.

(28)

Best, R. B.; Hummer, G. Optimized Molecular Dynamics Force Fields Applied to the Helix-Coil Transition of Polypeptides. J. Phys. Chem. B 2009, 113 (26), 9004–9015.

(29)

Best, R. B.; Mittal, J. Balance between α and β Structures in Ab Initio Protein Folding. J. Phys. Chem. B 2010, 114 (26), 8790–8798.

(30)

Chen, J.; Im, W.; Brooks, C. L. Balancing Solvation and Intramolecular Interactions: Toward a Consistent Generalized Born Force Field. J. Am. Chem. Soc. 2006, 128 (11), 3728–3736.

(31)

Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.; Simmerling, C. ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB. J. Chem. Theory Comput. 2015, 150723121218006.

(32)

Cerutti, D. S.; Swope, W. C.; Rice, J. E.; Case, D. A. ff14ipq: A Self-Consistent Force Field for Condensed-Phase Simulations of Proteins. J. Chem. Theory Comput. 2014, 10 (10), 4515–4534.

(33)

Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of Multiple Amber Force Fields and Development of Improved Protein Backbone Parameters. Proteins 2006, 65, 712–725.

(34)

Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis, J. L.; Dror, R. O.; Shaw, D. E. Improved Side-Chain Torsion Potentials for the Amber ff99SB Protein Force Field. Proteins Struct. Funct. Bioinforma. 2010, 78 (8), 1950–1958.

(35)

Still, W. C.; Tempczyk, A.; Hawley, R. C.; Hendrickson, T. Semianalytical Treatment of Solvation for Molecular Mechanics and Dynamics. J. Am. Chem. Soc. 1990, 112 (16),

ACS Paragon Plus Environment

Page 32 of 40

Page 33 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

6127–6129. (36)

Hawkins, G. D.; Cramer, C. J.; Truhlar, D. G. Parametrized Models of Aqueous Free Energies of Solvation Based on Pairwise Descreening of Solute Atomic Charges from a Dielectric Medium. J. Phys. Chem. 1996, 100 (51), 19824–19839.

(37)

Hawkins, G. D.; Cramer, C. J.; Truhlar, D. G. Pairwise Solute Descreening of Solute Charges from a Dielectric Medium. Chem. Phys. Lett. 1995, 246 (1–2), 122–129.

(38)

Onufriev, A.; Bashford, D.; Case, D. A. Modification of the Generalized Born Model Suitable for Macromolecules. J. Phys. Chem. B 2000, 104 (15), 3712–3720.

(39)

Mongan, J.; Simmerling, C.; McCammon, J. A.; Case, D. A.; Onufriev, A. Generalized Born Model with a Simple, Robust Molecular Volume Correction. J. Chem. Theory Comput. 2007, 3 (1), 156–169.

(40)

Lwin, T. Z.; Luo, R. Force Field Influences in β-Hairpin Folding Simulations. Protein Sci. 2006, 15 (11), 2642–2655.

(41)

Geney, R.; Layten, M.; Gomperts, R.; Hornak, V.; Simmerling, C. Investigation of Salt Bridge Stability in a Generalized Born Solvent Model. J. Chem. Theory Comput. 2006, 2 (1), 115–127.

(42)

Lin, E.; Shell, M. S. Convergence and Heterogeneity in Peptide Folding with Replica Exchange Molecular Dynamics. J. Chem. Theory Comput. 2009, 5 (8), 2062–2073.

(43)

Lin, E. I.; Shell, M. S. Can Peptide Folding Simulations Provide Predictive Information for Aggregation Propensity? J. Phys. Chem. B 2010, 114 (36), 11899–11908.

(44)

Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules. J. Am. Chem. Soc. 1995, 117 (19), 5179–5197.

(45)

Yoda, T.; Sugita, Y.; Okamoto, Y. Secondary-Structure Preferences of Force Fields for Proteins Evaluated by Generalized-Ensemble Simulations. Chem. Phys. 2004, 307, 269– 283.

(46)

Kollman, P. A.; Dixon, R.; Cornell, W.; Fox, T.; Chipot, C.; Pohorille, A. The Development/application of a “Minimalist” Organic/biochemical Molecular Mechanic Force Field Using a Combination of Ab Initio Calculations and Experimental Data. In Computer Simulation of Biomolecular Systems, Vol. 3; Wilkinson, A., Weiner, P., van

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Gunsteren, W. F., Eds.; Elsevier, 1997; pp 83–96. (47)

Duan, Y.; Wu, C.; Chowdhury, S.; Lee, M. C.; Xiong, G.; Zhang, W.; Yang, R.; Cieplak, P.; Luo, R.; Lee, T.; Caldwell, J.; Wang, J.; Kollman, P. A Point-Charge Force Field for Molecular Mechanics Simulations of Proteins Based on Condensed-Phase Quantum Mechanical Calculations. J. Comput. Chem. 2003, 24 (16), 1999–2012.

(48)

Perez, A.; MacCallum, J. L.; Brini, E.; Simmerling, C.; Dill, K. A. Grid-Based Backbone Correction to the ff12SB Protein Force Field for Implicit-Solvent Simulations. J. Chem. Theory Comput. 2015, 11 (10), 4770–4779.

(49)

Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79 (2), 926.

(50)

Horn, H. W.; Swope, W. C.; Pitera, J. W.; Madura, J. D.; Dick, T. J.; Hura, G. L.; HeadGordon, T. Development of an Improved Four-Site Water Model for Biomolecular Simulations: TIP4P-Ew. J. Chem. Phys. 2004, 120 (20), 9665–9678.

(51)

Cerutti, D. S.; Rice, J. E.; Swope, W. C.; Case, D. A. Derivation of Fixed Partial Charges for Amino Acids Accommodating a Specific Water Model and Implicit Polarization. J. Phys. Chem. B 2013, 117 (8), 2328–2338.

(52)

Li, D.-W.; Brüschweiler, R. NMR-Based Protein Potentials. Angew. Chemie Int. Ed. 2010, 49 (38), 6778–6780.

(53)

Nerenberg, P. S.; Head-Gordon, T. Optimizing Protein-Solvent Force Fields to Reproduce Intrinsic Conformational Preferences of Model Peptides. J. Chem. Theory Comput. 2011, 7 (4), 1220–1230.

(54)

Case, D. A.; Babin, V.; Berryman, J. T.; Betz, R. M.; Cai, Q.; Cerutti, D. S.; Cheatham, III, T. E.; Darden, T. A.; Duke, R. E.; Gohlke, H.; Goetz, A. W.; Gusarov, S.; Homeyer, N.; Janowski, P.; Kaus, J.; Kolossváry, I.; Kovalenko, A.; Lee, T. S.; LeGrand, S.; Luchko, T.; Luo, R.; Madej, B.; Merz, K. M.; Paesani, F.; Roe, D. R.; Roitberg, A.; Sagui, C.; Salomon-Ferrer, R.; Seabra, G.; Simmerling, C. L.; Smith, W.; Swails, J.; Walker, R. C.; Wang, J.; Wolf, R. M.; Wu, X.; Kollmann, P. A. AMBER 14; University of California, San Francisco, 2014.

(55)

Santiveri, C. M.; Pantoja-Uceda, D.; Rico, M.; Jiménez, M. A. Beta-Hairpin Formation in Aqueous Solution and in the Presence of Trifluoroethanol: A (1)H and (13)C Nuclear

ACS Paragon Plus Environment

Page 34 of 40

Page 35 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Magnetic Resonance Conformational Study of Designed Peptides. Biopolymers 2005, 79 (3), 150–162. (56)

Bierzynski, A.; Kim, P. S.; Baldwin, R. L. A Salt Bridge Stabilizes the Helix Formed by Isolated C-Peptide of RNase A. Proc. Natl. Acad. Sci. U. S. A. 1982, 79 (April), 2470– 2474.

(57)

Scholtz, J. M.; Barrick, D.; York, E. J.; Stewart, J. M.; Baldwin, R. L. Urea Unfolding of Peptide Helices as a Model for Interpreting Protein Unfolding. Proc. Natl. Acad. Sci. U. S. A. 1995, 92 (1), 185–189.

(58)

García, A. E.; Sanbonmatsu, K. Y. Exploring the Energy Landscape of a Beta Hairpin in Explicit Solvent. Proteins 2001, 42 (3), 345–354.

(59)

Ghosh, T.; Garde, S.; García, A. E. Role of Backbone Hydration and Salt-Bridge Formation in Stability of Alpha-Helix in Solution. Biophys. J. 2003, 85 (5), 3187–3193.

(60)

Demarest, S. J.; Fairman, R.; Raleigh, D. P. Peptide Models of Local and Long-Range Interactions in the Molten Globule State of Human α-Lactalbumin. J. Mol. Biol. 1998, 283 (1), 279–291.

(61)

Demarest, S. J.; Hua, Y.; Raleigh, D. P. Local Interactions Drive the Formation of Nonnative Structure in the Denatured State of Human α-Lactalbumin: A High Resolution Structural Characterization of a Peptide Model in Aqueous Solution † , ‡. Biochemistry 1999, 38 (22), 7380–7387.

(62)

Okur, A.; Strockbine, B.; Hornak, V.; Simmerling, C. Using PC Clusters to Evaluate the Transferability of Molecular Mechanics Force Fields for Proteins. J. Comput. Chem. 2003, 24 (1), 21–31.

(63)

Zerella, R.; Chen, P. Y.; Evans, P. A.; Raine, A.; Williams, D. H. Structural Characterization of a Mutant Peptide Derived from Ubiquitin: Implications for Protein Folding. Protein Sci. 2000, 9 (11), 2142–2150.

(64)

Jang, S.; Shin, S.; Pak, Y. Molecular Dynamics Study of Peptides in Implicit Water: Ab Initio Folding of β-Hairpin, β-Sheet, and Ββα-Motif. J. Am. Chem. Soc. 2002, 124 (18), 4976–4977.

(65)

Ulmschneider, J. P.; Jorgensen, W. L. Polypeptide Folding Using Monte Carlo Sampling, Concerted Rotation, and Continuum Solvation. J. Am. Chem. Soc. 2004, 126 (6), 1849– 1857.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(66)

Eaton, W. a; Munoz, V.; Hagen, S. J.; Jas, G. S.; Lapidus, L. J.; Henry, E. R.; Hofrichter, J. Fast Kinetics and Mechanisms in Protein Folding. Annu. Rev. Biophys. Biomol. Struct. 2000, 29, 327–359.

(67)

Blanco, F. J.; Serrano, L. Folding of Protein G B1 Domain Studied by the Conformational Characterization of Fragments Comprising Its Secondary Structure Elements. Eur. J. Biochem. 1995, 230 (2), 634–649.

(68)

Honda, S.; Kobayashi, N.; Munekata, E. Thermodynamics of a Beta-Hairpin Structure: Evidence for Cooperative Formation of Folding Nucleus. J. Mol. Biol. 2000, 295 (2), 269– 278.

(69)

Kobayashi, N.; Honda, S.; Yoshii, H.; Munekata, E. Role of Side-Chains in the Cooperative Beta-Hairpin Folding of the Short C-Terminal Fragment Derived from Streptococcal Protein G. Biochemistry 2000, 39 (21), 6564–6571.

(70)

Derrick, J. P.; Wigley, D. B. The Third IgG-Binding Domain from Streptococcal Protein G: An Analysis by X-Ray Crystallography of the Structure Alone and in a Complex with Fab. J. Mol. Biol. 1994, 243 (5), 906–918.

(71)

Gallagher, T.; Alexander, P.; Bryan, P.; Gilliland, G. L. Two Crystal Structures of the B1 Immunoglobulin-Binding Domain of Streptococcal Protein G and Comparison with NMR. Biochemistry 1994, 33, 4721–4729.

(72)

Lian, L. Y.; Derrick, J. P.; Sutcliffe, M. J.; Yang, J. C.; Roberts, G. C. K. Determination of the Solution Structures of Domains II and III of Protein G from Streptococcus by 1H Nuclear Magnetic Resonance. J. Mol. Biol. 1992, 228 (4), 1219–1234.

(73)

Zhou, R.; Berne, B. J.; Germain, R. The Free Energy Landscape for β Hairpin Folding in Explicit Water. Proc. Natl. Acad. Sci. 2001, 98 (26), 14931–14936.

(74)

Zhou, R.; Berne, B. J. Can a Continuum Solvent Model Reproduce the Free Energy Landscape of a β-Hairpin Folding in Water? Proc. Natl. Acad. Sci. 2002, 99 (20), 12777– 12782.

(75)

Dinner, A. R.; Lazaridis, T.; Karplus, M. Understanding β-Hairpin Formation. Proc. Natl. Acad. Sci. 1999, 96 (16), 9068–9073.

(76)

Pande, V. S.; Rokhsar, D. S. Molecular Dynamics Simulations of Unfolding and Refolding of a β-Hairpin Fragment of Protein G. Proc. Natl. Acad. Sci. U. S. A. 1999, 96 (16), 9062–9067.

ACS Paragon Plus Environment

Page 36 of 40

Page 37 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(77)

Nguyen, P. H.; Stock, G.; Mittag, E.; Hu, C.-K.; Li, M. S. Free Energy Landscape and Folding Mechanism of a β-Hairpin in Explicit Water: A Replica Exchange Molecular Dynamics Study. Proteins Struct. Funct. Bioinforma. 2005, 61 (4), 795–808.

(78)

Gronenborn, A.; Filpula, D.; Essig, N.; Achari, A.; Whitlow, M.; Wingfield, P.; Clore, G. A Novel, Highly Stable Fold of the Immunoglobulin Binding Domain of Streptococcal Protein G. Science 1991, 253 (5020), 657–661.

(79)

Munoz, V.; Henry, E. R.; Hofrichter, J.; Eaton, W. A. A Statistical Mechanical Model for β-Hairpin Kinetics. Proc. Natl. Acad. Sci. 1998, 95 (11), 5872–5879.

(80)

Roccatano, D.; Amadei, A.; Nola, A. Di; Berendsen, H. J. C. A Molecular Dynamics Study of the 41-56 β-Hairpin from B1 Domain of Protein G. Protein Sci. 1999, 8 (10), 2130–2143.

(81)

Kolinski, A.; Ilkowski, B.; Skolnick, J. Dynamics and Thermodynamics of β-Hairpin Assembly: Insights from Various Simulation Techniques. Biophys. J. 1999, 77 (6), 2942– 2952.

(82)

Bryant, Z.; Pande, V. S.; Rokhsar, D. S. Mechanical Unfolding of a Beta-Hairpin Using Molecular Dynamics. Biophys. J. 2000, 78 (February), 584–589.

(83)

Ma, B.; Nussinov, R. Molecular Dynamics Simulations of a Beta-Hairpin Fragment of Protein G: Balance between Side-Chain and Backbone Forces. J. Mol. Biol. 2000, 296 (4), 1091–1104.

(84)

Cochran, A. G.; Skelton, N. J.; Starovasnik, M. A. Tryptophan Zippers: Stable, Monomeric β-Hairpins. Proc. Natl. Acad. Sci. 2001, 98 (10), 5578–5583.

(85)

Pastor, M. T.; López de la Paz, M.; Lacroix, E.; Serrano, L.; Pérez-Payá, E. Combinatorial Approaches: A New Tool to Search for Highly Structured Beta-Hairpin Peptides. Proc. Natl. Acad. Sci. U. S. A. 2002, 99 (2), 614–619.

(86)

Kim, E.; Yang, C.; Jang, S.; Pak, Y. Free Energy Landscapes of a Highly Structured βHairpin Peptide and Its Single Mutant. J. Chem. Phys. 2008, 129 (16), 165104.

(87)

Neidigh, J. W.; Fesinmeyer, R. M.; Andersen, N. H. Designing a 20-Residue Protein. Nat. Struct. Biol. 2002, 9 (6), 425–430.

(88)

Paschek, D.; Nymeyer, H.; García, A. E. Replica Exchange Simulation of Reversible Folding/unfolding of the Trp-Cage Miniprotein in Explicit Solvent: On the Structure and Possible Role of Internal Water. J. Struct. Biol. 2007, 157 (3), 524–533.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(89)

Pitera, J. W.; Swope, W. Understanding Folding and Design: Replica-Exchange Simulations of “Trp-Cage” miniproteins. Proc. Natl. Acad. Sci. U. S. A. 2003, 100 (13), 7587–7592.

(90)

Zhou, R. Trp-Cage: Folding Free Energy Landscape in Explicit Water. Proc. Natl. Acad. Sci. 2003, 100 (23), 13280–13285.

(91)

Simmerling, C.; Strockbine, B.; Roitberg, A. E. All-Atom Structure Prediction and Folding Simulations of a Stable Protein. J. Am. Chem. Soc. 2002, 124 (38), 11258–11259.

(92)

Snow, C. D.; Nguyen, H.; Pande, V. S.; Gruebele, M. Absolute Comparison of Simulated and Experimental Protein-Folding Dynamics. Nature 2002, 420 (6911), 102–106.

(93)

Pantoja-Uceda, D.; Pastor, M. T.; Salgado, J.; Pineda-Lucena, A.; Pérez-Payá, E. Design of a Bivalent Peptide with Two Independent Elements of Secondary Structure Able to Fold Autonomously. J. Pept. Sci. 2008, 14 (7), 845–854.

(94)

Shoemaker, K. R.; Kim, P. S.; Brems, D. N.; Marqusee, S.; York, E. J.; Chaiken, I. M.; Stewart, J. M.; Baldwin, R. L. Nature of the Charged-Group Effect on the Stability of the C-Peptide Helix. Proc. Natl. Acad. Sci. U. S. A. 1985, 82 (8), 2349–2353.

(95)

Hansmann, U. H. E.; Okamoto, Y. Tertiary Structure Prediction of C-Peptide of Ribonuclease A by Multicanonical Algorithm. J. Phys. Chem. B 1998, 102 (4), 653–656.

(96)

Okamoto, Y.; Fukugita, M.; Nakazawa, T.; Kawai, H. α -Helix Folding by Monte Carlo Simulated Annealing in Isolated C-Peptide of Ribonuclease A. Protein Eng. Des. Sel. 1991, 4 (6), 639–647.

(97)

Kim, P.; Baldwin, R. Intermediates In The Folding Reactions Of Small Proteins. Annu. Rev. Biochem. 1990, 59 (1), 631–660.

(98)

Yoda, T.; Sugita, Y.; Okamoto, Y. Comparisons of Force Fields for Proteins by Generalized-Ensemble Simulations. Chem. Phys. Lett. 2004, 386 (4–6), 460–467.

(99)

Marqusee, S.; Baldwin, R. L. Helix Stabilization by Glu-...Lys+ Salt Bridges in Short Peptides of de Novo Design. Proc. Natl. Acad. Sci. 1987, 84 (24), 8898–8902.

(100) Sugita, Y.; Okamoto, Y. Replica-Exchange Molecular Dynamics Method for Protein Folding. Chem. Phys. Lett. 1999, 314 (1–2), 141–151. (101) Chodera, J. D.; Swope, W. C.; Pitera, J. W.; Dill, K. a. Long-Time Protein Folding Dynamics from Short-Time Molecular Dynamics Simulations. Multiscale Model. Simul. 2006, 5 (4), 1214–1226.

ACS Paragon Plus Environment

Page 38 of 40

Page 39 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(102) Kabsch, W.; Sander, C. Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers 1983, 22 (12), 2577–2637. (103) Blanco, F. J.; Rivas, G.; Serrano, L. A Short Linear Peptide That Folds into a Native Stable Bold Beta-Hairpin in Aqueous Solution. Nat. Struct. Biol. 1994, 1 (9), 584–590. (104) Shao, Q.; Yang, L.; Gao, Y. Q. A Test of Implicit Solvent Models on the Folding Simulation of the GB1 Peptide. J. Chem. Phys. 2009, 130 (19), 195104. (105) Im, W.; Lee, M. S.; Brooks, C. L. Generalized Born Model with a Simple Smoothing Function. J. Comput. Chem. 2003, 24 (14), 1691–1702. (106) Lee, M. S.; Salsbury, F. R.; Brooks, C. L. Novel Generalized Born Methods. J. Chem. Phys. 2002, 116 (24), 10606–10614. (107) Wagoner, J. A.; Baker, N. A. Assessing Implicit Models for Nonpolar Mean Solvation Forces: The Importance of Dispersion and Volume Terms. Proc. Natl. Acad. Sci. 2006, 103 (22), 8331–8336. (108) Chen, J.; Brooks III, C. L. Implicit Modeling of Nonpolar Solvation for Simulating Protein Folding and Conformational Transitions. Phys. Chem. Chem. Phys. 2008, 10 (4), 471–481. (109) Sanyal, T.; Shell, M. S. Coarse-Grained Models Using Local-Density Potentials Optimized with the Relative Entropy: Application to Implicit Solvation. J. Chem. Phys. 2016, 145 (3), 34109.

ACS Paragon Plus Environment

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

For Table of Contents Only

ACS Paragon Plus Environment

Page 40 of 40