Charged Termini on the Trp-Cage Roughen the Folding Energy

May 27, 2015 - We study the energy landscape and thermodynamics of the zwitterionic variant of the widely studied TC5b Trp-cage protein using replica ...
1 downloads 11 Views 2MB Size
Article pubs.acs.org/JPCB

Charged Termini on the Trp-Cage Roughen the Folding Energy Landscape Charles A. English and Angel E. García* Department of Physics and Astronomy and The Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York 12180, United States ABSTRACT: We study the energy landscape and thermodynamics of the zwitterionic variant of the widely studied TC5b Trp-cage protein using replica exchange molecular dynamics simulations. We show that the addition of two charge groups at the termini has dramatic consequences to the folding landscape. First, the addition of charged ends increases the equilibration time of the simulation by a factor of 2.5 over a variant with terminal capping. Second, we identify the formation of two long-lived metastable states not present in the capped ends variant structural ensemble. The population of these metastable states is higher at lower temperatures; furthermore, these states are determined to be low energy states, relative to the folded state. The first of the metastable states is a folding intermediate structure which is characterized by a non-native charge pair. The second is characterized by significant β sheet content. We show through potential of mean force (PMF) calculations that the PMF between two charge groups is a poor predictor of the prevalence of a particular ion pair in the unfolded structural ensemble. Finally, by analyzing the energy differences between the folded state, unfolded states, and the metastable states, we show that the stabilization of these metastable states is not only due to favorable Coulomb interactions but also due to strain in the dihedral angles. Our results show that, even for a simple protein, the folding landscape can be extremely complex and significantly altered by simple changes to the charge states of the sequence.



state.5 Numerous simulations of the Trp-cage have examined the effects of various force fields and both implicit and explicit solvents,7−16 and simulation techniques.8,9,17−21 Other studies have examined the thermodynamics,22,23 kinetics,6,24−26 structure and stability in urea, TMAO, and other cosolvents,27−31 protonation effects,32 sequence variants,33 and effects of confinement on the stability of Trp-cage.34,35 Generally, protein chains found in nature are zwitterionic, having a positively charged N-terminus and a negatively charged C-terminus under normal physiological conditions. However, peptides may be designed with neutral, capped ends to optimize stability and solubility. Commonly, an acetyl group is added to the N-terminus and an N-methyl group is added to the C-terminus, thus eliminating the terminal charges. In computational studies of Trp-cage, a number of simulations are performed with neutral caps placed on the termini. For example, studies on the Trp-cage by Zhou et al.18 and Huang et al.17 and by our group29,30,32−34 employ the use of neutralizing terminal caps. In contrast, a number of more recent studies have been performed in which the termini of the Trp-cage are simulated with charges.23,36 Experiments typically employ the zwitterionic variant of proteins commonly found in nature.5 In this study, a zwitterionic variant of the TC5b Trp-cage is

INTRODUCTION Charged groups play an important role in the stability of protein structure and folding kinetics.1,2 As such, the addition of charged groups by mutation or changes in solvent conditions may have large effects on the folding and conformational ensemble of a protein. For example, it is thought that the charged groups of a protein play a role in limiting aggregation while increasing folding times.3 Other work has shown charges can play a role in limiting the aggregation propensity of an amyloid.4 Experiments on proteins have difficulty in determining changes to non-native structure. For this reason, all-atom simulations are an important tool in examinations of mutational effects, in the intermediate and the unfolded structural ensembles. Since solvent molecules play a large role in charge pair screening, explicit solvent and all-atom models are preferred, even required, thus limiting the number of proteins possible for study to relatively smaller proteins with fast folding times (∼1 μs). The Trp-cage, a designed protein5 with a folding time of 4 μs,6 consisting of 20 residues, is a good candidate for a study on the effects of the addition of charges on the structural ensemble of a protein. The Trp-cage variant known as TC5b, with the sequence NLYIQWLKDGGPSSGRPPPS, is well studied in the literature and possesses an α helix (residues 1−8), a 3−10 turn (residues 12−14), and a polyproline II segment (residues 16−20) with the main protein chain shielding the hydrophobic tryptophan side chain (TRP(6)) in the center of the cage, in the folded © XXXX American Chemical Society

Received: March 2, 2015 Revised: May 15, 2015

A

DOI: 10.1021/acs.jpcb.5b02040 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B simulated with replica exchange molecular dynamics.37,38 The structural ensemble is then characterized and compared with a previous study performed for the TC5b variant with acetylated and N-methylated ends39 using the same simulation parameters. We find that, in addition to a longer equilibration time for the charged end variant, the nonfolded structural ensemble of the charged ends variant showed a dramatic increase in the variety of structures sampled, particularly in the formation of metastable non-native states. The formation of these metastable states is rare (1 replica each), but they are long-lived (more than 1 μs). We also examine non-native charge pair interactions observed in the simulations. We find the formation of a metastable intermediate can be attributed to the N-terminus contacting the aspartic acid (ASP(9)) side chain. However, the creation of a metastable state in the unfolded ensemble cannot be directly attributable to a non-native charge pair. Rather, this state is stabilized by the N-terminus forming contacts with the carbonyl groups of the protein backbone. Furthermore, we show the likelihood to form particular non-native charge pairs in the unfolded state is not directly related to the strength of the electrostatic pair energy. This finding is demonstrated through potential of mean force (PMF) calculations for interactions between single amino acids which are then compared with histograms of the non-native charge pair contacts in the unfolded state observed in the simulations.

was found to be less than 0.22 nm). The criterion for a folded state is justified for the neutral end variant of TC5b in another publication and is retained for the purposes of comparison.39 In order to gain information about which structures were associated with specific histogram populations, configurations from the simulation trajectories corresponding to the peak in question were extracted from the full simulation. The Daura clustering algorithm,45 as implemented in GROMACS, was then run using a 0.10 nm cutoff on the extracted configurations to determine the structural content of the histogram peaks. To determine whether the calculated centroids are associated with a single replica or sampled by multiple replicas, we calculated Cα RMSD values for the simulation trajectories with respect to the centroid structures. Structures were considered as corresponding to the metastable states if the Cα RMSD values were under 0.30 nm. This cutoff was justified from RMSD histograms calculated with respect to each of the centroid metastable states. Finally, contact maps were generated from distances between all Cα atoms. Two contact Cα atoms were in contact if they were within 0.60 nm of each other. In order to calculate the PMF between charged amino acids, umbrella sampling46 molecular dynamics runs were analyzed with the weighted histogram analysis method (WHAM).47 A PMF measures the average energy of interaction between two entities in a simulation as a function of a reaction coordinate. Two individual amino acids were placed in a simulation box, representing the corresponding residues in the protein possessing the charge pair of interest. N- and C-terminal residues (ASN and SER) were simulated with only a single acetyl or N-methyl cap on the N- (for SER) or C-terminus (for ASN), thereby ensuring the same charge properties as in the protein sequence. Residues found elsewhere in the chain were created with acetyl and N-methyl caps on both termini. Preparation of the simulation boxes was performed in the same manner as previously described for the whole protein but with the number of solvent molecules added corresponding to a 1.1 nm distance between the edge of the cubic box and the closest atom found in either of the two amino acid residues. The reaction coordinate chosen was the distance between the central atom of the charged atomic group of each residue. For example, the distance between ASP and ARG side chains was defined as the distance between the Cγ of the aspartic acid and the Cζ atom of the arginine. Umbrellas were placed every 0.02 nm covering the interval from 0.20 to 1.0 nm. The force constant used for the umbrellas was 5000 kJ/mol/nm2.



METHODS We performed replica exchange molecular dynamics (REMD)37,38 simulations of the sequence NLYIQWLKDGGPSSGRPPPS. We compared results to those of the sequence Ace-NLYIQWLKDGGPSSGRPPPS-NMeth, which were previously published.39 The simulations were performed using the GROMACS 4.5.1 software package40 and the AMBER99sb force field.11 The replica temperatures were chosen such that 40 replicas were used covering a span of temperatures from 275 to 500 K with a uniform 20% exchange rate. The temperatures required for a uniform exchange rate were found using the method outlined in Garcia et al.41 and using a Mathematica script supplied in the Supporting Information of English et al.33 As in previous work, a stochastic dynamics integrator was used as the thermostat. Two independent REMD simulations were run. The first was initialized with a randomized collapsed structure and was denoted as the unfolded initial conditions (UIC) simulation. The second used the folded PDB (PDB code 1L2Y, frame 1) structure determined from NMR results as initial conditions. This simulation is denoted in this paper as the folded initial conditions (FIC) simulation. The simulations were run with PME electrostatics42 using real space cutoffs of 1.0 nm. A 1.0 nm cutoff was used for Lennard-Jones interactions. The simulation box was cubic and was solvated using 2983 TIP3P water molecules.43 Two chloride ions and one sodium ion were added to make the box net charge neutral. The temperature and pressure of the box were equilibrated to 300 K by a 10 ps simulation with the Berendsen thermostat and to 1 bar with a 10 ns simulation using the Berendsen barostat. The box size obtained after the temperature and pressure equilibration was constant throughout the REMD production simulation. No pressure coupling was used. Bond constraints were simulated using the LINCS algorithim.44 The UIC and FIC simulations were run until a steady state was reached, and both simulations yielded a similar number of replicas folded (Cα RMSD with respect to the PDB structure



RESULTS AND DISCUSION The convergence time for the number of replicas folded in the UIC simulation was determined to be 1500 ns (Figure 1A). After 1500 ns, the number of replicas folded in the UIC simulation reached a value that remained nearly constant for the rest of the simulation. In contrast, the same equilibration mark occurred by 400 ns for the capped ends variant. For the capped ends sequence, the correlation time was determined to be 50 ns, while, for the charged ends sequence, the correlation time was found to be 500 ns. The correlation time was taken to be the time necessary for the system to display all characteristic behavior. As such, the factor of 10 increase indicates a much rougher landscape for the charged ends variant than for the capped ends sequence. Additionally, time blocks of double the correlation time were taken to be an appropriate length to compute errors in the calculated averages.48 On the basis of the correlation time and the equilibration time for the folded state, B

DOI: 10.1021/acs.jpcb.5b02040 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

Figure 1. (A) Time per replica elapsed vs the fraction of replicas folded out of 40. UIC and FIC simulations are plotted for TC5b charged ends as well as the UIC plot for the capped ends variant. (B) Melting curve for TC5b capped ends UIC (red) and charged ends UIC simulations (black). Populations for the intermediate metastable state (green) and unfolded metastable state (blue) in the charged ends simulations as functions of temperature are given.

the last 600 ns of the UIC capped ends simulation and the last 2000 ns of the UIC charged ends simulation were used for analysis. A melting curve (Figure 1B) was calculated for both sequence variants. The folding temperature of the sequence was defined as the temperature at which half the configurations of the simulation were folded. The melting temperature was found to be 320 ± 8 K for the capped ends variant and 321 ± 1 K for the charged ends variant. The melting temperatures were therefore found not to be statistically different. More interestingly, it was found the fraction of configurations folded decreased by 0.11 at the lowest temperatures for the charged ends sequence than for the capped ends sequence. The difference is due to a shift in population away from the folded state to the two low energy metastable states. The metastable states act as a third state that competes with the folded and unfolded states. The lower stability for the charged ends variant is in conflict with the experimental results of Barua et al.,49 who found the capping of the C-terminus resulted in lower stability for the Trp-cage at 280 K. Barua et al. attributed the increase in stability to a likely contact between the N- and C-termini. In our simulations, we could not find evidence of a significant increase in contacts between the N- and C-termini over the capped ends sequence. The populations of each metastable state are given in Figure 1B as a function of temperature. In order to verify the 0.22 nm RMSD cutoff remained appropriate to define the folded state, RMSD histograms were constructed with a bin width of 0.01 nm, which are shown in Figure 2A. By observing that the populations that comprise the two peaks corresponding to the folded state, centered at 0.9 and 0.20 nm, are zero at values slightly greater than 0.22 nm, we can conclude that this cutoff distance is a good criterion to distinguish folded and unfolded states. The two peaks below 0.22 nm correspond to two sets of structures with slightly different 3−10 turn conformations that interconvert in the ns time scale.39 RMSD histograms were also calculated for 500 ns time blocks of the UIC simulation to determine the presence of metastable non-native states, shown in Figure 2B. From the constructed RMSD histograms, counts for both sequences in the folded state were nearly identical. However, the RMSD histograms constructed from the 500 ns time blocks show that the individual folded substates do not equilibrate until after 2000 ns, with the secondary folded substate losing population during the equilibration process. In the intermediate region (defined as 0.22−0.50 nm), the histogram changing over time (Figure 2B) for the charged ends sequence indicates the presence of a metastable structure with a Cα RMSD between 0.33 and 0.39 nm. This peak was not found in the capped ends

Figure 2. (A) RMSD histogram for charged and capped ends Trp-cage variants at 283 K (lowest temperature simulated) and 319 K (melting temperature). (B) RMSD histograms for UIC charged ends simulation separated into 500 ns segments. (C) History of replicas characterized by folding status in 1 ns time blocks. A replica’s folding status is given by the legend on the right, with red corresponding to all frames in the 1 ns time block being folded and yellow none. The metastable states are shown as dark green (corresponding to replica 15, and named the intermediate metastable state) and orange (corresponding to replica 39, named the unfolded metastable). The two peaks below 0.22 nm, corresponding to the folded state, represent changes in the 3−10 helix turn.39 The system interconverts quickly (ns time scale) between these two forms, and both forms are considered to be in the folded state.

sequence RMSD histogram. The peak declines from the time periods between 1500 and 2000 ns and 2500 and 3000 ns, at which point the state disappears. To determine how many replicas adopt metastable structures, a diagram of the status of all replicas at a given time was constructed (Figure 2C). The intermediate metastable state is only stably adopted by a single replica but is relatively long-lived (about 1.0 μs). The histograms for the capped ends sequence indicate the presence of a metastable state with a Cα RMSD between 0.46 and 0.50 nm not present for the charged ends sequence. In the unfolded region (Cα RMSD greater than 0.50 nm), the charged ends sequence histograms indicate the presence of two metastable populations, a primary one with Cα RMSDs between 0.53 and 0.60 nm and a less prominent, secondary one with Cα RMSDs between 0.60 and 0.63 nm. The former peak grows over time, while the latter one disappears by 3.0 μs. The prominent metastable state in the unfolded ensemble is also attributed to a single replica adopting the configuration (trace for replica 39 in Figure 2C, colored orange). For this metastable state, a set of 100 MD simulations was run at 321 K starting from the calculated centroid structure (see Figure 3B). After 150 ns, none of the 100 replicas had unfolded, suggesting that this metastable state is very stable. Furthermore, the adoption of both metastable states appears to be independent of the folding process. It has been suggested that, in order to fold, a replica need not sample the entire landscape of the unfolded state.50 It has been reported that Trp-cage will, on average, sample about 27% of the unfolded state before folding.51 To identify and characterize the structures of the metastable states associated with these histogram peaks, Daura clustering45 was performed on the trajectories at 283 K with a 0.10 nm cutoff. The calculated centroid structures for each of the metastable states are shown in Figure 3. The states are C

DOI: 10.1021/acs.jpcb.5b02040 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

Figure 3. Most prevalent centroid structures generated from single linkage clustering: (A) intermediate charged ends metastable state, (B) unfolded charged ends β sheet state, (C) charged ends unfolded metastable state (RMSD between 0.60 and 0.63 nm), and (D) intermediate capped ends metastable state.

Figure 4. Contact maps for the intermediate state of (A) TC5b capped ends and (B) TC5b charged ends. Contact maps for the unfolded state of (C) TC5b capped ends and (D) TC5b charged ends. A residue is considered in contact if the distance between Cα atoms is less than 0.6 nm.

characterized as follows. Figure 3A is the metastable intermediate found in the nonequilibrated UIC simulation for the charged ends variant. The structure is in a native-like arrangement with a charge pair contact between the Nterminus and the negatively charged aspartic acid side chain. In this structure, the aspartic acid side chain also contacts the positively charged arginine side chain, as is present in the native state of the Trp-cage. The unfolded metastable state present in the charged ends UIC simulation, Figure 3B, is characterized by a large β sheet content and is similar in structure to a beta hairpin motif. Although the N-terminus does not appear to be in contact with any particular side chain, the structure suggests the N-terminus is contacting the midchain carbonyl groups in the backbone. This conclusion is supported by contact maps, shown in Figure 4, which reveal a large frequency of contacts between the N-terminus (residue 1) and residues 8−15 (Figure 4D) not present for the capped ends variant (Figure 4C). The secondary unfolded UIC charged end metastable structure appears as Figure 3C. This structure shows the native state salt bridge and is similar to the β sheet structure with the Nterminal section of the backbone crossing the C-terminal section of the backbone. Finally, the capped end metastable state (Figure 3D) is characterized as possessing the native state salt bridge and forms a beta-hairpin-like structure as well. The contact maps for both forms of Tc5b show long-range contacts between TRP(6) and PRO(12) and ARG(16) and PRO(18). These contacts suggest the formation of a hydrophobic cluster as observed in NMR experiments at high urea concentrations.27,28 NMR studies of the unfolded state of TC5b at 6 M urea also suggest the presence of significant non-random-coil structures. Rogne et al.,28 in agreement with Mok at el.,27 showed that ILE(4), LEU(7), PRO(12), and ARG(16) participate in a hydrophobic cluster and that there is no alpha helical structure in the denatured state. These observations are consistent with the contact maps and with the structures of the metastable states identified by clustering. However, we cannot make assignments of our identified

meltable states with the NMR observables under denaturing conditions. A set of dictionary of secondary structure of proteins (DSSP) calculations52 were used to determine what impact the charged ends make on secondary structures. DSSP calculations were performed for both the charged and capped ends sequences and are shown in Figure 5. From the DSSP analysis, it is clear that the addition of charged ends has little effect on the alpha helical content for the intermediate and unfolded states. The β sheet content for the unfolded ensemble, for the first nine residues, shows two peaks at residues 2−3 and 7−8. The percentage of β sheet content comprising these peaks indicates

Figure 5. DSSP analysis of configurations sampled at 283 K showing (A) alpha helical, (B) β sheet, (C) 3−10 turn, and (D) random coil content by residue for the capped ends intermediate ensemble (black), capped ends unfolded ensemble (red), charged ends intermediate ensemble (green), and charged ends unfolded ensemble (blue). D

DOI: 10.1021/acs.jpcb.5b02040 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B a quadrupling of the population with this secondary structure over the capped ends variant. This increase is primarily due to the addition of the metastable state in the unfolded ensemble for the charged ends variant. There is a greater variation in structures assigned to 3−10 turn type structures for the charged ends variant over the capped ends sequence in the intermediate ensemble. Most interestingly, however, is the lack of significant change between the two charge variants in the random coil count for almost all residues, with the exception of the alpha helical region for the charged ends ensemble, which indicates a slight drop in random coil content. This indicates that the dramatic increase in β sheet content for the charged ends unfolded ensemble does not significantly reduce the amount of random coil in the ensembles. Rather, the secondary structure shifts are created by eliminating other secondary structures in the capped ends nonfolded ensembles, such as bends and turns. The energy differences between the folded and intermediate metastable states, the folded and unfolded metastable states, and the folded and unfolded states (excluding metastable states) were calculated. The results are shown in Table 1 for

Figure 6. Average potential energy levels of different states within the TC5b charged ends system. The average potential energy for the unfolded state is shown at the top, with the two metastable states’ average energies shown at the bottom. The average potential energies are calculated with respect to the average energy of the folded state.

each charge pair according to folding status (e.g., folded, intermediate, or unfolded) and the charged status of the termini. The results are shown in Figure 7 as the difference

Table 1. Energy Differences for a Transition from the Folded State to the Intermediate Metastable State, from the Folded State to the Unfolded Metastable State, and from the Folded State to the General Unfolded State (Not Including Either Metastable State)a

bonded interactons Lennard-Jones interactions Couloumb interactions total a

F to Int MS (kJ/mol)

F to Unf MS (kJ/mol)

F to Unf (kJ/mol)

−16 8

−25 17

−26 21

−5

−7

18

−13

−15

13

The energies are broken down by interaction type.

283 K. For the transition from the folded state to the unfolded state, we observed a significant decrease in energy associated with the bonded interactions term, indicating a more relaxed structure. The decrease in energy is more than compensated for by a corresponding increase in energy associated with both Coulomb and Lennard-Jones interactions, resulting in a net increase in energy associated with this transition. Similarly, changing from the folded state to either metastable state maintains the decrease in bonded interaction energy. However, this transition manages to decrease the Coulomb interaction energy with more modest gains in Lennard-Jones interaction energies. This results in a net loss in energy for both metastable states when compared to the folded state of the Trp-cage. Additionally, the energy change measurements also support the observation that charge interactions play a key role in stabilizing these metastable states. The energy levels of each of the states are displayed in Figure 6. In order to confirm which individual non-native charge pairs were prevalent in the intermediate and unfolded states of the charged ends sequence, histograms were constructed of the distances between attractive non-native charge groups in the protein sequence introduced by the charged termini. The four possible attractive charge pairs involving a charged terminus were the N-terminus with ASP(9), the N- and C-termini, the C-terminus with LYS(8), and the C-terminus with ARG(16). The native state salt bridge between ASP(9) and ARG(16) was also included in the analysis. The histograms were calculated for

Figure 7. Differences in populations for charge pair distances between charged and capped ends sequences. Positive (negative) numbers indicate a particular distance is more (less) likely in the charged ends sequence. The lines are folded state - black, intermediate - red, and unfolded - green for (A) N-terminus−ASP(9), (B) N-terminus− ASP(9) 1.5−2.5 μs, (C) N-terminus−C-terminus, (D) LYS(8)−Cterminus, (E) ARG(16)−C-terminus, and (F) ASP(9)−ARG(16).

between the normalized histograms calculated with positive and negative values corresponding to more prevalence in the charged ends and capped ends variants, respectively. It was found that the N-terminal contact with ASP(9) (Figure 7A) shows a significant increase in its population at a charge pair distance of 0.60 nm for the unfolded state of the charged ends sequence. This contact is the most prevalent change for all attractive charge pair contacts between capped and charged end sequences and is associated with the unfolded beta-hairpin-like metastable state. There is also a significant increase in contact population for the intermediate state at an N-terminus−ASP(9) distance of 0.4 nm, due to the intermediate metastable state. E

DOI: 10.1021/acs.jpcb.5b02040 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B The other possible non-native charge pair distance distributions show little change, however. Though theorized to be a stabilizing factor for the charged ends sequence over the capped ends sequence by Barua et al., the N-terminus shows only a slight increase in probability to contact the C-terminus in the unfolded state and a slight decrease in the intermediate state (Figure 7B). The last two non-native attractive charge pairs, the C-terminus and LYS(8) (Figure 7C) and the Cterminus with ARG(16) (Figure 7D), show only minimal counts or differences in any states at short distances (less than 0.6 nm), though there is a slight probability of these contacts forming for the charged ends sequence. Finally, the native state salt bridge between ASP(9) and ARG(16) (Figure 7E) forms in all states for both variants. The charged ends sequence possesses an increase in population in the unfolded ensemble at a separation distance of 0.4 nm and a decrease at 0.4 nm in both the folded and intermediate ensembles. In order to estimate the free energy of all possible charge pairs without the effects of the protein sequence, PMF calculations were performed as described in the Methods section. The results are given in Figure 8, and energy minima

Table 2. Minimum Contact Average Free Energies, as Determined from PMF Measurementsa primary PMF min. (kJ/mol) N−ASP(9) N−C LYS(8)−C ARG(16)−C ASP(9)−ARG(16) N−GLY-O a

−1.3 −1.1 −1.4 −5.4 −4.0 −0.3

± ± ± ± ± ±

0.5 0.6 0.4 0.9 1.4 0.3

secondary PMF min. (kJ/mol) 0.4 ± 0.1 ± 0.9 ± 0.2 ± 0.2 ± none

0.3 0.2 0.4 0.6 1.0

Primary and secondary contact average free energies are shown.

attractive PMF minima of 5.4 ± 0.9 and 4.0 ± 1.4 kJ/mol. As a partial validation, the energy of interaction of the ASP(9)− ARG(16) salt bridge has experimentally been determined to be 4.7 ± 1.3 kJ/mol in the Trp-cage.49,53 Finally, the direct contact of the N-terminus to the protein backbone was tested, yielding an energy minimum of 0.3 ± 0.3 kJ/mol. In no case tested did a secondary PMF minimum conclusively suggest an attractive energy was possible with a water molecule between charge groups, suggesting secondary PMF minima do not play a role in the possible formation of charge pairs. This suggests the stabilization of the unfolded charged ends metastable state by a secondary charge pair contact is unlikely. However, most interestingly, the results show that PMF measurements of interactions between charged residues prove to be poor predictors of which charge pairs are actually observed in the simulation. The PMFs suggest that the most energetically favorable non-native charge pair is the ARG(16)−C-terminus contact, yet charge pair histograms for this contact show little increase in populations going from the capped ends TC5b sequence to the charged ends variant. This suggests that, if the ARG(16)−C-terminus contact were to form, the free energy loss from the resulting charge pair would be more than offset by free energy gains from unfavorable conformations and contacts. The PMFs also indicate the N-terminal−ASP(9) contact alone is not more energetically favorable than any of the other possible charge pair contacts.



CONCLUSIONS We studied the effects of charged termini on the structure and stability of the Trp-cage. We find that the folded structure and stabilities of the two sequences are similar but that the free energy landscapes are dramatically different. Most noticeably, charged ends drastically increased the amount of time required to reach a steady-state equilibrium starting from unfolded structures over a capped end variant. This finding is consistent with work that suggests that the addition of charges frustrates the folding process and increases the folding time.3 The changes in charge states did not affect the folded structure of the Trp-cage yet did result in a decrease in stability at the lowest temperatures. This is inconsistent with previous experiments.49 Experimental results had suggested a possible contact between the two termini as a stabilizing factor over the capped ends sequence. While PMF results show that the AMBER99sb force field gives a favorable average energy of interaction for a contact between the two termini, we did not observe a significant population of N- and C-termini contacts. The fact that only one replica samples each metastable state precludes any reliable estimate of the proportion of the ensemble that comprises these states. We also cannot reliably predict what the probability of an event resulting in these states

Figure 8. Average free energies of interaction calculated from umbrella sampling and WHAM for the (A) N-terminus−ASP(9) contact, (B) N-terminus−C-terminus contact, (C) LYS(8)−C-terminus contact, (D) ARG(16)−C-terminus contact, (E) ASP(9)−ARG(16) contact, and (F) N-terminus−GLY carbonyl group contact.

and distances are given in Table 2. Also included in the PMF calculations was a model for a simple hydrogen bond between the N-terminus and a GLY carbonyl in order to determine the free energy of attraction associated with a hydrogen bond made with a carbonyl group in the protein backbone itself. The PMF was defined as zero at 1.0 nm, and all values for the PMF are measured relative to this. The N−ASP(9), N−C, and LYS(8)− C contacts all yield attractive (negative with respect to 1.0 nm) PMFs of 1.3 ± 0.5, 1.1 ± 0.6, and 1.4 ± 0.4 kJ/mol, respectively, while ARG(16)−C and ASP(9)−ARG(16) have F

DOI: 10.1021/acs.jpcb.5b02040 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

(4) Hou, L.; Kang, I.; Marchant, R. E.; Zagorski, M. G. Methionine 35 Oxidation Reduces Fibril Assembly of the Amyloid Aβ-(1−42) Peptide of Alzheimer’s Disease. J. Biol. Chem. 2002, 277, 40173− 40176. (5) Neidigh, J. W.; Fesinmeyer, R. M.; Andersen, N. H. Designing a 20-residue protein. Nat. Struct. Biol. 2002, 9, 425−430. (6) Qiu, L.; Pabit, S. A.; Roitberg, A.; Hagen, S. J. Smaller and Faster: The 20-Residue Trp-Cage Protein Folds in 4 μs. J. Am. Chem. Soc. 2002, 124, 12952−12953. (7) Simmerling, C.; Strockbine, B.; Roitberg, A. E. All-atom structure prediction and folding simulations of a stable protein. J. Am. Chem. Soc. 2002, 124, 11258−11259. (8) Pitera, J. W.; Swope, W. Understanding folding and design: replica exchange simulations of ‘Trp-cage’ fly miniproteins. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 7582−7592. (9) Juraszek, J.; Bolhuis, P. G. Sampling the multiple folding mechanisms of Trp-cage in explicit solvent. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 15859−15864. (10) Zhan, L. X.; Chen, J. Z. Y.; Liu, W. K. Computational study of the Trp-cage miniprotein based on the ECEPP/3 force field. Proteins 2007, 66, 436−443. (11) Hornak, V.; Abel, R.; Okur, A. Comparison of multiple amber force fields and development of improved protein backbone parameters. Proteins 2006, 65, 712−725. (12) Seibert, M. M.; Patriksson, A.; Hess, B.; van der Spoel, D. Reproducible polypeptide folding and structure prediction using molecular dynamics simulations. J. Mol. Biol. 2005, 354, 173−183. (13) Paschek, D.; Nymeyer, H.; Garcia, A. E. Replica exchange simulations of reversible folding/unfolding of the Trp-cage miniprotein in explicit solvent: on the structure and possible role of internal water. J. Struct. Biol. 2007, 157, 524−533. (14) Patriksson, A.; Adams, C. M.; Kjeldsen, F.; Zubarev, R. A.; van der Spoel, D. A direct comparison of protein structure in the gas and solution phase: the TRP-cage. J. Phys. Chem. B 2007, 111, 13147− 13150. (15) Snow, C. D.; Zagrovic, B.; Pande, V. J. The Trp cage: folding kinetics and unfolded state topology via molecular dynamics simulations. J. Am. Chem. Soc. 2002, 124, 14548−14549. (16) Chowdhury, S.; Lee, M. C.; Xiong, G. M.; Duan, Y. Ab initio folding simulation of the Trp-cage miniprotein approaches NMR resolution. J. Mol. Biol. 2003, 327, 711−717. (17) Huang, X. H.; Hagen, M.; Kim, B.; Friesner, R. A.; Zhou, R. H.; Berne, B. J. Replica exchange with solute tempering: efficiency in large scale systems. J. Phys. Chem. B 2007, 111, 5405−5410. (18) Zhou, R. H. Trp-cage: folding free energy landscape in explicit water. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 13280−13285. (19) Juraszek, J.; Bolhuis, P. G. Rate constant and reaction coordinate of Trp-cage folding in explicit water. Biophys. J. 2008, 95, 4246−4257. (20) Paschek, D.; Hempel, S.; Garcia, A. E. Computing the stability diagram of the Trp-cage miniprotein. Proc. Natl. Acad. Sci. U. S. A. 2008, 105, 17754−17759. (21) Kannan, S.; Zacharias, M. Folding simulations of Trp-cage miniprotein in explicit solvent using biasing potential replica-exchange molecular dynamics simulations. Proteins 2009, 76, 448−460. (22) Streicher, W. W.; Makhatadze, G. I. Unfolding thermodyamics of Trp-cage, a 20 residue miniproteinm studied by differential scanning calorimetry and circular dichroism spectroscopy. Biochemistry 2007, 46, 2876−2880. (23) Hatch, H. W.; Stillinger, F. H.; Debenedetti, P. G. Computational Study of the Stability of the Miniprotein Trp-cage, the GB1 βHairpin, and the AK16 Peptide, under Negative Pressure. J. Phys. Chem. B 2014, 118, 7761−7769. (24) Nikiforovich, G. V.; Andersen, N. H.; Fesinmeyer, R. M.; Frieden, C. Possible locally driven pathways of TC5b, a 20-residue protein. Proteins 2003, 52, 292−302. (25) Neuweiler, H.; Doose, S.; Sauer, M. A microscopic view of miniprotein folding: enhanced folding efficiency through formation of an intermediate. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 16650− 16655.

is with the current amount of sampling. This is further supported by the results of 100 independent molecular dynamics simulations using the unfolded metastable state as the starting configuration. Running simulations at the folding temperature, none of the 100 individual trajectories had escaped the metastable state by the end of 150 ns. Since both replicas remain in the metastable states for more than 1 μs in the REMD case, they are very energetically stable, as shown by the energy differences found in Table 1. The results suggest that even the simplest proteins could possess long-lived metastable states that require events rarer than a folding event to sample. It also suggests that the unfolded states do not necessarily equilibrate at the same rate as the folded state. However, neither metastable state is a required step in order to reach the folded state. We have characterized the differences between the structural ensembles of the TC5b Trp-cage with acetylated and Nmethylated ends and with charged ends. The folded structures remain the same. Additionally, it was found that the folding temperature remains the same within the errors observed. In contrast, the charged ends sequence takes significantly longer to equilibrate, suggesting a much rougher folding landscape. Metastable states were determined to be present in the charged ends structural ensemble. However, these structures were not seen in the capped ends simulation. Examining the metastable states from the RMSD histogram, we identify the non-native charge interactions that stabilize the metastable states. PMF calculations predict the formation of the native state ASP(9)− ARG(16) salt bridge and the ARG(16)−C-terminus as the most energetically favorable of all possible attractive charge pairs. However, the centroid structures as well as charge pair histograms fail to show the formation of these charge pairs. Our predictions obviously depend on the choice of force field and simulation methods. However, our calculations show that the Amber99sb force field can reliably describe the structure, folding temperature, and other thermodynamic quantities39 reasonably well. In addition, it shows that small changes at the ends of the sequence can result in energy landscapes with different roughness.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS Portions of the simulations were performed on the Stampede cluster supported by the Extreme Science and Engineering Discovery Environment (XSEDE), grant number MCB130178, which is supported by NSF grant ACI-1053575. The work was funded by NSF MCB-1050966.



REFERENCES

(1) Sanchez-Ruiz, J. M.; Makhatadze, G. I. To charge or not to charge? Trends Biotechnol. 2001, 19, 132−135. (2) Loladze, V. V.; Ibarra-Molero, B.; Sanchez-Ruiz, J. M.; Makhatadze, G. I. Engineering a thermostable protein via optimization of charge-charge interactions on the protein surface. Biochemistry 1999, 38, 16419−16423. (3) Kurnik, M.; Hedberg, L.; Danielsson, J.; Oliveberg, M. Folding without charges. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, 5705−5710. G

DOI: 10.1021/acs.jpcb.5b02040 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B (26) Ahmed, Z.; Beta, I. A.; Mikhonin, A. V.; Asher, S. A. UVresonance Raman thermal unfolding study of Trp-cage shows that it is not a simple two-state miniprotein. J. Am. Chem. Soc. 2005, 127, 10943−10950. (27) Mok, K. H.; Kuhn, L. T.; Goez, M.; Day, I.; Lin, J. C.; Andersen, N. H.; Hore, P. J. A pre-existing hydrophobic collapse in the unfolded state of an ultrafast folding protein. Nature 2007, 447, 106−109. (28) Rogne, P.; Ozdowy, P.; Richter, C.; Saxena, K.; Schwalbe, H.; Kuhn, L. Atomic-Level Structure Characterization of an Ultrafast Folding Mini-Protein Denatured State. PLoS One 2012, 7, 1−13. (29) Canchi, D. R.; Paschek, D.; Garcia, A. E. Equilibrium study of protein denaturation by urea. J. Am. Chem. Soc. 2010, 132, 2338−2344. (30) Canchi, D. R.; Garcia, A. E. Backbone and side-chain contributions in protein denaturation by urea. Biophys. J. 2011, 100, 1526−1533. (31) Canchi, D. R.; Garcia, A. E. Cosolvent effects on protein stability. Annu. Rev. Phys. Chem. 2013, 64, 273−293. (32) Jimenez-Cruz, C. A.; Makhatadze, G. I.; Garcia, A. E. Protonation/deprotonation effects on the stability of the Trp-cage miniprotein. Phys. Chem. Chem. Phys. 2011, 13, 17056−17063. (33) English, C. A.; Garcia, A. E. Folding and unfolding thermodynamics of the TC10b Trp-cage miniprotein. Phys. Chem. Chem. Phys. 2014, 16, 2748−2757. (34) Tian, J.; Garcia, A. E. Simulation studies of protein folding/ unfolding equilibrium under polar and nonpolar confinement. J. Am. Chem. Soc. 2011, 133, 15157−15164. (35) Marino, K. A.; Bolhuis, P. G. Confinement-Induced States in the Folding Landscape of the Trp-cage Miniprotein. J. Phys. Chem. B 2012, 116, 11872−11880. (36) Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Kleperis, J.; Dror, R.; Shaw, D. E. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins 2010, 78, 1950−1958. (37) Sugita, Y.; Okamoto, Y. Replica-exchange molecular dynamics method for protein folding. Chem. Phys. Lett. 1999, 314, 141−151. (38) Hansmann, U. H. E.; Okamoto, Y. New Monte Carlo algorithms for protein folding. Curr. Opin. Struct. Biol. 1999, 9, 177−183. (39) Day, R.; Paschek, D.; Garcia, A. E. Microsecond simulations of the folding/unfolding thermodynamics of the Trp-cage miniprotein. Proteins 2010, 78, 1889−1899. (40) Hess, B.; Kutzner, C.; van der Spoel, D.; Lindahl, E. GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation. J. Chem. Theory Comput 2008, 4, 435−447. (41) Garcia, A. E.; Herce, H. D.; Paschek, D. Simulations of temperature and pressure unfolding of peptides and proteins with replica exchange molecular dynamics. Annu. Rep. Comput. Chem. 2006, 2, 83−95. (42) Darden, T.; Darrin, Y.; Pedersen, L. Particle mesh Ewald: An N log(N) method for Ewald sums in large systems. J. Chem. Phys. 1993, 98, 10089−10092. (43) Jorgensen, W.; Chandrasekhar, J.; Madura, J.; Impey, R.; Klein, M. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926−935. (44) Hess, B.; Bekker, H.; Berendsen, H.; J, F. LINCS: A linear constraint solver for molecular simulations. J. Comput. Chem. 1998, 18, 1463−1472. (45) Daura, X.; Suter, R.; van Gunsteren, W. Validation of molecular simulation by comparison with experiment: rotational reorientation of tryptophan in water. J. Chem. Phys. 1999, 110, 3049−3055. (46) Torrie, G. M.; Valleau, J. P. Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. J. Comput. Phys. 1977, 23, 187−199. (47) Kumar, S.; Bouzida, D.; Swendsen, R.; Kollman, P. A.; Rosenberg, J. The weighted histogram analysis method for free-energy calculations on biomolecules. J. Comput. Chem. 1992, 13, 1011−1021. (48) Frenkel, D.; Smit, B. Understanding Molecular Simulation, 2nd ed.; Elsevier: San Diego, CA, 2002. (49) Barua, B.; Lin, J. C.; Williams, V. D.; Kummler, P.; Neidigh, J. W.; Andersen, N. H. The Trp-cage: optimizing the stability of a globular miniprotein. Protein Eng., Des. Sel. 2008, 21, 171−185.

(50) Levy, R. M.; Dai, W.; Deng, N.; Makarov, D. E. How long does it take to equilibrate the unfolded state of a protein? Protein Sci. 2013, 22, 1459−1465. (51) Deng, N.; Dai, W.; Levy, R. M. How Kinetics within the Unfolded State affects Protein Folding: An Analysis Based on Markov State Models and an Ultra-Long MD Trajectory. J. Phys. Chem. B 2013, 117, 13787−13799. (52) Kabsch, W.; Sander, C. Dictionary of protein seconday structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22, 2577−2637. (53) Williams, D. V.; Byrne, A.; Stewart, J.; Andersen, N. H. Optimal Salt Bridge for Trp-Cage Stabilization. Biochemistry 2011, 50, 1143− 1152.

H

DOI: 10.1021/acs.jpcb.5b02040 J. Phys. Chem. B XXXX, XXX, XXX−XXX