Article Cite This: J. Chem. Inf. Model. 2018, 58, 111−118
pubs.acs.org/jcim
Parallel Tempering of Dark Matter from the Ebola Virus Proteome: Comparison of CHARMM36m and CHARMM22 Force Fields with Implicit Solvent Mark A. Olson* Molecular and Translational Sciences, USAMRIID, Frederick, Maryland 21702, United States ABSTRACT: Intrinsically disordered proteins are characterized by their large manifold of thermally accessible conformations and their related statistical weights, making them an interesting target of simulation studies. To assess the development of a computational framework for modeling this distinct class of proteins, this work examines temperature-based replicaexchange simulations to generate a conformational ensemble of a 28-residue peptide from the Ebola virus protein VP35. Starting from a prefolded helix−β-turn−helix topology observed in a crystallographic assembly, the simulation strategy tested is the recently refined CHARMM36m force field combined with a generalized Born solvent model. A comparison of two replica-exchange methods is provided, where one is a traditional approach with a fixed set of temperatures and the other is an adaptive scheme in which the thermal windows are allowed to move in temperature space. The assessment is further extended to include a comparison with equivalent CHARMM22 simulation data sets. The analysis finds CHARMM36m to shift the minimum in the potential of mean force (PMF) to a lower fractional helicity compared with CHARMM22, while the latter showed greater conformational plasticity along the helix-forming reaction coordinate. Among the simulation models, only the adaptive tempering method with CHARMM36m found an ensemble of conformational heterogeneity consisting of transitions between α-helix−β-hairpin folds and unstructured states that produced a PMF of fractional fold propensity in qualitative agreement with circular dichroism experiments reporting a disordered peptide.
1. INTRODUCTION Intrinsically disordered proteins (IDPs) that encompass “dark matter” proteomes play a fundamental role in the regulation and function of protein association networks.1−4 Their hallmark is large-scale conformational heterogeneity when free in solution while finding a folded topology upon forming a multimeric assembly. An illustrative example is a 28-residue peptide region extracted from the Ebola virus protein VP35.5 The X-ray crystallographic structure of the peptide (designated as NPBP) bound to the Ebola virus protein NP shows a helix−β-turn−helix topology. When free in solution, NPBP transitions to a disordered ensemble as observed by circular dichroism (CD) spectroscopy.5 What makes the disorder− order transition of NPBP of larger interest is that when NPBP is added to a solution of 50% trifluoroethanol (TFE), the CD spectrum shows that the peptide transitions from an unstructured ensemble to structures containing helical folds. The hidden propensity of NPBP and conceivably that of many other IDPs makes atomistic simulation studies challenging to capture a heterogeneous conformational ensemble without being overly biased by the susceptibility to fold. The challenge is amplified by the inherent deficiencies of all-atom force fields and solvent models that tend to overstabilize fold propensities. The topic of this work is to test the recently reported CHARMM36m force-field parametrization6 in a simulation of NPBP using a temperature-based replica-exchange (ReX) method.7 The force field is a refinement of an earlier version This article not subject to U.S. Copyright. Published 2017 by the American Chemical Society
to improve the accuracy in polypeptide backbone conformational ensembles for IDPs. Although the development of CHARMM36m is primarily intended for explicit-solvent simulations, the work here applies the force field with a generalized Born (GB) solvent approximation represented by the GBMV2 model.8 GBMV2 is one of the most accurate GB models9 and reproduces the thermal stabilities of small α-helical and β-folded proteins observed experimentally.10−12 Given the continued evaluation of force fields13,14 and optimization studies of GB models,15 it is of general interest to determine whether the combined CHARMM36m/GBMV2 simulation strategy provides a framework for modeling IDPs.16−18 While GBMV2 is a computationally efficient model compared with explicit-solvent simulations, there remains the question of whether implicit-solvent models can correctly shift the density of states of a prefolded IDP on an energy landscape hindered by multiple kinetic traps to an ensemble of disordered states favored by configurational entropy. To help with the assessment of the CHARMM36m/GBMV2 simulation model, two different strategies of applying ReX methods are compared, where one is a traditional approach with a fixed set of sampling temperatures and the other is an adaptive scheme in which the windows are allowed to dynamically move in temperature space between two temperReceived: August 29, 2017 Published: November 29, 2017 111
DOI: 10.1021/acs.jcim.7b00517 J. Chem. Inf. Model. 2018, 58, 111−118
Journal of Chemical Information and Modeling
Article
To model the movement of clients, a thermal current j can be defined from eq 3 by
ature extremes. The underlying idea of the adaptive-T method is an enhanced sampling technique to enrich the population of windows and their exchanges near a so-called “phase transition” temperature to improve basin-hopping across conformational states. The scheme was first proposed by Hansmann and coworkers,19 and Troyer and co-workers.20 Here the implementation of the adaptive-T ReX algorithm is based on earlier work of modeling the unfolding−folding dynamics of the 57-residue protein SH3.21 In that study it was observed that the adaptive-T approach produced a significantly lower melting transition temperature than the traditional fixed-T method, leading to excellent agreement with experimental observations. Other noted applications of the adaptive-T ReX method include structure refinement of knowledge-based protein models22,23 and simulation analysis of the trapping of a metastable conformational state in a crystallographic protein homodimeric assembly.24 Given the broad applicability of the adaptive-T ReX simulation strategy for modeling protein folding events, it is of general interest to determine whether this enhanced sampling approach is beneficial in generating an ensemble of less-cooperative transitions that are known to govern IDPs. To extend the work reported here, the CHARMM36m/GBMV2 simulation results are compared to a re-examination of CHARMM22/GBMV2 simulation data sets taken from an earlier study.25 The comparison centers on computing potentials of mean force (PMFs) using the parallel-tempering weighted histogram analysis method26 (PTWHAM) and the multistate Bennett acceptance ratio (MBAR) method.27
j = D(T )η(T )
⎛ T ⎞[2/(N − 1)] Ti + 1 ≤ ⎜ max ⎟ Ti ⎝ Tmin ⎠
pi̇ = fi − γi pi + R i + λ gi
gi = ⟨pi⟩L
(7)
where ⟨...⟩L denotes a local average. The interval is further defined as L = tL/δt, where tL is the local averaging time and δt is the time step along the simulation trajectory. The simulation methodology and parameter set applied here are similar to that given in an earlier study of the NPBP peptide using the CHARMM22 force field with the GBMV2 solvent model. 25 Here CHARMM36m was applied using the CHARMM simulation program31 (version c41b1) and the utilities and programming libraries of the Multiscale Modeling Tools for Structural Biology (MMTSB).32 An integration time step of 2 fs was used. In regard to the SGLD parameters, the friction constant γi was set to 1 ps−1 for all heavy atoms, the guiding factor λ in eq 6 was set to a value of 1, and the time average of the momentum (tL) was set to 1 ps. Nonbonded interaction cutoff parameters for electrostatics and nonpolar terms were set at a radius of 22 Å with a 2 Å potential switching function. Covalent bonds between heavy atoms and hydrogen atoms were constrained using the SHAKE algorithm.33 The GBMV2 parameters within MMTSB were set to gbmvabeta = −12 and gbmvap3 = 0.65. Modifications of these parameters were first explored by Chocholoušová and Feig34 in a study of simulation strategies to enhance energy conservation in the application of GBMV2 in molecular dynamics. The parameter gbmvabeta is defined as a scaling factor to make the molecular volume sharper or smoother at
(1)
(2)
where βa = 1/kBTa, in which kB is Boltzmann’s constant and Ta is the temperature of replica client a, and Ea is the potential energy of client a (similar expressions apply for client b). In an adaptive scheme, eq 1 is replaced by an algorithm that attempts to maximize the number of times replica clients progress in round trips from Tmin to Tmax. Depending on the last temperature extreme visited, each replica client is labeled as either cold or hot.21 Histograms ncold(T) and nhot(T) over temperature space are constructed to collect the numbers of clients visiting each temperature window. The fraction of cold clients visiting temperature T is modeled as ncold(T ) ncold(T ) + nhot(T )
(6)
where ṗi is the rate of change of the momentum of particle i, fi is the force acting on the particle, γi is the friction constant, Ri defines the random force, and gi is a memory function, which is scaled by an ad hoc guiding factor λ. The memory function gi is defined by the moving average of the momentum seen by the system over an ensemble interval L:
where i = 1, 2, ..., N and the number of replica clients is given by N. After a specified number of simulation integration time steps, neighboring replica clients a and b swap temperatures with a probability given by the Metropolis energy criterion:28
f (T ) =
(5)
where the lower and upper values of Ti are set to Tmin and Tmax, respectively.21 CHARMM-Based Simulations. For conformational sampling of the 28-residue NPBP peptide, the self-guided Langevin dynamics (SGLD) method developed by Wu and Brooks29,30 was combined with ReX using a strategy first reported in modeling of the Trp-cage mini-protein.11 The SGLD equation of motion is given by
T1 = Tmin
p(a ↔ b) = min[1, e(βa − βb)(Eb − Ea)]
(4)
where D(T) is the diffusivity and η(T) is the probability that any client will reside at T. The current j can be maximized by altering the temperatures such that f(Ti) increases linearly as a function of the temperature index i. A continuous f(T) is constructed from an interpolation of the computed values at the current Ti, and then a search is conducted for the new temperatures where f(T) = i/(N − 1). To avoid clustering of all of the windows around the same temperature, a constraint is introduced such that no neighboring temperatures can be more than two geometric spacing units apart:
2. METHODS Replica-Exchange Strategies. The temperatures of the replica clients in a traditional approach are fixed at specific values during the simulations and can be modeled as a geometrically spaced set of N − 1 intervals from the minimum temperature Tmin to the maximum temperature Tmax: ⎛ Tmax ⎞[1/(N − 1)] Ti + 1 = Ti ⎜ ⎟ ⎝ Tmin ⎠
df dT
(3) 112
DOI: 10.1021/acs.jcim.7b00517 J. Chem. Inf. Model. 2018, 58, 111−118
Journal of Chemical Information and Modeling
Article
Illustrated in Figure 1 are PMFs at 300 K evaluated using PTWHAM. Comparison between the fixed-T ReX and the
the protein−solvent dielectric boundary. Likewise, the scaling parameter gbmvap3 can be adjusted to match Poisson solvation energies. The effect of varying both parameters in molecular dynamics simulations was presented in a study on modeling of the heat capacity of SH3 by Yeh et al.10 The last parameter in GBMV2 is the hydrophobic cavitation term, which was modeled by applying a phenomenological surface tension coefficient set to a value of 0.015 kcal mol−1 Å−2. Simulations were carried out using 24 replica clients, and the frequency of exchanges was set to 1 ps−1. Temperatures were geometrically spaced between Tmin = 300 K and Tmax = 475 K. The NPBP peptide was modeled for 200 ns of simulation time per client, generating an ensemble of 4.8 μs. There were 200 000 culled structures per temperature, which were used in the analysis to compute the secondary structure by the DSSP algorithm35 and the radius of gyration (Rg). PMFs (denoted as the measure WT at temperature T) were computed using order parameters of fractional helicity ( f H), Rg, and Z-score of the potential energies (E) from the CHARMM36m and CHARMM22 force fields as input to PTWHAM and MBAR calculations. Further evaluations were applied using an order parameter of fractional fold comprising α−β secondarystructure elements from DSSP. It should be noted that the PMFs generated by the SGLD method are prone to small deviations from a canonical description due to the ad hoc force term in eq 6, which is added to achieve enhanced sampling (see, e.g., refs 11 and 30). While an algorithmic scheme has been reported to reweight the biases of local averages of forces and momenta,36 the application is cumbersome and exceeds the purpose of this work, where the deviation is thought to be small for a peptide.11 Further analysis of the generated conformations was conducted using MMTSB to evaluate the population density of states by clustering methods that included hierarchical clustering (conformational and potential energy values) and kmeans clustering (conformational). Additional examination was conducted of the root-mean-square deviation (RMSD) in selected backbone angles Φ and Ψ from the initial folded peptide structure. The RMSD values were computed across the ensemble for two peptide segments and provided as input into a PTWHAM calculation.
3. RESULTS AND DISCUSSION The initial conformation of the NPBP peptide (numbered from residues 20 to 47) bound to Ebola virus protein NP has the topology of a helix−β-turn−helix fold and shows a fractional helicity of f H = 0.43 from DSSP. The two helices are partitioned as Trp28 to Thr35 and Val40 to Ile43. The initial fold compactness is given by the dimension Rg = 10 Å and appears more collapsed than an estimate for a comparable unfolded 28-residue peptide showing Rg ∼ 13 Å.37 Potentials of Mean Force for Helix Formation. Given the reported experimental CD spectrum without TFE showing the lack of α-helix formation, in contrast to that found in the bound NPBP and the observation with TFE in stabilizing a hidden helix propensity,5 the simulation models were first assessed by the calculation of PMFs using an order parameter of fractional helicity. While the CD spectroscopy data provide a qualitative description of the ensemble of dynamic conformations, comparison with simulations give valuable insight into the accuracy of the force fields and sampling methods to model the formation of secondary structure elements and their transitions.
Figure 1. Conformational and energy landscapes computed from the four CHARMM/GBMV2 replica-exchange simulation models: (A) CHARMM36m fixed-T ReX, (B) CHARMM36m adaptive-T ReX, (C) CHARMM22 fixed-T ReX, and (D) CHARMM22 adaptive-T ReX. The order parameters are fractional helicity, radius of gyration, and Z-score of potential energies. The plots display data extracted at the sampling temperature T = 300 K. In the displayed free energy scale, the color blue represents minima in the PMFs and is followed in order by the colors green, yellow, and red, the last of which denotes high-energy states of low population.
adaptive-T method for CHARMM36m shows a helical fold profile with an overall similar formation for the two sampling algorithms. For the fixed-T method, the minimum in WT( f H, Rg) is observed at f H = 0.26 and Rg = 7.7 Å, while the adaptive-T ReX gives an identical location of f H = 0.26 and Rg = 7.8 Å, thus showing convergence in searching for the minimum. Nevertheless, several differences between the two simulation models can be detected. The transition from the minimum in WT(f H, Rg) to a unstructured state where f H = 0 113
DOI: 10.1021/acs.jcim.7b00517 J. Chem. Inf. Model. 2018, 58, 111−118
Journal of Chemical Information and Modeling
Article
Figure 2. Data extracted from parallel tempering. (A) Final replica-exchange temperature sets for the four model simulation methodologies. The blue-colored line and symbols represent the adaptive-T method with the CHARMM36m force field, and the red-colored line and symbols depict CHARMM22 with adaptive-T sampling; the black-colored line and symbols depict the fixed-T method using either force field. (B) Average temperature over 24 replica clients during the simulation for each force field and simulation sampling method. Colored lines have the same meaning as in (A). (C) Metropolis exchange rates computed for each simulation. Blue color denotes the CHARMM36m adaptive-T sampling simulation, red color denotes the CHARMM22 adaptive-T method, green color denotes CHARMM36m using the fixed-T method, and purple color represents the CHARMM22 fixed-T method. (D) Fractions of cold replica clients as functions of the temperature index. Blue- and red-colored lines and symbols denote CHARMM36m and CHARMM22 adaptive-T simulations, respectively, and the black dashed line represents a linear relationship.
and Rg is set to an identical value as in the minimum yields for the fixed-T ReX a unfolding free energy difference of 3.3kBT and for the adaptive approach a value of 2.2kBT. An appearance of this lower-free-energy transition is noticeable through changes in the potential energy Z-score profiles, where the adaptive method provides greater population density at low fractional helicity. There are significant distinctions between the PMFs calculated from CHARMM36m and CHARMM22 as well as between those for the fixed-T ReX and adaptive-T methods for CHARMM22. For the fixed-T method, CHARMM22 shows the minimum at f H = 0.53 and Rg = 9.8 Å. A connecting local minimum is detected at f H = 0.37 and Rg = 7.9 Å with a free energy difference of only 0.05kBT from the global minimum. By comparison, the adaptive-T ReX method for CHARMM22 yields a minimum at f H = 0.37 and Rg = 8.8 Å with a nearby basin at f H = 0.16 and Rg = 8.8 Å) with a 0.13kBT free energy difference from the minimum. The conformational transition from the minimum to the unstructured state WT(f H = 0, Rg) for CHARMM22 requires 1.1kBT using the fixed-T approach and 1.5kBT using the adaptive-T method. Both values are noticeably lower than the corresponding values from CHARMM36m, indicating differences in basin funnel depth. From the plots in Figure 1, CHARMM36m lowers the weight of helix propensity compared with CHARMM22, yet the conformational plasticity along the helix-forming order parameter is favored by CHARMM22. The application of the adaptive-T ReX method for CHARMM22 shows a manifold of connecting multiple basins, while sampling from CHARMM36m is more confined to discrete states and shows
a strong bias in funneling conformations to a helical basin that can be characterized as a large kinetic trap. The distinction is evident in comparing the potential energy Z-score landscapes, where CHARMM22 simulation shuttles conformations among the major basins through a pathway of favorable exchanges among the probability density distribution. To check the accuracy of PTWHAM and further characterize the energy landscape topology in applying the CHARMM force fields with GBMV2, WT( f H, Rg) was recalculated using the MBAR method. The analysis shows overall consistency for CHARMM36m, with a minimum at f H = 0.26 and Rg = 8.0 Å, equivalent to that calculated from PTWHAM, and similar values for CHARMM22 to yield a minimum at f H = 0.46 and Rg = 8.6 Å. One notable difference for both force fields and sampling methods is that MBAR yields deeper helical basins when transitioning to unstructured conformations. The analysis shows ∼8k B T for CHARMM36m and ∼4k B T for CHARMM22. Nevertheless, the rank order of free energies that model transitions among basins from MBAR is similar to that found by PTWHAM. Finally, for each simulation model, comparing WT(f H, Rg) over the last 50 ns gives an estimate of the statistical variance in determining contours of the major free energy basins and shows an approximate upper-bound variance of ±0.8kBT. Analysis of Parallel Tempering Strategies. To develop a better understanding of the methodologies in parallel tempering and their application to conformational sampling, Figure 2 displays data extracted from the simulations. The first plot (Figure 2A) illustrates the final replica-exchange temperature profiles for the fixed-T ReX and adaptive-T methods. The 114
DOI: 10.1021/acs.jcim.7b00517 J. Chem. Inf. Model. 2018, 58, 111−118
Journal of Chemical Information and Modeling
Article
results show that allowing the windows to move in temperature space during a simulation produces significant deviations from the traditional method, and the dissimilarity depends on the applied force field. Application of CHARMM36m yields a broad-spanning cluster of windows at a lower temperature profile than that observed from CHARMM22. Figure 2B shows the average temperature window computed over the 24 clients along the simulation trajectory. Combined, the plots offer a unique perspective on the force fields and provide a measure of temperature-induced biases required to achieve conformational diffusion across potential energy barriers in locating the free energy minimum. Figure 2C illustrates the replica exchange rates, and as expected, the fixed-T method shows a flat exchange profile for both force fields. Of the two adaptive-T simulations, CHARMM36m yields a maximum exchange of ∼96% in a critical sampling regime of the potential energy, while for CHARMM22 the maximum is ∼93%. In both, the high exchanges are compensated with significantly reduced exchanges at the temperature end points to prevent aggregate clustering of all clients.21 As an estimate of the overall exchanges and the mobility of replica clients, Figure 2D shows the fractions of cold clients f(T) computed from eq 3 for the CHARMM36m and CHARMM22 simulations. The simulations follow an optimal linear relationship anticipated for IDPs because of a lack of a sharp phase transition temperature common for two-state folding events.21 The drop at high temperatures, while noticeably less abrupt than that observed for unfolding proteins with well-defined tertiary folds,21 is attributed to limited round-trip excursions by the clients between the extremes, typifying a universal problem of parallel sampling methods when applying all-atom protein models. Peptide Fold Analysis. Displayed in Figure 3A is the starting NPBP conformation bound to the Ebola virus protein NP. As noted above, NPBP is composed of two helical regions, residues Trp28 to Thr35 (denoted as α1) and residues Val40 to Ile43 (denoted as α2). Illustrated in Figure 3B are conformations extracted from hierarchical clustering of the CHARMM36m/GBMV2-generated ensemble selected at 300 K from the adaptive-T simulation. Shown are population estimates (P) for each cluster together with extracted structures that best represent the median energies from the CHARMM36m/GBMV2 potential energy function. A comparison is made with structures from CHARMM22/GBMV2 adaptive-T sampling (Figure 3C). The CHARMM36m/GBMV2 adaptive-T simulation shows that the conformation located at the sampled potential energy minimum (Emin) contains a short helix ( f H = 0.25) in the α1 region. For the two largest clusters, short helical conformations are likewise observed in α1; however, one of the clusters transitions to conformers exhibiting short β-hairpins in the Cterminal region, including α2. Further transitions are observed among the clusters in both extending and shorting the helical length. By comparison, the conformation at Emin of the CHARMM22/GBMV2 adaptive-T sampling simulation contains a helix−turn−helix fold with f H = 0.46, and the largest cluster is of similar general fold. Figure 3D reports the Cα RMSDs of structures culled from the simulation models at 300 K relative to the prefolded NPBP conformation. The analysis shows that the RMSD values are primarily in the range of 5−9 Å, and the CHARMM36m/GBMV2 model with adaptive sampling yields the highest net value while the adaptive method
Figure 3. Analysis of peptide folds. (A) The initial prefolded conformation observed in the crystallographic assembly, where the small peptide represents NPBP and the molecular surface represents Ebola virus NP. (B) Clustering of conformations taken at T = 300 K from the CHARMM36m/GBMV2 adaptive-T ReX simulation. The energy Emin denotes the minimum conformational potential energy sampled from the simulation. Values of P report the population percentages for the illustrated clusters. (C) Conformations extracted from the CHARMM22/GBMV2 adaptive-T ReX simulation. (D) Plots of Cα RMSD (in Å) vs conformation index at T = 300 K from the prefolded conformation. The blue-colored line and gray data set represent the CHARMM36m/GBMV2 adaptive-T simulation results, the black-colored line represents the CHARMM36m/GBMV2 fixed-T results, the red-colored line represents the CHARMM22/GBMV2 adaptive-T results, and the purple-colored line represents the CHARMM22/GBMV2 fixed-T simulation.
using CHARMM22/GBMV2 accounts for the lowest value. All of the model simulations show good convergence with statistical deviations no greater than ±1 Å. The observation of β-hairpins in conformational sampling of NPBP is unique to the CHARMM36m/GBMV2 energy function. Moreover, the population density depends on the ReX method for sampling. When the fixed-T simulation approach is applied, structures culled at 300 K are observed to yield a total P = 5% for populating β-hairpins, whereas the adaptive method found a strikingly higher population of P = 42%. Of the latter data set, sampled conformations are largely three-stranded β-hairpins. The significant difference between the two sampling methods can conceivably be understood from the temperature profiles reported in Figure 2A,B. Maintaining a large pool of thermal windows with high Metropolis exchanges around ∼350 K rather than annealing via stepwise hierarchical windows allows for more localized diffusion and thus greater attempts confined at basin-hopping transitions to overcome an 115
DOI: 10.1021/acs.jcim.7b00517 J. Chem. Inf. Model. 2018, 58, 111−118
Journal of Chemical Information and Modeling
Article
Figure 4. Population distribution landscapes and extracted conformations from the CHARMM36m/GBMV2 adaptive-T ReX results. (A) Plot of RMSDs in (Φ, Ψ) space computed from the prefolded NPBP conformation for residues Ser30-Glu31-Gln32. (B) Plot of RMSDs for residues Val40Ser41-Asp42. (C) Conformational landscape of fractional fold containing α-helix and β-hairpin structures vs radius of gyration. (D) Conformations taken from the CHARMM36m/GBMV2 adaptive-T simulation. The colors applied in the PMFs and their scales are noted in Figure 1.
III). Other conformers with low-energy transitions are illustrated in Figure 4D. The thermally accessible III−IV transition in WT( f H, Rg) is illustrative of the dynamic conformational heterogeneity found using the CHARMM36m/GBMV2 adaptive-T model and is consistent with the CD spectrum reported without TFE.5 Moreover, the detected α−β folded state is surprisingly consistent with unbiased predictions of secondary structure by a consensus approach.38 The lack of β-folded states in the CD spectrum with TFE is not unexpected because the additive compound is known to preferentially overstabilize the formation of helical structures. Fold Stiffness and Compactness Propensity. Illustrated in Figure 5 are plots of peptide fold formation in terms of helicity and compactness as functions of sampling temperature. Shown are statistical averages over data sets for CHARMM36m and CHARMM22 generated from both ReX methods. A separate plot of fold formation from the CHARMM36m/ GBMV2 adaptive-T simulation that reports the contribution of α−β conformations is included. Also shown in both plots is an assessment of the statistical averages for the CHARMM36m fixed-T ReX simulation by overlaying values of f H and Rg where WT( f H, Rg) = 0 along the temperature profile computed from MBAR. As anticipated from the PMFs, the analysis shows the CHARMM22 models to exhibit an extended thermal stability in helix formation greater than CHARMM36m (Figure 5A). Unlike CHARMM22, CHARMM36m shows only a minor
entropy-driven barrier from cooperative interactions of forming a short β-hairpin. To further investigate the folding transitions found by CHARMM36m/GBMV2 using adaptive-T sampling, Figure 4 illustrates a PTWHAM assessment of RMSDs in the backbone angles Φ and Ψ from the prefolded peptide for the threeresidue segments Ser30-Glu31-Gln32 (α1 region) and Val40Ser41-Asp42 (α2 region). The analysis finds for α1 (Figure 4A) the highest populated basin to be located at small RMSD differences while overall the landscape shows conformational populations spread out among the profile. For α2 (Figure 4B), a strikingly different result is obtained. The landscape shows a “hot” population region of deviations largely removed from the initial structure, containing multiple transitions among short α2 helix folds and β-hairpins. Figure 4C shows the conformational probability distribution computed from PTWHAM using an order parameter of fractional fold, rather than helicity as shown in Figure 1B. Changing the order parameter to include β-folds reveals a significant finding of shifting the minimum in WT(f H, Rg) to f H = 0 and Rg = 10.4 Å. The weighted probability distribution calculated by modifying the order parameter reveals that the potential energies in PTWHAM are strongly correlated with fractional fold propensity. Transition from the minimum to the populated basin at WT( f H = 0.47, Rg = 7.9 Å) in Figure 4C yields a low free energy barrier of only 0.03kBT for interconversion between the unfolded state (designated as IV in Figure 4D) and an α1-helix−β-hairpin fold (designated as 116
DOI: 10.1021/acs.jcim.7b00517 J. Chem. Inf. Model. 2018, 58, 111−118
Journal of Chemical Information and Modeling
Article
4. CONCLUSIONS This study investigated the application of the CHARMM36m force field with the GBMV2 implicit-solvent model in a replicaexchange simulation to calculate the conformational ensemble of a 28-residue IDP from the Ebola virus protein VP35. A comparison of two parallel tempering methods was provided, where one was a standard fixed-T sampling protocol and the other was a strategy in which the temperature windows move in temperature space, adapting to conformational sampling of the potential energy function. Comparisons were made with equivalent CHARMM22/GBMV2 simulation data sets. The central issue of the study is the applicability of potential energy functions applied in parallel tempering algorithms as a computational approach for modeling of large-scale conformational heterogeneity. The measure of success was the accuracy in replicating a disordered conformational ensemble of the peptide with a large manifold of thermally accessible states as reported from CD experiments. Starting from a helix−turn− helix topology, the results revealed that CHARMM36m using tempering methods produced a PMF of lower fractional helicity than CHARMM22, yet the latter force field captured more conformational plasticity along the helix-forming reaction coordinate. When the order parameter was changed to include β-sheet folding in addition to α-helix formation, only the adaptive tempering method with CHARMM36m found highly populated α-helix−β-hairpin transitions that led to a significant increase in conformational heterogeneity derived from multiple discrete states. With this model, a PMF of fractional fold propensity was calculated and revealed a free energy minimum containing an unstructured state, in qualitative uniformity with CD experiments showing the lack of folded structures.
Figure 5. Fold stiffness and compactness of generated conformations by the simulation models along the temperature profile. (A) Average fractional fold vs T. The red-colored line and circles represent a statistical average of f H from the CHARMM36m/GBMV2 fixed-T simulation, and the green-colored line and symbols represent f H values from the CHARMM36m/GBMV2 adaptive-T simulation. The black line and symbols represent the net peptide fold ( f H + β-hairpins) from the CHARMM36m/GBMV2 adaptive-T simulation, and the bluecolored line and symbols represent values of f H where WT( f H, Rg) = 0 for the CHARMM36m/GBMV2 fixed-T data set computed using MBAR. The dashed line represents CHARMM22/GBMV2 fixed-T results, and the purple-colored line and symbols represent the CHARMM22/GBMV2 adaptive-T simulation. (B) Average Rg vs T using the line colors and symbols noted above. Typical standard deviations are given by the CHARMM36m/GBMV2 fixed-T simulation: for f H, the standard deviation is ∼0.10 at T = 300 K and ∼0.02 at T = 475 K; for Rg, the standard deviation is ∼0.38 Å at T = 300 K and ∼2.64 Å at T = 475 K. The error in determining the minima in WT(f H, Rg) from MBAR along the temperature coordinate is ∼0.2kBT.
■
AUTHOR INFORMATION
Corresponding Author
*E-mail:
[email protected]. ORCID
Mark A. Olson: 0000-0002-5259-0343 Notes
The author declares no competing financial interest.
■
ACKNOWLEDGMENTS Financial support for this work was provided by the U.S. Department of Defense Threat Reduction Agency. The opinions or assertions contained herein are the private views of the author and are not to be construed as official or to reflect the views of the U.S. Army or the U.S. Department of Defense. This article has been approved for public release with unlimited distribution.
difference between the adaptive-T and fixed-T samplings of helicity. By including β-hairpin formation, the adaptive-T CHARMM36m results show a higher initial contribution followed by decay, consistent with the transient nature of cooperative interactions of stabilizing a β-hairpin. This is followed by a profile similar to the fixed-T sampling for CHARMM36m. One of the concerns of applying implicit-solvent descriptions to modeling of large-scale conformational heterogeneity is that the sampled structures will be overly collapsed (see, e.g., the comparison of solvent models in ref 25). Figure 5B shows that all of the simulation models are plagued by extended compactness of the generated conformations. As a noted benchmark, the prefolded peptide conformation shows Rg ∼ 10 Å in the multimeric complex. Only the adaptive-T method with CHARMM36m found a free energy minimum of an unstructured state with Rg ∼ 10 Å, in reasonable accord with a typical unfolded 28-residue peptide showing Rg ∼ 13 Å.37 Conceivably to help avoid generating overcollapsed conformations, incorporating a volume-based nonpolar solvation free energy term in GBMV2 could offset unfavorable biases.39
■
REFERENCES
(1) Bhowmick, A.; Brookes, D. H.; Yost, S. R.; Dyson, H. J.; FormanKay, J. D.; Gunter, D.; Head-Gordon, M.; Hura, G. L.; Pande, V. S.; Wemmer, D. E.; Wright, P. E.; Head-Gordon, T. Finding Our Way in the Dark Proteome. J. Am. Chem. Soc. 2016, 138, 9730−9742. (2) Arai, M.; Sugase, K.; Dyson, H. J.; Wright, P. E. Conformational Propensities of Intrinsically Disordered Proteins Influence the Mechanism of Binding and Folding. Proc. Natl. Acad. Sci. U. S. A. 2015, 112, 9614−9619. (3) Wright, P. E.; Dyson, H. J. Intrinsically Unstructured Proteins: Re-Assessing the Protein Structure-Function Paradigm. J. Mol. Biol. 1999, 293, 321−331. (4) Dyson, H. J.; Wright, P. E. Intrinsically Unstructured Proteins and their Functions. Nat. Rev. Mol. Cell Biol. 2005, 6, 197−208.
117
DOI: 10.1021/acs.jcim.7b00517 J. Chem. Inf. Model. 2018, 58, 111−118
Journal of Chemical Information and Modeling
Article
of Single Domain Antibodies? Protein Eng., Des. Sel. 2015, 28, 395− 402. (24) Olson, M. A.; Legler, P. M.; Goldman, E. R. Comparison of Replica Exchange Simulations of a Kinetically Trapped Protein Conformational State and its Native Form. J. Phys. Chem. B 2016, 120, 2234−40. (25) Olson, M. A. On the Helix Propensity in Generalized Born Solvent Descriptions of Modeling the Dark Proteome. Front. Mol. Biosci. 2017, 4, 3. (26) Chodera, J. D.; Swope, W. C.; Pitera, J. W.; Seok, C.; Dill, K. A. Use of the Weighted Histogram Analysis Method for the Analysis of Simulated and Parallel Tempering Simulations. J. Chem. Theory Comput. 2007, 3, 26−41. (27) Shirts, M. R.; Chodera, J. D. Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys. 2008, 129, 124105−10. (28) Metropolis, N.; Rosenbluth, A.; Rosenbluth, M.; Teller, A.; Teller, E. Equation of State Calculations by Fast Computing Machines. J. Chem. Phys. 1953, 21, 1087−1092. (29) Wu, X.; Brooks, B. R. Self-Guided Langevin Dynamics Simulation Method. Chem. Phys. Lett. 2003, 381, 512−518. (30) Wu, X.; Damjanovic, A.; Brooks, B. R. Efficient and Unbiased Sampling of Biomolecular Systems in the Canonical Ensemble: A Review of Self-Guided Langevin Dynamics. Adv. Chem. Phys. 2012, 150, 255−326. (31) Brooks, B. R.; Brooks, C. L., III; Mackerell, A. D., Jr.; Nilsson, L.; Petrella, R. J.; Roux, B.; Won, Y.; Archontis, G.; Bartels, C.; Boresch, S.; et al. CHARMM: The Biomolecular Simulation Program. J. Comput. Chem. 2009, 30, 1545−1614. (32) Feig, M.; Karanicolas, J.; Brooks, C. L., 3rd. MMTSB Tool Set: Enhanced Sampling and Multiscale Modeling Methods for Applications in Structural Biology. J. Mol. Graphics Modell. 2004, 22, 377−395. (33) Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. C. Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular Dynamics of n-Alkanes. J. Comput. Phys. 1977, 23, 327−341. (34) Chocholoušová, J.; Feig, M. Balancing an Accurate Representation of the Molecular Surface in Generalized Born Formalisms with Integrator Stability in Molecular Dynamics Simulations. J. Comput. Chem. 2006, 27, 719−29. (35) Kabsch, W.; Sander, C. Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers 1983, 22, 2577−2637. (36) Wu, X.; Brooks, B. R. Toward Canonical Ensemble Distribution from Self-Guided Langevin Dynamics Simulation. J. Chem. Phys. 2011, 134, 134108. (37) Kohn, J. E.; Millett, I. S.; Jacob, J.; Zagrovic, B.; Dillon, T. M.; Cingel, N.; et al. Random-coil Behavior and the Dimensions of Chemically Unfolded Proteins. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 12491−12496. (38) Kieslich, C. A.; Smadbeck, J.; Khoury, G. A.; Floudas, C. A. conSSert: Consensus SVM Model for Accurate Prediction of Ordered Secondary Structure. J. Chem. Inf. Model. 2016, 56, 455−461. (39) Lee, M. S.; Olson, M. A. Comparison of Volume and Surface Area Nonpolar Solvation Free Energy Terms for Implicit Solvent Simulations. J. Chem. Phys. 2013, 139, 044119.
(5) Leung, D. W.; Borek, D.; Luthra, P.; Binning, J. M.; Anantpadma, M.; Liu, G.; Harvey, I. B.; Su, Z.; Endlich-Frazier, A.; Pan, J.; Shabman, R. S.; Chiu, W.; Davey, R. A.; Otwinowski, Z.; Basler, C. F.; Amarasinghe, G. K. An Intrinsically Disordered Peptide from Ebola Virus VP35 Controls Viral RNA Synthesis by Modulating Nucleoprotein-RNA Interactions. Cell Rep. 2015, 11, 376−389. (6) Huang, J.; Rauscher, S.; Nawrocki, G.; Ran, T.; Feig, M.; de Groot, B. L.; Grubmüller, H.; MacKerell, A. D., Jr. CHARMM36m: An Improved Force Field for Folded and Intrinsically Disordered Proteins. Nat. Methods 2017, 14, 71−73. (7) Ishikawa, Y.; Sugita, Y.; Nishikawa, T.; Okamoto, Y. Ab Initio Replica Exchange Monte Carlo Method for Cluster Studies. Chem. Phys. Lett. 2001, 333, 199−206. (8) Lee, M. S.; Feig, M.; Salsbury, F. R., Jr.; Brooks, C. L., 3rd. New Analytic Approximation to the Standard Molecular Volume Definition and its Application to Generalized Born Calculations. J. Comput. Chem. 2003, 24, 1348−1356. (9) Feig, M.; Onufriev, A.; Lee, M. S.; Im, W.; Case, D. A.; Brooks, C. L., III. Performance Comparison of Generalized Born and Poisson Methods in the Calculation of Electrostatic Solvation Energies for Protein Structures. J. Comput. Chem. 2004, 25, 265−284. (10) Yeh, I. C.; Lee, M. S.; Olson, M. A. Calculation of Protein Heat Capacity from Replica-Exchange Molecular Dynamics Simulations with Different Implicit Solvent Models. J. Phys. Chem. B 2008, 112, 15064−15073. (11) Lee, M. S.; Olson, M. A. Protein Folding Simulations Combining Self-Guided Langevin Dynamics and Temperature-Based Replica Exchange. J. Chem. Theory Comput. 2010, 6, 2477−2487. (12) Chaudhury, S.; Olson, M. A.; Tawa, G.; Wallqvist, A.; Lee, M. S. Efficient Conformational Sampling in Explicit Solvent using a Hybrid Replica Exchange Molecular Dynamics Method. J. Chem. Theory Comput. 2012, 8, 677−687. (13) Song, D.; Wang, W.; Ye, W.; Ji, D.; Luo, R.; Chen, H.-F. ff14IDPs Force Field Improving the Conformation Sampling of Intrinsically Disordered Proteins. Chem. Biol. Drug Des. 2017, 89, 5− 15. (14) Ye, W.; Ji, D.; Wang, W.; Luo, R.; Chen, H.-F. Test and Evaluation of ff99IDPs Force Field for Intrinsically Disordered Proteins. J. Chem. Inf. Model. 2015, 55, 1021−1029. (15) Lee, K. H.; Chen, J. Optimization of the GBMV2 Implicit Solvent Force Field for Accurate Simulation of Protein Conformational Equilibria. J. Comput. Chem. 2017, 38, 1332−1341. (16) Best, R. B. Computational and Theoretical Advances in Studies of Intrinsically Disordered Proteins. Curr. Opin. Struct. Biol. 2017, 42, 147−154. (17) Chong, S. H.; Chatterjee, P.; Ham, S. Computer Simulations of Intrinsically Disordered Proteins. Annu. Rev. Phys. Chem. 2017, 68, 117−134. (18) Bhowmick, A.; Brookes, D. H.; Yost, S. R.; Dyson, H. J.; Forman-Kay, J. D.; Gunter, D.; Head-Gordon, M.; Hura, G. L.; Pande, V. S.; Wemmer, D. E.; Wright, P. E.; Head-Gordon, T. Finding our Way in the Dark Proteome. J. Am. Chem. Soc. 2016, 138, 9730−9742. (19) Trebst, S.; Troyer, M.; Hansmann, U. H. Optimized Parallel Tempering Simulations of Proteins. J. Chem. Phys. 2006, 124, 174903−174909. (20) Katzgraber, G.; Trebst, S.; Huse, D. A.; Troyer, M. FeedbackOptimized Parallel Tempering Monte Carlo. J. Stat. Mech.: Theory Exp. 2006, 2006, P03018. (21) Lee, M. S.; Olson, M. A. Comparison of Two Adaptive Temperature-Based Replica Exchange Methods Applied to a Sharp Phase Transition of Protein Unfolding-Folding. J. Chem. Phys. 2011, 134, 244111−224417. (22) Olson, M. A.; Lee, M. S. Evaluation of Unrestrained ReplicaExchange Simulations using Dynamic Walkers in Temperature Space for Protein Structure Refinement. PLoS One 2014, 9, e96638. (23) Olson, M. A.; Zabetakis, D.; Legler, P. M.; Turner, K. B.; Anderson, G. P.; Goldman, E. R. Can Template-Based Protein Models Guide the Design of Sequence Fitness for Enhanced Thermal Stability 118
DOI: 10.1021/acs.jcim.7b00517 J. Chem. Inf. Model. 2018, 58, 111−118