Article pubs.acs.org/JCTC
Effective Conformational Sampling in Explicit Solvent with Gaussian Biased Accelerated Molecular Dynamics Qiang Shao*,†,‡ and Weiliang Zhu†,‡ †
Drug Discovery and Design Center, CAS Key Laboratory of Receptor Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China ‡ University of Chinese Academy of Sciences, Beijing, 100049, China S Supporting Information *
ABSTRACT: In this Article, a user-friendly Gaussian biased accelerated molecular dynamics (GbAMD) method is presented that uses a sum of Gaussians of potential energies as the biased force to accelerate the conformational sampling. The easy parameter setting of GbAMD is demonstrated in a variety of simulation tests for the conformational transitions of proteins with various complexity including the folding of Trpcage, GB1p, and HP35 peptides as well as the functional conformational changes of nCaM and HIV-1 PR proteins. Additionally, the ability of GbAMD in conformational sampling and free-energy evaluation is quantitatively assessed through the comparison of GbAMD simulations on the folding of α-helical Trpcage and β-hairpin GB1p with the accompanying standard dual boost AMD and conventional MD (cMD) simulations. While GbAMD can fold both peptides into their native structures repeatedly in individual trajectories, AMD can only fold Trpcage and cMD fails the folding in both cases. As a result, only GbAMD can quantitatively measure the properties of the equilibrium conformational ensemble of protein folding consistent with experimental data. Also notable is that the structural properties of the indispensable unfolded and transition states in the folding pathways of Trpcage and GB1p characterized by GbAMD simulations are in great agreement with previous simulations on the two peptides. In summary, GbAMD has an effective conformational sampling ability that provides a convenient and effective access for simulating the structural dynamics of biomolecular systems.
■
been developed and applied in software including CHARMM,5 AMBER,6 GROMOS,7 etc. Ideally, one force field should be suitable for describing the dynamics of all kinds of protein folding. However, a variety of simulations have indicated that the force fields in practice have the tendencies to favor a certain secondary structure over another8−11 (e.g., the AMBER FF96 favors β-structures, whereas the other force fields such as FF99, FF03, and their variants favor α-helical conformations). Therefore, transferability of force field is still desirable. Intriguingly, the force fields developed more recently display improved protein secondary structure balance.12−14 For instance, as coupled with explicit solvent models, the FF99SB-ILDN and RSFF2 force fields can fold fast-fold proteins with diverse structures and evaluate the folding/ unfolding thermodynamic quantities consistent with experimental data, respectively.11,15 The FF14SBonlysc force field can fold these proteins even coupled with the implicit solvent model (GB-Neck2), although the native structures are not the lowest free-energy states for specific proteins.16 On the other hand, the time-scale gap between simulation and experiment has been shortened to a great extent by recent
INTRODUCTION While molecular dynamics (MD) simulation has been extensively applied in the biological field, its reliability in modeling the physical transition of a biomolecular system largely relies on two factors: the accuracy of the molecular force field and the efficiency of conformational sampling. Protein folding, which has been posing challenges for simulation for decades, is arguably an ideal model process to indicate the limitation that the above-mentioned factors exert on molecular simulation.1 A folding process includes the formation of secondary and tertiary structures. The varied physicochemical properties of amino acids result in the complexity of interresidue interactions (particularly side chain interactions), which leads to the rugged configuration of the free-energy landscape.2,3 As proteins behave themselves on free-energy landscapes, they are easy to be trapped in transient intermediate states for durations exceeding the time scale of simulation. In general, the barrier-crossing transitions on the complex freeenergy surfaces make the folding usually occur at microseconds to milliseconds.4 Such a time scale is generally too short for currently available experimental techniques but meanwhile too long for MD simulations. Triggered by the crucial importance of force field in molecular simulation, a great number of force fields have © 2017 American Chemical Society
Received: March 7, 2017 Published: July 31, 2017 4240
DOI: 10.1021/acs.jctc.7b00242 J. Chem. Theory Comput. 2017, 13, 4240−4252
Article
Journal of Chemical Theory and Computation
well as the nature of the potential energy surface. In addition, the nature of eq 1 (the larger bias potential being added at lower energy range) may result in an undersampling of the functionally important states lying in the low energy region. To make an even sampling in both low and high energy regions rather than enhancing the sampling in a specific region, a sum of Gaussian-type functions has been used to bias the potential energy of the system of interest,43 similar to the Gaussian terms in metadynamics method except that the Gaussian functions are deposited along the potential energy rather than the trajectory of collective variables. Despite the success in folding short peptide (Trpcage) in an implicit solvent simulation,43 the ability of this method in making a uniform sampling on the potential energy in a large explicit solvent simulation whose transition covers an extensive potential energy surface is certainly deficient because of the difficulty in setting a large quantity of parameters. Here, a user-friendly Gaussian biased accelerated molecular dynamics (GbAMD) method is presented with the aim to utilize easily set simulation parameters to accelerate the conformational sampling of molecular simulation. Such a method differs from the standard AMD including GaMD in two aspects: the detailed formula of the bias potential and the manner in which the bias potential is added. While AMD applies a single boost potential with either the original and
encouraging advances in computational power. By using the specialized Anton supercomputer, the unbiased MD folding simulations of small proteins consisting of 10−80 residues are extended to approximate time scales of 0.1−3 ms.11,12 Nevertheless, despite the respectable computational power invested, one should notice that the contemporary availability of such an initiative in the broad scientific community is limited. The protein folding simulations on more commonly available general-purpose computational resources are often run with the use of various enhanced conformational sampling methods, which have been developed to sample the configuration space of protein in accelerated manners.17−21 A straightforward strategy of enhanced sampling is to add a bias potential or force that smoothens the potential energy surface and thus facilitates the biomolecular conformational transition crossing over the high-energy barrier. In targeted or steered MD simulation (TMD22 or SMD23), a harmonic energy function is added to drag the biomolecule of interest to the target conformation. Similarly, in umbrella sampling simulation, a series of predefined harmonic energy functions is applied in individual windows separately to enhance the sampling along the transition pathway.24−26 In addition, the added bias potentials or forces are continuously updated to converge to the free energy in adaptive biasing force (ABF)27−29 and metadynamics30 methods. For all of these biasing simulation methods, certain reaction coordinates (or collective variables) are needed along which the bias forces are applied.17 The definition of the reaction coordinates, however, often requires expert knowledge of the studied systems. Furthermore, the predefined reaction coordinates largely place constraints on the pathway and conformational space to be sampled during the biased simulation, leading to the inconvenience in subsequent free-energy landscape analysis.17,31 Accelerated molecular dynamics (AMD) is another representative biasing simulation method which adjusts the potential energy surface by adding a non-negative boost potential function taking the form32,33 ΔV (r ) =
(E − V (r ))2 α + E − V (r )
most common form of ΔV (r ) = 1
(E − V (r ))2 32,33 α + E − V (r ) 2 41
or the
harmonic form of ΔV (r ) = 2 k(E − V (r )) , the GbAMD here is built as a sum of Gaussian energy functions (see details in the “Computational Methods” section), which are supposed to essentially yield a Gaussian distributed bias force that could give the accurate recovery of free-energy landscape. In addition, unlike in AMD in which the bias force is mainly boosted below a reference energy, the Gaussian bias force in GbAMD is applied all over the desired potential energy range. To make it simple to implement for computation, the Gaussian bias energies in GbAMD are centered on evenly distributed points in an expanded potential energy range and their heights are simply linearly increased from the low to high energy regions. As a result, the entire potential energy range should be sampled in an enhanced manner and the bias is more severe at high energy level. Accordingly, the simulation can sample highenergy and sparsely populated transient states and meanwhile guarantee that low-energy states are not omitted or undersampled. The easy setting of the parameters in GbAMD (see the procedure in Figure 1 and the detailed description in the “Computational Methods” section) should allow for its application on large explicit solvent simulations. To indicate the reliability of GbAMD in conformational sampling, we tested explicit solvent GbAMD simulations on the conformational transitions of a series of biomolecular systems, including the folding of three peptides (Trpcage TC5b,44 the C-terminal β-hairpin (residues 41−56) from the B1 domain of protein G (GB1p),45 and the thermostable C-terminal fragment (residues 42−76) of the chicken headpiece (HP35)),46 the open-to-close conformational change of the N-terminal domain of calmodulin (nCaM, residues of Leu4-Lys75),47,48 and the semiopen-to-open conformational change of human immunodeficiency virus type 1 protease (HIV-1 PR, 198 amino acids).49 With conveniently set parameters, GbAMD can search for the folding events and conformational changes in short simulation times (