Predicting RNA Structures via a Simple van der ... - ACS Publications

Dec 29, 2016 - conjunction with the OPC water via an unequal Lorentz−. Berthelot combination ..... This study was supported by National Research Fou...
0 downloads 0 Views 1MB Size
Letter pubs.acs.org/JCTC

Predicting RNA Structures via a Simple van der Waals Correction to an All-Atom Force Field Changwon Yang,† Manho Lim,† Eunae Kim,‡ and Youngshang Pak*,† †

Department of Chemistry and Institute of Functional Materials, Pusan National University, Busan 46241, South Korea College of Pharmacy, Chosun University, Gwangju 61452, South Korea



S Supporting Information *

ABSTRACT: We proposed a simple van der Waals backbone correction (O2′ and OP) to the AMBER ff12 force field in conjunction with the OPC water via an unequal Lorentz− Berthelot combination rule. As tested on four different tetranuceotides such as r(GACC), r(CCCC), r(AAAA), and r(CAAU), this new force field correctly captured each native fold as the largest population. For a RNA tetraloop (UUCG) tested, the stability of its native fold is substantially improved. tetraloop, it is uncertain that the ff99+Chen_Garcia can reproduce the native fold in a global free energy minimum. Recently, Bergonzo et al. presented a modified version of ff12 (ff12+vdWbb) targeting RNAs.10 This new force field is a variant of the standard ff12 with van der Waals (vdW) corrections to the backbone phosphate oxygen atoms (OP, O3′, and O5′; see Figure S1 for the definition of oxygen atom types). This force field was tested against two simple RNA tetramers, r(GACC) and r(CCCC), using four different water models: TIP3P,11 SPC/E,12 TIP4P-Ew,13 and OPC.14 The results showed that the OPC water model, which was previously optimized to improve bulk water properties, performed the best with ff12+vdWbb, thereby implying an importance of using a more decent water model. In particular, ff12+vdWbb predicted the population of the A-form native structure of r(GACC) in a closer agreement with the experiment. In contrast, this force field predicted the native structure (A-form) of r(CCCC) incorrectly as a very minor conformation. Considering this limited applicability, ff12+vdWbb correction needs to be revised further to possibly expand the transferability of ff12+vdWbb. More recently, GilLey et al. improved the standard ff12 by adding the backbone torsion potentials derived from the X-ray ensemble protein data bank (PDB).15 Interestingly, this new force field (ff12+pdb) appeared to perform better than ff12+vdWbb. The ff12+pdb successfully predicted the NMR native structures of both r(GACC) and r(CCCC) in their largest population. The ff12+pdb, however, fell short of observing the native structure (A-form) of r(AAAA). As noted above, all previous attempts to improve the RNA force fields led to a partial success, reassuring an inherent complexity of the RNA structure predictions using all-atom MD. Most recently, using several variants of AMBER force field, enhanced MD simulations of various RNA

T

he prediction of RNA structures is of fundamental importance to understand their intrinsic functions and biological behaviors in cellular environments. A molecular dynamics simulation (MD) is a powerful tool for predicting the biomolecular structures and dynamics at an atomistic resolution. A computer simulation of the RNA structures via all-atom MD is computationally challenging due to the very flexible and complex nature of RNA structures. Although notable advances in all-atom MD simulations previously have been reported in the folding of small proteins with various structural motifs,1,2 progress in RNA folding simulations at an all-atom level has been rather limited. This is mainly because the empirical force fields for nucleic acids have not reached a level of accuracy for properly describing the complex interactions exhibited by the RNA backbone, ribose group, base stacking, and base pair hydrogen bonding. Until now, there have been many attempts to improve the quality of allatom RNA force fields. For example, ff12 is the standard AMBER force field for nucleic acids with improved α/γ backbone torsions and glycosidic χ angles (OL3).3,4 Using enhanced sampling MD methods in combination with ff12, the native-like states of UUCG and GAGA tetraloops were captured, but the equilibrium ensemble for each tetraloop was not fully established.5,6 On the other hand, starting from the standard AMBER ff99, Chen and Garcia modified this force field by optimizing the base stacking and glycosidic torsions (ff99+Chen_Garcia).7 Using this ff99+Chen_Garcia force field, the native-like structures of several RNA tetraloops (GCAA, UUCG, and CUUG) were observed simultaneously from their replica exchange MD (REMD) simulation, but these simulation ensembles were not converged. In a subsequent study with this modified force field, the native fold of the GCAA tetraloop was shown to be the lowest free energy minimum state.8 For the UUCG tetraloop, however, its native fold deviated substantially from the most populated state, as evidenced by multiple long time enhanced sampling MD simulations.9 For the CUUG © 2016 American Chemical Society

Received: August 17, 2016 Published: December 29, 2016 395

DOI: 10.1021/acs.jctc.6b00808 J. Chem. Theory Comput. 2017, 13, 395−399

Letter

Journal of Chemical Theory and Computation

Figure 1. Simulated population with respect to the RMSD values for four different RNA tetramers of (A) r(GACC), (B) r(CCCC), (C) r(AAAA), and (D) r(CAAU). Each of all tetramers tested has an A-form reference. This A-form reference structure was created using the Make-na server: http://structure.usc.edu/make-na/server.html. The RMSD value was obtained from each A-form reference structure. For each tetramer, two independent RECT simulations initiating from random and intercalated structures (run 1 and run 2) were run for 3 μs per replica at 277 K. For r(GACC) and r(CCCC), extra simulations using the same RECT protocol were performed using the standard Lorentz−Berthelot (LB) combination rule with the scaled vdW radii for OP and O2′. The resulting RMSD distributions (in magenta) are given in the panels (A) and (B). The experimental population of the major conformer was denoted for r(GACC)17 and r(CCCC).18

group by weakening the H-bonds between the phosphate group and water molecules. This study proposes a simple and effective vdW correction scheme to possibly increase the accuracy of ff12+vdWbb. For this purpose, a vdW correction was introduced by focusing on the two oxygen atom types: OP in the phosphate group and O2′ in the ribose. Here, the backbone O3′ and O5′ atoms in the phosphate group were excluded from the present correction to minimize any residual effects on the backbone torsional profiles of the standard ff12. For only these two oxygen atom types (OP and O2′) in the RNA part, the corresponding σ values were uniformly increased by 5% from those of ff12. Most importantly, different Lorentz−Berthelot combination rules were employed, so that these scaled values were applied to the intramolecular interaction of RNA, whereas the original (unscaled) values of ff12 were used for the intermolecular interaction between OP/O2′ (RNA) and OW (water molecule) (see Table S1). This unequal combination rule effectively modulates the strength of the H-bonds between the phosphate and hydroxyl groups, while the hydration effect of these groups remains intact. As a result, this scheme increased selectively the desolvation penalty of these functional groups to reduce the population of the problematic intercalated structure. Here, using this new force field (ff12+vdWYP) with the OPC water model, it was demonstrated that all the NMR native structures (A-form) of four different sequences of

tetranucleotide systems on multiple long time scales indicated that intercalated structures were commonly observed instead of the native structures.16 These intercalated structures were stabilized by overestimated hydrogen bonds (H-bonds) between the highly charged phosphate group in the backbone and the hydroxyl group in ribose. Similar intercalated structures were also found in an earlier simulation result with ff12+vdWbb.10 In contrast to these simulation results, there was no NMR evidence of these intercalated structures.16−18 Therefore, balancing this overestimating H-bond between the phosphate and ribose groups will be necessary to enhance the quality of the AMBER-based RNA force field. In this regard, one can possibly achieve an enhancement by modulating the desolvation penalty of these two functional groups via a simple vdW correction. In the previous vdWbb correction, the σ values of the phosphate oxygen atoms (OP and O3′/O5′) increased by factors of 5.2 and 5.3%, respectively. Originally, Steinbrecher et al. obtained these scale factors in an effort to reproduce the hydration free energy of several bio-organic phosphates.19 Increasing the vdW radii of the phosphate oxygen atoms by these factors appeared to destabilize the intercalated structure by reducing the unrealistic H-bond association of these two groups. This vdWbb correction, however, may not be highly effective, because increasing the σ values of the phosphate oxygen atoms can also affect the hydration of the phosphate 396

DOI: 10.1021/acs.jctc.6b00808 J. Chem. Theory Comput. 2017, 13, 395−399

Letter

Journal of Chemical Theory and Computation tetranucleotides, r(GACC), r(CCCC), r(AAAA), and r(CAAU), were correctly captured in their major population state. For an enhanced MD sampling of the tetramers, the replica exchange with collective variable tempering (RECT) simulation20 was employed (see the Supporting Information for detailed setup protocols). In addition, for a benchmark test of the modified force field targeting an RNA tetraloop, a parallel tempering meta-dynamics (PTMETAD)21 simulation method was used (see the Supporting Information). Figure 1 shows the RMSD distributions of the four RNA tetramers in comparison with earlier simulation results.10,15 As demonstrated in the two test cases of r(GACC) and r(CCCC) tetramers, the unequal combination rule with the present vdW scaling appears to be better than the standard vdW scaling in terms of locating the A-form reference as the major population. The standard combination rule tends to enhance the population of conformers deviating from the A-form. This tendency is largely pronounced in the case of r(CCCC). In the simulation of the r(GACC) tetramer, all three force fields, ff12+vdWbb,10 ff12+pdb,15 and ff12+vdWYP, produced the native-like structure (A-form) correctly in the major populated state. Interestingly, the ff12+vdWbb sampled a non-negligible population (25%) of a minor conformer near RMSD = 2.5 Å.10 On the other hand, this minor population was suppressed to approximately 4% when the ff12+vdWYP was used (see Figure S2 for cluster analysis results). This reduction of the minor conformer from 25% to 4% is more in line with ff12+pdb, but it is in less agreement with NMR. In terms of r(CCCC), the ff12+vdWbb failed to reproduce the native fold, but both ff12+pdb and ff12+vdWYP correctly captured the native structure in the dominant population. For r(AAAA), only ff12+vdWYP could correctly predict the native fold of r(AAAA) as the largest population. In the case of r(CAAU), the ff12+vdWYP produced a relatively small population of the native-like state (41%), but this native-like state is still most abundant. As the second largest population (17%), the conformer occurring at RMSD = 5.1 Å is primarily the 1-3-4 stacked structure (Figure S2). Besides this type of stacked structure, more diverse non-native contacts of the pyrimidine and purine base pair were consistently observed as minor populations. Since this non-native base stacking in r(CAAU) is not supported by the NMR experiment,16 ff12+vdWYP/OPC tends to overestimate the base stacking interaction, as evidenced by stacking free energy data simulated for various dinucleotides (Table S2). Figure 2 presents the number of false positives obtained from the RECT simulation on all these four tetramers. The number of false positives is defined as the number of the predicted distances below 5 Å but not observed experimentally. This number can be a good indicator for assessing the reliability of the simulated structures.15,16 Furthermore, the number of false negatives, which is the number of the predicted distances above 5 Å among the experimentally observed nuclear overhauser effect (NOE) pairs, was given in Figure 2. Overall, the ff12+vdWYP consistently yielded much smaller numbers of the false positives than ff12+pdb.15 Also, the number of false negatives from the ff12+vdWYP was relatively small; 1 for r(GACC), r(CCCC), and r(AAAA) and 3 for r(CAAU). For a more extensive evaluation of ff12+vdWYP, the simulated NOE data was compared to the NMR experiment (Figure S3), and the simulated backbone dihedral distributions were given in Figure S4. The simulated NOE data agree reasonably well with the NMR experiment. In terms of the backbone angle

Figure 2. (A) Number of false positives that are defined to be the predicted distances below 5 Å that were not observed by NOE experiment. (B) Number of false negatives that are defined to be the distances predicted to be above 5 Å among experimentally observed NOE pairs. Herein, the earlier experimental NOE data16−18 for r(GACC), r(CCCC), r(AAAA), and r(CAAU) were used to compute the false positives and negatives.

distributions, the backbone α and ζ angle distribution was reported to be important in modulating RNA structures.15 For each of the four tetranucleotides, the corresponding distribution of α and ζ angles obtained from ff12+vdWYP/OPC clearly points to the A-form reference of α (gauche-) and ζ (gauche-) (Figure S4). Until now, it was shown that the standard ff12 could be improved by the simple vdW correction and the unequal Lorentz−Berthelot combination rule in conjunction with the OPC water model. Although the ff12+vdWYP/OPC is successful in reproducing the native fold of various tetranucleotides tested, it is unclear if this force field can handle more challenging RNA tetraloops. Thus, far, existing RNA force fields have not been much successful in simulating the folding of tetraloops and their thermodynamic stability.9,22 In the present study, a computationally challenging UUCG tetraloop (ccUUCGgg) was chosen to probe its stability of the native fold. For this tetraloop, the earlier PTMETAD simulation study with the standard ff12/tip3p reported that the UUCG native fold is not the global free energy minimum; the free energy difference upon folding ΔGU→F was more than 7 kcal/mol at T = 300.9 K.22 Here, the same simulation protocol (PTMETAD) in conjunction with ff12+vdWYP/OPC was used to map the free energy profile of the UUCG tetraloop as a function of eRMSD (Figure 3). In a direct comparison with the earlier free energy profile obtained with the ff12/tip3p, the current ff12+vdWYP/OPC greatly improved the energetics of the native fold of the UUCG loop (ΔGU→F = 3 kcal/mol) 397

DOI: 10.1021/acs.jctc.6b00808 J. Chem. Theory Comput. 2017, 13, 395−399

Letter

Journal of Chemical Theory and Computation

RNA tetramers, the use of a reliable water model such as OPC seems to play a role by offering a more reasonable description of solute and solvent interactions. Despite the current improvement observed in tetramer folding, it should be pointed out that some caveat was found in terms of reproducing the NMR population of a subtle minor structure of the GACC tetramer. In the current equilibrium ensemble of this tetramer, the minor conformer (near A-form) exhibits an extrusion of the C4 base from the native C3−C4 stacking position. As shown by Table S2, in comparison with experiment, obviously the ff12+vdWYP/OPC overestimated the stacking free energy of the CC pair. Therefore, this reduced population of the minor conformer may be related to the force field artifact of overestimating stacking free energy of the CC pair (Table S2). As indicated by the enhanced MD simulation (PTMETAD) on the UUCG tetraloop (Figure 3), the use of ff12+vdWYP/ OPC lowered the free energy of the native fold by 4 kcal/mol compared with the standard ff12/tip3p result. This improvement is encouraging and worth emphasizing, but unfortunately the non-negligible stability gap between theory and experiment still exists. As a result, we anticipate that other force field artifacts, such as overestimating base stacking and underestimating base pairing interactions described in earlier reports,7,16,23,24 need to be resolved for further enhancing the accuracy of the RNA force field.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jctc.6b00808. Details of simulation methods, the modified vdW parameters, simulated stacking free energy of various dinucleotides, cluster analysis result, NOE analysis result, and the backbone dihedral angle distributions of (ζi, αi+1) for four tetranucleotides (PDF) Implementation files of ff12+vdWYPOPC (GROMACS format: ff12_vdw_yp.ff.zip) (ZIP)

Figure 3. (A) Free energy profiles at 300.9 K as a function of eRMSD resulting from using ff12/tip3p (red line)22 and ff12_vdWYP/OPC (black line). (B) Folding free energy change at 300.9 K relative to that obtained from ff12/tip3p: ΔΔG ≡ ΔGf(X) − ΔGf (ff12/tip3p), where X denotes the method for comparison. The symbols α, ζ, and α+ζ indicate the previous dihedral correction for α, ζ, and both α and ζ angles.22 Experimental folding free energy of the UUCG tetraloop is ΔGf(Expt.) = −2.8 kcal/mol.25



(Figure 3A). Clearly, this improvement is better than what was previously achieved by additional dihedral corrections to ff12/ tip3p (Figure 3B) but not yet sufficient for correct folding of the tetraloop. As mentioned above, the previous backbone dihedral correction using the PDB database of dinucleotides (ff12+pdb) enhanced the accuracy of ff12.15 This correction was carried out by one robust and self-consistent scheme to improve the RNA force field via an empirical correction of the backbone dihedral angles. Despite such advances, the applicability of ff12+pdb is somewhat limited due to the issue of portability.15 In the present modification, however, the backbone dihedral terms remain virtually the same as those of the standard ff12. Only the vdW radii of the two oxygen atom types were slightly tuned with the two Lorentz−Berthelot combination rules applied separately for intra- and intermolecular vdW interaction parts. This unequal combination turns out more effective than the standard combination rule in terms of predicting the major population of A-form. This simple vdW modification with the unequal combination led to noticeable progress toward correctly predicting the native fold of a more expanded set of various RNA tetramers (A-form). In the process of achieving such a successful simulation outcome of

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Youngshang Pak: 0000-0002-7083-8672 Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This study was supported by National Research Foundation (NRF-2013R1A1A2059437, NRF-2014R1A4A1001690). Y.P. thanks the Korea Institute of Science and Technology Information (KISTI) for computational resources under the “KSC-2015-C3-046 program”. Also, the use of computing systems at the Daegu-Gyeongbuk Institute of Science & Technology (DGIST) was greatly appreciated.



REFERENCES

(1) Lindorff-Larsen, K.; Piana, S.; Dror, R. O.; Shaw, D. E. How fastfolding proteins fold. Science 2011, 334, 517−520. (2) Jiang, F.; Wu, Y. D. Folding of Fourteen Small Proteins with a Residue-Specific Force Field and Replica-Exchange Molecular Dynamics. J. Am. Chem. Soc. 2014, 136, 9536−9539.

398

DOI: 10.1021/acs.jctc.6b00808 J. Chem. Theory Comput. 2017, 13, 395−399

Letter

Journal of Chemical Theory and Computation (3) Perez, A.; Marchan, I.; Svozil, D.; Sponer, J.; Cheatham, T. E., 3rd; Laughton, C. A.; Orozco, M. Refinement of the AMBER Force Field for Nucleic Acids: Improving the Description of α/γ Conformers. Biophys. J. 2007, 92, 3817−3829. (4) Zgarbova, M.; Otyepka, M.; Sponer, J.; Mladek, A.; Banas, P.; Cheatham, T. E.; Jurecka, P. Refinement of the Cornell et al. Nucleic Acids Force Field Based on Reference Quantum Chemical Calculations of Glycosidic Torsion Profiles. J. Chem. Theory Comput. 2011, 7, 2886−2902. (5) Kuhrova, P.; Banas, P.; Best, R. B.; Sponer, J.; Otyepka, M. Computer Folding of RNA Tetraloops? Are We There Yet? J. Chem. Theory Comput. 2013, 9, 2115−2125. (6) Haldar, S.; Kuhrova, P.; Banas, P.; Spiwok, V.; Sponer, J.; Hobza, P.; Otyepka, M. Insights into Stability and Folding of GNRA and UNCG Tetra loops Revealed by Microsecond Molecular Dynamics and Well-Tempered Metadynamics. J. Chem. Theory Comput. 2015, 11 (8), 3866−3877. (7) Chen, A. A.; Garcia, A. E. High-resolution reversible folding of hyperstable RNA tetraloops using molecular dynamics simulations. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 16820−16825. (8) Miner, J. C.; Chen, A. A.; Garcia, A. E. Free-energy landscape of a hyperstable RNA tetraloop. Proc. Natl. Acad. Sci. U. S. A. 2016, 113, 6665−6670. (9) Bergonzo, C.; Henriksen, N. M.; Roe, D. R.; Cheatham, T. E. Highly sampled tetranucleotide and tetraloop motifs enable evaluation of common RNA force fields. RNA 2015, 21, 1578−1590. (10) Bergonzo, C.; Cheatham, T. E. Improved Force Field Parameters Lead to a Better Description of RNA Structure. J. Chem. Theory Comput. 2015, 11, 3969−3972. (11) Jorgensen, W. L. Quantum and statistical mechanical studies of liquids. 10. Transferable intermolecular potential functions for water, alcohols, and ethers. Application to liquid water. J. Am. Chem. Soc. 1981, 103, 335−340. (12) Berendsen, H. J. C.; Grigera, J. R.; Straatsma, T. P. The missing term in effective pair potentials. J. Phys. Chem. 1987, 91, 6269−6271. (13) Horn, H. W.; Swope, W. C.; Pitera, J. W.; Madura, J. D.; Dick, T. J.; Hura, G. L.; Head-Gordon, T. Development of an improved four-site water model for biomolecular simulations: TIP4P-Ew. J. Chem. Phys. 2004, 120, 9665−9678. (14) Izadi, S.; Anandakrishnan, R.; Onufriev, A. V. Building Water Models: A Different Approach. J. Phys. Chem. Lett. 2014, 5 (21), 3863−3871. (15) Gil-Ley, A.; Bottaro, S.; Bussi, G. Empirical Corrections to the Amber RNA Force Field with Target Metadynamics. J. Chem. Theory Comput. 2016, 12, 2790−2798. (16) Condon, D. E.; Kennedy, S. D.; Mort, B. C.; Kierzek, R.; Yildirim, I.; Turner, D. H. Stacking in RNA: NMR of Four Tetramers Benchmark Molecular Dynamics. J. Chem. Theory Comput. 2015, 11, 2729−2742. (17) Yildirim, I.; Stern, H. A.; Tubbs, J. D.; Kennedy, S. D.; Turner, D. H. Benchmarking AMBER force fields for RNA: comparisons to NMR spectra for single-stranded r(GACC) are improved by revised χ torsions. J. Phys. Chem. B 2011, 115, 9261−9270. (18) Tubbs, J. D.; Condon, D. E.; Kennedy, S. D.; Hauser, M.; Bevilacqua, P. C.; Turner, D. H. The nuclear magnetic resonance of CCCC RNA reveals a right-handed helix, and revised parameters for AMBER force field torsions improve structural predictions from molecular dynamics. Biochemistry 2013, 52, 996−1010. (19) Steinbrecher, T.; Latzer, J.; Case, D. A. Revised AMBER Parameters for Bioorganic Phosphates. J. Chem. Theory Comput. 2012, 8, 4405−4412. (20) Gil-Ley, A.; Bussi, G. Enhanced Conformational Sampling Using Replica Exchange with Collective-Variable Tempering. J. Chem. Theory Comput. 2015, 11, 1077−1085. (21) Bussi, G.; Gervasio, F. L.; Laio, A.; Parrinello, M. Free-energy landscape for beta hairpin folding from combined parallel tempering and metadynamics. J. Am. Chem. Soc. 2006, 128, 13435−13441.

(22) Bottaro, S.; Banas, P.; Sponer, J.; Bussi, G. Free Energy Landscape of GAGA and UUCG RNA Tetraloops. J. Phys. Chem. Lett. 2016, 7, 4032−4038. (23) Kuhrova, P.; Best, R. B.; Bottaro, S.; Bussi, G.; Sponer, J.; Otyepka, M.; Banas, P. Computer Folding of RNA Tetraloops: Identification of Key Force Field Deficiencies. J. Chem. Theory Comput. 2016, 12, 4534−4548. (24) Brown, R. F.; Andrews, C. T.; Elcock, A. H. Stacking Free Energies of All DNA and RNA Nucleoside Pairs and DinucleosideMonophosphates Computed Using Recently Revised AMBER Parameters and Compared with Experiment. J. Chem. Theory Comput. 2015, 11, 2315−2328. (25) Molinaro, M.; Tinoco, I. Use of ultra stable UNCG tetraloop hairpins to fold RNA structures: thermodynamic and spectroscopic applications. Nucleic Acids Res. 1995, 23, 3056−3063.

399

DOI: 10.1021/acs.jctc.6b00808 J. Chem. Theory Comput. 2017, 13, 395−399