Subscriber access provided by Miami University Libraries
B: Liquids, Chemical and Dynamical Processes in Solution, Spectroscopy in Solution
A Linear Interaction Energy Model for Cavitand Host-Guest Binding Affinities Joel José Montalvo-Acosta, Paulina Pacak, Diego Enri Barreto-Gomes, and Marco Cecchini J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.8b03245 • Publication Date (Web): 04 Jun 2018 Downloaded from http://pubs.acs.org on June 4, 2018
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry
A Linear Interaction Energy Model for Cavitand Host-Guest Binding Affinities Joel Jos´e Montalvo-Acosta, Paulina Pacak, Diego Enri Barreto Gomes and Marco Cecchini∗ Laboratoire d’Ing´enierie des Fonctions Mol´eculaires UMR7177 CNRS, Universit´e de Strasbourg, F-67083 Strasbourg Cedex, France June 1, 2018
Abstract Host-guest systems provide excellent models to explore molecular recognition in solution along with relevant technological applications from drug carriers to chemosensors. Here, we present a linear interaction energy (LIE) model to predict the binding affinity in host-guests with remarkable efficiency and predictive power. Using four host families including cucurbiturils, octa acids, and β-cyclodextrin, and a large set (49) of chemically-diverse guests, we demonstrate that binding-affinity predictions with a RMSE < 1.5 kcal/mol from experiments can be obtained with a few nanoseconds of Molecular Dynamics. The parameters of the LIE model are shown to be transferable among host-guest families and the quality of the predictions to be essentially force-field independent. Inclusion of the strain energy of the host in the bound state appears to be critically important to improve the quality of the predictions, particularly when the host and the guest have comparable sizes. Unsuccesfull predictions for 28 additional highly-charged and bulky guests to cucurbit[7]uril indicate future directions for improvement.
Introduction Host-guest complexes have attracted significant interest in recent years both from experimental and computational chemists. 1–3 The host is typically a small synthetic molecule with a well-defined cavity or cleft, where a number of compounds (i.e. the guests) bind with remarkable affinity and/or selectivity. 4 The formation of host-guest complexes in solution is driven by the same non-covalent forces that steer protein-ligand binding (i.e. hydrogen bonding, electrostatic and Van der Waals interactions, etc.), which makes them suitable model systems to explore molecular recognition in solution. 5–7 In addition, a ∗
Corresponding author:
[email protected] 1
ACS Paragon Plus Environment
The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
number of synthetic hosts were shown to be interesting targets for technological applications as chemosensors, biomimetics, solubility enhancers, reaction containers or drug carriers, 8–11 and the design of scaffolds that bind potently and selectively specific families of guests is currently an active research field. 12,13 In this context, the development of accurate and efficient numerical approaches to evaluate the binding constant in host-guests may open to the rational design of molecular function(s). The accurate calculation of the free energy of binding in solution, ∆G◦b , remains a grand challenge in computational chemistry. 14 Despite a number of numerical strategies have been proposed, from fast and crude scoring functions to the most accurate and expensive quantum chemistry methods, there is no universal approach to the binding constant. 15 Among the pool of available methods, “end-point” approaches such as LIE 16–18 and MM/PBSA 19 provide efficient (though approximate) strategies to the free energy of binding and have been recently used in drug discovery. 20 The great advantage of these methods is that they sample only the configurational space of the initial and final states of the binding reaction, which drastically increases the efficiency of the calculations relative to more rigorous approaches. However, the quality of their predictions has been questioned, the accuracy of LIE being dependent on a set of empirical and typically nontransferable parameters, 21,22 while that of MM/PBSA being limited by the evaluation of the solvent contribution by continuum electrostatics. 23 In this letter, we present a LIE model for cavitand host-guests with remarkable efficiency and predictive power. The parameters of the model are shown to be transferable among chemically diverse hosts sharing a ring-like shape and to provide binding affinity predictions within 1.5 kcal/mol from experiments. The usefulness of the model is illustrated by the numerical evaluation of the differential binding affinity of cucurbituril (CB[n]) hosts for a set of steroid guests. 24
Material and Methods The theory of LIE 16,18 states that ∆G◦b can be obtained from the ensemble averages of the electrostatics and van der Waals contributions to the interaction energy of the ligand with the surroundings in the bound and the unbound states as elec
elec vdw
vdw ∆G◦b = β UL−s − U + α U − UL−s ub (1) L−s ub L−s b b where α and β are empirical parameters, the symbol h i indicates ensemble averages typically collected by Molecular Dynamics (MD) simulations, and the subscripts b and ub refer to the bound and unbound states, respectively. LIE parameters for cavitand hostguests were generated using a training set of 14 complexes based on the cucurbit[7]uril (CB7) host for which experimental binding affinities in water were available; 6 see Figure S1. The ability of CB7 to bind a highly diverse set of ligands, 25 i.e. rigid/flexible, neutral/charged, and alkyl/aromatic compounds, makes it an ideal framework for training the model. For each guest, classical MD simulations with an explicit treatment of the solvent were carried both in the bound and free state in solution using the General Amber Force Field (GAFF); 26 see Supporting Information (SI) for details. The LIE parameters were then obtained by linear fitting the ligand/surrounding interaction energies 2
ACS Paragon Plus Environment
Page 2 of 13
Page 3 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry
versus experimental binding affinities using Eq.1; see Figure S2 and Tables S1-S2. Linear fitting of the GAFF simulation results produced a good correlation with experiments (i.e. RMSE of 1.35 kcal/mol and R2 of 0.62; Table S3) and yielded α = 0.43 and β = 0.20. We note that these values are significantly different from those that are typically used in protein-ligand binding (β = 0.33 − 0.5 and α = 0.18), 18 with the coefficient for the electrostatic interactions β being approximately half the theoretical value of 0.5, and α being twice as large. The predictive character of this LIE model was assessed using a test set of 49 chemically diverse cavitand host-guest complexes. The test set included 15 complexes of an octa acid (OAH), 6 complexes of the tetra-endomethyl octa acid (OAM), 22 complexes of CB7, and 6 complexes of β-cyclodextrin (BCD); see Figure 1 for the hosts and Figures S3-S5 for the guests. The OAH and OAM complexes were all part of the Statistical Assessment of the Modeling of Proteins and Ligands blind challenges versions 2015 and 2016 (SAMPL4 and SAMPL5) 6,7 and provide a stringent benchmark for any computational approach. The 22 hydrocarbon guests in complex with CB7 were used in the HYDROPHOBE challenge, a recent experimental/computational benchmark of numerical approaches to the binding constant. 27 And, BCD is a flexible cavitand host that has been used as a solubility enhancer for drug formulation. 28
Results and Discussion The prediction strength of the LIE model was assessed by measuring the root mean square error (RMSE) and the mean absolute error (MAE) as metrics. The results in Figure 2 show a striking correlation with the experimental determinations (R=0.81) with a calculated RMSE of 1.08 kcal/mol (Table S4 and S6). Note that this error is lower than any other reported in SAMPL4 and SAMPL5 using a variety of computational methods. 6,7 Remarkably, accurate predictions were obtained for the OAH (RMSE = 0.66 kcal/mol), OAM (RMSE = 1.06 kcal/mol) and BCD (RMSE of 1.48 kcal/mol) hosts individually, which were not part of the training set. Based on these results, we conclude that the LIE parameters above are transferable among chemically-diverse hosts families. In the case of the 22 CB7-hydrocarbons complexes, a RMSE of 1.17 kcal/mol was obtained, surpassing the accuracy obtained by more rigorous methods based on expensive quantum calculations (RMSE = 1.94 kcal/mol) and/or extensive sampling based on MD (RMSE = 5.05 kcal/mol). 27 The accuracy of the predictions above indicate that a straightforward LIE model is able to capture the details modulating the host-guest binding affinity in solution. In addition, the statistical analysis in SI shows that the LIE parameters as well as the performances of the model are essentially independent of the training set, providing numerical predictions with a RMSE 4 kcal/mol. Moreover, the results in Figure 3b show that inclusion of the strain energy of the host overshoots the experimental binding free energy systematically, thereby failing to correct the numerical predictions. Since the evaluation of the strain energy here required the use of an implicit-solvent model, we suspect that our numerical protocol is suboptimal (if not inadequate) with formally charged ligands. The development of better performing protocols for accurate strain-energy corrections on complexation is currently under investigation and will be reported elsewhere. Overall, the inaccurate predictions on the Muddana dataset highlight some of the shortcomings of the current LIE implementation and suggest future directions for improvement.
Conclusion In conclusion, we have presented a LIE model for cavitand host-guest binding affinities that is transferable among chemically diverse families, accurate and reliable, producing predictions with a RMSE < 1.5 kcal/mol in a large test set including 49 guests and four 5
ACS Paragon Plus Environment
The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
different hosts. Our model is computationally efficient, its performances are essentially independent of the training set, and produces converged results within a few nanoseconds of MD, which opens to high-throughput computational screenings. The semi-empirical character of the model was shown to absorb most of the systematic error of the force field, making the predictions essentially force-field independent; a considerable advantage over other physics-based approaches that cannot be more accurate than the model of energetics in use. Finally, the inclusion of the strain energy of the host in the calculation of the binding affinity, which is absent in the original LIE formulation, was shown to improve the quality of the predictions substantially, especially when hosts and guests have similar sizes. Nonetheless, the current formulation of LIE failed in predicting the binding affinity of ultra-tight (femto- to atto-molar) binders to CB7 and was shown to overestimate the strain energy of the host in complex with bulky and formally charged guests. The usefulness of a LIE formulation for host-guest recognition was demonstrated through the accurate prediction of steroid binding to cucurbituril hosts, which are technologically relevant for the development of chemosensors.
Acknowledgment. This work was financially supported by the Fondation pour la Recherche M´edicale (DBI20141231319). Support from the Agence Nationale de la Recherche (ANR) through the LabEx project Chemistry of Complex Systems (CSC-MCE-15), and the International Center for Frontier Research in Chemistry (icFRC) is gratefully acknowledged. The work was granted access to the HPC resources of CINES under the allocation 2016-[0710142] made by GENCI (Grand Equipement National de Calcul Intensif). Supporting Information Available: Computational details for performing MD simulations; practical aspects for computing ∆G◦b by LIE and the associated statistical error; assessment of the impact of the training set on the performance of the LIE model; analysis of the statistical convergence of the binding affinity predictions; calculation of the strain energy of the host; chemical structures of all guest molecules; decomposition of the calculated ∆G◦b into the van der Waals and electrostatic contributions in the bound and the unbound states; summary of statistical metrics used to assess the accuracy of the LIE model(s). This material is available free of charge on the ACS Publications website.
6
ACS Paragon Plus Environment
Page 6 of 13
Page 7 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry
References (1) Stoddart, J. Host–guest chemistry. Annu. Rep. Progr. Chem. B 1988, 85, 353–386. (2) Lee, J. W.; Samal, S.; Selvapalam, N.; Kim, H.-J.; Kim, K. Cucurbituril homologues and derivatives: new opportunities in supramolecular chemistry. Acc. Chem. Res. 2003, 36, 621–630. (3) Grimme, S. Supramolecular binding thermodynamics by dispersion-corrected density functional theory. Chem. - Eur. J. 2012, 18, 9955–9964. (4) Mobley, D. L.; Gilson, M. K. Predicting binding free energies: Frontiers and benchmarks. Annu. Rev. Biophys. 2017, 46, 531–558. (5) Biedermann, F.; Uzunova, V. D.; Scherman, O. A.; Nau, W. M.; De Simone, A. Release of high-energy water as an essential driving force for the high-affinity binding of cucurbit[n]urils. J. Am. Chem. Soc. 2012, 134, 15318–15323. (6) Muddana, H. S.; Fenley, A. T.; Mobley, D. L.; Gilson, M. K. The SAMPL4 host– guest blind prediction challenge: an overview. J. Comput.-Aided Mol. Des. 2014, 28, 305–317. (7) Yin, J.; Henriksen, N. M.; Slochower, D. R.; Shirts, M. R.; Chiu, M. W.; Mobley, D. L.; Gilson, M. K. Overview of the SAMPL5 host–guest challenge: Are we doing better? J. Comput.-Aided Mol. Des. 2017, 31, 1–19. (8) Lehn, J.-M. Perspectives in Supramolecular ChemistryFrom Molecular Recognition towards Molecular Information Processing and Self-Organization. Angew. Chem., Int. Ed. Engl. 1990, 29, 1304–1319. (9) Lehn, J.-M. From supramolecular chemistry towards constitutional dynamic chemistry and adaptive chemistry. Chem. Soc. Rev. 2007, 36, 151–160. (10) Steed, J. W.; Atwood, J. L.; Gale, P. A. Definition and emergence of supramolecular chemistry; Wiley Online Library, 2012. (11) Furusho, Y.; Rahman, I. M.; Hasegawa, H.; Izatt, N. E. Application of Molecular Recognition Technology to Green Chemistry: Analytical Determinations of Metals in Metallurgical, Environmental, Waste, and Radiochemical Samples; John Wiley & Sons, 2016; p 271. (12) Liu, W.; Samanta, S. K.; Smith, B. D.; Isaacs, L. Synthetic mimics of biotin/(strept) avidin. Chem. Soc. Rev. 2017, 46, 2391–2403. (13) Ogoshi, T.; Yamagishi, T.-a.; Nakamoto, Y. Pillar-shaped macrocyclic hosts pillar [n] arenes: new key players for supramolecular chemistry. Chem. Rev. 2016, 116, 7937–8002.
7
ACS Paragon Plus Environment
The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
(14) Jensen, J. H. Predicting accurate absolute binding energies in aqueous solution: thermodynamic considerations for electronic structure methods. Phys. Chem. Chem. Phys. 2015, 17, 12441–12451. (15) Montalvo-Acosta, J. J.; Cecchini, M. Computational Approaches to the Chemical Equilibrium Constant in Proteinligand Binding. Mol. Inform. 2016, 35, 555–567. (16) ˚ Aqvist, J.; Medina, C.; Samuelsson, J.-E. A new method for predicting binding affinity in computer-aided drug design. Protein Eng., Des. Sel. 1994, 7, 385–391. (17) ˚ Aqvist, J.; Marelius, J. The linear interaction energy method for predicting ligand binding free energies. Comb. Chem. High Throughput Screening 2001, 4, 613–626. (18) Guti´errez-de Ter´an, H.; ˚ Aqvist, J. Linear interaction energy: method and applications in drug design; Springer, 2012; pp 305–323. (19) Kollman, P. A.; Massova, I.; Reyes, C.; Kuhn, B.; Huo, S.; Chong, L.; Lee, M.; Lee, T.; Duan, Y.; Wang, W.; Donini, O.; Cieplak, P.; Srinivasan, J.; Case, D. A.; Cheatham, T. E. Calculating Structures and Free Energies of Complex Molecules: Combining Molecular Mechanics and Continuum Models. Acc. Chem. Res. 2000, 33, 889–897. (20) Homeyer, N.; Stoll, F.; Hillisch, A.; Gohlke, H. Binding free energy calculations for lead optimization: assessment of their accuracy in an industrial drug design context. J. Chem. Theory Comput. 2014, 10, 3331–3344. (21) Huang, D.; Caflisch, A. Efficient evaluation of binding free energy using continuum electrostatics solvation. J. Med. Chem. 2004, 47, 5791–5797. (22) Zhou, T.; Huang, D.; Caflisch, A. Is quantum mechanics necessary for predicting binding free energy? J. Med. Chem. 2008, 51, 4280–4288. (23) Roux, B.; Simonson, T. Implicit solvent models. Biophys. Chem. 1999, 78, 1–20. (24) Lazar, A. I.; Biedermann, F.; Mustafina, K. R.; Assaf, K. I.; Hennig, A.; Nau, W. M. Nanomolar binding of steroids to cucurbit[n]urils: selectivity and applications. J. Am. Chem. Soc. 2016, 138, 13022–13029. (25) Assaf, K. I.; Nau, W. M. Cucurbiturils: from synthesis to high-affinity binding and catalysis. Chem. Soc. Rev. 2015, 44, 394–418. (26) Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25, 1157–1174. (27) Assaf, K. I.; Florea, M.; Antony, J.; Henriksen, N. M.; Yin, J.; Hansen, A.; Qu, Z.-w.; Sure, R.; Klapstein, D.; Gilson, M. K.; Grimme, S.; Nau, W. M. HYDROPHOBE Challenge: A Joint Experimental and Computational Study on the HostGuest Binding of Hydrocarbons to Cucurbiturils, Allowing Explicit Evaluation of Guest Hydration Free-Energy Contributions. J. Phys. Chem. B 2017, 121, 11144–11162. 8
ACS Paragon Plus Environment
Page 8 of 13
Page 9 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry
(28) Loftsson, T.; Duchene, D. Cyclodextrins and their pharmaceutical applications. Int. J. Pharm. 2007, 329, 1–11. (29) Vanommeslaeghe, K.; Hatcher, E.; Acharya, C.; Kundu, S.; Zhong, S.; Shim, J.; Darian, E.; Guvench, O.; Lopes, P.; Vorobyov, I.; Mackerell, A. D. CHARMM general force field: A force field for druglike molecules compatible with the CHARMM allatom additive biological force fields. J. Comput. Chem. 2010, 31, 671–690. (30) Muddana, H. S.; Gilson, M. K. Calculation of host–guest binding affinities using a quantum-mechanical energy model. J. Chem. Theory Comput. 2012, 8, 2023–2033.
9
ACS Paragon Plus Environment
The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 1: Chemical structures of host molecules CB7, CB8, BCD, OAH and OAM (extra methyl groups are in yellow) used in this study. 10
ACS Paragon Plus Environment
Page 10 of 13
Page 11 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry
Figure 2: Experimental vs calculated binding free energy values in aqueous solution for host-guest systems of the test set from the GAFF LIE model.
11
ACS Paragon Plus Environment
The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Figure 3: Experimental vs calculated binding free energy values in aqueous solution for CB[7/8]-steroids complexes (A) and the Muddana set (B) in study.
12
ACS Paragon Plus Environment
Page 12 of 13
Page 13 of 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
The Journal of Physical Chemistry
TOC Graphic
13
ACS Paragon Plus Environment