Simulations versus Grid-based Methods Authors

given to the software providers who were free to use a different method for the protein preparation. All solvent mapping methods were used in their st...
4 downloads 10 Views 2MB Size
Subscriber access provided by Universitaetsbibliothek | Johann Christian Senckenberg

Article

Shedding Light on Important Waters for Drug Design: Simulations versus Grid-based Methods Denis Bucher, Pieter FW Stouten, and Nicolas Triballeau J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.7b00642 • Publication Date (Web): 28 Feb 2018 Downloaded from http://pubs.acs.org on March 1, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Shedding Light on Important Waters for Drug Design: Simulations versus Grid-based Methods Authors: Denis Bucher*,†, Pieter Stouten,‡ Nicolas Triballeau† † ‡

Galapagos SASU, 102 Avenue Gaston Roussel, 93230 Romainville, France Galapagos NV, Generaal De Wittelaan L11 A3, 2800 Mechelen, Belgium

*

Corresponding author: [email protected]

Abstract Water molecules play an important role in the association of drugs with their pharmaceutical targets. For this reason, calculating the energetic contribution of water is essential to make accurate predictions of compounds’ affinity and selectivity. Water molecules can also modify the binding mode of compounds by forming water bridges, or clusters, that stabilize a particular orientation of the ligand. Several computational methods have been developed for solvent mapping, but few studies have attempted to compare them in a drug design context. In this paper, four commercially available solvent mapping tools (SZMAP, WaterFLAP, 3D-RISM, WaterMap) are evaluated on three different protein targets. The methods were compared by looking at their ability to predict the SAR of lead compounds. All methods were found to be useful to some degree and to improve the predictions from docking alone. However, the only simulation-based approach tested, WaterMap, was found in some cases to be more accurate than grid-based methods.

Introduction Thermodynamic properties describe the affinity of drugs for their targets. For this reason, the accurate calculation of thermodynamic properties has long been a quest of computational chemistry. Unfortunately the physical processes underlying molecular recognition are complex, and the calculation of thermodynamic quantities from theory alone remains challenging. For most of the 20th century, medicinal chemists, and later computational chemists, have largely relied on empirical models to guide their synthetic efforts. But the recent development of better physical models, force fields, and dedicated tools to setup and analyze such calculations, has led to a regain of interest in the calculation of thermodynamic properties. As a consequence, the characterization of solvation patterns around drugs and binding sites has now become an important objective for molecular modelling in the industry.1–3 Hydrophobic surfaces have been estimated to cover ~75% of the surface drug sites,4 indicating that the hydrophobic effect is one of the main energetic contributions to ligand binding and recognition. Co-crystallized structures of protein-ligand complexes at high resolution have shed light on individual water molecules that play an important stabilizing effect, for instance, by creating bridges between the ligand and the receptor.5,6 A network of H-bond enriched water molecules around the ligand has also been associated with improved binding affinity.7–9 While it is not always possible to determine accurately the energetic contribution of individual water molecules to ligand binding, Dunitz provided an empirical upper bound of 7 cal/mol K (about 2 kcal/mol at 300 K) for the entropy cost of transferring a water molecule from the bulk to a binding site.10 Similar estimates for the enthalpy of water ordering gave −3.8 kcal/mol.11 In a hydrophobic binding site, most buried water molecules can be expected to be H-bond depleted and ligands that are able to displace these 1 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 16

enthalpically unstable water molecules are found to display improved potency.12,13 Various solvent mapping methods have been developed over the years with the goal of helping medicinal chemists and modelers identify stable and unstable (i.e. “happy” and “unhappy”) water molecules in correct locations. These methods aim to complement more rigorous computationally approaches such as free energy calculations.7,14,15 The ability to assign a free energy estimate to each water site is quite appealing, as it can help provide rapid interpretations of the SAR in terms of the displacement of individual water molecules. Because solvent mapping methods can differ greatly in their implementations, sampling techniques, and accuracy; in the present work, an attempt was made to perform an objective evaluation of four water models in a drug design context. Two major approaches that exist are based on either grid techniques or molecular dynamics (MD) simulations. Grid-based methods are faster because they do not try to generate all the possible multi-water configurations. The algorithm in these methods generally analyzes the best location for placing a single water probe. Their speed makes grid-based methods ideally suited for virtual screening (VS) where a large number of compounds must be ranked. Instead, methods based on MD simulations attempt to numerically generate all possible multi-water conformations, and use statistical analysis to find the lowest energy and therefore most relevant configurations.16 While simulationbased techniques are more computer-intensive, their superiority lies in their ability to describe accurately complex water networks, and they provide insight into the thermodynamic signature of water molecules.17 A common limitation of MD methods is that they can suffer from convergence problems when the simulation time is too short for the molecules to reach equilibrium (the sampling problem).16,18 In this paper, we compared three grid-based algorithms for solvent mapping (SZMAP,19,20 WaterFLAP,21,22 and 3D-RISM23,24), and one simulation-based algorithm (WaterMap)25,26. Because the free energy of individual water molecules is difficult to calculate reliably or measure experimentally, a direct, fully quantitative metric is not easily accessible.27 In a drug design context, the most important feature of a solvent mapping method is its ability to impact projects, which is related to its ability to assist the prioritization of synthetic ideas by providing clear pictures and predictions of the compound activities. Therefore, to carry out this investigation, we chose to focus on how solvent mapping methods can retrospectively rationalize trends observed in structure-activity relations (SAR). Three systems were included in this study: autotaxin and two kinase targets. These test systems were chosen as the SAR could not be rationalized in terms of direct interactions between the protein and the ligand, suggesting a role for water molecules. With this limited test set, our goal was not to provide definite conclusions about the merit and limitations of each solvent mapping method; however, it is hoped that it will encourage other groups to publish such studies and start a discussion. Results are presented separately for each system, and conclusions are drawn at the end of the manuscript.

Methods The following solvent mapping methods were considered: SZMAP, WaterFLAP, 3D-RISM, WaterMap. The theory behind these methods has been discussed in more details in a recent article.3 SZMAP19,20 was developed by OpenEye. The method works by computing on a grid an effective Poisson-Boltzmann potential. The energy is then evaluated by a single water probe molecule, which can sample multiple orientations. This helps account for some of the rotational and translational entropy. Due to the independent particle approximation, water is described as a fluid lacking 2 ACS Paragon Plus Environment

Page 3 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

correlation. In this study, the 1.2.0 version of SZMAP was used. WaterFLAP21,22 is another continuum solvent method, which was developed by Molecular Discovery. It is an extension of the GRID Molecular interaction fields,22,28 initially developed for virtual screening. The method contains several empirical parameters, and the energy is computed using a spherical water probe. 3DRISM23,24 is commercialized by the Chemical Computing Group. The method is formulated as integral equations, and derives the properties of water from the three-dimensional distribution of average densities obtained using the density functional theory of liquids. Compared to the independent particle approximation or continuum models, 3D-RISM produces an approximate average solvent distribution around a rigid solute, which attempts to incorporate molecular correlations in an effective way. WaterMap25,26 is commercialized by Schrodinger Inc., and is based on the inhomogeneous solvation theory (IST).29,30 The WaterMap tool analyzes the result of a short (2 ns) MD simulation in which the protein is typically held rigid and water molecules are able to enter and leave the binding pocket. A clustering algorithm is then applied to assign populations from MD simulations. The free energy of each water site is calculated by IST, which estimate the local average interaction energy contribution to water binding, as well as the entropic penalty due to increased ordering relative to the bulk. A related alternative to the WaterMap method has been proposed in the grid implementation of IST (GIST),31 which has been integrated into DOCK3.7.32 However, in the present study only WaterMap was considered. MD simulations run were performed with Desmond33 and the OPLS3 force field.34All methods were evaluated as a blind study directly by the software provider. The initial structures were prepared with Schrodinger’s Protein Preparation Wizard35 and given to the software providers who were free to use a different method for the protein preparation. All solvent mapping methods were used in their standard parameterization. Only structural information was communicated about the protein targets and their ligands, but without providing any activity data. In all cases, except WaterFLAP, the software company provided us with an evaluation license, which we used to: (1) calculate the maps and check that the results can be replicated by a computational chemist; and (2) evaluate the usability of the software, such as the speed and ease of use.

Results System A Autotaxin (system A) is a target in idiopathic pulmonary fibrosis (IPF) for which Galapagos has developed GLPG1690 (Figure 1B, ligand 2), a first-in-class inhibitor. In the FLORA Phase 2a study in IPF patients, Galapagos reported a halt in disease progression, target engagement, and favorable safety and tolerability. Galapagos also recently disclosed the co-crystallized X-ray structure of human autotaxin with ligand 2.36,37 Figure 1A shows the historical progression of the compounds leading to the discovery of ligand 2. The X-ray structure was resolved during the project but did not immediately translate into the design of more potent compounds. During the first half of the optimization process, 157 compounds were synthesized with 6 substitutions made on the thiazole moiety. At the time, the team did not have access to solvent mapping tools, and the first potentiating nitrile group on a thiazole substituent was discovered by a classical optimization process (Figure 1B). The introduction of a nitrile group was found to improve potency by an average of 0.84 log. Once it was discovered, about 1/3 of compounds featured a substitution on the thiazole, or at the ortho position of the nearby phenyl ring, including the final compound (ligand 2) selected for clinical trials. 3 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 16

Figure 1: (A) The addition of a nitrile group to Ligand 1 led to the more potent compound Ligand 2, (GLPG1690), which was progressed to clinical trials. (B) The scaffold of the autotaxin inhibitor series is shown with the position of the R2 and R3 substitutent where nitrile groups were found to increase potency. Retrospective analysis shows that as much as 27 examples exist where the addition of a nitrile group was beneficial, with no counterexamples of the transformation reducing affinity. (C) Timeline since the project initiation. Each data point corresponds to a compound tested in biochemical assay. The orange area groups compounds that were made prior to the resolution of the X-ray structure of autotaxin with a co-crystallized ligand. The blue area indicates that the compounds were synthetized after the X-ray structure was solved, but before the discovery of the first nitrile compound by a classical ligand optimization process. The remaining group indicates compounds with and without nitrile substituent that were made after the discovery of the first nitrile compound.

Despite having access to the X-ray structure, optimizing the compounds based on the X-ray structure alone was far from obvious. Figure 2A shows that there were many vectors for substitutions theoretically possible for this series, as shown by red arrows. A detailed look at the protein surface around the ligand shows an empty volume near the thiazole group that could be occupied by a water molecule (Figure 2B), but no water molecule could be seen in the pocket in the X-ray structure, whereas a stable crystal water was seen above the pocket (Figure 2C). The scoring function of the popular Glide docking tool also does not predict a gain in activity for nitrile compounds. A classical 4 ACS Paragon Plus Environment

Page 5 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

ligand optimization process was used in this case and led to the discovery of the nitrile substitutent. To understand this, MD simulations performed after the project was completed showed that the site can be transiently occupied by a water molecule, which is unable to make hydrogen bonds with the protein (Figure 2D), and therefore likely to be highly unstable. The unstable water provides a likely retrospective explanation for the impact of the nitrile group on the potency of the compounds.

Figure 2. (A) Autotaxin surface in the binding site. Red arrows indicate seven regions where the ligand could have been grown without clashing with the protein to illustrate that a substitution on the 5 position of the thiazole group was nonobvious. (B) A finer mesh on the protein surface reveals an empty volume that could accommodate a water molecule near the thiazole. (C) The electron density in the crystallographic direct map (2Fo-Fc@1σ) does not show the presence of a water molecule near the 5 position. (D) MD simulations ran for 50 ns show the existence of a transient water site. The water is not able to make any hydrogen bonds with neighboring protein residues.

The next step in this study was to compare the actual experimental results to predictions from commercial software that try to identify H-bond depleted water molecules, providing useful information for improving the potency of the compounds. Because this study was set up as a blind test, we received pictures of the binding site water molecules directly from software companies in different formats (Figure 3). Recommendations from software providers were communicated orally through online presentations.

5 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 16

Figure 3. Screen captions of the autotaxin binding site with solvent mapping calculations from different software. The water molecule singled out by a yellow circle is predicted to be responsible for the large shift in potency observed experimentally, when a cyano group is added to the ligand. Scores on individual water molecules are expressed in kcal/mol. (A) SZMAP apo (ligand-free) hydrophobic map overlaid with ligand. Hydrophobic waters are shown as pink stars and labelled with their hydrophobicity score. (B) WaterFLAP apo hydrophobic map overlaid with ligand. Hydrophobic regions are shown in yellow. Water molecules are displayed as spheres, with larger spheres indicating that a water molecule is highly hydrophobic. (C) 3D-RISM apo free energy map, showing regions of possible hydration sites. Red waters are predicted more unstable than bulk, and green waters are predicted more stable than bulk. (D) Watermap apo map overlaid with the ligand. The free energy density is shown as a gradient from green to red, corresponding to regions of increasingly unstable waters. Large spheres show the positions of predicted hydration sites, with their associated free energy score. Red waters are unstable and green waters are more stable than bulk. A ball and stick presentation is used to show the position of autotaxin natural ligand, lysophosphatidic acid (18:1 LPA), as resolved in the 3NKR.pdb structure.

Despite differences in graphical depictions, the computational results were in line with the experimental observations. In particular, all methods indicated the presence of a high energy water near the 5 position of the thiazole moiety. However, some methods were better than others in terms of the clarity of the results. In Figure 3A, the SZMAP apo hydrophobic map is shown. It highlights the water as the most hydrophobic in the binding site, and suggests that its enthalpic binding contribution is unfavorable, with a score of 2.35. This indicates that displacing the water could improve the ligand potency. However, several other water molecules also display high hydrophobic scores between +1 and +2. This result suggests that growing the compound in the direction of other high energy waters could also improve ligand binding, in similar amounts, which is not consistent with the observed SAR in this project (data not shown). . In Figure 3B, the WaterFLAP results are shown. Here the hydrophobic water near the thiazole was not ranked among the top 3 most hydrophobic waters. Instead other 6 ACS Paragon Plus Environment

Page 7 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

suggestions were made by the software provider to grow the ligand into a region of unstable water molecules, as shown by purple circles in Figure 3B. The 3D-RISM result is shown in Figure 3C. As with WaterFLAP, the water of interest did not have a free energy score that would place it in the top 5 of water molecules to replace, and therefore, 3D-RISM is not in line with the observation that the best water to displace is near the thiazole group. Finally, the WaterMap results are shown in Figure 3D. The apo map highlights the water near the thiazole group as the highest free energy, or most unstable, water near the ligand. In Figure 3D, it is shown that a continuous chain of H-bond depleted watermolecules is observed surrounding the ligand. These unstable waters are able to trace out the path of a lipid substrate found in the X-ray structure 3NKR.pdb. This strongly suggests that the displacement of water molecules participates in the natural ligand association process with the enzyme. Going back to our series, the thiazole moiety is found to miss the most unstable water molecule in the chain, which is indeed displaced by addition of a nitrile group. Both the apo (ligand-free) and holo (ligand-bound) WaterMaps are instructive (Figure 4). In the apo map, the ligand is shown to displace a high energy water, estimated at a free energy 7.5 kcal/mol higher than bulk. This may indicate that WaterMap overestimates the water positive free energy, since the free energy measured experimentally for the nitrile addition is on average only 1.0 kcal/mol. However, it is also possible that the free energy of the unstable water in the apo map does not fully translate into improving ligand binding, because the presence of the ligand can lead to distortions in the water network, and to the creation of new unstable waters. An observation that WaterMap may overestimate the water positive free energy was made by Graves et al., who have suggested that the neglect of some high order entropic terms, such as the effect of the solute on the water-water correlation contributions,26 could be a limitation in the current implementation of WaterMap that may explain an overestimate of the positive free energy of isolated waters.3 A similar observation has been reported by Lee and co-workers when replacing a water predicted by WaterMap to be unstable by +9.1 kcal/mol, using a chemical modification in their lead compound.38 The binding affinity was improved from 4.6 to 0.2 nM, which corresponds to a free energy of only 1.64 kcal/mol. Despite this possible limitation of WaterMap, the qualitative picture given for autotaxin is still very valuable for ligand design. WaterMap does not appear to overestimate all waters in the apo map, but rather, to exaggerate the positive free energy of the most unstable water, which would have been sufficient for guiding the chemistry effort in the right direction. The apo map should preferably be interpreted together with the holo map. In the holo map, the high energy water was not observed and a vacuum bubble appeared next to the ligand. The lack of a recognized site for a water molecule was further investigated, as it could be caused by a convergence problems with the default WaterMap simulation length (2 ns), and the restraints applied to the protein backbone structure. The picture given by longer unrestraint MD simulations (50 ns) was found to be consistent with the WaterMap view. Although, a water molecule was able to enter and leave the pocket, it was never able to stay inside the pocket for more than a few picoseconds, indicating that a vacuum bubble may be a better description for this site in the presence of the ligand. According to this view, the ligand without a nitrile group does not just miss opportunities for binding; it sets up a feature – a vacuum bubble – that opposes binding. This rationalization is consistent with the experimental observation that the addition of a nitrile leads to a gain in potency, either at 5 position of the thiazole or at the ortho position on the phenyl ring.

7 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 16

Figure 4. Autotaxin WaterMap. A) The apo map is shown with waters that do not overlap with the ligand. Stable waters are shown in green and unstable waters in red. B), and C), Zooming in on the thiazole group, and comparing both the apo and holo WaterMap, shows that a vacuum bubble (opaque surface) is created in the presence of the ligand, which is a feature that opposes binding. The ligand surface is shown as a mesh.

System B For this kinase system, a pair of ligands was studied for which a 5-fold increase in activity was observed between the R and the S stereoisomers. The isopropyl region of the ligand, which varies in the two poses (Figure 5), displays no direct interactions with the protein and is solvent-exposed. Therefore, solvation effects were identified as a likely explanation for the difference in affinity.

8 ACS Paragon Plus Environment

Page 9 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 5. (A) System B kinase with ligand shown as blue surface. (B) Difference between the S and R stereoisomers which is causing an increase in activity by more than 5-fold. (C) Position of the stereocenter (indicated by a star), in a region that is solvent exposed, as shown here by MD simulations with a truncated compound (methylamide). (D) Overlay between the modeled pose for the R (brown) and S (green) stereoisomers in the binding site.

Software providers were asked to rank the two compounds. With the exception of 3D-RISM, all methods correctly identified the S stereoisomer as more potent due to solvation effects. The SZMAP result (Figure 6A) shows the existence of a water in a highly hydrophobic site that is replaced by the ammonium. A second hydrophobic water is displaced by the isopropyl of the S isomer, but not by the R isomer. Similarly, the WaterFLAP result shows the presence of a water in a hydrophobic site, as a yellow sphere. The presence and size of the sphere indicates it is one of the most interesting opportunities in the binding site for displacing a water molecule (Figure 6B). Results from 3D-RISM were less convincing as both R and S stereoisomers were seen to displace a high energy water, and it is unclear from the map which stereoisomer is supposed to be more stable. Finally, the WaterMap calculation is in agreement with the SZMAP and WaterFLAP calculations. It shows the presence of an additional unstable water that is displaced by the S isomer, and not displaced by the R isomer. The estimated free energy for displacing the water is 1.2 kcal/mol, which is in the range of the experimental result (0.88 kcal/mol for a 5.4-fold increase). During MD simulations the water is unable to make any hydrogen bonds with the protein, and thermodynamic partitioning indicates that it is unstable both for enthalpic and entropic reasons (∆H=0.42 and -T∆S=0.80 kcal/mol).

9 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 16

Figure 6. Screen captions of solvent mapping calculations for the System B kinase with different software: (A) SZMAP apo hydrophobic map overlaid with the isopropyl part of the ligand, showing both R and S stereoisomers. Hydrophobic sites for water are shown in purple. The star in the SZMAP depiction indicates several possible orientations for the water dipole moment. (B) WaterFLAP apo hydrophobic map overlaid with the ligand. Highly hydrophobic sites are shown in yellow, and hydrophilic sites are in light blue. The size of the sphere is proportional to the energy of the site. (C) 3D-RISM apo free energy map overlaid with the ligand. Red regions indicate high free energy (unstable) locations for water, while green regions indicate stable low free energy locations. (D) WaterMap apo map overlaid with ligand for the R and S stereoisomers. All water spheres are green, as no color code was used.

System C A second kinase system was used to test the four solvent mapping methods. Here, the goal was to retrospectively explain the improvement in potency that had been achieved in this series during lead optimization, and rank 4 compounds in a congeneric series. Chemistry efforts have shown that relatively small variations in the substituents are able to provide a >300-fold gain in inhibitory potency. Inspection of in-house X-ray structures indicated that the side chains do not directly interact with the protein, suggesting that solvation effects are likely to be involved.

10 ACS Paragon Plus Environment

Page 11 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 7. (A) Four ligands found during ligand optimization showing increasing potency thanks to favorable solvation effects. (B) Ranking of the ligand with Glide SP and WM/MM scoring (WaterMap). The error bars on the Glide SP scores were calculated by averaging the scores for the top 5 poses. It was found that the noise of the method is larger than the differences between compounds. Moreover, taking the scores of the best pose for each compound did not correctly rank the compounds. In contrast, WM/MM scoring (WaterMap) was able to capture the solvent effect and gave a correct ranking of the compounds.

Qualitatively all methods succeeded in providing some retrospective explanation for the improved potency of the best compounds (data not shown). But only WaterMap proved sufficiently accurate to allow for the correct ranking of the four ligands displayed in Figure 7. In this case, conventional scoring with Glide SP was also insufficient to correctly rank the compounds. To provide more insights into the value of including energies from explicit waters, a comparison was made with the implicit solvent MM-GBSA method. Residues around 5 Ang of the ligand were allowed to relax during the calculation, as it improved the correlation between MM-GBSA and experiments, when looking at a 35 compounds dataset from the project (R2=0.38, versus R2=0.13, for a rigid protein). On the same 35 compounds test set, Glide SP did not show any correlation (R2=0.01), and WM/MM 11 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 16

scoring showed only a slight improvement over the docking score (R2=0.17). However, even with the addition of flexibility in the protein, MM-GBSA was unable to provide a correct ranking for the 4 compounds of interest, as the last 2 compounds were inverted with values of (-27.1/-42.2/-54.3/-48.7 kcal/mol). Only WM/MM scoring and FEP could correctly rank the 4 compounds. This dataset suggests that, while MM-GBSA may be a better method for the general purpose ranking of diverse compounds, WM/MM scoring performs best when ranking compounds from a congeneric series. A similar promising result of WM/MM scoring has been reported in the literature for a congeneric series binding to the A2A Receptor.39

Discussion and Conclusions The goal of this study was to assess four different solvent mapping tools that have been developed for drug discovery. However, the study does not attempt to provide a definite answer as to which algorithm should be used, since this would require a much larger test set, and will ultimately strongly depend on the user’s goals and acquaintance with a particular method. In this paper, the merits of solvent mapping tools were evaluated on real examples from drug design in the industry. On this set of examples, it was found that solvent mapping tools are helpful to address optimization problems. As shown by our examples, the tools can be used to identify the most promising vectors for growing the molecules, and can assist the interpretation of counter-intuitive SAR. In the case of autotaxin, both WaterMap and grid-based methods provided a qualitatively correct picture of a high energy water molecule near the ligand, in line with experimental results. Thus, from this example, it is clear that any solvent mapping tool is better than no tool at all. In all the examples given, it was found that the clarity of the calculated water maps is important. In that aspect, our preference goes to both WaterFLAP and WaterMap in terms of clarity and readability. MD simulations take longer to compute (hours rather than minutes), but this approach was found to be generally worthwhile, even in an industrial setting where timelines tend to be short. In the last example, system C, a correct prediction was obtained for the ranking of compounds, which could directly assist synthetic prioritization. Despite this good result, WaterMap was not aimed at becoming a quantitative method for lead optimization because it only calculates the solvent contribution to the free energy. Other methods such as alchemical free energy calculations are now better suited for this task.40 But as a solvent mapping method, the accuracy of WaterMap appears very satisfactory, and in terms of user-experience, WaterMap was also found to be superior to current grid-based methods in that it provided a clearer picture of the water network. In the WaterMap calculations presented here, the protein was held rigid during MD simulations and only a few terminal rotors – mainly hydroxyls – were allowed to move. Restraints are important to avoid averaging the calculated thermodynamic quantities over several protein conformations, which would lower the interpretability of the water maps. However, despite the benefits of restraints, their use can become problematic for many hydration sites, where the number and strength of water-solute H-bonds may be over- or underestimated due to conformational variability of the acceptor and/or donor orientations/positions over time that is not accounted for with fixed degrees of solute freedom. In addition, some systems require a water molecule, not present during the initial setup, to reach buried locations that are not accessible without a conformational change in the protein. A recent development to address this problem is the use of Grand Canonical Monte Carlo (GCMC) simulations to help water molecules reach buried sites inside the protein. 12 ACS Paragon Plus Environment

Page 13 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Graphic processing units (GPUs) may also be used in the future to accelerate MD simulations and better sample water sites. We hope to see more interest in these methods in the future, as well as other studies that try to compare solvent mapping using real examples of drug design.

Acknowledgments We thank Simon Cross (Molecular Discovery), Gunther Stahl (OpenEye), Markus Kossner (CCG), Daniel D. Robinson (Schrödinger), and Nathalie Lacoste (Schrödinger), for their assistance with the calculations presented. We would also like to show our gratitude to Cornel Catana and Miriam Lopez Ramos for many discussions on the role of water in drug design, and for suggesting one of the kinase project (system B) selected. We thank Nicolas Desroy for providing us with details on the autotaxin project, and Raphael Geney and Gregory Bar for carefully reading the manuscript.

References (1)

(2) (3)

(4)

(5)

(6)

(6)

(8)

(9)

(10) (11)

Mason, J. S.; Bortolato, A.; Weiss, D. R.; Deflorian, F.; Tehan, B.; Marshall, F. H. High End GPCR Design: Crafted Ligand Design and Druggability Analysis Using Protein Structure, Lipophilic Hotspots and Explicit Water Networks. Silico Pharmacol. 2013, 1, 23. Bodnarchuk, M. S. Water, Water, Everywhere… It’s Time to Stop and Think. Drug Discov. Today 2016, 21 (7), 1139–1146. Graves, A. P.; Wall, I. D.; Edge, C. M.; Woolven, J. M.; Cui, G.; Le Gall, A.; Hong, X.; Raha, K.; Manas, E. S. A Perspective on Water Site Prediction Methods for Structure Based Drug Design. Curr. Top. Med. Chem. 2017, 17 (23), 2599–2616. Snyder, P. W.; Lockett, M. R.; Moustakas, D. T.; Whitesides, G. M. Is It the Shape of the Cavity, or the Shape of the Water in the Cavity? Eur. Phys. J. Spec. Top. 2014, 223 (5), 853– 891. Lu, Y.; Wang, R.; Yang, C.-Y.; Wang, S. Analysis of Ligand-Bound Water Molecules in HighResolution Crystal Structures of Protein−Ligand Complexes. J. Chem. Inf. Model. 2007, 47 (2), 668–675. García-Sosa, A. T. Hydration Properties of Ligands and Drugs in Protein Binding Sites: Tightly-Bound, Bridging Water Molecules and Their Effects and Consequences on Molecular Design Strategies. J. Chem. Inf. Model. 2013, 53 (6), 1388–1405. Michel, J.; Tirado-Rives, J.; Jorgensen, W. L. Energetics of Displacing Water Molecules from Protein Binding Sites: Consequences for Ligand Optimization. J. Am. Chem. Soc. 2009, 131, 15403–15411. Krimmer, S. G.; Cramer, J.; Betz, M.; Fridh, V.; Karlsson, R.; Heine, A.; Klebe, G. Rational Design of Thermodynamic and Kinetic Binding Profiles by Optimizing Surface Water Networks Coating Protein-Bound Ligands. J. Med. Chem. 2016, 59 (23), 10530–10548. Snyder, P. W.; Mecinović, J.; Moustakas, D. T.; Thomas, S. W.; Harder, M.; Mack, E. T.; Lockett, M. R.; Héroux, A.; Sherman, W.; Whitesides, G. M. Mechanism of the Hydrophobic Effect in the Biomolecular Recognition of Arylsulfonamides by Carbonic Anhydrase. Proc. Natl. Acad. Sci. 2011, 108 (44), 17889–17894. Dunitz, J. D. The Entropic Cost of Bound Water in Crystals and Biomolecules. Science 1994, 264 (5159), 670–671. Ladbury, J. E. Just Add Water! The Effect of Water on the Specificity of Protein-Ligand Binding Sites and Its Potential Application to Drug Design. Chem. Biol. 1996, 3 (12), 973– 980.

13 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(12)

(13) (14)

(15) (16) (17) (18)

(19) (20)

(21)

(22) (23) (24) (25)

(26)

(27)

(28)

(29) (30) (31)

(32)

(33)

Page 14 of 16

Murray, C. W.; Carr, M. G.; Callaghan, O.; Chessari, G.; Congreve, M.; Cowan, S.; Coyle, J. E.; Downham, R.; Figueroa, E.; Frederickson, M.; et al. Fragment-Based Drug Discovery Applied to Hsp90. Discovery of Two Lead Series with High Ligand Efficiency. J. Med. Chem. 2010, 53 (16), 5942–5955. Setny, P.; Baron, R.; McCammon, J. A. How Can Hydrophobic Association Be Enthalpy Driven? J. Chem. Theory Comput. 2010, 6 (9), 2866–2871. Wade, R. C.; Mazor, M. H.; McCammon, J. A.; Quiocho, F. A. A Molecular Dynamics Study of Thermodynamic and Structural Aspects of the Hydration of Cavities in Proteins. Biopolymers 1991, 31 (8), 919–931. Ross, G. A.; Bodnarchuk, M. S.; Essex, J. W. Water Sites, Networks, And Free Energies with Grand Canonical Monte Carlo. J. Am. Chem. Soc. 2015, 137 (47), 14930–14943. Miao, Y.; Baudry, J. Active-Site Hydration and Water Diffusion in Cytochrome P450cam: A Highly Dynamic Process. Biophys. J. 2011, 101 (6), 1493–1503. de Beer, S.; Vermeulen, N.; Oostenbrink, C. The Role of Water Molecules in Computational Drug Design. Curr. Top. Med. Chem. 2010, 10 (1), 55–66. Bernardi, R. C.; Melo, M. C. R.; Schulten, K. Enhanced Sampling Techniques in Molecular Dynamics Simulations of Biological Systems. Biochim. Biophys. Acta BBA - Gen. Subj. 2015, 1850 (5), 872–877. SZMAP, 1.0.0; OpenEye Scientific Software Inc.: Santa Fe, NM, USA, 2011. Bayden, A. S.; Moustakas, D. T.; Joseph-McCarthy, D.; Lamb, M. L. Evaluating Free Energies of Binding and Conservation of Crystallographic Waters Using SZMAP. J. Chem. Inf. Model. 2015, 55 (8), 1552–1565. Pastor, M.; Cruciani, G.; Watson, K. A. A Strategy for the Incorporation of Water Molecules Present in a Ligand Binding Site into a Three-Dimensional Quantitative Structure- Activity Relationship Analysis. J. Med. Chem. 1997, 40 (25), 4089–4102. Cross, S.; Cruciani, G. Molecular Fields in Drug Discovery: Getting Old or Reaching Maturity? Drug Discov. Today 2010, 15 (1–2), 23–32. Beglov, D.; Roux, B. An Integral Equation to Describe the Solvation of Polar Molecules in Liquid Water. J. Phys. Chem. B 1997, 101 (39), 7821–7826. Truchon, J.-F.; Pettitt, B. M.; Labute, P. A Cavity Corrected 3D-RISM Functional for Accurate Solvation Free Energies. J. Chem. Theory Comput. 2014, 10 (3), 934–941. Young, T.; Abel, R.; Kim, B.; Berne, B. J.; Friesner, R. A. Motifs for Molecular Recognition Exploiting Hydrophobic Enclosure in Protein–ligand Binding. Proc. Natl. Acad. Sci. 2007, 104 (3), 808–813. Abel, R.; Young, T.; Farid, R.; Berne, B. J.; Friesner, R. A. Role of the Active-Site Solvent in the Thermodynamics of Factor Xa Ligand Binding. J. Am. Chem. Soc. 2008, 130 (9), 2817– 2831. Huggins, D. J.; Payne, M. C. Assessing the Accuracy of Inhomogeneous Fluid Solvation Theory in Predicting Hydration Free Energies of Simple Solutes. J. Phys. Chem. B 2013, 117 (27), 8232–8244. Reynolds, C. A.; Wade, R. C.; Goodford, P. J. Identifying Targets for Bioreductive Agents: Using GRID to Predict Selective Binding Regions of Proteins. J. Mol. Graph. 1989, 7 (2), 103–108. Lazaridis, T. Inhomogeneous Fluid Approach to Solvation Thermodynamics. 1. Theory. J. Phys. Chem. B 1998, 102 (18), 3531–3541. Li, Z.; Lazaridis, T. Thermodynamic Contributions of the Ordered Water Molecule in HIV-1 Protease. J. Am. Chem. Soc. 2003, 125 (22), 6636–6637. Nguyen, C. N.; Young, T. K.; Gilson, M. K. Grid Inhomogeneous Solvation Theory: Hydration Structure and Thermodynamics of the Miniature Receptor Cucurbit[7]Uril. J. Chem. Phys. 2012, 137 (4), 044101. Balius, T. E.; Fischer, M.; Stein, R. M.; Adler, T. B.; Nguyen, C. N.; Cruz, A.; Gilson, M. K.; Kurtzman, T.; Shoichet, B. K. Testing Inhomogeneous Solvation Theory in Structure-Based Ligand Discovery. Proc. Natl. Acad. Sci. U. S. A. 2017, 114 (33), E6839–E6846. Bowers, K. J.; Chow, E.; Xu, H.; Dror, R. O.; Eastwood, M. P.; Gregersen, B. A.; Klepeis, J. L.; Kolossvary, I.; Moraes, M. A.; Sacerdoti, F. D.; et al. Scalable Algorithms for Molecular 14 ACS Paragon Plus Environment

Page 15 of 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(34)

(35)

(36)

(37)

(38)

(39)

(40)

Dynamics Simulations on Commodity Clusters. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing; SC ’06; ACM: New York, NY, USA, 2006. Harder, E.; Damm, W.; Maple, J.; Wu, C.; Reboul, M.; Xiang, J. Y.; Wang, L.; Lupyan, D.; Dahlgren, M. K.; Knight, J. L.; et al. OPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and Proteins. J. Chem. Theory Comput. 2016, 12 (1), 281–296. Sastry, G. M.; Adzhigirey, M.; Day, T.; Annabhimoju, R.; Sherman, W. Protein and Ligand Preparation: Parameters, Protocols, and Influence on Virtual Screening Enrichments. J. Comput. Aided Mol. Des. 2013, 27 (3), 221–234. Joncour, A.; Desroy, N.; Housseman, C.; Bock, X.; Bienvenu, N.; Cherel, L.; Labeguere, V.; Peixoto, C.; Annoot, D.; Lepissier, L.; et al. Discovery, Structure–Activity Relationship, and Binding Mode of an Imidazo[1,2-a]Pyridine Series of Autotaxin Inhibitors. J. Med. Chem. 2017, 60 (17), 7371–7392. Desroy, N.; Housseman, C.; Bock, X.; Joncour, A.; Bienvenu, N.; Cherel, L.; Labeguere, V.; Rondet, E.; Peixoto, C.; Grassot, J.-M.; et al. Discovery of 2-[[2-Ethyl-6-[4-[2-(3Hydroxyazetidin-1-Yl)-2-Oxoethyl]Piperazin-1-Yl]-8-Methylimidazo[1,2-a]Pyridin-3Yl]Methylamino]-4-(4-Fluorophenyl)Thiazole-5-Carbonitrile (GLPG1690), a First-in-Class Autotaxin Inhibitor Undergoing Clinical Evaluation for the Treatment of Idiopathic Pulmonary Fibrosis. J. Med. Chem. 2017, 60 (9), 3580–3590. Lee, K. L.; Ambler, C. M.; Anderson, D. R.; Boscoe, B. P.; Bree, A. G.; Brodfuehrer, J. I.; Chang, J. S.; Choi, C.; Chung, S.; Curran, K. J.; et al. Discovery of Clinical Candidate 1{[(2S,3S,4S)-3-Ethyl-4-Fluoro-5-Oxopyrrolidin-2-Yl]Methoxy}-7-Methoxyisoquinoline-6Carboxamide (PF-06650833), a Potent, Selective Inhibitor of Interleukin-1 Receptor Associated Kinase 4 (IRAK4), by Fragment-Based Drug Design. J. Med. Chem. 2017, 60 (13), 5521–5542. Higgs, C.; Beuming, T.; Sherman, W. Hydration Site Thermodynamics Explain SARs for Triazolylpurines Analogues Binding to the A2A Receptor. ACS Med. Chem. Lett. 2010, 1 (4), 160–164. Mobley, D. L.; Klimovich, P. V. Perspective: Alchemical Free Energy Calculations for Drug Discovery. J. Chem. Phys. 2012, 137 (23), 230901.

15 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 16

For Table of Contents Only (TOC):

16 ACS Paragon Plus Environment