COSMO Scoring Function at the DFTB3-D3H4 Level: Unique

We have recently introduced the “SQM/COSMO” scoring function which combines a semiempirical quantum mechanical description of noncovalent interact...
0 downloads 0 Views 942KB Size
Letter pubs.acs.org/jcim

SQM/COSMO Scoring Function at the DFTB3-D3H4 Level: Unique Identification of Native Protein−Ligand Poses Adam Pecina,† Susanta Haldar,† Jindřich Fanfrlík,† René Meier,‡ Jan Ř ezác,̌ † Martin Lepšík,† and Pavel Hobza*,†,§ †

Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic, v.v.i., Flemingovo nám. 2, 16610 Prague 6, Czech Republic ‡ Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120 Halle, Germany § Regional Centre of Advanced Technologies and Materials, Palacký University, 77146 Olomouc, Czech Republic S Supporting Information *

to the crystal) can be reproduced in up to 80% of PL cases.4,6−8 However, classical scoring functions (SF) often fail to identify the correct binding mode as the best scoring pose (especially for metalloproteins, halogenated ligands, inorganic ligands etc.).4 Thus, the reliable identification of the native PL pose for a diverse set of PL complexes by using a single SF still remains a challenging task nowadays.3,4,9 All SFs, although based on different approaches, encounter problems.10 For a reliable description of PL noncovalent interactions, the general solution is the use of quantum mechanics (QM) which adds the description of quantum phenomena, such as charge transfer, which play roles in e.g. halogen bonding. However, QM methods have much higher computational requirements and scale unfavorably with system size. Despite these limitations, the rewards of QM in PL scoring are such that it is used in various setups to increase speed (e.g., fragmentation or QM/MM treatment).11−16 A good compromise between speed and accuracy was found in semiempirical QM (SQM) methods; AM1 method combined with empirical dispersion (D) and Poisson−Boltzmann (PB) implicit solvent made the first SQM-based SF,17 which proved advantageous especially for metalloprotein binding.18 However, the accuracy of the AM1-D method is not sufficient for general use. We have thus developed corrections for dispersion, hydrogen- and halogenbonding in SQM methods; the combination with the PM6 method19 yielded PM6-D3H4X20−22 which gives highly accurate interaction energies in diverse classes of systems.23 Our PM6-D3H4X-based SF24,25 approximates the PL binding affinity as a sum of the gas-phase interaction energy (ΔEint), the interaction solvation/desolvation free energy (ΔΔGsolv), the change of the conformational “free” energy (ΔG′confw), and the entropy change upon binding (−TΔS). The SQM SF is applicable to the wide chemical space without additional parametrization.23 The generality of our SF was shown for variety of PL complexes.24−29 The QM nature of our SF enabled us to extend it for covalent PL binding.30 We have recently simplified the SQM-based SF for virtual screening based on our experience. This novel scheme, “SQM/ COSMO SF”, features two dominant terms ΔEint, and ΔΔGsolv

ABSTRACT: We have recently introduced the “SQM/ COSMO” scoring function which combines a semiempirical quantum mechanical description of noncovalent interactions at the PM6-D3H4X level and the COSMO implicit model of solvation. This approach outperformed standard scoring functions but faced challenges with a metalloprotein featuring a Zn2+···S− interaction. Here, we invoke SCC-DFTB3-D3H4, a higher-level SQM method, and observe improved behavior for the metalloprotein and high-quality results for the other systems. This method holds promise for diverse protein−ligand complexes including metalloproteins.

V

irtual screening, as a computational counterpart to experimental high-throughput screening, has been used since 1990 to search for new drugs. The major techniques involved are molecular docking which generates ligand binding poses and scoring which ranks them by predicted affinities. Due to the extreme pressure on the timing, docking/scoring methods have utilized crude approximations which in turn hampered their usefulness.1 However, due to recent methodological advances, docking/scoring have been successful in discovering dozens of new protein ligands.2 The first task of docking/scoring is the prediction of native ligand pose in protein−ligand (PL) complexes. This so-called “Docking Power”3 or “Sampling Power”4 has been repeatedly tested for dozens of docking/scoring programs across diverse PL complexes.5 The successes are fair in general − native ligand binding modes (defined as poses with RMSD < 2 Å as compared © 2017 American Chemical Society

Published: January 3, 2017 127

DOI: 10.1021/acs.jcim.6b00513 J. Chem. Inf. Model. 2017, 57, 127−132

Letter

Journal of Chemical Information and Modeling and neglects expensive SQM optimizations.31 The CPU demands of SQM/COSMO SF are significantly reduced. ΔEint describes various types of gas-phase PL noncovalent interactions, whose reliable description is a challenging task, when most of computational methods give significant systematic and random errors.32 We have been successful with the PM6D3H4X method.24−27,29 Here, we further increase the accuracy of the SQM/COSMO SF by employing another promising method, self-consistent charges density functional tight-binding (SCC-DFTB, abbreviated here as DFTB) scheme augmented with the D3H4 dispersion and hydrogen-bonding corrections.21,33 The main advantage of DFTB is that it uses integrals precalculated using density functional theory (DFT) and the remaining empirical parameters are fitted to DFT or high-level calculations. This makes the method more robust and accurate in comparison with other SQM methods. Additionally, DFTB should in principle be more efficient than the classical SQM methods (such as PM6). However, the implementation used34 is several times more demanding than PM6 in MOPAC with the linear scaling algorithm.35,36 A linear-scaling and massively parallelizable procedure for DFTB method is actively being developed.37,38 For more details on the approximations involved in various SQM methods and in DFTB and their relevance to the description of noncovalent interactions, we refer the reader to a recent review by Christensen et al.39 Similarly to other SQM methods, DFTB itself does not describe noncovalent interactions well enough for drug design applications. Again, London dispersion is missing and hydrogen bonds are seriously underestimated. These issues can be addressed by additional corrections. There are multiple options available in the DFTB framework and we have recently examined them carefully in order to select the best setup for practical use. On the basis of this survey, we have chosen the empirical D3H4X corrections which can be easily integrated with DFTB, provide high accuracy, and are reasonably robust for application to biomolecules.38 It must be noted that due to its empirical nature, the transferability of the H4 correction is limited, but the studied systems lie within the chemical space it was parametrized for. Compared to the original work introducing the D3H4X corrections,21 we now use the most recent version of the DFTB including the third-order terms and the 3OB SlaterKoster parameters developed for it.40−43 The 3OB set now covers not only elements present in organic molecules42 but also halogens and some alkali, alkaline-earth, and transition metals,40,42−45 which are needed for modeling biomolecular systems. This setup was named DFTB3 (note that the standalone DFTB3 includes also damping of the gamma function for hydrogen43 which we do not use here because it is replaced by the empirical corrections). The change of the DFTB parameters required a reparametrization of the D3H4X corrections. The parametrization procedure followed exactly the protocol described in ref 21. Both corrections are parametrized on the relevant subsets of the S66x8 data set46 of dissociation curves. The main advantage of DFTB in the context of this work is its robustness when dealing with less common interaction motifs for which no specific corrections are being used. More detailed analysis of the performance of various DFTB-based methods with corrections for noncovalent interactions including DFTB3D3H4 is being published separately.38 ΔΔGsolv usually gives large positive values which partially compensate ΔEint. We have shown47 that COSMO implicit solvent model48 provides more accurate solvation free energies than generalized Born (GB; igb =1).49 Although there is charge-

dependent PB model implemented in CHARMM,50 it is extremely time-consuming and impractical. Here, the SQM/ COSMO SF is computed as a sum of the ΔEint at the DFTB3D3H4 level and the ΔΔGsolv term evaluated at the COSMO48 level from PM6/COSMO calculation using MOPAC.35 The default parameters provided by MOPAC are used. Optimized atomic radii48 are used where available, for other elements including transition metals, van der Waals radii are scaled by the default factor of 1.17. ΔEint and ΔΔGsolv terms do not correlate because they are due to different physicochemical properties and thus can be combined. This enables us to compare solvent models, i.e. also DFTB3-D3H4 ΔEint combined with GB from AMBER/GB49 calculations. The new variant of the SQM/COSMO SF is assessed by its ability to identify the native pose in four challenging PL complexes from our previous study.31 In that pilot study, we have cautiously selected these four unrelated PL complexes that feature difficult-to-handle noncovalent interactions. Their crystal structures were determined at reasonable resolution and the ligand electron density was well distinguishable. Acetylcholinesterase (AChE, PDB: 1E66)51 represents the systems with two binding pockets and halogenated ligand; TNF-α converting enzyme (TACE, PDB: 3B92)52 is a metalloprotein having Zn2+ coordinated by S− of the ligand and also three water molecules in the binding site; Aldose reductase (AR, PDB: 2IKJ)53 represents the protein with the cofactor in the active site; and HIV-1 protease (HIV PR, PDB: 1NH0)54 has a large, flexible, and charged ligand together with a structural water molecule in the binding site. All the studied systems were prepared carefully (for details about structure preparation and protonation states see the SI). Previously, we generated for these systems a set of 2865 nonredundant poses using four popular docking programs − Glide (ver. 60014, mmshare 23014), PLANTS (ver. 1.2), AutoDock Vina (MGLTools 1.5.7rc1), and GOLD (ver. 5.1, cppbuild).31 All the studied geometries are now provided in the SI. These docking poses were rescored by seven widely used SFs and compared to two physics-based SFs. The classical SFs were empirical, regression-based functions Glide XP,55 PLANTS PLP,56 AutoDock Vina,57 Chemscore (CS),58 Goldscore (GS),59 and ChemPLP56 which use different terms to describe vdW contacts, lipophilic surface coverage, hydrogen bonding, ligand strain, desolvation or metal interaction. The second category was knowledge-based SF represented by Astex Statistical Potential (ASP).60 ASP uses an atom−atom distance potential derived from a database of PL complexes and incorporates some Chemscore terms. For the seven standard SFs we used recommended protocols. The two physics-based SFs were represented by AMBER/GB which combines the ff03-GAFF MM force fields with GB implicit solvent49,61,62 and our SQM/ COSMO SF at the PM6-D3H4X level.31 Here, we add more robust variant of SQM/COSMO at the DFTB3-D3H4 level.21,33,38 For physics-based SFs, only hydrogens were relaxed by AMBER/GB method as described previously.31 The first global view of SF sampling power is an enrichment plot. It shows the % of cases in which the lowest energy ligand of a given SF has defined RMSD to the crystal pose (Figure 1). We can see that SQM/COSMO SF at the DFTB3-D3H4 level makes a perfectly vertical curve (100% of the four cases between 0−2 Å RMSD), which indicates a perfect scoring function. The second best SFs are SQM/COSMO at the PM6-D3H4X level and Gold GS which were able to identify the native poses in 75% of cases. Note that the native pose is defined to have RMSD of 0−0.5 Å from the crystal pose due to the inaccuracies of X-ray 128

DOI: 10.1021/acs.jcim.6b00513 J. Chem. Inf. Model. 2017, 57, 127−132

Letter

Journal of Chemical Information and Modeling

correct distribution of the relative scores vs RMSD data points is characterized by positive energy values for decoy poses. Deviations from this behavior can be quantified by the number of false-positive (FP) solutions, i.e. all the poses which scored better than the native pose which is defined above (Figure 3A, Table S2). Number of FPs increased by one also indicates the rank of the X-ray pose (ideally the lowest-ranking pose) among all docking poses. It should be noted that the number of FPs is independent of normalization. As another criterion, we previously introduced RMSDmax (Figure 3B).31 RMSDmax is maximal RMSD of the ligand from the X-ray pose in a defined interval of the relative normalized score. In such a window, we can find poses with differing RMSD. An ideal behavior of a SF is that these poses have low RMSD, in other words that their RMSDmax is low. This criterion simulates a scenario in which one would not only consider the best-scoring pose but also those within a narrow energy window (normalized score). RMSDmax should not be confused with the “lowest RMSD” which describes the best pose produced by a docking program, i.e. the pose that is the most similar to the X-ray structure. The SQM/COSMO SF at the PM6-D3H4X level had the best performance among the tested SFs.26 It had zero FPs for three systems. FPs were found only for the TACE metalloprotein (39 FP solutions). It also had the lowest RMSDmax value within the window of 5; on average 0.88 Å. It thus went even far beyond the standard limit of recognition of the correct binding pose (2 Å).9 Following SQM/COSMO, the very good performing SFs were GOLD ASP and GOLD CS (Figure 3). We note here that the physics-based AMBER/GB would qualify to this category, if we excluded the FPs for the TACE metalloprotein. GOLD ASP and GOLD CS had single-digit FPs for the three PL systems except TACE which had 59 and 54 FPs, respectively. Still, these were the second lowest numbers of FPs after SQM/COSMO, compared to other SFs.31 The RMSDmax values were within the standard range of native pose recognition (2 Å)91.78 and 1.28 Å on average, respectively. (We note here that AMBER/GB has a comparable RMSDmax of 1.62 Å.) Obviously, the TACE metalloprotein was the most difficult one, which gave 39 FPs even for SQM/COSMO SF at the PM6D3H4X level. The other SFs, however, had significantly higher number of FPs (Figure 3A). Here we show that the performance of the SQM/COSMO SF scheme at the DFTB3-D3H4 level is even better than at the PM6-D3H4X level. The number of FPs is zero for all the systems (Figure 3A). The second criterion, RMSDmax, for the interval of the normalized relative scores below 5 (energy, arbitrary units) is shown in Figure 3B. The SQM/COSMO SF at DFTB3-D3H4 level shows the lowest RMSDmax values of 0.43 Å on average, whereas its average value is 0.88 Å at the PM6-D3H4X level. Similarly, the SQM/COSMO SF at the DFTB3-D3H4 level significantly outperforms all other SFs in the intervals of the normalized score of 10 and 20 (Tables S3−S5). These results show the importance of electronic-structure theory and its quality for the description of PL binding. For example, possible charge transfer energy between ligands and metal ions in metalloproteins are not described by classical forcefields or statistical potentials, but they are described by the SQM/ COSMO SF, better so at the DFTB3-D3H4 than the PM6D3H4X level due to the less empirical nature of DFTB3 as opposed to PM6.43,45 Despite the impressive performance of DFTB3-D3H4 method shown here we note that the method still needs further development, especially for charged ligands.45

Figure 1. Enrichment plot for all scoring functions in the four PL systems.

crystallography.63,64 AMBER/GB and PLANTS PLP have the same performance with enrichment of 50% within RMSD of 0.5 Å. All other SFs perform worse. The sampling performance of a SF is judged by unambiguous identification of the native pose (RMSD = 0−0.5 Å to the crystal) as the lowest-scoring structure as compared to decoy poses. All the generated ligand poses and the crystal pose were rescored by ten SFs and RMSD from the crystal pose measured. The plots of raw scores vs RMSD for all the methods and all the PL systems are shown in Appendix A in the SI. In order to be able to compare the methods which are on different scales, these scores were transformed into relative numbers with respect to the score of the crystal pose. The relative scores were scaled for visualization (Appendix B in the SI) and RMSDmax criterion (see below). For the second global view of the performance of all ten SFs, we simplified these “clouds of points” (i.e. relative normalized scores vs RMSD) by showing the lower boundary of all energies (for details, see SI) in a single graph (Figure 2). An ideal shape of the curve is sharply ascending in the low RMSD region and not descending below zero in higher RMSD region. As we can see, the SQM/COSMO SFs are the closest to such a behavior. For detailed quantification of SFs performance, multiple analyses of the data have been carried out. Following the funnel hypothesis of sampling of PL binding mentioned above, the

Figure 2. Plots of lower energy boundaries of relative normalized scores against RMSD for the studied PL systems. 129

DOI: 10.1021/acs.jcim.6b00513 J. Chem. Inf. Model. 2017, 57, 127−132

Letter

Journal of Chemical Information and Modeling

Figure 3. (A) Number of FPs as defined in the text. (B) RMSDmax within a window of 5 (energy, arbitrary units) of the normalized score.

A reliable solvation model is also important as is seen by the combinations of SQM methods with GB which provide considerably worse results, 105 and 144 FPs for PM6-D3H4X/ GB and DFTB3-D3H4/GB, respectively (see Table S6). Justification of DFTB3-D3H4 Results. To understand the superior behavior of DFTB3-D3H4 for the TACE metalloprotein, we have studied in more details selected interactions (S−···Zn2+, SO2···HN, and SO2···H2O) which are potentially difficult to describe by SQM methods (Figure S2).42,43,45 To be able to obtain reference energies by advanced QM methods, we constructed small models of these interactions. The small model for S−···Zn2+ interaction consisted of ca. 50 atoms comprising Zn2+, three imidazoles (for His405, His409, and His415) and acetic acid (Glu406) and CH3−CH2−S− fragment representing the ligand. In the case of SO2···H2O interaction the small system consisted of Thr347, Leu348, Gly349, Wat524 and CH3−SO2− Phe fragment of the ligand. The benchmark gas-phase interaction energy was calculated at the RI-MP2/def-QZVPP level and corrected for BSSE. We compared it to several DFT-D3 methods and two SQM methods (Table S7). Errors for the SO2···H2O interaction are for all methods consistently small (below 3 kcal/mol). The errors are larger in the case of the S−···Zn2+ interaction. Here, only DFT-D3 calculations with the def2-QZVP basis set have error below 2.0 kcal/mol (Table S7). The error of PM6-D3H4X method is 9.5 kcal/mol. In the case of DFTB3-D3H4 method, it is considerably less, 5.4 kcal/mol. DFTB3-D3H4 thus provides considerably better agreement with benchmark results than PM6-D3H4X. Further, the DFTB3-D3H4 results are even comparable to those obtained with DFT-D3/TPSS/TZVPP and DFT-D3/BLYP/ DZVP65 methods. Similar findings were obtained in many model systems with Zn2+, Cu+, and Cu2+ metal ions.43,45 To conclude, we have recently introduced the SQM/COSMO SF at the PM6-D3H4X level that outperformed eight commonly used SFs in recognizing the native pose in PL complexes. Moreover, it surpassed the common limits and evaluated even small changes in the geometry of the ligand binding. FPs were only found in the case of metalloprotein. Here, we show that the results of the SQM/COSMO SF for this system can be significantly improved by applying more accurate DFTB3-

D3H4 method combined with COSMO implicit solvation. For the three other systems the high-quality performance of PM6D3H4X was retained. The success of the SQM/COSMO SF is due to the high-quality description of both dominant terms and also probably due to fortuitous compensations of the neglected scoring terms. These very promising results are being further validated in other PL complexes in our laboratory. Even though the DFTB3-D3H4 is more demanding than the PM6-D3H4X method, it is still orders of magnitude less time-consuming than the most economic DFT. Given the supercomputer power, the time requirements for the SQM/COSMO SF allow us to calculate thousands of docking poses in reasonable time. We are showing here that the SQM/COSMO SF pushes the accuracy of SFs toward the experimental limits. Because of its generality, comparability across the chemical space and no need for any ad hoc system-specific parameters, we propose the SQM/COSMO SF be used as a computational tool for accurate mediumthroughput refinement in later stages of virtual screening or as a reference method to judge the performance of other SFs.



ASSOCIATED CONTENT

* Supporting Information S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.6b00513. Details about structure preparation, DFTB3-D3H4 parameters, score normalization, RMSDmax within windows of 10 and 20 of the normalized score, number of false positive solutions, the Zn2+···S− distance scan plot, timing of SQM methods, relative raw energy and normalized score values plotted against RMSD values (PDF) Geometries of the studied systems (proteins, X-ray, and docking poses) (ZIP)



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Jan Ř ezác:̌ 0000-0001-6849-7314 Pavel Hobza: 0000-0001-5292-6719 130

DOI: 10.1021/acs.jcim.6b00513 J. Chem. Inf. Model. 2017, 57, 127−132

Letter

Journal of Chemical Information and Modeling Notes

ment: Distinguishing between Binders and Decoys in Cytochrome c Peroxidase. J. Chem. Inf. Model. 2011, 51, 93−101. (16) Antony, J.; Grimme, S.; Liakos, D. G.; Neese, F. Protein-Ligand Interaction Energies with Dispersion Corrected Density Functional Theory and High-Level Wave Function Based Methods. J. Phys. Chem. A 2011, 115, 11210−11220. (17) Raha, K.; Merz, K. M. A Quantum Mechanics-based Scoring Function: Study of Zinc Ion-mediated Ligand Binding. J. Am. Chem. Soc. 2004, 126, 1020−1021. (18) Raha, K.; Merz, K. M. Large-scale Validation of a Quantum Mechanics Based Scoring Function: Predicting the Binding Affinity and the Binding Mode of a Diverse Set of Protein-Ligand Complexes. J. Med. Chem. 2005, 48, 4558−4575. (19) Stewart, J. J. P. Optimization of Parameters for Semiempirical Methods V: Modification of NDDO Approximations and Application to 70 Elements. J. Mol. Model. 2007, 13, 1173−1213. (20) Rezac, J.; Fanfrlik, J.; Salahub, D.; Hobza, P. Semiempirical Quantum Chemical PM6 Method Augmented by Dispersion and HBonding Correction Terms Reliably Describes Various Types of Noncovalent Complexes. J. Chem. Theory Comput. 2009, 5, 1749−1760. (21) Rezac, J.; Hobza, P. Advanced Corrections of Hydrogen Bonding and Dispersion for Semiempirical Quantum Mechanical Methods. J. Chem. Theory Comput. 2012, 8, 141−151. (22) Rezac, J.; Hobza, P. A Halogen-bonding Correction for the Semiempirical PM6 Method. Chem. Phys. Lett. 2011, 506, 286−289. (23) Rezac, J.; Hobza, P. Benchmark Calculations of Interaction Energies in Noncovalent Complexes and Their Applications. Chem. Rev. 2016, 116, 5038−5071. (24) Fanfrlik, J.; Bronowska, A. K.; Rezac, J.; Prenosil, O.; Konvalinka, J.; Hobza, P. A Reliable Docking/Scoring Scheme Based on the Semiempirical Quantum Mechanical PM6-DH2 Method Accurately Covering Dispersion and H-Bonding: HIV-1 Protease with 22 Ligands. J. Phys. Chem. B 2010, 114, 12666−12678. (25) Lepsik, M.; Rezac, J.; Kolar, M.; Pecina, A.; Hobza, P.; Fanfrlik, J. The Semiempirical Quantum Mechanical Scoring Function for In Silico Drug Design. ChemPlusChem 2013, 78, 921−931. (26) Fanfrlik, J.; Kolar, M.; Kamlar, M.; Hurny, D.; Ruiz, F. X.; Cousido-Siah, A.; Mitschler, A.; Rezac, J.; Munusamy, E.; Lepsik, M.; Matejicek, P.; Vesely, J.; Podjarny, A.; Hobza, P. Modulation of Aldose Reductase Inhibition by Halogen Bond Tuning. ACS Chem. Biol. 2013, 8, 2484−2492. (27) Fanfrlik, J.; Ruiz, F. X.; Kadlcikova, A.; Rezac, J.; Cousido-Siah, A.; Mitschler, A.; Haldar, S.; Lepsik, M.; Kolar, M. H.; Majer, P.; Podjarny, A. D.; Hobza, P. The Effect of Halogen-to-Hydrogen Bond Substitution on Human Aldose Reductase Inhibition. ACS Chem. Biol. 2015, 10, 1637−1642. (28) Vorlova, B.; Nachtigallova, D.; Jiraskova-Vanickova, J.; Ajani, H.; Jansa, P.; Rezac, J.; Fanfrlik, J.; Otyepka, M.; Hobza, P.; Konvalinka, J.; Lepsik, M. Malonate-based Inhibitors of Mammalian Serine Racemase: Kinetic Characterization and Structure-based Computational Study. Eur. J. Med. Chem. 2015, 89, 189−197. (29) Cousido-Siah, A.; Ruiz, F. X.; Fanfrlik, J.; Gimenez-Dejoz, J.; Mitschler, A.; Kamlar, M.; Vesely, J.; Ajani, H.; Pares, X.; Farres, J.; Hobza, P.; Podjarny, A. IDD388 Polyhalogenated Derivatives as Probes for an Improved Structure-Based Selectivity of AKR1B10 Inhibitors. ACS Chem. Biol. 2016, 11, 2693−2705. (30) Fanfrlik, J.; Brahmkshatriya, P. S.; Rezac, J.; Jilkova, A.; Horn, M.; Mares, M.; Hobza, P.; Lepsik, M. Quantum Mechanics-Based Scoring Rationalizes the Irreversible Inactivation of Parasitic Schistosoma mansoni Cysteine Peptidase by Vinyl Sulfone Inhibitors. J. Phys. Chem. B 2013, 117, 14973−14982. (31) Pecina, A.; Meier, R.; Fanfrlik, J.; Lepsik, M.; Rezac, J.; Hobza, P.; Baldauf, C. The SQM/COSMO Filter: Reliable Native Pose Identification Based on the Quantum-Mechanical Description of Protein-Ligand Interactions and Implicit COSMO Solvation. Chem. Commun. 2016, 52, 3312−3315. (32) Faver, J. C.; Benson, M. L.; He, X. A.; Roberts, B. P.; Wang, B.; Marshall, M. S.; Kennedy, M. R.; Sherrill, C. D.; Merz, K. M. Formal

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We are thankful to Agnieszka K. Bronowska for her valuable comments on the manuscript. We are also grateful to all the reviewers for inspiring comments, which contributed positively to the quality of the manuscript. This work was part of the Research Project RVO: 61388963 of the Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic. This work was also supported by the Czech Science Foundation [A.P., S.H., J.F., M.L., and P.H. from grant no. P208/ 12/G016 and J.Ř . from grant no. P208/16-11321Y]. This work was supported by the Ministry of Education, Youth and Sports from the Large Infrastructures for Research, Experimental Development and Innovations project “IT4Innovations National Supercomputing Center−LM2015070” as well as from project LO1305 (P.H.).



REFERENCES

(1) Schneider, G. Virtual Screening: An Endless Staircase? Nat. Rev. Drug Discovery 2010, 9, 273−276. (2) Irwin, J. J.; Shoichet, B. K. Docking Screens for Novel Ligands Conferring New Biology. J. Med. Chem. 2016, 59, 4103−4120. (3) Cheng, T. J.; Li, X.; Li, Y.; Liu, Z. H.; Wang, R. X. Comparative Assessment of Scoring Functions on a Diverse Test Set. J. Chem. Inf. Model. 2009, 49, 1079−1093. (4) Wang, Z.; Sun, H. Y.; Yao, X. J.; Li, D.; Xu, L.; Li, Y. Y.; Tian, S.; Hou, T. J. Comprehensive Evaluation of Ten Docking Programs on a Diverse Set of Protein-Ligand Complexes: The Prediction Accuracy of Sampling Power and Scoring Power. Phys. Chem. Chem. Phys. 2016, 18, 12964−12975. (5) Yuriev, E.; Agostino, M.; Ramsland, P. A. Challenges and Advances in Computational Docking: 2009 in Review. J. Mol. Recognit. 2011, 24, 149−164. (6) Jain, A. N.; Nicholls, A. Recommendations for Evaluation of Computational Methods. J. Comput.-Aided Mol. Des. 2008, 22, 133−139. (7) Kontoyianni, M.; McClellan, L. M.; Sokol, G. S. Evaluation of Docking Performance: Comparative Data on Docking Algorithms. J. Med. Chem. 2004, 47, 558−565. (8) Gohlke, H.; Hendlich, M.; Klebe, G. Knowledge-based Scoring Function to Predict Protein-Ligand Interactions. J. Mol. Biol. 2000, 295, 337−356. (9) Warren, G. L.; Andrews, C. W.; Capelli, A. M.; Clarke, B.; LaLonde, J.; Lambert, M. H.; Lindvall, M.; Nevins, N.; Semus, S. F.; Senger, S.; Tedesco, G.; Wall, I. D.; Woolven, J. M.; Peishoff, C. E.; Head, M. S. A Critical Assessment of Docking Programs and Scoring Functions. J. Med. Chem. 2006, 49, 5912−5931. (10) Grinter, S. Z.; Zou, X. Q. Challenges, Applications, and Recent Advances of Protein-Ligand Docking in Structure-Based Drug Design. Molecules 2014, 19, 10150−10176. (11) Soderhjelm, P.; Kongsted, J.; Ryde, U. Ligand Affinities Estimated by Quantum Chemical Calculations. J. Chem. Theory Comput. 2010, 6, 1726−1737. (12) Hayik, S. A.; Dunbrack, R.; Merz, K. M. Mixed Quantum Mechanics/Molecular Mechanics Scoring Function To Predict ProteinLigand Binding Affinity. J. Chem. Theory Comput. 2010, 6, 3079−3091. (13) Berg, L.; Mishra, B. K.; Andersson, C. D.; Ekstrom, F.; Linusson, A. The Nature of Activated Non-classical Hydrogen Bonds: A Case Study on Acetylcholinesterase-Ligand Complexes. Chem. - Eur. J. 2016, 22, 2672−2681. (14) Chaskar, P.; Zoete, V.; Rohrig, U. F. Toward On-The-Fly Quantum Mechanical/Molecular Mechanical (QM/MM) Docking: Development and Benchmark of a Scoring Function. J. Chem. Inf. Model. 2014, 54, 3137−3152. (15) Burger, S. K.; Thompson, D. C.; Ayers, P. W. Quantum Mechanics/Molecular Mechanics Strategies for Docking Pose Refine131

DOI: 10.1021/acs.jcim.6b00513 J. Chem. Inf. Model. 2017, 57, 127−132

Letter

Journal of Chemical Information and Modeling Estimation of Errors in Computed Absolute Interaction Energies of Protein-Ligand Complexes. J. Chem. Theory Comput. 2011, 7, 790−797. (33) Elstner, M.; Hobza, P.; Frauenheim, T.; Suhai, S.; Kaxiras, E. Hydrogen Bonding and Stacking Interactions of Nucleic Acid Base Pairs: A Density-Functional-Theory Based Treatment. J. Chem. Phys. 2001, 114, 5149−5155. (34) Aradi, B.; Hourahine, B.; Frauenheim, T. DFTB+, A Sparse Matrix-based Implementation of the DFTB Method. J. Phys. Chem. A 2007, 111, 5678−5684. (35) Stewart, J. J. P. MOPAC2016; Stewart Computational Chemistry: Colorado Springs, CO, 2016. (36) Stewart, J. J. P. Application of the PM6 Method to Modeling Proteins. J. Mol. Model. 2009, 15, 765−805. (37) Nishizawa, H.; Nishimura, Y.; Kobayashi, M.; Irle, S.; Nakai, H. Three Pillars for Achieving Quantum Mechanical Molecular Dynamics Simulations of Huge Systems: Divide-and-conquer, Density-functional Tight-binding, and Massively Parallel Computation. J. Comput. Chem. 2016, 37, 1983−1992. (38) Rezac, J.; Miriyala, V. M. Description of Non-Covalent Interactions in SCC-DFTB Methods. J. Comput. Chem. 2017, DOI: 10.1002/jcc.24725. (39) Christensen, A. S.; Kubar, T.; Cui, Q.; Elstner, M. Semiempirical Quantum Mechanical Methods for Noncovalent Interactions for Chemical and Biochemical Applications. Chem. Rev. 2016, 116, 5301− 5337. (40) Gaus, M.; Cui, Q. A.; Elstner, M. DFTB3: Extension of the SelfConsistent-Charge Density-Functional Tight-Binding Method (SCCDFTB). J. Chem. Theory Comput. 2011, 7, 931−948. (41) Gaus, M.; Goez, A.; Elstner, M. Parametrization and Benchmark of DFTB3 for Organic Molecules. J. Chem. Theory Comput. 2013, 9, 338−354. (42) Gaus, M.; Lu, X. Y.; Elstner, M.; Cui, Q. Parameterization of DFTB3/3OB for Sulfur and Phosphorus for Chemical and Biological Applications. J. Chem. Theory Comput. 2014, 10, 1518−1537. (43) Lu, X. Y.; Gaus, M.; Elstner, M.; Cui, Q. Parametrization of DFTB3/3OB for Magnesium and Zinc for Chemical and Biological Applications. J. Phys. Chem. B 2015, 119, 1062−1082. (44) Kubillus, M.; Kubar, T.; Gaus, M.; Rezac, J.; Elstner, M. Parameterization of the DFTB3 Method for Br, Ca, Cl, F, I, K, and Na in Organic and Biological Systems. J. Chem. Theory Comput. 2015, 11, 332−342. (45) Gaus, M.; Jin, H. Y.; Demapan, D.; Christensen, A. S.; Goyal, P.; Elstner, M.; Cui, Q. DFTB3 Parametrization for Copper: The Importance of Orbital Angular Momentum Dependence of Hubbard Parameters. J. Chem. Theory Comput. 2015, 11, 4205−4219. (46) Rezac, J.; Riley, K. E.; Hobza, P. S66: A Well-balanced Database of Benchmark Interaction Energies Relevant to Biomolecular Structures. J. Chem. Theory Comput. 2011, 7, 2427−2438. (47) Kolar, M.; Fanfrlik, J.; Lepsik, M.; Forti, F.; Luque, F. J.; Hobza, P. Assessing the Accuracy and Performance of Implicit Solvent Models for Drug Molecules: Conformational Ensemble Approaches. J. Phys. Chem. B 2013, 117, 5950−5962. (48) Klamt, A.; Schuurmann, G. COSMO - A New Approach to Dielectric Screening in Solvents with Explicit Expressions for the Screening Energy and Its Gradient. J. Chem. Soc., Perkin Trans. 2 1993, 2, 799−805. (49) Tsui, V.; Case, D. A. Theory and Applications of the Generalized Born Solvation Model in Macromolecular Simulations. Biopolymers 2000, 56, 275−291. (50) Brooks, B. R.; Brooks, C. L.; Mackerell, A. D.; Nilsson, L.; Petrella, R. J.; Roux, B.; Won, Y.; Archontis, G.; Bartels, C.; Boresch, S.; Caflisch, A.; Caves, L.; Cui, Q.; Dinner, A. R.; Feig, M.; Fischer, S.; Gao, J.; Hodoscek, M.; Im, W.; Kuczera, K.; Lazaridis, T.; Ma, J.; Ovchinnikov, V.; Paci, E.; Pastor, R. W.; Post, C. B.; Pu, J. Z.; Schaefer, M.; Tidor, B.; Venable, R. M.; Woodcock, H. L.; Wu, X.; Yang, W.; York, D. M.; Karplus, M. CHARMM: The Biomolecular Simulation Program. J. Comput. Chem. 2009, 30, 1545−1614. (51) Dvir, H.; Wong, D. M.; Harel, M.; Barril, X.; Orozco, M.; Luque, F. J.; Munoz-Torrero, D.; Camps, P.; Rosenberry, T. L.; Silman, I.;

Sussman, J. L. 3D Structure of Torpedo Californica Acetylcholinesterase Complexed with Huprine X at 2.1 Angstrom Resolution: Kinetic and Molecular Dynamic Correlates. Biochemistry 2002, 41, 2970−2981. (52) Bandarage, U. K.; Wang, T.; Come, J. H.; Perola, E.; Wei, Y.; Rao, B. G. Novel Thiol-based TACE Inhibitors. Part 2: Rational Design, Synthesis, and SAR of Thiol-containing Aryl Sulfones. Bioorg. Med. Chem. Lett. 2008, 18, 44−48. (53) Steuber, H.; Heine, A.; Klebe, G. Structural and Thermodynamic Study on Aldose Reductase: Nitro-substituted Inhibitors with Strong Enthalpic Binding Contribution. J. Mol. Biol. 2007, 368, 618−638. (54) Brynda, J.; Rezacova, P.; Fabry, M.; Horejsi, M.; Stouracova, R.; Sedlacek, J.; Soucek, M.; Hradilek, M.; Lepsik, M.; Konvalinka, J. A Phenylnorstatine Inhibitor Binding to HIV-1 Protease: Geometry, Protonation, and Subsite-pocket Interactions Analyzed at Atomic Resolution. J. Med. Chem. 2004, 47, 2030−2036. (55) Friesner, R. A.; Murphy, R. B.; Repasky, M. P.; Frye, L. L.; Greenwood, J. R.; Halgren, T. A.; Sanschagrin, P. C.; Mainz, D. T. Extra Precision Glide: Docking and Scoring Incorporating a Model of Hydrophobic Enclosure for Protein-Ligand Complexes. J. Med. Chem. 2006, 49, 6177−6196. (56) Korb, O.; Stutzle, T.; Exner, T. E. Empirical Scoring Functions for Advanced Protein-Ligand Docking with PLANTS. J. Chem. Inf. Model. 2009, 49, 84−96. (57) Trott, O.; Olson, A. J. AutoDock Vina: Improving the Speed and Accuracy of Docking with a New Scoring Function, Efficient Optimization, and Multithreading. J. Comput. Chem. 2010, 31, 455−461. (58) Eldridge, M. D.; Murray, C. W.; Auton, T. R.; Paolini, G. V.; Mee, R. P. Empirical Scoring Functions 1. The Development of a Fast Empirical Scoring Function to Estimate the Binding Affinity of Ligands in Receptor Complexes. J. Comput.-Aided Mol. Des. 1997, 11, 425−445. (59) Jones, G.; Willett, P.; Glen, R. C.; Leach, A. R.; Taylor, R. Development and Validation of a Genetic Algorithm for Flexible Docking. J. Mol. Biol. 1997, 267, 727−748. (60) Mooij, W. T. M.; Verdonk, M. L. General and Targeted Statistical Potentials for Protein-Ligand Interactions. Proteins: Struct., Funct., Genet. 2005, 61, 272−287. (61) Duan, Y.; Wu, C.; Chowdhury, S.; Lee, M. C.; Xiong, G. M.; Zhang, W.; Yang, R.; Cieplak, P.; Luo, R.; Lee, T.; Caldwell, J.; Wang, J. M.; Kollman, P. A Point-Charge Force Field for Molecular Mechanics Simulations of Proteins Based on Condensed-Phase Quantum Mechanical Calculations. J. Comput. Chem. 2003, 24, 1999−2012. (62) Wang, J. M.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development and Testing of a General Amber Force Field. J. Comput. Chem. 2004, 25, 1157−1174. (63) Kirchmair, J.; Wolber, G.; Laggner, C.; Langer, T. Comparative Performance Assessment of the Conformational Model Generators Omega and Catalyst: A Large-scale Survey on the Retrieval of Proteinbound Ligand Conformations. J. Chem. Inf. Model. 2006, 46, 1848− 1861. (64) Warren, G. L.; Do, T. D.; Kelley, B. P.; Nicholls, A.; Warren, S. D. Essential Considerations for Using Protein-Ligand Structures in Drug Discovery. Drug Discovery Today 2012, 17, 1270−1281. (65) Fanfrlik, J.; Holub, J.; Ruzickova, Z.; Rezac, J.; Lane, P.; Wann, D.; Hnyk, D.; Ruzicka, A.; Hobza, P. Competition between Halogen, Hydrogen and Dihydrogen Bonding in Brominated Carboranes. ChemPhysChem 2016, 17, 3373−3376.

132

DOI: 10.1021/acs.jcim.6b00513 J. Chem. Inf. Model. 2017, 57, 127−132