Development of New Hydrogen-Bond Descriptors and Their

Development of New Hydrogen-Bond Descriptors and Their Application to ... The general application of these descriptors in comparative molecular field ...
0 downloads 0 Views 1MB Size
J. Med. Chem. 2002, 45, 1585-1597

1585

Development of New Hydrogen-Bond Descriptors and Their Application to Comparative Molecular Field Analyses† Markus Bo¨hm and Gerhard Klebe* Department of Pharmaceutical Chemistry, University of Marburg, Marbacher Weg 6, D-35032 Marburg, Germany Received September 10, 2001

Knowledge-based descriptors extracted from composite crystal-field environments in crystal data have been developed for the description of interaction properties of small molecules. Using SuperStar seven diverse probe atoms have been selected to reflect the most important physicochemical properties. The general application of these descriptors in comparative molecular field analysis has been investigated using a dataset of thermolysin inhibitors, and a comparison to the GRID program has been performed. We especially focused on hydrogenbond donor and acceptor properties by selecting a carbonyl and amino group as suitable probes. Their performance has been compared to that of the hydrogen-bond descriptors presently implemented in CoMSIA (comparative molecular similarity indices analysis). The newly developed descriptors produced significantly improved statistics for the correlation analyses if they are exclusively applied or, even better, applied in combination with other CoMSIA descriptors. Two methodologically different approaches have been tested to approximate the developed descriptors. Both reduce significantly the required computational efforts in particular for large data sets. The graphical interpretation of the field contributions of hydrogen-bonding properties elucidates additional features compared to those obtained from the original CoMSIA method. They are of valuable support for the design of improved inhibitors. Introduction The binding of a small-molecule ligand to a protein receptor is a mutual molecular recognition process of two partners performed through nonbonded interactions. The most prominent interactions responsible for the firm immobilization of a ligand at the binding site are hydrogen bonds.1 Being rather directional in nature, they determine the orientation of a ligand with respect to the protein. This predominance in directional properties is not necessarily correlated with an equivalent control of the affinity-determining parameters.2 Hydrogen bonding only contributes to ligand binding affinity if the hydrogen-bond inventory between the solvent and protein environment results in an overall increase or loss of the strength of hydrogen-bonding. This is usually only the case if the hydrogen bonds of a ligand, which are formed with the neighboring solvent water molecules in its unbound state, are replaced by chargeassisted hydrogen bonds at the binding site or if a particular ligand functional group does not match a suitable hydrogen-bond forming countergroup in the protein. However, the predominant influence on the spatial immobilization of ligands through hydrogenbonding makes them an important feature in structureor ligand-based design. The usage of field-based descriptors is well established for the analysis of “hot spots“ in protein-binding sites. The most prominent approach used in this context is the GRID program introduced by Peter Goodford.3-6 † Dedicated to Prof. Dr. Johann Gasteiger on the occasion of his 60th birthday. * To whom correspondence should be addressed. Phone: +49 6421 28 21313. Fax: +49 6421 28 28994. E-mail: klebe@ mailer.uni-marburg.de.

GRID calculates interaction energies to a specific probe by applying an empirical force field. In the MCSS (multiple copy simultaneous search) method of Miranker et al.,7-10 energetically favorable orientations and positions of various functional groups are calculated using a CHARMM potential. The directional nature of hydrogen bonds is used in HSITE11,12 for predicting potential hydrogen-bonding sites in proteins. This approach is based on crystallographic information retrieved from the CSD (Cambridge Crystallographic Database).13 On the basis of the HSITE approach, the program HBMAP14,15 also makes use of composite crystal-field information derived from the CSD to calculate hydrogen-bond probability values. In the X-SITE program,16-18 a dataset of selected high-resolution protein structures from the PDB (Protein Data Bank)19,20 was used to produce spatial distributions of atomic contact preferences for different atom types. Comparative molecular field analyses are applied to correlate and partition structural changes among ligands with overall variations in their binding affinities. They are performed if a reference protein structure is available from a crystallographic study or, in the absence of such a reference, to reveal some initial ideas about a putative receptor model. In the first case, descriptors representing hydrogen-bond properties are used to elucidate those areas in the protein where hydrogenbonding features among a set of ligands can partly explain trends in the biological data. In the latter situation, the spatial characterization of putative hydrogen-bonding sites around the ligands is of utmost importance to support and define the construction of a receptor model.21-26 Furthermore, in both situations a significant 3D QSAR model, incorporating the evalua-

10.1021/jm011039x CCC: $22.00 © 2002 American Chemical Society Published on Web 03/16/2002

1586 Journal of Medicinal Chemistry, 2002, Vol. 45, No. 8

tion of reliable hydrogen-bond descriptors, is extremely valuable for probing bioisosterism for the replacement of functional groups in ligands. Such substitutions could be attempted for various reasons, e.g., tailoring ligands toward improved ADMET properties27-30 or simply to circumvent a conflicting patent situation with potential competitors. The relevance and reliability of such an analysis of hydrogen-bonding properties strongly depend on the quality of the hydrogen-bond descriptors used in a comparative molecular field analysis. Recently, we extended the CoMSIA method31,32 by considering separate fields for hydrogen-bonding.33 In our approach, putative hydrogen-bonding sites have been generated around functional groups of the molecules included in the analysis. Such donor and acceptor sites were placed according to a set of generic hydrogen bonding sites derived from experimental information.34,35 Subsequently, these sites have been used as centers to place Gaussian-type functions approximately describing the spatial region where putative hydrogen-bonding partners could be expected. These descriptors have the advantage of being fast to assign and compute; however, they are only approximate in nature. Thus, we decided to develop and test some more sophisticated hydrogenbond descriptors for 3D QSAR analyses. In particular, we focused on knowledge-based descriptors because the systematic study of composite crystal-field environments meanwhile resulted in a much more reliable source of information about the spatial orientation of hydrogen bonds. The experimental data compiled in IsoStar36 by statistical means can be employed in terms of propensity distributions to localize nonbonded contacts around molecules. Recently, Verdonk et al. developed the program SuperStar,37,38 which utilizes the experimental information collected in IsoStar to map binding-site features. Similarly, knowledge-based pair-preferences found in protein-ligand complexes39,40 have been employed to localize preferred interaction sites in protein pockets.41 In the present paper we describe the development and performance of new hydrogen-bond descriptors for comparative molecular field analysis. A previously studied data set of thermolysin inhibitors31,33,42 has been used to test and validate these descriptors. The 3D QSAR analysis, including these new descriptors, has been performed in the formal setup of the CoMFA/CoMSIA method. The obtained results are examined with respect to their predictive power and their relevance and clarity to accentuate graphically the hydrogen-bonding contribution maps derived from the correlation analysis. The obtained features will be discussed in terms of the crystal structure of thermolysin as a reference. Methods Requirements for New Descriptors. The description of hydrogen-bonding in the current version of CoMSIA (comparative molecular similarity indices analysis)31 is based on a restricted set of rules derived from crystallographic data.34 At first, the molecule under consideration is analyzed in terms of its hydrogen-bonding donor and acceptor groups. Putative positions (“pseudoatoms”) of complementary acceptor and donor sites are then generated in a possible receptor environment. The locations of these pseudoatoms are assigned according to predefined distances, angle, and dihedral angles. The last values were parametrized from experimental data

Bo¨ hm and Klebe

Figure 1. CoMSIA donor and acceptor fields of an aliphatic amide group together with the corresponding pseudoatom positions. The donor field (left) is produced by positioning one generic pseudoatom (magenta sphere) along the projected NH direction with a distance of 1.9 Å from the hydrogen atom. Two pseudoatoms along the projected direction of the lone pairs at the oxygen with a distance of 1.8 Å from the carbonyl oxygen are placed to generate the acceptor field (right). At the pseudocenter positions, spherical Gaussians are generated, indicating regions favorable for putative acceptor (left) or donor (right) groups. found in crystal structures showing the representative ligand fragment involved in hydrogen bonding. A composite picture of such interactions in space is compiled by superimposing all retrieved examples. Finally, the pseudoatoms are placed in the centers of such distributions in space. Around each of the thuspositioned pseudoatom sites a spherical Gaussian function is calculated. Additionally, a rough weighting scheme is applied to the different pseudoatoms. Focusing on the centers of these distributions revealed a limited set of rules used in CoMSIA. As an example, the CoMSIA donor and acceptor fields of an aliphatic amide group together with its corresponding pseudoatom positions are shown in Figure 1. A weakness of the current implementation is that the description of the hydrogen bonds is quite approximate in nature because of the above-mentioned simplification and idealization. A clear advance toward a more sophisticated consideration would be the usage of information directly obtained from experimental data. Recently, a knowledge base of nonbonded interactions has been compiled in the database IsoStar.36 Crystallographic data from the CSD13 and PDB19,20 are collected in terms of scatter plots containing the distribution of certain contact groups around central groups. Currently, IsoStar (version 1.3) contains about 18 000 scatter plots retrieved for more than 300 central groups and 48 contact groups. The program SuperStar37,38 utilizes this information by superimposing such scatter plots onto the relevant functional groups exposed to a protein-binding site. The scatter plots are converted into density maps and subsequently normalized to represent probability maps. These latter probabilities can be interpreted as propensity maps; for example, at a certain position the propensity p ) 4.0 corresponds to the chance of finding such a group at this site with a 4 times higher probability than random. A more detailed description of the SuperStar methodology is given elsewhere.37 For example, Figure 2 shows the experimentally observed distributions of carbonyl and NH contact groups around an aliphatic amide group together with the corresponding propensity plots. The results obtained are qualitatively similar to those obtained by the original CoMSIA method (see Figure 1 and discussion section). Up to now, in the literature the application and validation of SuperStar are rather limited with respect to the analysis of protein-binding sites. Nevertheless, SuperStar maps can also be generated around small molecules and thus serve as fields in a comparative molecular field analysis. In this paper, we applied SuperStar fields for their general usage in 3D QSAR. Furthermore, we show that some of these maps are in particular suited to characterize hydrogen-binding properties.

Hydrogen-Bond Descriptors

Figure 2. Experimentally observed distributions of carbonyl and NH contact groups around an aliphatic amide group (upper row) together with the corresponding propensity plots (lower row). The distributions show all contact groups found within the sum of the van der Waals radii of the atoms involved. Most carbonyl contact groups are detected around the amide NH, orienting their oxygen atom toward the NH group (upper left). The amide carbonyl group is predominantly surrounded by NH contact groups placing their hydrogen atoms toward the amide oxygen. The resulting propensity distributions are contoured at three levels (blue, 2.0; red, 4.0; yellow, 8.0; see text for explanation). Finally, we present two different approaches to approximate the SuperStar maps by a fast algorithm without loosing important information. Descriptors Derived from SuperStar. The first step in the application of SuperStar to small molecules is the decomposition of the molecule under consideration into appropriate central groups that match the fragments in IsoStar. Compared to proteins, where only 20 natural amino acids have to be considered, the variety of possible central groups is much larger in the case of small molecules. For the selection of appropriate central fragments a well-balanced compromise has to be found: (1) too small and generalized fragments do not represent sufficiently the details in physicochemical properties of the decomposed molecule; however, in contrast, (2) too detailed and specialized fragments cannot be used because limited data hamper the compilation of statistically significant composite crystal-field environments. A ligand containing, for example, a sulfonamide group would be represented best by a single sulfonamide fragment, but because the data for this fragment are too scarce, a statistically insignificant distribution would have to be considered. Therefore, it is advisable to split this group into a sulfone fragment and a planar or pyramidal NH group.43 All central groups used in our study are listed in Table 1. Similar considerations have to be followed for the selection of contact groups. Thus, we utilized only those contact groups for which a sufficient amount of data could be found. Furthermore, a nonredundant set of these contact groups have been selected to reflect the most important physicochemical properties. Table 2 lists the considered contact groups together with their assigned physicochemical property. All comparative molecular field evaluations were performed using SYBYL44 on a Silicon Graphics O2 machine (225 MHz MIPS R10000 processor). The implementation of SuperStar in SYBYL for its usage in a comparative molecular field analysis was realized via an SPL (SYBYL Programming Language) script. For each molecule in the data set, the SuperStar fields for the considered probe atoms were calculated and subsequently imported into a SYBYL molecular spreadsheet. The generated molecular fields were finally processed in a PLS analysis similar to the CoMFA/CoMSIA protocol (see below).

Journal of Medicinal Chemistry, 2002, Vol. 45, No. 8 1587 To validate the results obtained by SuperStar, we also generated models using the original CoMSIA property fields (steric, electrostatic, hydrophobic, hydrogen-bond donor and acceptor properties) as implemented in SYBYL. All CoMSIA field calculations were performed with standard parameters and an attenuation factor R of 0.3 for the Gaussian-type distance dependence.31 Partial atomic charges were computed using the AM1 Hamiltonian45 as implemented in MOPAC.46,47 Furthermore, the methodologically similar GRID program was employed by utilizing five probes (Table 3) analogous to those applied in SuperStar. For all SuperStar, CoMSIA, and GRID field calculations, a uniform lattice with a 1 Å grid spacing was generated to allow for a consistent comparison of the statistical results and their graphical interpretation. By use of the CoMFA standard scaling option, equal weights for each field used in the PLS analysis were assigned. Cross-validated analyses were done by means of the leave-one-out (LOO) procedure using the enhanced SAMPLS method.48 The optimal number of components was determined in such a way that each additional component had to increase the q2 (crossvalidated r2) value by at least 5%. This procedure takes into account the principle of parsimony by selecting the smallest number of significant components.49,50 Usually this value corresponded also to the lowest sPRESS value. The same number of components was subsequently used to derive the final 3D QSAR models. The “minimum σ” standard deviation (column filtering in SYBYL) during the conventional analyses (no crossvalidation) was set to a threshold such that at least 10% of the variables are considered in the PLS analysis. This selection was necessary because, unlike in CoMFA51-53 where LennardJones and Coulomb energies expressed in kcal/mol are calculated, the CoMSIA and SuperStar fields are defined in arbitrary units and propensity values, respectively. Consequently, in comparison to CoMFA, no physical unit can be defined as minimum-σ or threshold value, e.g. as level of the contour maps (see discussion section). We applied the above-described fields to a previously studied data set of thermolysin inhibitors (Table 4).33 The experimentally determined biological activities of all thermolysin inhibitors were used as pKi values (Table 5). A training set of 61 inhibitors was used to develop the 3D QSAR models. (The atomic coordinates of all molecules of the thermolysin data set are available from the authors upon request.) In addition, 15 molecules not included in the training set were selected as a test set to assess the predictive power of the derived models. We therefore calculated the predictive r2 value according to the definition given by Cramer et al.51 Recently, we applied the CoMSIA method to a data set of serine protease inhibitors.54 By regarding all possible combinations of the five available property fields (see above), we detected strong interdependencies among the individual fields. Therefore, we also determined to what extent the SuperStar propensity maps are intercorrelated by testing some possible combinations of fields (Figure 3). Descriptors To Map Hydrogen-Bond Properties. Because the main focus of this study has been addressed to hydrogen-bond descriptors, we selected two suitable probes describing hydrogen-bonding properties of ligand molecules: (1) a carbonyl oxygen probe representing acceptor properties and (2) an amino hydrogen (“any NH”) probe serving as a group with donor properties (Table 2). To adhere to a uniform nomenclature in this paper, we want to emphasize that the calculation of a SuperStar acceptor field considers all acceptor groups of a ligand molecule and maps donor properties in the protein neighborhood of this ligand molecule using an amino hydrogen probe. On the other hand, by use of the carbonyl oxygen probe, the donor functional groups of a ligand are regarded to compute the donor field that maps acceptor properties in a putative protein environment. Substituting the original CoMSIA donor and acceptor fields by the corresponding ones from SuperStar allowed us a direct comparison of the statistics of the derived 3D QSAR models along with their graphical interpretation. The computational procedure was

1588

Journal of Medicinal Chemistry, 2002, Vol. 45, No. 8

Bo¨ hm and Klebe

Table 1. List of Central Groups Extracted from IsoStar and Used To Generate Propensity Maps around the Considered Inhibitorsa

a The expected donor (D) or acceptor (A) property of a central group is indicated. a Abbreviations: C , aliphatic (sp3) carbon; C , aromatic al ar or sp2 carbon. b Fragments possessing only weak donor or acceptor property (max propensity, 0.7). The limitation due to grid spacing also explains the improvement of the models once the grid spacing used for the development of the hydrogen-bond descriptors was set from 0.7 to 1 Å. When only the donor and acceptor fields were used, the q2 value increased from 0.46 to 0.52, and if all five fields were considered, q2 increased from 0.67 to 0.71 (R ) 0.7). As mentioned above, one important advantage of approximating the SuperStar fields is their fast calculation. Nissink et al.55 noted a save in computational requirements by a factor of 5 and more for calculating SuperStar fields for a protein-binding site. In the case of calculating the hydrogen-bond property fields (i.e., two probes) of 76 thermolysin inhibitors, SuperStar required about 30 min of computing time. When the nonspherical and spherical fitted descriptors were applied, the calculation time was reduced to 8 and 7 min, respectively, which corresponds to a reduction by a factor of 4. Graphical Interpretation. To exemplify the graphical interpretation of the field contributions of the newly developed descriptors, the hydrogen-bonding properties were analyzed for the fitted SuperStar fields approximated with spherical Gaussians and taking the original

Bo¨ hm and Klebe

CoMSIA results as reference. Detailed analysis showed that the original SuperStar descriptors as well as those fitted by the nonspherical Gaussians produce comparable results. In Figure 8 the hydrogen-bond acceptor properties are shown. Areas within the receptor site are highlighted where putative hydrogen bonding partners in the enzyme can interact with ligand functional groups. As mentioned above, acceptor properties indicated as required at the receptor site originate from the correlation of donor groups present in the set of ligands under consideration. At first glance, the contour maps derived from CoMSIA and SuperStar display qualitatively similar features. For example, the three red contoured areas derived from CoMSIA (Figure 8, left) are also highlighted using SuperStar hydrogen-bond fields (Figure 8, right). Two of the contours encompass the carbonyl oxygen atoms in the terminal amide group of Asn 111 and Asn 112, indicating the presence of acceptor functionality at the receptor site to be favorable for binding affinity. This is in agreement with the fact that these amino acids are frequently involved as acceptors interacting with hydrogen-bond donor groups of potent thermolysin inhibitors. The third red area is positioned next to the OH group of Tyr 157, indicating a favorable acceptor facility in this area. An additional red isopleth occurs in the SuperStar map next to the position of a structurally important water molecule (Figure 8, water1). It is predicted to act as an acceptor mediating a hydrogen bond between a ligand and the backbone functional groups of Trp 115. Yellow contoured regions indicate areas for which protein acceptor groups are unlikely to be present. In both diagrams, one such contour encloses the NH2 group of Asn 112, predicting that an acceptor property should be missing in this area. In fact, the presence of the NH2 group of this residue exhibiting a donor facility confirms this condition indicated by the models. Different from CoMSIA, the SuperStar model produces a second yellow isopleth next to backbone atoms of Ala 113, restraining the occurrence of an acceptor in this region. This is in accordance with the presence of the backbone NH of Ala 113 acting as a donor group; even so, the backbone carbonyl of the same residue is closely adjacent. A further area encompassing a second key water molecule (Figure 8, water2) can be correlated with the presence of protein functional groups. This structural water forms a hydrogen bond to the carboxamide oxygen of Gln 225. It is predicted to unlikely expose acceptor facilities to a bound ligand. The hydrogen-bond donor properties for both the CoMSIA and SuperStar model are depicted in Figure 9. Again, all contours produced by the original CoMSIA are similarly detected in the SuperStar map. Three principle contours, in blue, indicate the requirement for donor groups in the receptor to improve the binding affinity of inhibitors. In agreement with the superimposed protein environment, all fall next to protein residues exhibiting donor properties (NH groups of Asn 112, Trp 115, and Arg 203). The additional isopleths in blue in the SuperStar map are more difficult to assign to neighboring groups of the protein. A yellow contour encompassing the region next to the catalytic zinc highlights an area where the presence of a hydrogenbond donor would reduce affinity. This relatively large

Hydrogen-Bond Descriptors

Journal of Medicinal Chemistry, 2002, Vol. 45, No. 8 1595

Figure 8. Diagram of the isocontour plot of stdev × coeff field contributions for hydrogen-bond acceptor properties (left, plot from original CoMSIA model; right, plot from the model using SuperStar propensities fitted by spherical Gaussians). Superimposed are some of the aligned inhibitors (white) together with key residues (green-blue) in the active site of thermolysin. The solventaccessible surface is indicated as a solid surface. Structurally important water molecules are drawn as red spheres; zinc is in cyan. Red isopleths (contour level left, 0.001; right, 0.015) enclose areas where an acceptor group from the protein will be favorable for binding. Regions encompassed by yellow contours (contour level left, -0.005; right, -0.007) highlight hydrogen-bond acceptor capabilities at the receptor site that do not enhance affinity.

Figure 9. Diagram of the isocontour plot of stdev × coeff field contributions for hydrogen-bond donor properties (left, plot from original CoMSIA model; right, plot from the model using SuperStar propensities fitted by spherical Gaussians). Blue contoured regions (contour level left, 0.003; right, 0.009) indicate areas where the presence of a donor group at the receptor enhances binding affinity. No improvement in affinity can be expected in areas surrounded by yellow isopleths (contour level left, -0.008; right, -0.014).

isopleth is produced by the correlation of the various zinc-binding groups present in the different ligands. Since these metal coordinating groups usually exhibit an oxygen or nitrogen with a lone pair, the indication of favorable acceptor groups is reasonable. The occurrence of a structural water molecule (Figure 9, water1) coinciding with the yellow contour of the SuperStar map indicates that this water is unlikely to act as a donor. This is in accordance with the results from the hydrogenbond acceptor properties (see above) where this water was predicted to expose acceptor properties toward a ligand. In contrast, the same water molecule in the CoMSIA map falls next to a blue contoured area indicating favorable donor properties toward the protein, even though it is positioned close to the yellow contoured area. In summary, the graphical interpretation of the

generated models qualitatively produced similar contour maps. However, the more detailed SuperStar descriptors compared to the approximate ones in the original CoMSIA provide additional information valuable for improved ligand design. This is apparent from the highly specified contours, which in turn correspond to structural features to be discovered in the neighboring receptor site. Conclusions In the present study, the successful development of knowledge-based descriptors derived from SuperStar has been demonstrated to achieve significant improvement in 3D QSAR analysis. Principally any descriptor for an intermolecular interaction could be derived from

1596

Journal of Medicinal Chemistry, 2002, Vol. 45, No. 8

composite crystal-field environments and included in a comparative molecular field analysis. We selected a subset of seven diverse probe atoms that should reflect in a representative way the most important physicochemical properties such as hydrophobic or hydrogenbonding features. In a comparative study we demonstrated that for the hydrogen-bonding properties only, the propensity information directly taken from crystal data achieved significant improvement. Thus, we suggest a mixed model, using the original CoMSIA fields for steric, electrostatic, and hydrophobic properties and taking propensities from composite crystal-field environments for hydrogen-bond donor and acceptor properties. Using a carbonyl oxygen and an amino hydrogen probe in SuperStar to represent hydrogen-bond acceptor and donor properties, respectively, we could perform a direct comparison between the original CoMSIA hydrogen-bond descriptors and the new SuperStar derived ones. The statistical performance as well as the graphical interpretation was compared. The significantly improved statistical results indicate that the new SuperStar descriptors provide additional information resulting in better correlations and higher predictive power of the developed QSAR models. Furthermore, the graphical interpretations of the CoMFA-type contribution maps show more detailed features directly corresponding to properties actually reflected in the surrounding protein environment. Accordingly, more conclusive indications of how to optimize a particular ligand with respect to its hydrogen-bonding features in a subsequent design process can be deduced. The calculation of SuperStar descriptors can be computationally quite demanding for large data sets. Therefore, it is recommended that these descriptors be approximated in order to speed up the calculation time without losing too much information. Surprisingly, the statistical results even increased while the fitted descriptors were applied, supposedly because some of the background noise present in the maps of experimental origin is reduced.55 The graphical interpretation of the models resulted in qualitatively similar maps, showing some additional information in the SuperStar maps. However, more data sets have to be analyzed to further validate this observation. Acknowledgment. The authors are grateful to Thomas Mietzner (BASF) who helped implement a routine for the fitting of propensity distributions to spherical Gaussians. Willem Nissink (University of Marburg, now CCDC) is warmly acknowledged for making available the nonspherical fitting program and for many fruitful discussions. We thank in particular Marcel Verdonk (CCDC, now Astex) for his support on the SuperStar program and its adaptation for usage in 3D QSAR. We are grateful to CCDC for making IsoStar available to us. Tripos kindly provided us with a copy of its modeling package SYBYL. Generous financial support by the German bmb+f (Grant No. 311 681) is gratefully acknowledged. References (1) Jeffrey, G. A.; Saenger, W. Hydrogen bonding in biological structures; Springer: Berlin, 1991. (2) Bo¨hm, H. J.; Klebe, G. What Can We Learn from Molecular Recognition in Protein-Ligand Complexes for the Design of New Drugs? Angew. Chem., Int. Ed. Engl. 1996, 35, 2588-2614.

Bo¨ hm and Klebe (3) Goodford, P. J. A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. J. Med. Chem. 1985, 28, 849-857. (4) Boobbyer, D. N.; Goodford, P. J.; McWhinnie, P. M.; Wade, R. C. New hydrogen-bond potentials for use in determining energetically favorable binding sites on molecules of known structure. J. Med. Chem. 1989, 32, 1083-1094. (5) Wade, R. C.; Clark, K. J.; Goodford, P. J. Further development of hydrogen bond functions for use in determining energetically favorable binding sites on molecules of known structure. 1. Ligand probe groups with the ability to form two hydrogen bonds. J. Med. Chem. 1993, 36, 140-147. (6) Wade, R. C.; Goodford, P. J. Further development of hydrogen bond functions for use in determining energetically favorable binding sites on molecules of known structure. 2. Ligand probe groups with the ability to form more than two hydrogen bonds. J. Med. Chem. 1993, 36, 148-156. (7) Miranker, A.; Karplus, M. Functionality maps of binding sites: a multiple copy simultaneous search method. Proteins 1991, 11, 29-34. (8) Caflisch, A.; Miranker, A.; Karplus, M. Multiple copy simultaneous search and construction of ligands in binding sites: application to inhibitors of HIV-1 aspartic proteinase. J. Med. Chem. 1993, 36, 2142-2167. (9) Caflisch, A.; Schramm, H. J.; Karplus, M. Design of dimerization inhibitors of HIV-1 aspartic proteinase: a computer-based combinatorial approach. J. Comput.-Aided Mol. Des. 2000, 14, 161-179. (10) English, A. C.; Groom, C. R.; Hubbard, R. E. Experimental and computational mapping of the binding surface of a crystalline protein. Protein Eng. 2001, 14, 47-59. (11) Danziger, D. J.; Dean, P. M. Automated site-directed drug design: a general algorithm for knowledge acquisition about hydrogen-bonding regions at protein surfaces. Proc. R. Soc. London, Ser. B 1989, 236, 101-113. (12) Danziger, D. J.; Dean, P. M. Automated site-directed drug design: the prediction and observation of ligand point positions at hydrogen-bonding regions on protein surfaces. Proc. R. Soc. London, Ser. B 1989, 236, 115-124. (13) Lommerse, J. P.; Taylor, R. Characterising non-covalent interactions with the Cambridge Structural Database. J. Enzyme Inhib. 1997, 11, 223-243. (14) Mills, J. E.; Dean, P. M. Three-dimensional hydrogen-bond geometry and probability information from a crystal survey. J. Comput.-Aided Mol. Des. 1996, 10, 607-622. (15) Mills, J. E.; Perkins, T. D.; Dean, P. M. An automated method for predicting the positions of hydrogen-bonding atoms in binding sites. J. Comput.-Aided Mol. Des. 1997, 11, 229-242. (16) Singh, J.; Saldanha, J.; Thornton, J. M. A novel method for the modelling of peptide ligands to their receptors. Protein Eng. 1991, 4, 251-261. (17) Laskowski, R. A.; Thornton, J. M.; Humblet, C.; Singh, J. X-SITE: use of empirically derived atomic packing preferences to identify favourable interaction regions in the binding sites of proteins. J. Mol. Biol. 1996, 259, 175-201. (18) Wallace, A. C.; Laskowski, R. A.; Singh, J.; Thornton, J. M. Molecular recognition by proteins: protein-ligand interactions from a structural perspective. Biochem. Soc. Trans. 1996, 24, 280-284. (19) Bernstein, F. C.; Koetzle, T. F.; Williams, G. J.; Meyer, E. E., Jr.; Brice, M. D. et al. The protein data bank: a computer-based archival file for macromolecular structures. J. Mol. Biol. 1977, 112, 535-542. (20) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; et al. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235-242. (21) Snyder, J. P.; Rao, S. N.; Koehler, K. F.; Vedani, A. Minireceptors and pseudoreceptors. In 3D QSAR in Drug Design: Theory, Methods and Applications; Kubinyi, H., Ed.; ESCOM: Leiden, The Netherlands, 1993; pp 336-354. (22) Vedani, A.; Zbinden, P.; Snyder, J. P. Pseudo-receptor modeling: a new concept for the three-dimensional construction of receptor binding sites. J. Recept. Res. 1993, 13, 163-177. (23) Schmetzer, S.; Greenidge, P.; Kovar, K. A.; Schulze-Alexandru, M.; Folkers, G. Structure-activity relationships of cannabinoids: a joint CoMFA and pseudoreceptor modelling study. J. Comput.-Aided Mol. Des. 1997, 11, 278-292. (24) Sippl, W.; Stark, H.; Holtje, H. D. Development of a binding site model for histamine H3-receptor agonists. Pharmazie 1998, 53, 433-437. (25) Schleifer, K. J. Pseudoreceptor model for ryanodine derivatives at calcium release channels. J. Comput.-Aided Mol. Des. 2000, 14, 467-475. (26) Schafferhans, A.; Klebe, G. Docking ligands onto binding site representations derived from proteins built by homology modelling. J. Mol. Biol. 2001, 307, 407-427.

Hydrogen-Bond Descriptors (27) Ekins, S.; Waller, C. L.; Swaan, P. W.; Cruciani, G.; Wrighton, S. A.; et al. Progress in predicting human ADME parameters in silico. J. Pharmacol. Toxicol. Methods 2000, 44, 251-272. (28) Lipinski, C. A. Drug-like properties and the causes of poor solubility and poor permeability. J. Pharmacol. Toxicol. Methods 2000, 44, 235-249. (29) Eddershaw, P. J.; Beresford, A. P.; Bayliss, M. K. ADME/PK as part of a rational approach to drug discovery. Drug Discovery Today 2000, 5, 409-414. (30) Li, A. P. Screening for human ADME/Tox drug properties in drug discovery. Drug Discovery Today 2001, 6, 357-366. (31) Klebe, G.; Abraham, U.; Mietzner, T. Molecular similarity indices in a comparative analysis (CoMSIA) of drug molecules to correlate and predict their biological activity. J. Med. Chem. 1994, 37, 4130-4146. (32) Klebe, G. Comparative molecular similarity indices analysis: CoMSIA. Perspect. Drug Discovery Des. 1998, 12-14, 87-104. (33) Klebe, G.; Abraham, U. Comparative molecular similarity index analysis (CoMSIA) to study hydrogen-bonding properties and to score combinatorial libraries. J. Comput.-Aided Mol. Des. 1999, 13, 1-10. (34) Klebe, G. The use of composite crystal-field environments in molecular recognition and the de novo design of protein ligands. J. Mol. Biol. 1994, 237, 212-235. (35) Klebe, G.; Mietzner, T. A fast and efficient method to generate biologically relevant conformations. J. Comput.-Aided Mol. Des. 1994, 8, 583-606. (36) Bruno, I. J.; Cole, J. C.; Lommerse, J. P.; Rowland, R. S.; Taylor, R.; et al. IsoStar: a library of information about nonbonded interactions. J. Comput.-Aided Mol. Des. 1997, 11, 525-537. (37) Verdonk, M. L.; Cole, J. C.; Taylor, R. SuperStar: a knowledgebased approach for identifying interaction sites in proteins. J. Mol. Biol. 1999, 289, 1093-1108. (38) Verdonk, M. L.; Cole, J. C.; Watson, P.; Gillet, V.; Willett, P. SuperStar: improved knowledge-based interaction fields for protein binding sites. J. Mol. Biol. 2001, 307, 841-859. (39) Muegge, I.; Martin, Y. C. A general and fast scoring function for protein-ligand interactions: a simplified potential approach. J. Med. Chem. 1999, 42, 791-804. (40) Gohlke, H.; Hendlich, M.; Klebe, G. Knowledge-based scoring function to predict protein-ligand interactions. J. Mol. Biol. 2000, 295, 337-356. (41) Gohlke, H.; Hendlich, M.; Klebe, G. Predicting binding modes, binding affinities and “Hot Spots” for protein-ligand complexes using a knowledge-based scoring function. Perspect. Drug Discovery Des. 2000, 20, 115-144. (42) DePriest, S. A.; Mayer, D.; Naylor, C. B.; Marshall, G. R. 3DQSAR of angiotensin-converting enzyme and thermolysin inhibitors: a comparison of CoMFA models based on deduced and experimentally determined active site geometries. J. Am. Chem. Soc. 1993, 115, 5372-5384. (43) Inspection of crystal structures in the CSD containing a sulfonamide group reveals entries with both planar and pyramidal NH fragments. About 60% of the sulfonamides in these examples

Journal of Medicinal Chemistry, 2002, Vol. 45, No. 8 1597

(44) (45) (46) (47) (48) (49)

(50)

(51)

(52)

(53)

(54)

(55)

(56)

possess a planar NH, and 40% show a pyramidal NH group. In our analysis we tried to model the properties of this fragment according to the majority of occurrences. SYBYL Molecular Modeling Package, version 6.7; Tripos, Inc. (1699 South Hanley Road, Suite 303, St. Louis, MO 63144), 2000. Dewar, M. J. S.; Zoebisch, E. G.; Healy, E. F.; Stewart, J. J. P. AM1: a new general purpose quantum mechanical molecular model. J. Am. Chem. Soc. 1985, 107, 3902-3909. Stewart, J. J. MOPAC: a semiempirical molecular orbital program. J. Comput.-Aided Mol. Des. 1990, 4, 1-105. MOPAC: a general molecular orbital package, version 6.0; QCPE #455: J. J. P. Stewart, Stewart Computational Chemistry, 15210 Paddington Circle, Colorado Springs, CO 80921. Bush, B. L.; Nachbar, R. B., Jr. Sample-distance partial least squares: PLS optimized for many variables, with application to CoMFA. J. Comput.-Aided Mol. Des. 1993, 7, 587-619. Thibaut, U.; Folkers, G.; Klebe, G.; Kubinyi, H.; Merz, A.; et al. Recommendations for CoMFA studies and 3D QSAR publications. In 3D QSAR in Drug Design: Theory, Methods and Applications; Kubinyi, H., Ed.; ESCOM: Leiden, The Netherlands, 1993; pp 711-716. Kubinyi, H.; Abraham, U. Practical problems in PLS analyses. In 3D QSAR in Drug Design: Theory, Methods and Applications; Kubinyi, H., Ed.; ESCOM: Leiden, The Netherlands, 1993; pp 717-728. Cramer, R. D., III.; Patterson, D. E.; Bunce, J. D. Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J. Am. Chem. Soc. 1988, 110, 5959-5967. Cramer, R. D., III; DePriest, S. A.; Patterson, D. E.; Hecht, P. The Developing Practice of Comparative Molecular Field Analysis. In 3D QSAR in Drug Design: Theory, Methods and Applications; Kubinyi, H., Ed.; ESCOM: Leiden, The Netherlands, 1993; pp 443-485. Folkers, G.; Merz, A.; Rognan, D. CoMFA: scope and limitations. In 3D QSAR in Drug Design: Theory, Methods and Applications; Kubinyi, H., Ed.; ESCOM: Leiden, The Netherlands, 1993; pp 583-618. Bo¨hm, M.; Stu¨rzebecher, J.; Klebe, G. Three-dimensional quantitative structure-activity relationship analyses using comparative molecular field analysis and comparative molecular similarity indices analysis to elucidate selectivity differences of inhibitors binding to trypsin, thrombin, and factor Xa. J. Med. Chem. 1999, 42, 458-477. Nissink, J. W. M.; Verdonk, M. L.; Klebe, G. Simple knowledgebased descriptors to predict protein-ligand interactions. Methodology and validation. J. Comput.-Aided Mol. Des. 2000, 14, 787-803. Hodgkin, E. E.; Richards, W. G. Molecular similarity based on electrostatic potential and electric field. Int. J. Quantum Chem., Quantum Biol. Symp. 1987, 14, 105-110.

JM011039X