BRUTUS: Optimization of a Grid-Based Similarity ... - ACS Publications

May 18, 2005 - We have developed a fast grid-based algorithm, BRUTUS, ... BRUTUS aligns molecules using field information derived from charge...
0 downloads 0 Views 421KB Size
4076

J. Med. Chem. 2005, 48, 4076-4086

BRUTUS: Optimization of a Grid-Based Similarity Function for Rigid-Body Molecular Superposition. 1. Alignment and Virtual Screening Applications Anu J. Tervo,†,‡ Toni Ro¨nkko¨,† Tommi H. Nyro¨nen,‡ and Antti Poso*,† Department of Pharmaceutical Chemistry, University of Kuopio, P. O. Box 1627, 70211 Kuopio, Finland, and CSC-Scientific Computing Limited, P. O. Box 405, 02101 Espoo, Finland Received November 2, 2004

We have developed a fast grid-based algorithm, BRUTUS, for rigid-body molecular superposition and similarity searching. BRUTUS aligns molecules using field information derived from charge distributions and van der Waals shapes of the compounds. Molecules can have similar biological properties if their charge distributions and shapes are similar, even though they have different chemical structures; that is, BRUTUS can identify compounds possessing similar properties, regardless of their structures. In this paper, we present two applications of BRUTUS. First, BRUTUS was used to superimpose five sets of inhibitors. Second, two sets of known inhibitors were searched from a database, and the results were analyzed using self-organizing maps. We demonstrate that BRUTUS is successful in superimposing compounds using molecular fields and, importantly, is fast and accurate enough for virtual screening of chemical databases using a standard personal computer. This fast and efficient molecular-field-based algorithm is applicable for virtual screening of structurally diverse, active molecules. Introduction Virtual screening of chemical databases to identify novel lead compounds has become an important tool in drug design.1,2 In recent years, several successful virtual screening experiments have been reported,3-11 which demonstrates the usefulness of the methodology in modern drug discovery. This has raised interest for development of new methods and algorithms for virtual screening. We have introduced our novel approach for molecular superposition and similarity searching, BRUTUS, which aligns compounds based on their molecular fields.12 In BRUTUS, electrostatic fields are modeled with electrostatic potentials (ESPs) that are derived from precalculated point charges. Steric fields are derived from the van der Waals (vdW) shape of the studied compounds. Thus, BRUTUS concentrates on the fields that describe intermolecular recognition properties of molecules instead of molecular structures. In this paper, we evaluate the ability of BRUTUS to generate molecular alignments and conduct similarity searching. A more detailed description of the BRUTUS algorithm is presented elsewhere.12 Many physicochemical and biological properties of compounds (e.g., activity because of complex formation with a target protein) are fundamentally related to the electron density surrounding the atomic nuclei of the molecule. Two molecules can possess similar biological properties if their charge distributions and shapes are similar, even though they have different molecular structures. To compare the similarity between two compounds, chemical features that lead to their observed properties are first superimposed. Several publications have reported different superposition method* To whom correspondence should be addressed. Tel: +358-17162462. Fax: +358-17-162456. Email: [email protected]. † Department of Pharmaceutical Chemistry, University of Kuopio. ‡ CSC-Scientific Computing Limited.

ologies, and these have been extensively reviewed by Lemmen et al.13 Some reported superposition methods are based on the steric and electrostatic properties of molecules,14-21 resembling the approach used in BRUTUS. Field-based alignment methods are less dependent on matching specific structural features of compounds than traditional atom-based approaches and tend to be better in aligning structurally diverse compounds.22 Similarity searching is a common methodology in virtual screening.23-25 The typical aim of similarity searching is to discover compounds from chemical databases that exhibit similar molecular properties toward a given target molecule or a set of molecules, for example, similar binding affinity for a target protein. Several reported similarity searching approaches are based on structural or substructural similarities of compounds26-28 or on their molecular shape.29-33 It is commonly recognized that compounds having similar structures often have similar properties. However, similarity searches that concentrate exclusively on structural similarity may fail to identify structurally dissimilar compounds that have similar molecular fields and thus could have similar biological properties. Therefore, the advantage of BRUTUS, compared to many other similarity searching methods, is its ability to locate compounds that have similar molecular fields regardless of their structure and thus identify molecules with diverse structures but similar molecular properties. In similarity searching, BRUTUS first aligns compounds on the basis of their molecular fields, and the numerical characterization of the aligned field data is then classified using a self-organizing map (SOM)34 procedure. Similarity searching approaches based on molecular fields,35,36 or on the utilization of field-derived data,37-39 have been previously described. BRUTUS is a fast and efficient field-based method applicable for virtual screening of chemical databases in a reasonable

10.1021/jm049123a CCC: $30.25 © 2005 American Chemical Society Published on Web 05/18/2005

BRUTUS, 1

Journal of Medicinal Chemistry, 2005, Vol. 48, No. 12 4077

Figure 1. Molecular superposition of the HIV-1 PR inhibitors. Compound 12 (green) was aligned with the template molecule 1 (red) using Gasteiger-Marsili charges. The alignment of these structurally different inhibitors yielded a rmsd value of 1.153 Å against the experimentally determined alignment of 12 (blue).

time; therefore, it may be of great value in future drug design experiments. Results and Discussion Molecular Alignments. BRUTUS was used to generate rigid-body mutual superpositions for compounds in five individual inhibitor sets. The sets consisted of 36 human immunodeficiency virus protease (HIV-1 PR) inhibitors, 14 human rhinovirus coating protein (HRV14) inhibitors, 12 elastase inhibitors, 8 thermolysin inhibitors, and 6 matrix metalloproteinase 8 (MMP-8) inhibitors (see the Supporting Information for the structures of the compounds). The size of the selected inhibitors varied between 39 and 162 atoms per molecule, and the inhibitors contained diverse functional groups. All of the inhibitors were obtained from enzymeinhibitor complexes that were retrieved from the Protein Data Bank (PDB).40 For superpositions, one inhibitor from each set was chosen to be the template molecule with which the others were aligned. BRUTUS was assumed to better superimpose different-sized compounds when a largeor medium-sized template was selected, that is, when the template had energy fields extensive enough for the alignments.12 It was also desirable for the template to contain all or most of the energy fields required for biological activity. Considering these points, the following template inhibitors were selected: HIV-1 PR inhibitor 1 (Figure 1), HRV-14 inhibitor 2 (Figure 2), elastase inhibitor 3 (Figure 3), thermolysin inhibitor 4 (Figure 4), and MMP-8 inhibitor 5 (Figure 5). At the moment, BRUTUS operates on rigid molecules. Therefore, inhibitors were used in their crystallographically determined binding conformations. Each inhibitor

was first arbitrarily rotated and positioned, followed by a superposition with the template of the corresponding inhibitor set. A root mean square distance (rmsd) value between the superimposed inhibitor and the experimental alignment of the same inhibitor was then calculated. BRUTUS was allowed to generate five superpositions for each inhibitor, and the alignment yielding the best rmsd value was taken into account. The effect of partial charges on aligning inhibitors was also of interest. Therefore, the superpositions were generated by consecutively using four different types of partial charges: Gasteiger-Hu¨ckel, Gasteiger-Marsili, MMFF94, and scaled MOPAC 6.0 MNDO/ESP charges implemented in Sybyl.41 A field resolution of 2.0 Å was chosen to obtain an estimate of the superposing accuracy used in similarity searching. In any evaluation of molecular alignments, a large rmsd can be misleading, as it may result from a modest overall fit or from a precise fit in one structural part of the compounds and a clearly inaccurate fit in another. In addition, the magnitude of the rmsd value is dependent on the number of atoms in a compound. Therefore, larger rmsd values should be accepted for the alignments of compounds that are larger in size. Some positional error is introduced by solving resolutions of the X-ray structure data from the enzyme-inhibitor complexes (as a rule of thumb, approximately 1/6 of the resolution42) and the subsequent alignment procedure of these complexes. As a field-based alignment method, BRUTUS itself introduces some structural inaccuracies into the superpositions.12 In their studies, Parretti et al. have observed that optimal alignments based on electrostatic potentials are often different from optimal structural fitting.16 This indicates that an optimal field-

4078

Journal of Medicinal Chemistry, 2005, Vol. 48, No. 12

Tervo et al.

Figure 2. Molecular superposition of the HRV-14 inhibitors showing reverse binding modes. Compound 52 (green) was aligned with the template molecule 2 (red) using MMFF94 charges, yielding a rmsd value of 1.194 Å against the experimentally determined alignment of 52 (blue). Coulomb ESP fields at the -40/40 level for both inhibitors are also illustrated. Areas having negative potential are colored in blue and areas having positive potential are colored in red.

based alignment that is generated by BRUTUS may vary from the optimal structural alignment. Considering these points, alignments yielding a rmsd value better than 2.5 Å were considered to be adequate. Alignments of a single inhibitor that yielded rmsd values of 2.0 Å or worse, with at least two types of partial charges, were also visually inspected to determine likely causes for the lower superimposing accuracy. Individual rmsd values for each superimposed inhibitor are presented in Table 1. Percentages of the inhibitor alignments that yielded an rmsd value below 2.5 Å are presented in Table 2. (1) HIV-1 PR Inhibitors. This series contained 36 inhibitors of various sizes and with several types of molecular structures and functional groups. HIV-1 PR is a C2-symmetrical dimer; therefore both symmetrical and asymmetrical inhibitors were present in the set. Because of the large structural diversity, inhibitors in this set exhibited differences within their binding modes and the volume occupied in the HIV-1 PR active site by different inhibitors varied to some extent. Good rmsd values were obtained for inhibitors with a high number of atoms (e.g., inhibitors 6, 8, 10, 15, 35, and 36, the total number of atoms was over 70 for each compound). This suggests a good superimposing ability for BRUTUS. Inaccuracy in the superpositions was observed for those inhibitors that had one or two large side chains in a different orientation than the template inhibitor 1 (14, 22, 26, and 33). However, acceptable

alignments were also obtained for comparable cases (27, 29, 31, and 38). Two superimposed, structurally different HIV-1 PR inhibitors having an unobvious experimental alignment are presented in Figure 1. This alignment would be extremely difficult to achieve on the basis of the molecular structures of the inhibitors, but BRUTUS’s superposition based on molecular fields was successful. (2) HRV-14 Inhibitors. The 14 inhibitors in this set were structurally similar, with terminal heterocycles and a bridging aliphatic or aromatic moiety. These compounds occupied mostly the same volume in the HRV-14 active site, except for 46, which was bound deeper within the enzyme binding pocket. The HRV-14 inhibitors represented two distinct binding modes, which differed in the reverse orientation of the entire inhibitor. A reverse binding mode was observed for 45, 51, and 52 when compared to the template 2. Eleven of the HRV-14 inhibitors were superimposed successfully with all partial charge types, irrespective of their binding modes. Superposition of the inhibitors 44 and 46 resulted in two and three inaccurate alignments, respectively, but BRUTUS managed to find acceptable alignments for these inhibitors as well (Table 1). Figure 2 shows the alignment of two HRV-14 inhibitors, 2 and 52, binding in a reverse orientation at the HRV-14 active site. Molecular fields were visualized in Sybyl separately for both inhibitors using MOLCAD+

BRUTUS, 1

Figure 3. Molecular superposition of the elastase inhibitors. Compound 63 (green) was aligned with the template molecule 3 (red) using MMFF94 charges, yielding a rmsd value of 1.014 Å against the experimentally determined alignment of 63 (blue).

Figure 4. Molecular superposition of the thermolysin inhibitors. Compound 67 (green) was aligned with the template molecule 4 (red) using MNDO/ESP charges, yielding a rmsd value of 0.941 Å against the experimentally determined alignment of 67 (blue).

Coulomb ESP at the -40/40 level. Areas with negative potential are shaded in blue, and areas with positive potential are shaded in red. Reverse alignment was not expected on the basis of the molecular structures. However, on the basis of the molecular fields, it is clear that two reasonable alignments can be generated for these inhibitors. Methods relying on structural similarity may have difficulties in detecting a reverse binding mode for HRV-14 inhibitors.26,28 However, BRUTUS focuses on molecular fields instead of structural similar-

Journal of Medicinal Chemistry, 2005, Vol. 48, No. 12 4079

Figure 5. Molecular superposition of the MMP-8 inhibitors. Compound 73 (green) was aligned with the template molecule 5 (red) using Gasteiger-Hu¨ckel charges, yielding a rmsd value of 1.137 Å against the experimentally determined alignment of 73 (blue).

ity; thus, it was able to recognize both of the alignment alternatives for these inhibitors. Previously, the reverse orientations for HRV-14 inhibitors were successfully recognized by Klebe22 with the field-based superposition method SEAL18 and with the surface shape-based method described by Cosgrove et al.43 Reasonable reverse alignment alternatives for structurally similar compounds have also been described by Kubinyi and coworkers44 when using the SEAL method in their studies of the well-known Cramer steroid set.45 (3) Elastase Inhibitors. Structurally analogous elastase inhibitors can exhibit several different binding modes,46 and a single template molecule covering all of the possible binding pockets of the elastase active site was not available. The inhibitors of this set were only checked for common, rather than distinct, molecular volume with the medium-size template 3; thus, the binding modes of the selected inhibitors and the occupied subpockets in the elastase active site differed to some extent. BRUTUS succeeded in superimposing inhibitors having a similar binding mode with the template (58, 61, 62, and 64) but had difficulties in superimposing inhibitors that occupied different subpockets in the elastase active site (54, 56, 57, 59, and 60). However, it cannot be expected that all of the inhibitors would be correctly aligned, since the template did not cover extensively all of the energy fields that would have been necessary to accommodate all of the superpositions.12 Difficulties in superimposing elastase inhibitors with various binding modes have been previously reported by several authors.26,28 The elastase inhibitor set can be seen as a demonstration of the importance of a similar binding mode for superimposed compounds. If there is too little informa-

4080

Journal of Medicinal Chemistry, 2005, Vol. 48, No. 12

Tervo et al.

Table 1. Individual rmsd Values for the Superimposed Inhibitors rmsd (Å) compd

PDB IDa

rmsd (Å)

GastHu¨ckb

GastMarsc

MMFF94

MNDO/ESP

0.707 0.618 2.249 1.960 1.220 1.757 0.920 0.781 0.882 3.544 0.569 0.427 0.377 0.952 1.874 0.697 3.549

0.707 0.658 2.806 3.184 0.491 0.669 0.920 0.781 1.044 2.644 0.526 0.629 0.816 0.791 2.694 1.195 1.978

MMFF94

MNDO/ESP

compd

PDB IDa

GastHu¨ckb

GastMarsc

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

1a8g 1ajv 1ajx 1d4h 1d4i 1d4j 1d4k 1d4l 1dif 1dmp 1ebw 1eby 1ebz 1ec0 1ec1 1ec2 1ec3 1g2k

1.395 0.921 0.831 2.210 1.006 0.725 3.866 0.998 4.058 0.520 1.368 1.227 0.942 0.346 1.014 1.183 3.237 0.695

HIV-1 PR inhibitors; superimposed with template molecule 1 (1qbr) 0.956 0.690 1.010 24 1g35 1.044 0.556 0.921 0.956 0.921 25 1hef 0.774 0.993 0.581 0.531 0.497 26 1heg 2.249 0.786 0.533 0.458 0.890 27 1hos 1.941 1.894 1.264 1.160 0.765 28 1hpv 0.798 0.589 0.649 1.152 1.069 29 1hpx 2.824 0.567 1.153 1.208 1.438 30 1hwr 0.762 1.003 0.991 0.970 0.995 31 1hxw 1.142 0.781 5.239 4.058 1.150 32 1mtr 1.035 1.035 0.583 0.678 0.678 33 1ody 3.439 2.979 0.580 1.313 0.809 34 1qbs 0.425 0.778 1.227 1.227 1.050 35 1qbt 0.494 0.494 0.656 2.115 0.913 36 1qbu 0.377 0.816 0.856 2.007 2.892 37 1upj 2.042 0.952 3.054 1.337 2.020 38 1vij 0.717 0.283 1.935 1.492 3.323 39 2bpy 2.236 2.236 2.589 0.791 0.907 40 7upj 1.145 3.859 0.598 1.068 0.764

41 42 43 44 45 46 47

1hri 1hrv 1r08 1r09 1rudd 1vrh 2hwb

1.815 1.703 1.989 4.398 0.478 4.958 0.545

HRV-14 inhibitors; superimposed with template molecule 2 (1ruc) 0.914 0.781 2.252 48 2hwc 0.751 0.586 1.828 1.443 1.443 49 2r04 1.879 2.194 1.368 1.368 1.368 50 2r07 0.719 0.719 1.262 1.371 7.548 51 2rm2d 1.663 2.034 1.597 0.864 0.766 52 2rs3d 1.257 1.257 7.255 4.958 1.251 53 2rs5 0.679 0.690 0.743 0.582 0.834

0.395 0.697 0.325 2.034 1.194 0.669

0.699 0.977 0.465 1.162 0.918 0.945

54 55 56 57 58 59

1b0e 1bma 1eas 1eat 1ela 1elb

5.488 0.870 3.679 5.042 1.237 3.181

Elastase inhibitors; superimposed with template molecule 3 (1ele) 1.838 1.737 5.480 60 1elc 6.430 6.656 0.891 0.544 0.948 61 1eld 0.702 0.596 2.001 3.313 3.128 62 1h9l 0.744 1.226 4.378 4.640 5.002 63 1hv7 3.756 4.234 1.128 0.919 0.412 64 7est 0.528 0.689 2.802 4.499 3.762

6.725 0.648 1.216 1.014 0.716

4.072 0.549 3.255 1.924 0.360

65 66 67 68

1os0 1thl 1tlp 1tmn

0.753 1.097 1.227 0.752

Thermolysin inhibitors; superimposed with template molecule 4 (5tmn) 0.889 0.816 1.027 69 4tmn 1.060 1.227 1.097 0.420 1.074 70 5tln 0.905 1.413 0.786 0.786 0.941 71 6tmn 0.270 0.270 0.987 0.817 1.625

1.099 1.211 0.270

1.765 4.185 0.500

72 73 74

1a85 1a86 1bzs

0.825 1.137 0.218

MMP-8 inhibitors; superimposed with template molecule 5 (1 mmb) 2.902 1.533 2.248 75 1jj9 4.740 2.265 1.139 1.157 1.079 76 1kbc 1.689 0.752 1.308 1.197 0.543

1.145 0.587

1.357 0.444

a PDB ID of the corresponding enzyme-inhibitor complex. b Gasteiger-Hu ¨ ckel charges. c Gasteiger-Marsili charges. d HRV-14 inhibitors showing a reverse binding mode to 2.

Table 2. Percentages of Alignments that Yielded rmsd Values Better than 2.5 Å in the Five Superimposed Inhibitor Series (% rmsd