2498
J. Med. Chem. 1999, 42, 2498-2503
Evaluation of PMF Scoring in Docking Weak Ligands to the FK506 Binding Protein Ingo Muegge,* Yvonne C. Martin, Philip J. Hajduk, and Stephen W. Fesik Pharmaceutical Products Division, Abbott Laboratories, Abbott Park, Illinois 60064-3500 Received February 11, 1999
A new knowledge-based scoring function (PMF-score), implemented into the DOCK4 program, was used to screen a database of 3247 small molecules for binding to the FK506 binding protein (FKBP). The computational ranking of these compounds was compared to the binding affinities measured by NMR. It was demonstrated that small, weakly binding molecules have, on average, higher computational scores than nonbinders and are enriched in the upper ranks of the computational scoring lists. In addition, the results obtained with the PMF scoring function were superior (by 30-120% larger enrichment factors) to those obtained with the standard force field score of DOCK4. The reliable ranking of small, weakly binding molecules offers new ways of designing building blocks in combinatorial libraries as well as SAR by NMR libraries with the increased chance of identifying suitable lead compounds for drug design. Introduction The fast and cost-effective identification of suitable lead compounds is an extremely important step in the drug-discovery process. In search for such leads, biological screening techniques like high-throughput screening have usually been employed to screen proprietary databases of hundreds of thousands of compounds.1 However, these databases often do not contain a molecule with the desired properties that binds to a specific target macromolecule. In order to increase the chances of finding suitable leads, more compounds can be included in the library by buying or synthesizing them. Combinatorial chemistry can be used to prepare large libraries of tens or even hundreds of thousands of compounds.2,3 However, since the number of compounds that could theoretically be synthesized is so overwhelmingly large (>1060), combinatorial libraries contain only a tiny fraction of the conceivable molecular space. Thus one would ideally like to guide library design by selecting a set of compounds to make for a particular target. One way of determining molecular fragments that bind to the individual pockets of a protein target is through the use of a recently introduced NMR-based screening method called SAR by NMR.4 Using this technique, small organic molecules are identified that bind to proximal subsites of a protein.4 Binding is determined by the observation of 15N or 1H amide chemical shift changes in two-dimensional 15N heteronuclear single-quantum correlation (15N HSQC) spectra. When two ligands that bind to proximal binding sites have been identified, the incorporation of a linker between the two molecules can produce a high-affinity ligand. There are two major advantages of this method. First, the binding site of the molecules can be rapidly identified based on the chemical shift changes observed in the NMR spectra. Second, even molecules with low * To whom correspondence should be addressed. Present address: Bayer Corp., 400 Morgan Lane, West Haven, CT 06516. Phone: (203) 812-5139. Fax: (203) 812-3505. E-mail:
[email protected].
binding affinities (in the millimolar range) can be reliably identified. Similar to building blocks in combinatorial chemistry, the molecules to be linked should be small since the final ligand that is built from these small molecules should ideally have a molecular weight less than 500 Da.5 In principle, another approach for identifying small molecules that bind to proteins is by computational methods. Computational approaches have the advantage over SAR by NMR, or other experimental approaches, in not requiring any protein sample. Indeed, computational methods have been widely used to aid in the design of combinatorial libraries6 by implementing molecular diversity and cluster methods to increase the chances of finding active compounds in biological screening.7 Ideally, however, computational methods can be used to screen a library of compounds for binding to a biological target. The key is to develop algorithms like docking8-16 or de novo design17,18 and scoring functions18-28 that reliably predict binding affinities of small molecules to biological targets. Although the design of reliable scoring functions is a long-standing issue in computational chemistry, the accuracy of existing scoring functions is not yet high enough to make reliable predictions of protein-ligand binding affinities for a large number of molecules in a reasonable time. Most scoring functions were evaluated by demonstrating that they can identify strong binding molecules in databases. Unfortunately, it is usually unlikely to find strong binders in an existing database. Furthermore, assuming additivity in binding affinity of the building blocks of a library, it will be much more effective to dock and score putative building blocks than the enumerated library itself. Therefore, a computational scoring function must be able to solve the much more challenging problem of identifying small, weakly binding molecules that can serve as building blocks in the design of combinatorial libraries or SAR by NMR libraries. Recently, we introduced a new knowledge-based scoring function (potential of mean force score (PMF-score))
10.1021/jm990073x CCC: $18.00 © 1999 American Chemical Society Published on Web 06/29/1999
Evaluation of PMF Scoring
Journal of Medicinal Chemistry, 1999, Vol. 42, No. 14 2499
that combines the accuracy of empirical scoring functions with the advantage of higher generality and therefore wider applicability.28 It was shown that PMFscore performs better than the most prominent empirical scoring function used in docking and de novo design programs17,20 by scoring a wide variety of proteinligand structures of the Brookhaven Protein Data Bank29 (PDB) and modeled HIV-1 protease inhibitors. Here we report the implementation of this new scoring function into the DOCK4 program9,11 and the use of this DOCK4/PMF-score approach to dock a library of 3247 small molecules into the FK506 binding site of FKBP. The scores obtained by this computational approach were compared to the FKBP binding affinities of these compounds measured by NMR. This is the first time that a docking/scoring approach has tackled in a systematic way the challenging task of identifying weakly binding molecules. Methods Experimental Determination of Dissociation Constants. Ligand binding was detected by acquiring sensitivityenhanced 15N HSQC spectra30 on uniformly 15N-labeled FKBP in the presence and absence of added compound. NMR samples consisted of 0.3 mM FKBP in an aqueous buffer composed of 50 mM phosphate, 100 mM NaCl, 10% D2O, pH 6.5. The compounds were added as solutions in perdeuterated DMSO. All NMR spectra were acquired on a Bruker DMX500 spectrometer at 303 K. For compounds which produced measurable chemical shift changes at concentrations of 1.0 mM, dissociation constants were obtained by monitoring the averageweighted chemical shift changes [∆(1H,15N) ) (δ(1H)2 + (δ(15N)/ 5)2)0.5] of the backbone amides for residues D37, S38, V55, I56, W59, I90, I91, and F99 as a function of ligand concentration over the range of 0-2 mM. Data were fit using a single-binding site model. A least-squares grid search was performed by varying the values of KD and the chemical shift of the fully saturated protein. The reported dissociation constants are averages of those residues for which the average-weighted chemical shift difference between the free and bound states was greater than 0.1 ppm. For compounds which produced no chemical shift changes at concentrations of 1.0 mM, dissociation constants were estimated to be greater than 10.0 mM. Validation of Molecular Structure and Aqueous Solubility. Molecular structures and solubility were validated for selected compounds by analyzing one-dimensional 1H NMR spectra obtained on compounds dissolved to 200 µM in either CDCl3, DMSO-d6, or D2O. Compounds were considered to be sufficiently soluble for SAR by NMR analysis when the compounds were observed in NMR spectra obtained in D2O and there was no visible precipitation. Implementing the PMF-Score into DOCK4. The PMFscore of a protein-ligand complex is a knowledge-based measure for its binding free energy. It is calculated as the sum over all protein-ligand atom pair interaction free energies Aij(r) as function of the atom pair distance r by
PMF-score )
∑
Aij(r)
(1)
kl ij r