2374
J. Chem. Inf. Model. 2007, 47, 2374-2382
A Pharmacophore Map of Small Molecule Protein Kinase Inhibitors Malcolm J. McGregor† EMD Serono Research Institute, One Technology Place, Rockland, Massachusetts 02370 Received July 11, 2007
Protein kinases have emerged as one of the major drug target classes that are amenable to the development of small molecule inhibitors. They share a conserved structural similarity in the region of the ATP binding site, where most inhibitors interact. Using a pharmacophore approach, we have explored the features of protein-ligand interactions for a set of 220 kinase crystal structures from the Protein Data Bank (PDB). The resulting “pharmacophore map” shows the interactions made by all ligands with their receptors simultaneously. This gives insight that has been applied in the design of kinase screening sets and combinatorial libraries. An algorithm is described that scores small molecule structures for goodness of fit to the resulting binding site description obtained from the analysis. The algorithm identifies a pose that is close to the crystal structure pose for the majority of ligands, with only the 2D chemical structure as input and no knowledge of the crystal structure from which it was derived. Application of the algorithm to a test screening set gave a useful enrichment of active compounds. INTRODUCTION
Protein kinases are currently receiving a great amount of attention as drug targets for the development of orally available small molecule inhibitors.1 They are critical components of cellular signal transducion pathways, the control of which has applications in the treatment of important diseases, particularly in the fields of oncology2,3 and inflammation. Although of relatively recent origin, several protein kinase inhibitors have been approved for the treatment of cancer and many more are in various stages of clinical development and in the pipelines of major pharmaceutical companies.4,5 Protein kinases are a superfamily of enzymes that number over 500 in the human genome,6 for which there is a great deal of structural information available.7 They share a common structural fold and catalytic mechanism that have been described in detail.8,9 The catalytic mechanism involves ATP, which provides a phosphate group for phosphorylation of tyrosine, serine, or threonine residues of a protein substrate. Although kinase domains exist in different molecular and cellular contexts and have different mechanisms of activation,10 the ATP binding site is highly conserved in both sequence and structure across all members of the superfamily. Most small molecule inhibitors bind to this site and compete with ATP.11 Many pharmaceutical companies are currently involved in large scale screening of small molecule compounds for activity against kinase targets. The binding site similarities common to all or most members of the superfamily enables the design of kinase-focused screening sets.12,13 There are third party suppliers that also provide commercially available compound screening sets that attempt to exploit knowledge about kinase binding sites. We have analyzed a set of kinase ligand:receptor cocrystal structures from the Protein Data Bank (PDB) using a † Current address: Accelrys, Inc., 10188 Telesis Court, Suite 100, San Diego, CA 92121. E-mail:
[email protected].
pharmacophore approach, to arrive at a description of interactions made by kinase ligands with their respective targets. This gives insight into the features of ligands that are required for activity against any kinase, i.e., similarities, and also the features that confer selectivity between them, i.e., differences. The wealth of data now available gives a precise description of the structural features that a small molecules needs to possess to be a high affinity kinase ligand. In the early days of kinase inhibitor design it was thought that target selectivity might be very difficult to achieve. Subsequent experience has shown that some selectivity can be achieved by exploiting the structural differences that exist between receptors;14-17 however, cross-reactivity remains an issue. When screening compounds against kinase targets, an important consideration is how to design or put together a screening set of compounds that are (1) compatible with the ATP binding site and (2) contain novel chemotypes with regard to intellectual property. A computational algorithm has been devised that uses the results of the present analysis to select compounds that are compatible with the binding site description given by the pharmacophore map. The algorithm fits the PDB ligands to the pharmacophore map to within a close approximation of the crystal structure pose the majority of the time, without any direct knowledge of the receptor. This method has been used to select compounds and design combinatorial libraries that form the basis of a kinase focused screening set. MATERIALS AND METHODS
1. Crystal Structure Set for Analysis. We have constructed a database of protein kinase small molecule ligands derived from crystal structures from the PDB,18 from which a subset of structures was chosen for analysis. In general, the criterion was the presence of a unique drug-like inhibitor in the ATP binding site. Where duplicate ligand structures occur, only the one with the best crystallographic resolution
10.1021/ci700244t CCC: $37.00 © 2007 American Chemical Society Published on Web 10/17/2007
PHARMACOPHORE MAP
OF
PROTEIN KINASE INHIBITORS
was retained. The resulting set consisted of 220 structures. The crystal structure resolution was between 1.3 and 3.5 Å with a median of 2.1 Å. The molecular weight of the ligands was between 149 and 590, with a mean of 379 and a standard deviation of 90. Structures with ATP and ATP analogues were excluded; although many drug-like ligands mimic the adenine part of ATP, they generally do not mimic the sugar and phosphate regions, so including ATP analogues was considered to skew the analysis. Structures with ligands that do not make the ligand H-bond acceptor (L-HBA) interaction with the hinge region were excluded. This is the most commonly seen interaction in kinase ligands and is described in more detail below. The several ligands that do not make this interaction were not considered interesting from a drug development point of view. Therefore, as a rule for our purposes, all kinase ligands make this interaction. Interestingly, three closely related kinasessPim1, Pim2, and Pim3shave a proline residue in the receptor in the position of the residue that makes this interaction, and therefore ligands cannot make this H-bond. Therefore these structures were also eliminated from this analysis (should they become targets for drug design, they would be treated separately from a modeling point of view). Finally, the PI3K lipid kinases, although sharing the same overall fold and binding site features of protein kinases, were considered sufficiently divergent from the protein kinases to be excluded. 2. Common Reference Frame. The crystal structure with PDB code “1ATP” was used as a reference structure. This is murine PKA bound to ATP. It was one of the earliest crystal structures to be solved of a protein kinase19 and has been used as a reference structure in previous analyses.20,21 An important part of the present analysis involved transforming the coordinates of each structure to a common reference frame. This was done by performing a least-squares superimposition of each structure to the reference structure using the CR coordinates of a subset of residues. These residues, in 1ATP numbering, were chosen to be 57-59, 70-72, 103-105, 118-124, 126-129, 168-173, and 181-184. These are 30 residues in 7 continuous regions that define the core of the structure around the binding site that is common to all protein kinases, the conformation of which does not change appreciably upon kinase activation (Figure 1). The task is then to identify the equivalent residues in each structure. This is usually clear from a visual inspection on the computer graphics, but to speed the process, a computer program was written that examines the coordinates of each structure and determines which subset of residues has a distance matrix that most closely resembles that of the reference structure and then performs an rms superimposition using this subset of atoms. The selection was then checked manually by visual inspection and all the alignments were observed to be correct. All structures had a root mean square (rms) deviation of