“In Situ Cross-Docking” To Simultaneously Address Multiple Targets

Apr 9, 2005 - be addressed simultaneously in a single docking run. This “in situ cross-docking” is built on a grid-based docking method and follow...
0 downloads 0 Views 153KB Size
3122

J. Med. Chem. 2005, 48, 3122

“In Situ Cross-Docking” To Simultaneously Address Multiple Targets Christoph A. Sotriffer* and Ingo Dramburg† Department of Pharmaceutical Chemistry, University of Marburg, Marbacher Weg 6, D-35032 Marburg, Germany Received January 27, 2005 Abstract: In standard docking, every target structure requires separate docking calculations. To overcome this limitation, an approach is presented by which multiple proteins can be addressed simultaneously in a single docking run. This “in situ cross-docking” is built on a grid-based docking method and follows the idea that grids calculated for single binding sites may be joined to one common grid. Docking then allows for a direct selection of the optimal target by the ligand being docked. The approach is technically feasible and can lead to significant time savings over conventional cross-docking.

As a computational method to investigate proteinligand interactions, docking has become a routine tool in structure-based drug design.1 Despite its undeniable merits, however, docking is also still faced with a series of limitations, one of which may be described as the “single-structure-paradigm”. Most standard docking procedures, in fact, operate on the basis of a single conformation of a single target protein, such that multiple proteins can only be considered in a serial way, but not in a truly parallel fashion. To investigate the selectivity of a ligand, for example, separate docking runs need to be carried out for each target of interest. Such “cross-docking” is time-consuming, not only in terms of computing time, but also with respect to setup and analysis of all separate runs. Clearly, simultaneous docking to multiple targets would be advantageous. It could be used to address issues of selectivity and protein flexibility and could provide new opportunities for more effectively harnessing the increasing wealth of protein structures in the postgenomic era.2 Here, we show in a pilot study how in a conceptually simple way a gridbased docking method can be used to address multiple proteins in parallel, i.e., to perform cross-docking in situ. In grid-based docking, the protein binding site is represented by a grid in which every grid point holds the precalculated energies of interaction of different probe atoms with the entire protein. The main idea for in situ cross-docking is that a single grid may also hold more than just one protein. This can be achieved in two ways: either a single large grid encompassing all proteins is calculated, which implies that the entire protein surface is considered as search space, or separate grids calculated for each protein binding site are joined to one grid, as shown in Figure 1. Applying the former variant would be inefficient whenever the approximate binding regions of the proteins are known, which is normally the case in standard docking sce* To whom correspondence should be addressed: phone +49/6421/ 2825822, fax +49/6421/2828994, email: [email protected]. † Present address: BioSolveIT GmbH, D-53757 St. Augustin, Germany.

Figure 1. Construction of a joined grid for multiple protein binding sites. First, a standard grid of 30 Å × 30 Å × 30 Å centered at the binding site is defined for each protein (A). Then, these grids are aligned next to each other, but separated by a 3 Å-wide high-energy grid that acts as a repulsive layer between the binding sites to avoid artifacts (B). Finally, the whole combination is used in docking like a usual single grid (C).

narios. Accordingly, only the latter variant is further analyzed here. To practically investigate whether in situ crossdocking is technically feasible, AutoDock3 with its Lamarckian genetic algorithm (LGA) was used as the docking tool and simple test systems were setup. One test set consisted of the following protein-ligand complexes: antisteroid antibody DB3 with bound progesterone (PDB 1dbb), retinol-binding protein (RBP) complexed with axerophthene (PDB 1fen), and chorismate mutase (CM) in complex with an endo-oxabicyclic transition state analogue, abbreviated herein as EOBTSA (PDB 2cht). As a first test before in situ crossdocking, standard docking to each single protein was carried out to ensure that each complex on its own can be successfully docked with the applied search routine and scoring function. Accordingly, the ligands were removed from the protein structures, a cubic 303 Å3 grid centered on the binding site was calculated for each target, and the corresponding ligand was docked to its protein with a standard protocol using 10 independent LGA runs with a maximum of 1.5 × 106 energy evaluations in each run (further details about the methods are provided as Supporting Information). In all three cases, rigid ligand docking provided the correct binding mode as only result (i.e., in 10 of 10 runs), with rootmean-square deviations (rmsd) from the X-ray structure of 0.49, 0.66, 0.57 Å and free energy scores of -10.18, -9.73, -7.38 kcal/mol for 1dbb, 1fen, 2cht, respectively. Since the ligands of 1dbb, 1fen, and 2cht have one, four, and three torsional degrees of freedom, flexible ligand docking was carried out next. Using the same standard

10.1021/jm050075j CCC: $30.25 © 2005 American Chemical Society Published on Web 04/09/2005

Letters

Journal of Medicinal Chemistry, 2005, Vol. 48, No. 9 3123

Table 1. Comparison of Conventional Rigid Docking to a Single Target with Rigid In Situ Cross-Docking to Three Proteins (for the first set of ligands: DB3, RBP, and CM; for the second set of ligands: Th, AChE, and HPr) or to All Six Proteins Simultaneously standard protocol

fastest successful protocols

conventional single-target

in situ cross-docking three proteins

in situ cross-docking six proteins

in situ cross-docking three proteins

ligand (PDB)

rmsd

rmsd

rmsd

rmsd

ee

rmsd

ee

rmsd

ee

progesterone (1dbb) axerophthene (1fen) EOB-TSA (2cht) NAPAP (1ets) Aricept (1eve) XK-203 (1hvr)

0.49 0.66 0.57 0.58 0.19 0.45

0.50 0.67 0.57 0.58 0.19 0.45

0.51 0.67 0.56 0.58 0.19 0.45

0.45 0.65 0.46 0.62 0.18 0.10

50k 10k 50k 100k 10k 10k

0.40 0.65 0.51 0.62 0.25 0.09

50k 50k 50k 100k 25k 10k

0.46 0.67 0.42 0.59 0.22 0.09

50k 100k 50k 500k 50k 10k

conventional single-target

in situ cross-docking six proteins

a For each ligand, the docking result with best score is reported and its rmsd (in Å) from the corresponding crystal structure is shown. The left half of the table provides the results obtained with the standard protocol (1.5 × 106 ) 1500k energy evaluations), the right half those generated with the fastest successful protocol (ee ) energy evaluations).

Table 2. Comparison of Conventional Flexible Docking to a Single Target with Flexible In Situ Cross-Docking, Using the Same Format as In Table 1a standard protocol

fastest successful protocols

conventional single-target

in situ cross-docking three proteins

in situ cross-docking six proteins

ligand (PDB)

rmsd

rmsd

rmsd

rmsd

ee

rmsd

ee

rmsd

ee

progesterone (1dbb) axerophthene (1fen) EOB-TSA (2cht) NAPAP (1ets) Aricept (1eve) XK-203 (1hvr)

0.37 1.29 0.91 3.30 1.65 0.59

0.35 1.18 0.90 68.62 1.47 28.17

0.38 0.82 0.54 69.88 9.41 61.88

0.37 0.62 0.31 1.49 0.47 0.87

100k 50k 25k 7500k 500k 3000k

0.42 0.94 1.08 1.72 1.47 1.14

100k 50k 25k 45000k 1500k 7500k

0.38 0.70 0.62 1.18 1.11

500k 100k 500k >90000k 6000k 9000k

a

conventional single-target

in situ cross-docking three proteins

in situ cross-docking six proteins

See footnote of Table 1 for further explanations.

protocol, equally good results were obtained, with at least 9 of the 10 runs converging to the same top-ranked solution () result with best score) with rmsd values of 0.37, 1.29, 0.91 Å and free energy scores of -10.26, -9.98, -7.92 kcal/mol for 1dbb, 1fen, 2cht, respectively. To test for selectivity, conventional cross-docking was carried out next. Each ligand was flexibly docked to its two “non-native receptors” (e.g., progesterone to RBP and CM), using the same separate grids for each protein as above, and a 2-fold enhanced search protocol with 3.0 × 106 energy evaluations to ensure proper optimization. For a selective ligand, docking should yield the energetically best result with the “native” target. For the systems investigated here, this is indeed the case: progesterone clearly binds best to DB3 (free energy score of -10.26 versus -9.35 and -9.45 kcal/mol for the other two targets), axerophthene binds best to RBP (-11.22 versus -9.75 and -9.05 kcal/mol), and EOB-TSA binds best to CM (-7.92 versus -6.18 and -5.05 kcal/mol). Given that the correct selectivity is obtained in conventional cross-docking, the test system should be suitable for probing the feasibility of the new in situ approach. For in situ cross-docking, the three grids previously used for each protein were combined to one large grid, but internally separated by a 3 Å wide layer of repulsive grid points to avoid artificial docking results across the border of two joined grids (Figure 1). The resulting 96 Å × 30 Å × 30 Å grid represents a single search space for all three binding sites in parallel. As a first test, using the same standard protocol as above (1.5 × 106 energy evaluations), the ligands were docked rigidly to the joined grid to analyze whether the correct binding site of the correct target is found. The results summarized in Table 1 (top left part) illustrate this to be

the case: impressively, for each of the three ligands, the top-ranked result identified the correct protein target, and the experimental binding mode was reproduced with the same rmsd and free energy score as in standard docking to a single protein. It is worthwhile emphasizing that the search was not biased by a starting point, since the applied LGA generates an initial population of ligand positions randomly distributed across the grid. Furthermore, the same search parameters as in the single-target rigid docking were used. Flexible in situ cross-docking with the joined grid was tested next, and also here the results indicate that the approach is feasible. With the same standard protocol the correct binding mode was obtained in each case as the top-ranked result with similar rmsd and free-energy score as in the single-protein docking (cf. Table 2, top left part). This corresponds to a 5-fold speed-up compared to conventional cross-docking as carried out above (i.e., with 1.5 × 106 energy evaluations for the native target and 3.0 × 106 for the two non-native targets). Measured this way, however, the gain in efficiency depends mainly on the preset length of the optimization protocols; for a fair comparison, information about the fastest successful protocols is required. Accordingly, a systematic variation of the docking protocol was undertaken to identify the fastest possible settings for achieving successful binding-mode prediction. A docking prediction was defined as “successful” if out of 10 runs at least 2 top-ranked results with rmsd e 2 Å are obtained; in addition, the free energy score of the top-ranked result should be within 0.5 kcal/mol of the single-target rigid docking result obtained with the standard protocol. Docking runs with decreasing num-

3124

Journal of Medicinal Chemistry, 2005, Vol. 48, No. 9

bers of energy evaluations (and, thus, shorter computation times) were carried out. For single-protein docking, maxima of 1.0 × 106, 5.0 × 105, 1.0 × 105, 5.0 × 104, 2.5 × 104, and 1.0 × 104 energy evaluations per run were used as variants (from hereon, “103 energy evaluations” is abbreviated as “1k ee”; in this notation, the tested variants are 1000k, 500k, 100k, 50k, 25k, and 10k ee). Surprisingly, successful rigid docking to single-protein grids was observed with as low as 10k ee for 1fen and 50k ee for 1dbb and 2cht (cf. Table 1). For flexible docking, the fastest successful protocols were 100k ee for 1dbb, 50k ee for 1fen, and 25k ee for 2cht (cf. Table 2). These settings are all much faster than what is commonly used in AutoDock protocols. To check whether in situ cross-docking would also yield successful predictions with faster settings, docking runs were carried out with the fastest successful protocol identified in conventional single-target docking of the corresponding ligand; if no satisfying result was obtained, the search length was increased (according to the variants given above) until successful docking was achieved. As shown in Tables 1 and 2, in most cases the same protocol as in single-target docking was sufficient for successful identification of the preferred binding mode among the three targets in the grid: for the rigid runs, this is true for progesterone and EOBTSA (axerophthene required 50 k ee in in situ crossdocking instead of 10 k ee in conventional docking); for the flexible runs, it applies to all three ligands. Conventional cross-docking to each separate target would require three docking simulations with the corresponding protocol, whereas in situ cross-docking requires only one! Accordingly, significant amounts of computing time can indeed by saved with in situ cross-docking. Besides the system discussed so far, a further test set consisted of three targets complexed with more flexible ligands: thrombin (Th) with NAPAP (PDB 1ets), acetylcholinesterase (AChE) with Aricept (1eve), and HIV protease (HPr) with XK-203 (1hvr). Again, for each of these proteins a 30 Å × 30 Å × 30 Å grid was centered on the binding site and standard docking was carried out first. Also here, the correct selectivities could be obtained in conventional cross-docking of each ligand to each single protein: NAPAP was found to bind best to Th (and not to other targets), Aricept best to AChE, and XK-203 best to HPr. Thus, this test system should also be suitable for probing the feasibility of the new in situ approach. As far as rigid docking is concerned, the situation is very similar to what has been observed for the previous complexes (1dbb, 1fen, 2cht). In all three cases, rigid docking to the single native target with the standard protocol provided the correct binding mode as only result, with rmsd values of 0.58, 0.19, 0.45 Å and free energy scores of -10.82, -9.27, -14.32 kcal/mol for 1ets, 1eve, 1hvr, respectively. In-situ cross-docking with the standard protocol to all three proteins simultaneously gave exactly the same result (Table 1, bottom left) at a third of the calculation time. Again, protocols much faster than the standard settings of 1500k ee could be successfully applied: for conventional docking to singleprotein grids 100k ee were sufficient in case of 1ets and 10k ee in case of 1eve and 1hvr (Table 1, bottom right). Almost the same is true for rigid docking to the

Letters

Figure 2. Alignment of the grids for construction of the joined grid holding six protein binding sites simultaneously, illustrated along with the native ligand of the corresponding protein. The spacer between the grids is 4 Å wide, leading to joined-grid dimensions of 200 Å × 30 Å × 30 Å.

combined grid holding all three proteins: successful docking was already achieved with 100k, 25k, and 10k ee for NAPAP, Aricept, and XK-203, respectively, resulting in a net saving of computing time in all three cases. Flexible docking of the ligands in this second test set is more challenging, given the higher number of rotatable bonds, which is 8 for NAPAP, 6 for Aricept, and 10 for XK-203. In fact, in conventional docking with the standard protocol a correct binding mode on rank 1 was obtained only for Aricept (in 5 of 10 runs) and XK-203 (in 1 of 10 runs), while NAPAP was not satisfactorily docked (problems in reproducing the 1ets binding mode have also been reported in the literature4). To fully satisfy the criteria of successful docking defined above, 7500k ee were necessary for NAPAP and 3000k ee for XK-203; for Aricept, instead, already 500k ee were sufficient (cf. Table 2). In agreement with these results for single-protein docking, flexible in situ cross-docking with the standard protocol predicted the correct binding pocket and binding mode only for the 1eve complex (1.47 Å rmsd). However, successful predictions were also possible for the other two ligands, but longer protocols were required (XK-203: 7500k ee; NAPAP: 45000k ee). For estimates of computational efficiency, these data need to be compared with the fastest successful protocols of conventional docking, which would have to be run three times in normal cross-docking. Based on such a comparison, in situ cross-docking results faster for XK203 (7500k ee versus 3 × 3000k ee), equally fast for Aricept, and slower for NAPAP. To investigate whether the in situ cross-docking approach could also be extended to more targets, the proteins of both test sets were used to setup a joined grid of six proteins (Figure 2). As before, the singleprotein grids were aligned next to each other, a narrow repulsive layer was inserted between them, and the whole assembly was combined to one single grid holding six protein binding sites simultaneously. Rigid docking of the six different ligands to this grid using the standard protocol provided the correct binding pocket and binding mode for all ligands with rmsd values