J. Med. Chem. 2006, 49, 2969-2978
2969
The Influence of Target Family and Functional Activity on the Physicochemical Properties of Pre-Clinical Compounds Richard Morphy† Medicinal Chemistry Department, Organon Laboratories, Newhouse, Lanarkshire ML1 5SH, U. K. ReceiVed December 5, 2005
The target families of greatest interest in drug discovery can be differentiated on the basis of the physicochemical properties of their pre-clinical ligands. The ligands for peptidergic targets, such as peptide GPCRs and integrin receptors, possess significantly higher median property values than those for aminergic targets, such as monoamine transporters and GPCRs. The ligands for peptide GPCRs were found to be less efficient, in terms of their binding energy per unit of molecular weight or lipophilicity, than ligands for monoamine GPCRs. The changes in the property values during the optimization process were found to vary only slightly across the target families, with the main determinant of the drug-likeness of the optimized compounds being the profile of the starting compounds. Agonists for monoamine GPCRs, opioid receptors and ion channels were typically smaller and less lipophilic than the antagonists, but there was no difference between the agonists and the antagonists for peptide GPCRs and nuclear receptors. Introduction The influence of physicochemical properties on the pharmacokinetic behavior of drug molecules has been the subject of intense interest over the past few years since the publication of Lipinski’s seminal work on the rule-of-5 (RO5) in 1997.1 This was followed by the work of Teague et al.2 and Hann et al.,3 which highlighted the fact that molecules tend to increase in MW and cLogP during optimization. More recent work has examined the influence of the degree of (pre-)clinical advancement of molecules on physicochemical properties as well as the influence of the disease area,4 launch date,5 and route of administration.6 Over recent years, there has been an increasing amount of anecdotal evidence indicating that the discovery of orally active ligands for some targets and target families is more challenging than that of others. The aim of this work was to study the influence of target family on the physicochemical properties of pre-clinical ligands and thereby provide, for the first time, some quantitative data supporting these perceived differences. The influence of functional activity was also explored. Over the past few years, a database (called SCOPE) has been assembled at Organon, which consists of a large number of optimizations extracted predominantly from the primary literature. For each selected publication, the structures of the both the starting compound and the most highly optimized compound have been extracted. Data Source. Currently, the SCOPE database contains a total of 1860 optimizations, 1630 (88%) from the literature, and 230 (12%) from internal Organon projects. Each entry was annotated by target family, and this feature allowed a detailed analysis of the influence of these features on the physicochemical properties of the optimized compounds and the changes in those properties during optimization. The distribution of the major target families within the database is shown in Figure 1. For reasons of statistical validity, only target families that represented 2% or more of the total database were considered in this analysis, representing in total 89% of the database. These families † To whom correspondence should be addressed. Phone: +44 (0)1698 736000. Fax: +44 (0)1698 736187. E-mail:
[email protected].
included those of greatest current interest in drug discovery. Many entries also contained information about binding affinity and functional activity, and this allowed a further analysis of the relationship between physicochemical and biological properties. The database contains predominantly pre-clinical compounds, although a very small number of compounds that reached the market are included (8 in total). The majority of the entries are from the year 2000 onward when a systematic program of abstraction of four major medicinal chemistry journals began (Bioorganic & Medicinal Chemistry, Bioorganic & Medicinal Chemistry Letters, European Journal of Medicinal Chemistry, and Journal of Medicinal Chemistry). A smaller number of entries from the 1990-1999 period were also added. The year by year distribution of entries is shown in Figure 2. SCOPE is not a comprehensive database of all optimized compounds from the most recent literature. Because its principal aim is to capture information about the optimization process itself, only publications containing clearly identified starting and optimized compounds are abstracted. The abstraction policy for the SCOPE database is to select, as the optimized compound, the compound that was subjected to the most rigorous testing, and this is usually highlighted in the publication abstract or conclusion. It is not necessarily the most potent compound in the primary in vitro assay but rather the compound with the most rounded properties overall in terms of, for example, in vitro and in vivo potency, selectivity, and pharmacokinetic properties. Property Calculations and Statistical Analysis. For each optimized compound from both the full SCOPE set, the target family subsets, and the functional activity subsets, six physicochemical properties were calculated: molecular weight (MW), cLogP, polar surface area (PSA),7 the number of hydrogen bond acceptors (HBA), the number of hydrogen bond donors (HBD), and the number of rotatable bonds (RB).8 Because the database also contained the starting compound in each case, the changes in these same properties during optimization could also be studied. At the start of this work, it became apparent that for some of the target family and functional subsets, the property distributions for the optimized compounds and trajectories were not
10.1021/jm0512185 CCC: $33.50 © 2006 American Chemical Society Published on Web 04/13/2006
2970
Journal of Medicinal Chemistry, 2006, Vol. 49, No. 10
Morphy
Figure 1. Target family distribution of SCOPE database entries. Table 1. Molecular Weight (MW) Data Classified by Target Family
c
target family subset
number of entries
starting compd median (mean)
full SCOPE set esterases GPCRs (all)c GPCRs (monoamine) GPCRs (peptide) integrin receptor ion channels kinases nuclear receptors oxidases phosphodiesterases proteases transferases transporters
1680 32 755 326 309 41 158 120 138 59 38 211 56 69
382 (393) 349 (336) 391 (402) 347 (347) 451 (465) 454 (449) 311 (328) 349 (360) 410 (406) 314 (314) 415 (409) 421 (427) 451 (502) 299 (306)
optimized compd median (mean)
optimized compd lower quartile
optimized compd upper quartile
change in property median (mean)
95% CI (change in property)a
p value (change in property)b
422 (435) 383 (412) 433 (440) 375 (377) 510 (513) 466 (496) 364 (373) 392 (406) 421 (431) 357 (355) 462 (465) 467 (468) 521 (539) 325 (335)
353 339 358 295 434 436 300 343 381 315 392 409 392 270
504 435 517 444 581 575 429 470 492 403 569 524 693 395
30 (42) 51 (76) 30 (38) 24 (30) 44 (48) 33 (47) 33 (45) 33 (46) 18 (25) 21 (41) 39 (56) 32 (41) 30 (38) 25 (29)
(32, 39) (35, 95) (30, 39) (21, 33) (39, 54) (12, 78) (29, 47) (29, 56) (12, 33) (16, 56) (28, 79) (29, 49) (19, 50) (15, 36)
0 0 0 0 0 0.006 0 0 0 0 0 0 0 0
a The 1-sample Wilcoxon 95% confidence interval. b The p value from 1-sample Wilcoxon signed rank test of the median ) 0 versus median not ) 0. The GPCR (monoamine) and GPCR (peptide) subsets are contained within the GPCR (all) subset.
normally distributed but were skewed and often included extreme outliers. For reasons of consistency, all of the data sets were analyzed using nonparametric rank statistical methods rather than parametric t-tests. The Wilcoxon signed rank test was used to examine the significance of the changes in the properties during optimization, and the Mann-Whitney rank test was used to explore the significance of the differences in properties between the target family subsets. For that reason, more emphasis on the analysis and interpretation was placed upon the median values, although the mean values are also quoted for reference purposes. Results Molecular Weight. The median MW for the full set of 1680 optimized compounds was 422, and the mean was 435 (Table 1). These values are notably higher than the reported values for marketed oral drugs. For example, Vieth et al. reported a median MW of 322 and a mean MW of 344 for 1202 oral drugs.6 Vieth et al.6 and Blake et al.9 reported median MWs of 415 (mean of 448) and 393, respectively, for a range of preclinical compounds, and these figures are consistent with the SCOPE averages. Among the target family subsets, the median MW was highest for the peptide GPCR ligands and the
transferase inhibitors with values of 510 and 521, respectively (Table 1). At the other extreme, the ligands for transporters had a median MW of just 325. The median increase in MW during optimization was 30 (mean 42). This was lower than the median MW increase of 70 reported by Oprea et al.10 during the optimization of leads to drugs. However, the SCOPE mean increase is identical to the reported mean MW increase of 42 reported by Hann et al.3 for the lead to drug process. There is a statistically significant increase in MW for all of the target family subsets (p value < 0.001), indicating that this underlying trend is consistent and highly pronounced during the process of optimization. The biggest increases in MW were found for the peptide GPCR and esterase groups, 44 and 51, respectively. The lowest MW change was found for the nuclear receptor ligands (21). The target family subsets were divided into clusters by determining whether the MWs of the optimized compounds showed statistically significant differences from each other (p value < 0.1).11 The MWs of the peptide GPCR, transferase, and integrin receptor optimized compounds were not significantly different but were all significantly higher than the MWs for the protease and phosphodiesterase (PDE) optimized compounds. In turn, the MWs for the protease and PDE subsets
Pre-Clinical Compounds
Journal of Medicinal Chemistry, 2006, Vol. 49, No. 10 2971
Figure 2. Year-by-year distribution of SCOPE database entries.
Figure 3. Classification of target families on the basis of six physicochemical properties for their optimized ligands (MW, cLogP, polar surface area (PSA), the number of hydrogen bond acceptors (HBA), the number of hydrogen bond donors (HBD), and the number of rotatable bonds (RB)).
were not significantly different from each other but were significantly higher than those for the nuclear receptors. Similarly, the MWs for the kinase and esterase subsets were not different from each other but were significantly higher than those for the monoamine GPCRs, ion channels, and oxidases. The transporters had the lowest median MW ligands of all. Overall, this process of cross-comparisons using the MannWhitney test created a sequence of six statistically defined MW clusters (Figure 3). The higher molecular weights of the optimized compounds from the uppermost cluster could be due to a higher MW of
the starting compounds, a higher increase during optimization, or a combination of the two. When the MWs for peptide GPCRs, integrin receptors, proteases, and transferases were compared with those for monoamine GPCRs, transporters, and oxidases, it was clear that for the former the starting compounds were much larger, and, in addition, for peptide GPCRs and integrin receptors, there was a greater increase in size during optimization (Table 1). For the full SCOPE database, the receptor antagonists had the highest median MW, followed by the enzyme inhibitors, the transporter inhibitors, and then the receptor agonists (Table
2972
Journal of Medicinal Chemistry, 2006, Vol. 49, No. 10
Morphy
Table 2. Physicochemical Property Data for the Optimized Compounds Classified by Target Family
function
number of entriesa
MW median (mean)
cLogP median (mean)
PSA median (mean)
HBA median (mean)
HBD median (mean)
RB median (mean)
agonist antagonist inhibitor
314 682 675
402 (405) 437 (448) 412 (435)
4.1 (4.2) 4.4 (4.3) 3.4 (3.4)
48 (53) 55 (60) 66 (74)
3 (4) 4 (4) 5 (5)
1 (2) 1 (2) 2 (2)
5 (6) 6 (7) 6 (7)
GPCRs (all)
agonist antagonist
118 380
380 (377) 445 (458)
3.8 (3.8) 4.5 (4.5)
39 (46) 53 (57)
3 (3) 4 (4)
1 (2) 1 (1)
6 (6) 6 (7)
GPCRs (monoamine)
agonist antagonist
61 148
269 (329) 394 (395)
3.3 (3.2) 4.2 (4.0)
36 (41) 41 (43)
3 (3) 3 (3)
2 (2) 1 (1)
4 (5) 5 (5)
agonistb agonistc antagonist
38 12 165
446 (455) 548 (520) 523 (535)
4.9 (4.8) 5.6 (5.8) 5.0 (5.1)
42 (48) 64 (61) 65 (69)
4 (4) 4 (4) 5 (5)
1 (1) 1 (2) 1 (2)
6 (7) 9 (8) 8 (9)
ion channels
agonist antagonist
25 94
285 (312) 369 (388)
2.1 (2.2) 3.2 (3.1)
45 (47) 57 (64)
3 (4) 4 (4)
1 (1) 2 (2)
3 (5) 5 (6)
nuclear receptors
agonist antagonist
109 22
419 (425) 406 (429)
4.9 (5.3) 4.7 (5.0)
50 (53) 53 (53)
3 (4) 5 (4)
1 (1) 1 (1)
4 (5) 5 (6)
target family subset full set
GPCRs (peptide)
a The dataset used for the functional activity analysis was different from that used for the target family analysis because a functional activity assignment was not available for all SCOPE entries. b Full peptide GPCR agonist subset containing opioid ligands. c Peptide GPCR agonist subset minus opioid ligands. Bold text indicates Mann-Whitney p 0.8) between MW and RB and between PSA and HBA. The least correlated and most independent properties were cLogP and HBD. A table of Pearson coefficients of correlation for all 6 properties is provided in the Supporting Information. (14) Black, J. Drugs from emasculated hormones: the principle of syntopic antagonism. Science 1989, 245, 486-493 (15) Beeley, N. R. A. Can peptides be mimicked? Drug DiscoVery Today 2000, 5, 354-363. (16) Beaumont, K.; Schmid, E.; Smith, D. A. Oral delivery of G proteincoupled receptor modulators: An explanation for the observed class difference. Bioorg. Med. Chem. Lett. 2005, 15, 3658-3664. (17) Lipinski, C. A. Drug-like properties and the causes of poor solubility and poor permeability. J. Pharmacol. Toxicol. Methods. 2000, 44, 235-249. (18) Klabunde, T.; Hessler, G. Drug design strategies for targeting G-protein-coupled receptors. ChemBioChem 2002, 3, 928-944. (19) Mirzadegan, T.; Diehl, F.; Ebi, B.; Bhakta, S.; Polsky, I.; McCarley, D.; Mulkins, M.; Weatherhead , G.S.; Lapierre J. M.; Dankwardt J.; Morgans D.; Wilhelm R.; Jarnagin K. Identification of the binding site for a novel class of CCR2b chemokine receptor antagonists: binding to a common chemokine receptor motif within the helical bundle. J Biol Chem. 2000, 275, 25562-25571. (20) Bondensgaard, K.; Ankersen, M.; Thogersen, H.; Hansen, B. S.; Wulff, B. S.; Bywater, R. P. Recognition of privileged structures by G-protein coupled receptors. J. Med. Chem. 2004, 47, 888-899. (21) Lajiness, M. S.; Vieth, M.; Erickson, J. Molecular properties that influence oral drug-like behavior. Curr. Opin. Drug DiscoVery DeV. 2004, 7, 470-477. (22) Congreve, M.; Carr, R.; Murray, C.; Jhoti, H. A ‘rule of three’ for fragment-based lead discovery? Drug DiscoVery Today 2003, 8, 876-877.
JM0512185