Computational Analysis of Kinase Inhibitors ... - ACS Publications

Dec 13, 2018 - inhibitors of human kinases and their activity data from seven public repositories ... Overall, these cliffs suggested many target hypo...
1 downloads 0 Views 3MB Size
This is an open access article published under an ACS AuthorChoice License, which permits copying and redistribution of the article or any adaptations for non-commercial purposes.

Article Cite This: ACS Omega 2018, 3, 17295−17308

http://pubs.acs.org/journal/acsodf

Computational Analysis of Kinase Inhibitors Identifies Promiscuity Cliffs across the Human Kinome Filip Miljkovic ́ and Jürgen Bajorath*

ACS Omega 2018.3:17295-17308. Downloaded from pubs.acs.org by 79.110.18.88 on 12/17/18. For personal use only.

Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany ABSTRACT: Kinase inhibitors are high-priority drug candidates for a variety of therapeutic applications. Accordingly, there has been a rapid growth in the number of kinase inhibitors and volumes of associated activity data. A paradigm for the use of kinase inhibitors in oncology is that these compounds have multitarget activities and elicit their therapeutic effects through polypharmacology. An analysis of kinase inhibitors and associated activity data from medicinal chemistry has so far only identified small subsets of highly promiscuous kinase inhibitors. In this study, we have collected inhibitors of human kinases and their activity data from seven public repositories, curated, and combined these data, yielding more than 112 000 inhibitors with well-defined activity measurements from which qualitative target annotations were derived. An analysis of these unprecedentedly large data sets revealed that nearly 40% of human kinase inhibitors have multikinase activities but that only 4% are known to be active against five or more kinases. However, structurally analogous inhibitors often displayed significant differences in the number of kinase annotations, leading to the formation of nearly 16 000 “promiscuity cliffs”. Moreover, 2236 promiscuity cliffs (14.03%) were formed by kinase inhibitors at different stages of clinical development. Overall, these cliffs suggested many target hypotheses for kinase inhibitors, taking data incompleteness into consideration, as well as hypotheses for structural modifications leading to kinase selectivity. Furthermore, from network representations, pathways comprising sequences of promiscuity cliffs were extracted that revealed unexpected structure−promiscuity relationships. To enable follow-up investigations, all promiscuity cliffs formed by human kinase inhibitors will be made freely available.



INTRODUCTION Kinase inhibitors play a major role in drug discovery.1,2 Originally, kinase inhibitors were successfully applied in oncology, where their therapeutic efficacy was found to be largely due to polypharmacology.3,4 However, the clinical use of kinase inhibitors has been further expanded to other therapeutic areas such as immunology and inflammation or metabolic diseases, where target selectivity of inhibitors plays an important role.5−7 It is thus not surprising that the topic of kinase inhibitor selectivity versus promiscuity has been intensely investigated over the past decade and continues to be a much debated issue.8−13 Selectivity analysis is far from being a simple task, given the binding characteristics of kinase inhibitors and the many experimental variables that need to be considered. Furthermore, apparent promiscuity of compounds including kinase inhibitors is often associated with undesired effects such as artifacts resulting from assay interference.14−16 However, promiscuity also refers to the presence of true multitarget activities of compounds that represent the molecular basis of polypharmacology.17,18 A mapping of signature fragments of kinase inhibitors adopting different binding modes revealed by X-ray crystallography19,20 has shown that more than 95% of the currently available inhibitors of human kinases are type I inhibitors.21,22 © 2018 American Chemical Society

These inhibitors block the adenosine triphosphate (ATP) cofactor binding site that is largely conserved across the kinome, so they are expected to be promiscuous.21 Subsets of highly promiscuous kinase inhibitors have indeed been identified including anticancer drugs,22 consistent with their polypharmacology. Promiscuous kinase inhibitors include approved drugs as well as inhibitors at different stages of clinical development. A representative example is provided by sunitinib, a multikinase type I inhibitor, whose targets include vascular endothelial growth factor receptor 2 and platelet-derived growth factor receptor β kinases. Polypharmacology associated with sunitinib and various other promiscuous kinase inhibitors has proven essential for their efficacy in cancer treatment.2−4 On the other hand, analyses of the available kinase inhibitors and associated activity data have not supported the often assumed general promiscuity of these inhibitors. For example, in the first largescale analysis, 18 653 publicly available inhibitors with activity against 266 human kinases were identified for which highconfidence activity data were available.22 On the basis of Ki and IC50 measurements, 68 and 77% of all the inhibitors were only Received: October 29, 2018 Accepted: December 3, 2018 Published: December 13, 2018 17295

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

PCs involving such inhibitors immediately suggest follow-up experiments, given likely data incompleteness. Herein, we extend the systematic analysis of kinase inhibitors and their promiscuity on the basis of inhibitors and activity data that were selected from different source databases and combined, yielding unprecedented coverage of the human kinome. The analysis was combined with a systematic assessment of the PCs formed by human kinase inhibitors and PC pathways extracted from network representations.

annotated with a single human kinase, respectively, and only ∼1% of the inhibitors were active against five or more kinases.22 Two years later, the number of human kinase inhibitors with available high-confidence data had more than doubled and 43 331 inhibitors with activity against 286 human kinases were available.23 However, despite the rapid growth in kinase inhibitors, 76.5% of all the inhibitors were only annotated with a single kinase on the basis of combined K i and IC 50 measurements; again, only ∼1% of all the inhibitors had reported activity against five or more kinases.23 There is the possibility that the dominance of kinase inhibitors with single target annotations is at least partly due to data incompleteness24 because only a confined subset of kinase inhibitors have been subjected to kinome profiling. On the other hand, the mean promiscuity degree (PD) of ∼1.5 determined by activity data analyses22,23 did not significantly increase when the activity data confidence criteria were gradually relaxed and increasing numbers of activity measurements considered or primary kinase screening assays were analyzed. The PD is defined as the number of unique kinase annotations available for an inhibitor and serves as a qualitative measure of compound promiscuity. The PD is derived on the basis of well-defined activity measurements including, among others, (assay-dependent) IC50 and (assayindependent) Ki values. Although potency values reported on the basis of different types of measurements should not be directly compared, they are qualified to serve as sources for target annotations. In the light of the findings discussed above, large-scale activity data analysis did not provide support for the view that ATP sitedirected kinase inhibitors might generally be promiscuous. Moreover, kinase inhibitor activity profiles are multifaceted. For example, within the subset of promiscuous kinase inhibitors, in part strong target selectivity tendencies for individual kinases were detected, resulting from differential potency for multiple kinases.25−27 In this context, it should also be noted that largescale analyses of kinase inhibitor activity data reported so far were exclusively22,23,26 or mostly25,27 based on ChEMBL,28 the major public repository for compounds and activity data from medicinal chemistry. Hence, one might consider revisiting kinase inhibitor analysis by integrating data from different repositories that have become available over time. The number of available kinase inhibitors and volumes of associated activity data steadily grow, which reflects intense efforts to advance inhibitors to preclinical and clinical development in different therapeutic areas. However, there is no simple correlation between increasing amounts of available data and clinical advancements in the kinase inhibitor field, especially because requirements for kinase inhibitors considered for different therapeutic applications depart from standards established in drug discovery including, first and foremost, kinase promiscuity and ensuing polypharmacology.5,6 The promiscuity cliff (PC) data structure was introduced previously to explore the structural basis of multitarget activities of small molecules.29−31 A PC is defined as a pair of structurally analogous compounds that have a large difference in the number of target annotations.29 PCs have been identified in screening libraries29 and compound sets from ChEMBL including kinase inhibitors.30 The PC data structure is useful for exploring structure−promiscuity relationships and deriving additional target hypotheses for structural analogues of extensively tested kinase inhibitors, especially those that have advanced to the clinic or have been approved as drugs. Structurally related compounds have often not been extensively tested. Therefore,



MATERIALS AND METHODS General Compound Selection Criteria. Compounds were represented as canonical SMILES.32 The following selection criteria were generally applied: (1) Only inhibitors of human kinases having UniProt33 IDs were selected. (2) The permitted molecular weight range was [200, 900] Da. (3) Potency had to be reported using a standard concentration or constant (such as IC50, Ki, or Kd) and a numerically specified value with standard unit (such as units μM, nM, or pM). All potency measurements were recorded as the negative decadic logarithm. (4) A potency threshold of 10 μM was applied (pPOT ≥ 5). (5) If multiple potency values were reported for the same kinase, the highest value was selected. (6) Each kinase annotation of an inhibitor was recorded as a separate “interaction”. Source Databases and Data Curation. Databases were accessed in September 2018, except PubChem, which was accessed in June 2017. The following database-specific curation and selection criteria were applied. ChEMBL. From ChEMBL28 release 24, human kinase inhibitors were selected if inhibition of single kinases (target type “SINGLE PROTEIN”) in direct interaction assays (relationship type “D”) at the highest level of confidence (confidence score “9”) was reported using the standard activity relationship “=”. In addition, consistent activity records were required (e.g., excluding compounds designated as “active”, “inactive”, and/or “inconclusive” in the same record). PubChem. From PubChem,34,35 primary, confirmatory, and panel assays for human kinases were obtained that reported potency measurements with μM or nM activity units. PubChem’s target GI numbers were mapped to the corresponding UniProt IDs. Only compounds with a consistent designation as active with standard relationship “=” for a human kinase assay or across different assays for the same kinase were considered. Probes and Drugs Portal. The Probes and Drugs Portal combines activity data from ∼50 different sources.36 Human kinases from the Portal data were mapped to UniProt IDs. Kinase inhibitors with potency measurements such as pIC50, pKi, or pKd were selected. BindingDB. From BindingDB,37 inhibitors of human kinases with available pIC50, pKi, pKd, or pEC50 were selected. PDBbind. Protein−ligand complexes in PDBbind38 were filtered for human kinases with UniProt IDs and associated PDB39 codes for single targets. Reported compound activity measurements included IC50, Kd, and Ki values with standard relationship. Because PDBbind only provides PDB codes for complexes, these codes were searched for matches in the KLIFS database40 from which the corresponding inhibitors were obtained. 17296

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

kinases (pPOT ≥ 5). These inhibitors covered 82.2% of the human kinome (518 kinases)52 and formed a total of 234 740 unique compound−kinase interactions. For 97.2% of these interactions, only one type of potency measurement (e.g., Ki) was available. Furthermore, IC50, Ki, and Kd values represented 96.4% of all potency measurements, with IC50 representing two thirds of the data (67.5%), followed by Ki (22.3%) and Kd (6.2%) values. When a more stringent potency threshold of 100 nM (pPOT ≥ 7) was applied to this set, 69 774 inhibitors were obtained that covered 408 human kinases. Our previous analysis of kinase inhibitors23 was exclusively based on ChEMBL and ChEMBL-specific data selection criteria. Here, the scope of compound and activity data analysis was expanded and data curation and selection criteria were balanced to cover seven databases. To evaluate compound selection, ChEMBL release 24 was used as a reference database and found to contain 83 647 of the 112 624 inhibitors (74.3%). These compounds had at least one human kinase annotation in ChEMBL, not taking data confidence criteria into consideration. Table 1 reports the number of inhibitors and interactions that were uniquely contributed by individual databases. A subset of

ProteomicsDB. Results of a profiling study of clinical kinase inhibitors41 have been made available in ProteomicsDB.42 From this data set, measurements designated as “high confidence” were selected, given as Kd values with standard relationship, yielding 215 human kinase inhibitors. Drug Target Commons. Compound annotations with human kinases in the Drug Target Commons database43 were filtered for UniProt IDs, standard relationship “=”, and singletarget assays. The database refers to compounds using their ChEMBL IDs. Therefore, qualifying kinase inhibitors were retrieved from ChEMBL. Unifying Kinase Inhibitor Data from Different Sources. To combine data from different sources, assemble unique inhibitors, and evaluate compound sourcing, ChEMBL was used as a reference database. Compounds selected from all databases were mapped to ChEMBL and the overlap was determined. Then, inhibitors not contained in other databases were extracted from ChEMBL by applying the selection criteria specified above. Finally, it was determined how many unique human kinase inhibitors were obtained from each database. Unification of kinase data from different sources is generally hindered by the application of different assay systems, activity detection technologies, and experimental conditions such as varying ATP concentrations in the assays. These variables typically lead to different activity read-outs. Moreover, inconsistencies in data curation may lead to further bias in judging and comparing activity profiles. Therefore, we analyzed our data selection for potential inconsistencies in activity data across different data sources. Moreover, only 4.9% of the interactions had activity variations exceeding one order of magnitude, thus lending credence to the data curation and selection process. In this limited number of cases, for formal consistency, the highest reported activity value was selected and recorded to establish an interaction. Alerts for Pan-Assay Interference Compounds (PAINS). For PC analysis, kinase inhibitors were screened for pan-assay interference compounds (PAINS)14,15 using three public filters available in ChEMBL,28 RDKit,44 and ZINC.45 Although it is by no means certain that compounds containing PAINS substructures will cause assay interference and activity artifacts,46,47 excluding potential false-positives is of critical relevance for defining PCs. This is the case because single-assay interference compounds with artificial target annotations might give rise to many incorrect PCs. Therefore, kinase inhibitors with PAINS alerts were excluded from PC analysis. Promiscuity Cliffs. PCs formed by human kinase inhibitors were identified by systematically searching for transformation size-restricted matched molecular pairs (MMPs).48 An MMP is defined as a pair of compounds that are only distinguished by a chemical modification at a single site,49,50 termed a transformation.50 The MMPs were then screened for a participating inhibitor with a PD of 1−4 and another inhibitor with a larger PD value, yielding a PD difference (ΔPD) of 5 or more. The MMPs meeting these PD/ΔPD conditions were classified as PCs. PC networks in which nodes represent PC compounds and edges pairwise PCs were generated with Cytoscape. 51 Furthermore, phylogenetic trees of the human kinome52 were drawn with KinMap.53

Table 1. Unique Inhibitors and Interactions Originating from Different Databasesa no

database

unique inhibitors

unique interactions

1 2 3 4 5 6 7

ChEMBL PubChem Probes and Drugs Portal BindingDB PDBbind ProteomicsDB Drug Target Commons ∑

3457 444 188 27 277 365 6 16 31 753

6807 471 1971 44 547 380 126 16 54 318

a The table reports the number of inhibitors and compound−kinase interactions that were uniquely contributed by each database. Taken together, 31 753 inhibitors originated from only one of the source databases. The remaining 80 871 of the total of 112 624 qualifying inhibitors were shared by two or more databases.

3457 inhibitors was only present in ChEMBL, but no other database. With 27 277 compounds, BindingDB provided by far the largest fraction of unique inhibitors, which formed 44 547 interactions. BindingDB was followed by ChEMBL and PubChem (444 unique inhibitors). In total, 31 753 inhibitors originated from a single source database. In addition to uniquely contributed compounds, 681 inhibitors not contained in ChEMBL were shared by two or more other databases. The consolidated set of 112 624 human kinase inhibitors provided the basis for our subsequent analysis. Promiscuity Analysis. For promiscuity analysis, each defined compound−kinase interaction yielded an individual target annotation for an inhibitor, whose sum gave its PD. In addition, for each inhibitor with multikinase activity, the “nanomolar ratio” was determined as the proportion of nM relative to (nM + μM) potency measurements. The so-defined nM ratio served as a measure of the strength and relevance of interactions involving promiscuous kinase inhibitors. Figure 1 shows the distribution of PD values of the 112 624 human kinase inhibitors. Majority of inhibitors (61%) only had a single kinase annotation. More than a third of the inhibitors had known activity against two to four kinases, whereas only 4% were active against five or more kinases (4510 inhibitors). Among



RESULTS AND DISCUSSION Human Kinase Inhibitors. A total of 112 624 unique inhibitors were identified that were active against 426 human 17297

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

five target annotations, yielding a median value of 0.4 corresponding to 40% nM potency values. For inhibitors with a minimum of 10 target annotations, nM ratios were significantly reduced, yielding a median of 0.2. However, with further increase in PD thresholds, the distributions remained essentially constant. About half of 538 inhibitors with a PD of at least 30 had nM ratios of 0.2 or greater. Hence, highly promiscuous inhibitors were frequently active in the nanomolar range against multiple kinases. Moreover, promiscuity patterns of inhibitors greatly varied. Figure 3 shows four representative examples of inhibitors with more than 50 kinase annotations and the distributions of their activities across the kinome. As can be seen, highly promiscuous inhibitors were either active against kinases from different groups with similar frequency, corresponding to a wide distribution of activities across the human kinome, or predominantly targeting individual groups such as tyrosine kinases. In addition, the distribution of μM vs nM potencies substantially varied. In some instances, nM potencies of inhibitors were largely confined to single kinase groups, in others they were distributed over different groups. Thus, inhibitors displayed diversified promiscuity patterns, which revealed differential activities across the kinome, even for highly promiscuous inhibitors. Promiscuity Cliffs. Next, we systematically searched for structural analogous kinase inhibitors forming PCs, which required the consideration of additional analysis criteria. First, inhibitors with PAINS alerts were excluded from PC analysis to minimize the risk of false-positive PC assignments. Second, a data-driven PD difference (ΔPD) criterion for cliff formation was established. A total of 7132 inhibitors with PAINS alerts were detected (6.3%), thus only a small proportion, which included 4177 PAINS among 68 361 inhibitors with single kinase annotations and 2955 PAINS among 44 263 promiscuous inhibitors. Following the removal of PAINS, the PD value distribution was re-calculated for the remaining 41 308 promiscuous inhibitors, which again yielded a mean and median of 3.7 and 2.0, respectively, the same as for all promiscuous inhibitors including PAINS, indicating that inhibitors with PAINS alerts were in general not highly promiscuous. We then determined the PD distribution for the subset of promiscuous inhibitors with five or more target annotations, which yielded a median PD of 6. Hence, on the PD scale, the top ∼2% of kinase inhibitors had PD values of 6 or greater. These compounds were considered as candidates for highly promiscuous PC partners. Therefore, we set the ΔPD threshold for PC formation to 5. Accordingly, the PC of smallest magnitude involving an inhibitor with a single kinase annotation was formed with a qualifying structural analogue having a PD of 6. In addition, we set the criterion that weakly promiscuous cliff partners were limited to inhibitors with PD values ranging from 1 to 4. Application of this criterion ensured that no PCs were formed by pairs of highly promiscuous inhibitors. PC analysis was then based on a total of 105 492 inhibitors without PAINS alerts. A large number of 15 939 PCs was identified that involved 10 741 inhibitors (10.2%) including 1653 compounds with PD values of 6−295. We also determined that 2236 PCs (14.0%) were formed by 129 kinase inhibitors at different stages of clinical development and close structural analogues. Nearly all (i.e., 126) clinical inhibitors forming PCs were highly promiscuous cliff partners (PD ≥ 6), and 68 of the clinical inhibitors formed at least 10 PCs and thus served as “promiscuity hubs” in a PC network, as further discussed below.

Figure 1. Distribution of promiscuity degrees. A pie chart shows the distribution of PD values for the set of 112 624 human kinase inhibitors.

these, 1% (1392 inhibitors) had 10 or more kinase annotations, thus representing the subset of most promiscuous inhibitors across the human kinome. With mean and median PD of 2.1 and 1.0, respectively, kinase inhibitor promiscuity was overall only slightly higher across 426 human kinases than indicated by a mean PD of 1.5, which was previously determined on the basis of 43 331 inhibitors with high-confidence activity data for 286 human kinases that exclusively originated from ChEMBL.23 Thus, although our current analysis was based on many more compounds and a much larger kinome coverage, the assessment of global promiscuity among kinase inhibitors remained consistent with earlier findings. For promiscuous kinase inhibitors (PD ≥ 2), a mean and median PD of 3.7 and 2.0 was obtained, respectively. Only a small subset of the inhibitors had a high promiscuity. In addition to assessing kinase inhibitor promiscuity, it was also of interest to determine which kinase groups might form the largest numbers of inhibitor interactions. For the subset of promiscuous kinase inhibitors, the highest recorded number of interactions was found for tyrosine kinases (group TK) (66 011 interactions; 42.8% of all interactions), followed by CMGC (22 816; 14.8%) and AGC (16 097; 10.4%) kinases. Figure 2 reports the distribution of nM ratios for decreasing numbers of promiscuous inhibitors with increasing PD values. The widest distribution was observed for inhibitors with at least

Figure 2. Distribution of nanomolar ratios for inhibitors with increasing PD values. Boxplots monitor distributions of nM ratios (vertical axis on the left) for subsets of inhibitors at different PD thresholds (horizontal axis). The blue curve reports the number of inhibitors in each set (vertical axis on the right). Boxplots report the smallest value (bottom line), first quartile (lower boundary of the box), median value (thick line), third quartile (upper boundary of the box), largest value (top line), and outliers (points below the smallest or above the largest value). 17298

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

Figure 3. Promiscuity patterns. Shown are pairs of highly promiscuous inhibitors displaying different promiscuity patterns. ChEMBL IDs are reported above the compounds and their PD values below in blue circles. For each inhibitor, a phylogenetic tree of the human kinome is shown onto which its kinase annotations are mapped. Each dot represents a kinase the inhibitor is active against. Dots are color-coded according to compound potency (green, nanomolar; yellow, micromolar).

Figure 4 shows the distribution of ΔPD values over all PCs. More than half of the PCs (55%) had ΔPD values of 10 or more

Promiscuity Cliff Network. We then generated a global network from all 15 939 PCs in which nodes represented inhibitors and edges pairwise PC relationships. The global PC network was found to consist of a total 622 clusters with two to 633 inhibitors per cluster, with a mean of 17.3 and median of 6.5 inhibitors. These clusters contained between one and 1351 PCs, with a mean and median of 25.6 and 6.0 PCs per cluster. Thus, PCs were typically formed by groups of structurally related inhibitors, similar to what has been observed for activity cliffs,54,55 the majority of which are formed in a “coordinated” manner55 as well as for “interaction cliffs”, which take protein− ligand interaction similarity into account, in addition to structural similarity.56 As discussed below, coordination of PCs further increased the structural context information for promiscuity analysis. Figure 6a shows exemplary PC clusters from the global network. As can be seen, these clusters vary greatly in their size, topology, and complexity. Promiscuity Cliff Pathways. PC clusters served as a source of “PC pathways” (PCPs), as also illustrated in Figure 6a. A PCP represents a linear substructure (subgraph) of a PC cluster and a data structure for the extraction of structure−promiscuity relationships from the clusters. Figure 6b−g show a variety of PCPs of increasing length that are traced in clusters. A simple PCP is depicted in Figure 6b, which was isolated from a PC cluster with a “star” topology, resulting from a central highly promiscuous inhibitor forming PCs with many others. This PCP consists of only two PCs of smallest possible magnitude (i.e., PD values of 1 and 6). Figure 6c shows a PCP from a cluster containing a highly promiscuous inhibitor (PD 56) and a number of weakly promiscuous analogues. The highly promiscuous inhibitor has a substituent that is chemically distinct from those of its analogues, which might be responsible for its high promiscuity, providing experimentally testable

Figure 4. Distribution of ΔPD values for promiscuity cliffs. A pie chart shows the distribution of ΔPD values for the set of 15 939 PCs.

and 5879 PCs (37%) ΔPD values of 20 or more. Thus, significant numbers of large-magnitude PCs were identified. In Figure 5, exemplary PCs of increasing magnitude are shown, which reveal small structural changes that distinguish inhibitors with increasingly large differences in potency. As such, each PC encodes (i) additional target hypotheses for weakly or nonpromiscuous inhibitors (taking data incompleteness into consideration) and (ii) hypotheses for structural changes that might be responsible for achieving target selectivity or trigger promiscuity. Accordingly, computationally identified PCs provide a wealth of opportunities for follow-up investigations. 17299

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

Figure 5. Exemplary promiscuity cliffs. For each ΔPD category in Figure 4, an exemplary PC is given. For each inhibitor, the PD value is reported and chemical modifications distinguishing PC partners are color-coded.

isolated from increasingly large and complex clusters to extract structure−promiscuity relationship information from them. Both PCPs in Figure 6e,f are characterized by the presence of inhibitors with significantly varying PD values. Furthermore, the PCP in Figure 6g organizes a number of large-magnitude PCs involving inhibitors with single target annotations and highly promiscuous ones. It also illustrates that large series of overlapping PCs might encompass inhibitors of varying size and structural complexity. In cluster VI (Figure 6a) from which this PCP was extracted, compounds 11 and in particular 13 represent promiscuity hubs (with a PD value of 16 and 34, respectively), which have many weakly promiscuous or nonpromiscuous near neighbors. Such promiscuity hubs and their neighbors represent prime candidates for exploring the role of data incompleteness in subsequent kinase profiling assays as well as molecular origins of experimentally confirmed differences in promiscuity. Figure 7 shows exemplary clinical kinase inhibitors and their PC network neighborhoods.

hypotheses. The PCP depicted in Figure 6d combines inhibitors with single kinase annotations and others with varying degrees of promiscuity including one of the most promiscuous inhibitors identified (compound 6, PD 165). It is striking to observe how small chemical modifications along the PCP relate nonpromiscuous and highly promiscuous inhibitors to each other, for example, compounds 3 (PD 1) and 4 (PD 14) or compounds 6 (PD 165) and 7 (PD 1). A characteristic feature of PC sequences forming PCPs is that they consist of alternating structural analogues with low and high PD values. As such, the PCP uncovers multiple structure−promiscuity relationships that can be further investigated. Even a medium-sized PCP, such as the one shown in Figure 6d, provides many additional target hypotheses for inhibitors as well as hypotheses for structural modifications altering promiscuity. The information provided by a single PCP would be sufficient for initiating an experimental program to further explore a kinase inhibitor analogue series. As shown in Figure 6e−g, PCPs of increasing lengths can be 17300

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

Figure 6. continued

17301

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

Figure 6. continued

17302

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

Figure 6. continued

17303

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

Figure 6. continued

17304

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

Figure 6. Promiscuity cliff pathways. (a) Selected clusters (I−VI) from the global PC network and their composition. Nodes represent inhibitors and edges pairwise PCs. Nodes are color-coded according to different PD ranges. Pathways formed by sequences of PCs are traced using thick black edges and selected compounds are numbered. In (b)−(g), pathways from clusters I−VI in (a) are depicted in detail. The PD values of the participating inhibitors are reported applying the same color code as in (a) and iterative structural modifications that distinguish inhibitors along the paths are shown in red.

inhibitors were active against five or more human kinases. However, a small subset (∼1 to 2%) of highly promiscuous inhibitors was identified that did not contain PAINS substructures. Interestingly, highly promiscuous inhibitors displayed different nM ratios and activity distributions across the kinome. Special emphasis was put on systematically identifying PCs and PCPs. The PC concept was first introduced when analyzing compound array experiments.29 Another previous investigation established a promiscuity ontology of structurally related

Concluding Discussion. In this work, we have combined kinase inhibitors and associated activity data from different public repositories, yielding an unprecedentedly large collection of over 112 000 inhibitors with well-defined potency measurements and achieving 82% coverage of the human kinome. On the basis of these data, the much debated issue of kinase inhibitor promiscuity was revisited. As in a previous analyses focused on ChEMBL data, low global promiscuity was detected for human kinase inhibitors, with mean and median PD values of 2.1 and 1.0, respectively. Only 4% of the currently available 17305

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

Figure 7. Network environment of clinical kinase inhibitors. Shown are exemplary clinical kinase inhibitors together with their neighborhoods in the global PC network. The node corresponding to clinical kinase inhibitor is encircled and the PD values given below the inhibitor. The representation is according to Figure 6.

compounds on the basis of calculated fingerprint similarity and distinguished promiscuous from selective compounds.57 In our current analysis, we have systematically analyzed PCs, PC clusters, and PCPs to explore the structural modifications associated with large promiscuity differences. Therefore, datadriven PC criteria were established. A large number of ∼16 000 PCs were identified that were predominantly formed in a coordinated manner, as revealed by network analysis. We

introduced the PCP concept to extract structure−promiscuity relationships from PC clusters and organize them in an interpretable form. The analysis uncovered many structurally analogous inhibitors with large PD value differences and chemical modifications converting high into weak or nonpromiscuous inhibitors and vice versa. Observed large differences in promiscuity between structural analogues were surprising and might be due to multiple reasons, as discussed. 17306

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

Milanov, Z. V.; Morrison, M. J.; Pallares, G.; Patel, H. K.; Pritchard, S.; Wodicka, L. M.; Zarrinkar, P. P. A Quantitative Analysis of Kinase Inhibitor Selectivity. Nat. Biotechnol. 2008, 26, 127−132. (9) Anastassiadis, T.; Deacon, S. W.; Devarajan, K.; Ma, H.; Peterson, J. R. Comprehensive Assay of Kinase Catalytic Activity Reveals Features of Kinase Inhibitor Selectivity. Nat. Biotechnol. 2011, 29, 1039−1045. (10) Cheng, A. C.; John Eksterowicz, J.; Geuns-Meyer, S.; Sun, Y. Analysis of Kinase Inhibitor Selectivity Using a ThermodynamicsBased Partition Index. J. Med. Chem. 2010, 53, 4502−4510. (11) Levitzki, A. Tyrosine Kinase Inhibitors: Views of Selectivity, Sensitivity, and Clinical Performance. Annu. Rev. Pharmacol. Toxicol. 2013, 53, 161−185. (12) Müller, S.; Chaikuad, A.; Gray, N. S.; Knapp, S. The Ins and Outs of Selective Kinase Inhibitor Development. Nat. Chem. Biol. 2015, 11, 818−821. (13) Elkins, J. M.; Fedele, V.; Szklarz, M.; Abdul Azeez, K. R.; Salah, E.; Mikolajczyk, J.; Romanov, S.; Sepetov, N.; Huang, X. P.; Roth, B. L.; Al Haj Zen, A.; Fourches, D.; Muratov, E.; Tropsha, A.; Morris, J.; Teicher, B. A.; Kunkel, M.; Polley, E.; Lackey, K. E.; Atkinson, F. L.; Overington, J. P.; Bamborough, P.; Müller, S.; Price, D. J.; Willson, T. M.; Drewry, D. H.; Knapp, S.; Zuercher, W. J. Comprehensive Characterization of the Published Kinase Inhibitor Set. Nat. Biotechnol. 2016, 34, 95−103. (14) Baell, J. B.; Holloway, G. A. New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays. J. Med. Chem. 2010, 53, 2719−2740. (15) Baell, J.; Walters, M. A. Chemistry: Chemical Con Artists Foil Drug Discovery. Nature 2014, 513, 481−483. (16) Baell, J. B.; Nissink, J. W. M. Seven Year Itch: Pan-Assay Interference Compounds (PAINS) in 2017Utility and Limitations. ACS Chem. Biol. 2018, 13, 36−44. (17) Hu, Y.; Bajorath, J. Compound Promiscuity - What Can We Learn From Current Data. Drug Discovery Today 2013, 18, 644−650. (18) Gilberg, E.; Jasial, S.; Stumpfe, D.; Dimova, D.; Bajorath, J. Highly Promiscuous Small Molecules from Biological Screening Assays Include Many Pan-Assay Interference Compounds but also Candidates for Polypharmacology. J. Med. Chem. 2016, 59, 10285−10290. (19) Liu, Y.; Gray, N. S. Rational Design of Inhibitors that Bind to Inactive Kinase Conformations. Nat. Chem. Biol. 2006, 2, 358−364. (20) Zhao, Z.; Wu, H.; Wang, L.; Liu, Y.; Knapp, S.; Liu, Q.; Gray, N. S. Exploration of Type II Binding Mode: A Privileged Approach for Kinase Inhibitor Focused Drug Discovery? ACS Chem. Biol. 2014, 9, 1230−1241. (21) Gavrin, L. K.; Saiah, E. Approaches to Discover Non-ATP Site Kinase Inhibitors. Med. Chem. Commun. 2013, 4, 41−51. (22) Hu, Y.; Furtmann, N.; Bajorath, J. Current Compound Coverage of the Kinome. J. Med. Chem. 2015, 58, 30−40. (23) Dimova, D.; Bajorath, J. Assessing Scaffold Diversity of Kinase Inhibitors Using Alternative Scaffold Concepts and Estimating the Scaffold Hopping Potential for Different Kinases. Molecules 2017, 22, No. e730. (24) Mestres, J.; Gregori-Puigjane, E.; Valverde, S.; Sole, R. V. Data Completeness − The Achilles Heel of Drug-Target Networks. Nat. Biotechnol. 2008, 26, 983−984. (25) Stumpfe, D.; Tinivella, A.; Rastelli, G.; Bajorath, J. Promiscuity of Inhibitors of Human Protein Kinases at Varying Data Confidence Levels and Test Frequencies. RSC Adv. 2017, 7, 41265−41271. (26) Miljković, F.; Bajorath, J. Exploring Selectivity of Multikinase Inhibitors across the Human Kinome. ACS Omega 2018, 3, 1147− 1153. (27) Miljković, F.; Bajorath, J. Reconciling Selectivity Trends from a Comprehensive Kinase Inhibitor Profiling Campaign with Known Activity Data. ACS Omega 2018, 3, 3113−3119. (28) Gaulton, A.; Hersey, A.; Nowotka, M.; Bento, A. P.; Chambers, J.; Mendez, D.; Mutowo, P.; Atkinson, F.; Bellis, L. J.; Cibrián-Uhalte, E.; Davies, M.; Dedman, N.; Karlsson, A.; Magariñ os, M. P.; Overington, J. P.; Papadatos, G.; Smit, I.; Leach, A. R. The ChEMBL Database in 2017. Nucleic Acids Res. 2017, 45, D945−D954.

Systematic analysis of PC clusters and PCPs revealed many structure−promiscuity relationships and additional target hypotheses for inhibitors. As such, our study provides an example for large-scale computational data analysis and generation of data structures that provide a basis for experimental design. Therefore, following publication of this work, our kinase inhibitor data, PCs, and PC clusters will be made freely available to enable follow-up investigations. In summary, our analysis has yielded 1. a large kinase inhibitor collection from different sources, achieving 82% coverage of the human kinome; 2. a detailed view of kinase promiscuity across the kinome; 3. approximately 16 000 PCs, which suggests target hypotheses of kinase inhibitors for follow-up investigations; 4. a global PC network from which PC clusters can be extracted; 5. PC pathways that can be directly used to explore structure−promiscuity relationships in medicinal chemistry.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Phone: +49-228-7369-100. ORCID

Jürgen Bajorath: 0000-0002-0557-5714 Author Contributions

The study was carried out and the manuscript written with contributions of all the authors. All the authors have approved the final version of the manuscript. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors thank Ctibor Š kuta of the Probes and Drugs Portal for providing kinase inhibitor data. The authors also thank Chemical Computing Group for providing an academic licence for the Molecular Operating Environment used for descriptor calculations and OpenEye Scientific Software for providing an academic license for software toolkits used for compound standardization and MMP generation.



REFERENCES

(1) Cohen, P. Protein Kinases - the Major Drug Targets of the Twenty-First Century? Nat. Rev. Drug Discovery 2002, 1, 309−315. (2) Kinase Drug Discovery; Ward, R. A.; Goldberg, F. W., Eds.; RSC: Cambridge, U.K., 2011. (3) Knight, Z. A.; Lin, H.; Shokat, K. M. Targeting the Cancer Kinome through Polypharmacology. Nat. Rev. Cancer 2010, 10, 130−137. (4) Gross, S.; Rahal, R.; Stransky, N.; Lengauer, C.; Hoeflich, K. P. Targeting Cancer with Kinase Inhibitors. J. Clin. Invest. 2015, 125, 1780−1789. (5) Simmons, D. L. Targeting Kinases: A New Approach to Treating Inflammatory Rheumatic Diseases. Curr. Opin. Pharmacol. 2013, 13, 426−434. (6) Laufer, S.; Bajorath, J. New Frontiers in Kinases: Second Generation Inhibitors. J. Med. Chem. 2014, 57, 2167−2168. (7) Wu, P.; Nielsen, T. E.; Clausen, M. H. Small-Molecule Kinase Inhibitors: An Analysis of FDA-Approved Drugs. Drug Discovery Today 2016, 21, 5−10. (8) Karaman, M. W.; Herrgard, S.; Treiber, D. K.; Gallant, P.; Atteridge, C. E.; Campbell, B. T.; Chan, K. W.; Ciceri, P.; Davis, M. I.; Edeen, P. T.; Faraoni, R.; Floyd, M.; Hunt, J. P.; Lockhart, D. J.; 17307

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308

ACS Omega

Article

(29) Dimova, D.; Hu, Y.; Bajorath, J. Matched Molecular Pair Analysis of Small Molecule Microarray Data Identifies Promiscuity Cliffs and Reveals Molecular Origins of Extreme Compound Promiscuity. J. Med. Chem. 2012, 55, 10220−10228. (30) Dimova, D.; Gilberg, E.; Bajorath, J. Identification and Analysis of Promiscuity Cliffs Formed by Bioactive Compounds and Experimental Implications. RSC Adv. 2017, 7, 58−66. (31) Dimova, D.; Bajorath, J. Rationalizing Promiscuity Cliffs. ChemMedChem 2018, 13, 490−494. (32) Weininger, D. SMILES, a Chemical Language and Information System. 1. Introduction to Methodology and Encoding Rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31−36. (33) The UniProt Consortium. UniProt: The Universal Protein Knowledgebase. Nucleic Acids Res. 2018, 46, 2699. (34) Kim, S.; Thiessen, P. A.; Bolton, E. E.; Chen, J.; Fu, G.; Gindulyte, A.; Han, L.; He, J.; He, S.; Shoemaker, B. A.; Wang, J.; Yu, B.; Zhang, J.; Bryant, S. H. PubChem Substance and Compound Databases. Nucleic Acids Res. 2016, 44, D1202−D1213. (35) Wang, Y.; Bryant, S. H.; Cheng, T.; Wang, J.; Gindulyte, A. B.; Shoemaker, A.; Thiessen, P. A.; He, S.; Zhang, J. PubChem BioAssay: 2017 Update. Nucleic Acids Res. 2017, 45, D955−D963. (36) Skuta, C.; Popr, M.; Muller, T.; Jindrich, J.; Kahle, M.; Sedlak, D.; Svozil, D.; Bartunek, P. Probes & Drugs Portal: An Interactive, Open Data Resource for Chemical Biology. Nat. Methods 2017, 14, 759−760. (37) Gilson, M. K.; Liu, T.; Baitaluk, M.; Nicola, G.; Hwang, L.; Chong, J. BindingDB in 2015: A Public Database for Medicinal Chemistry, Computational Chemistry and Systems Pharmacology. Nucleic Acids Res. 2016, 44, D1045−D1053. (38) Liu, Z.; Su, M.; Han, L.; Liu, J.; Yang, Q.; Li, Y.; Wang, R. Forging the Basis for Developing Protein−Ligand Interaction Scoring Functions. Acc. Chem. Res. 2017, 50, 302−309. (39) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235−242. (40) Kooistra, A. J.; Kanev, G. K.; van Linden, O. P. J.; Leurs, R.; de Esch, I. J. P.; de Graaf, C. KLIFS: A Structural Kinase-Ligand Interaction Database. Nucleic Acids Res. 2016, 44, D365−D371. (41) Klaeger, S.; Heinzlmeir, S.; Wilhelm, M.; Polzer, H.; Vick, B.; Koenig, P. A.; Reinecke, M.; Ruprecht, B.; Petzoldt, S.; Meng, C.; Zecha, J.; Reiter, K.; Qiao, H.; Helm, D.; Koch, H.; Schoof, M.; Canevari, G.; Casale, E.; Depaolini, S. R.; Feuchtinger, A.; Wu, Z.; Schmidt, T.; Rueckert, L.; Becker, W.; Huenges, J.; Garz, A. K.; Gohlke, B. O.; Zolg, D. P.; Kayser, G.; Vooder, T.; Preissner, R.; Hahne, H.; Tõnisson, N.; Kramer, K.; Götze, K.; Bassermann, F.; Schlegl, J.; Ehrlich, H. C.; Aiche, S.; Walch, A.; Greif, P. A.; Schneider, S.; Felder, E. R.; Ruland, J.; Médard, G.; Jeremias, I.; Spiekermann, K.; Kuster, B. The Target Landscape of Clinical Kinase Inhibitors. Science 2017, 358, No. eaan4368. (42) Schmidt, T.; Samaras, P.; Frejno, M.; Gessulat, S.; Barnert, M.; Kienegger, H.; Krcmar, H.; Schlegl, J.; Ehrlich, H. C.; Aiche, S.; Kuster, B.; Wilhelm, M. ProteomicsDB. Nucleic Acids Res. 2018, 46, D1271− D1281. (43) Tang, J.; Tanoli, Z.-U.-R.; Ravikumar, B.; Alam, Z.; Rebane, A.; Vähä-Koskela, M.; Peddinti, G.; van Adrichem, A. J.; Wakkinen, J.; Jaiswal, A.; Karjalainen, E.; Gautam, P.; He, L.; Parri, E.; Khan, S.; Gupta, A.; Ali, M.; Yetukuri, L.; Gustavsson, A.-L.; Seashore-Ludlow, B.; Hersey, A.; Leach, A. R.; Overington, J. P.; Repasky, G.; Wennerberg, K.; Aittokallio, T. Drug Target Commons: A Community Effort to Build a Consensus Knowledge Base for Drug-Target Interactions. Cell Chem. Biol. 2018, 25, 224−229. (44) RDKit: Cheminformatics and Machine Learning Software. http:// www.rdkit.org, 2013. (45) Sterling, T.; Irwin, J. J. ZINC 15−Ligand Discovery for Everyone. J. Chem. Inf. Model. 2015, 55, 2324−2337. (46) Capuzzi, S. J.; Muratov, E. N.; Tropsha, A. Phantom PAINS: Problems with the Utility of Alerts for Pan-Assay INterference CompoundS. J. Chem. Inf. Model. 2017, 57, 417−427. (47) Jasial, S.; Hu, Y.; Bajorath, J. How Frequently Are Pan-Assay Interference Compounds Active? Large-Scale Analysis of Screening

Data Reveals Diverse Activity Profiles, Low Global Hit Frequency, and Many Consistently Inactive Compounds. J. Med. Chem. 2017, 60, 3879−3886. (48) Hu, X.; Hu, Y.; Vogt, M.; Stumpfe, D.; Bajorath, J. MMP-Cliffs: Systematic Identification of Activity Cliffs on the Basis of Matched Molecular Pairs. J. Chem. Inf. Model. 2012, 52, 1138−1145. (49) Kenny, P. W.; Sadowski, J. Structure Modification in Chemical Databases. In Chemoinformatics in Drug Discovery; Oprea, T. I., Ed.; Wiley-VCH: Weinheim, Germany, 2005; pp 271−285. (50) Hussain, J.; Rea, C. Computationally Efficient Algorithm to Identify Matched Molecular Pairs (MMPs) in Large Data Sets. J. Chem. Inf. Model. 2010, 50, 339−348. (51) Smoot, M. E.; Ono, K.; Ruscheinski, J.; Wang, P. L.; Ideker, T. Cytoscape 2.8: New Features for Data Integration and Network Visualization. Bioinformatics 2011, 27, 431−432. (52) Manning, G.; Whyte, D. B.; Martinez, R.; Hunter, T.; Sudarsanam, S. The Protein Kinase Complement of the Human Genome. Science 2002, 298, 1912−1934. (53) Eid, S.; Turk, S.; Volkamer, A.; Rippmann, F.; Fulle, S. KinMap: A Web-Based Tool for Interactive Navigation through Human Kinome Data. BMC Bioinformatics 2017, 18, No. 16. (54) Maggiora, G. M. On Outliers and Activity Cliffs − Why QSAR Often Disappoints. J. Chem. Inf. Model. 2006, 46, 1535. (55) Stumpfe, D.; Hu, Y.; Dimova, D.; Bajorath, J. Recent Progress in Understanding Activity Cliffs and their Utility in Medicinal Chemistry. J. Med. Chem. 2014, 57, 18−28. (56) Méndez-Lucio, O.; Kooistra, A. J.; de Graaf, C.; Bender, A.; Medina-Franco, J. L. Analyzing Multitarget Activity Landscapes Using Protein−Ligand Interaction Fingerprints: Interaction Cliffs. J. Chem. Inf. Model. 2015, 55, 251−262. (57) Yongye, A. B.; Medina-Franco, J. L. Data Mining of ProteinBinding Profiling Data Identifies Structural Modifications That Distinguish Selective and Promiscuous Compounds. J. Chem. Inf. Model. 2012, 52, 2454−2461.

17308

DOI: 10.1021/acsomega.8b02998 ACS Omega 2018, 3, 17295−17308