Determination of a Focused Mini Kinase Panel for Early Identification

smaller number of kinases, or mini kinase panel (MKP), to assess its selectivity. ... individual kinases chosen to constitute such a mini kinase panel...
0 downloads 0 Views 2MB Size
Subscriber access provided by UNIVERSITY OF TOLEDO LIBRARIES

Pharmaceutical Modeling

Determination of a Focused Mini Kinase Panel for Early Identification of Selective Kinase Inhibitors Scott D. Bembenek, Gavin Hirst, and Taraneh Mirzadegan J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.8b00222 • Publication Date (Web): 24 May 2018 Downloaded from http://pubs.acs.org on May 25, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Determination of a Focused Mini Kinase Panel for Early Identification of Selective Kinase Inhibitors

Scott D. Bembenekτ∗, Gavin Hirstθ and Taraneh Mirzadeganτ τ

Discovery Sciences and θImmunology, Janssen Research & Development, San Diego, CA 92121, USA.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT We analyzed an extensive data set of 3000 Janssen kinase inhibitors (spanning some 40 therapeutic projects) profiled at 414 kinases in the DiscoverX KINOMEscan to better understand the necessity of using such a full kinase panel versus simply profiling one’s compound at a much smaller number of kinases, or mini kinase panel (MKP), to assess its selectivity. To this end, we generated a series of MKPs over a range of sizes, and of varying kinase membership using Monte Carlo simulations. By defining the kinase hit index (KHI), we quantified a compound’s selectivity based on the number of kinases it hits. We find that certain combinations (rather than a random selection) of kinases can result in a much lower average error. Indeed, we identified a focused MKP with a 45.1% improvement in the average error (compared to random) that yields an overall correlation of R2 = 0.786 – 0.826 for the KHI compared to the full kinase panel value. Unlike using a full kinase panel, which is both time and cost restrictive, a focused MKP is amenable to the triaging of all early-stage compounds. In this way, promiscuous compounds are filtered out early on, leaving the most selective compounds for lead optimization.

ACS Paragon Plus Environment

Page 2 of 33

Page 3 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

INTRODUCTION An important standard for a kinase inhibitor’s progression to drug candidacy is its selectivity for the kinase target of interest relative to the rest of the kinome. Ideally, all early leads would be tested in a full kinase panel to determine their kinase selectivity profile. In this way, one would then determine the number of kinases hit (per some activity cutoff) by each of these compounds. However, profiling every early lead is a costly, time-consuming endeavor, and therefore, one needs to be mindful of the compounds chosen. Often, reduction to practice means the testing of a minimal set of lead compounds against a much smaller number of kinases. The individual kinases chosen to constitute such a mini kinase panel (MKP) vary depending on the primary kinase target and therapeutic area bias. The major goal of this work was to determine a highly reliable MKP for selectivity determination that was: 1) cost effective; 2) time efficient; 3) amenable to screening of all/most of one’s early leads; and 4) unbiasedly/rationally chosen. Clearly, such a MKP would better enable the identification of those compounds that should move forward for lead optimization. Indeed, the concept of a using a MKP to evaluate kinase selectivity has been presented in the literature. Most notable is the work of Brandt et al.1, which was motivated (in part) by an earlier contention that “… small assay panels do not provide a robust measure of selectivity.”2 From their analyses, they concluded that a rationally designed MKP could be used to estimate the degree of promiscuity. Nonetheless, they had access to only small data sets. Moreover, attempts, with varying rates of success, have been made to devise models to correctly predict the actual kinase profile of a compound, and in this way, allow the determination of the selectivity.3,4,5 Here, we interrogate our extensive data set using a Monte Carlo approach to both

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

understand the uses and limitations of MKPs in general, and to identify an optimal (or focused) MKP. As part of our internal kinase efforts, we have generated an extensive amount of kinase profiling data from the DiscoverX KINOMEscan2,6 on a core set of 3368 Janssen kinase inhibitors spanning some 40 therapeutic projects; this data set afforded us the opportunity for some detailed analysis. Our initial goal was to understand the effect of the kinase panel size in determining a compound’s selectivity, regardless of the actual kinases therein. To this end, we ran Monte Carlo simulations to generate MKPs over a range of sizes, and of varying kinase membership. By defining the kinase hit index (KHI), we were able to quantify a compound’s selectivity based on the number of kinases it hits. Therefore, for a given MKP size, comparison of the KHI obtained for each compound in each of the MKPs (generated from the Monte Carlo simulations) to that of the full KHI (based on the kinase profiling data) allowed for the determination of the average error one incurs from using the smaller sized MKP. Repeating this procedure over a range of MKP sizes allowed us to assess how this average error varies as a function of the MKP size. From this analysis, several important aspects were revealed. First, we find that most of improvement one obtains from using a larger kinase panel quickly tapers off at approximately a MKP size of 50 kinases. In other words, while one does gain improvement in the average error with increasing MKP size (as expected), the rate of improvement past a MKP size of 50 is substantially less. Therefore, we have established a reliable upper bound for a MKP size.

ACS Paragon Plus Environment

Page 4 of 33

Page 5 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Secondly, we find that one can reduce the average error by choosing particular combinations of kinases to comprise a focused MKP rather than making this selection at random. In particular, we find that with such a focused MKP, we are able to obtain essentially the same average error with a MKP size of 20 as with MKP sizes between 50 – 100 kinases, where the latter were chosen randomly; this is a 45.1% improvement in the average error compared to a random selection at the same MKP size of 20 kinases. Finally, we obtain a correlation between R2 = 0.786 – 0.826 for the KHI calculated with this focused MKP versus the full kinase panel. In this way, we have established a reliable lower bound, upper bound, and an optimal range for the MKP size. Furthermore, we have shown that using a focused MKP, rather than choosing one at random substantially lowers the average error, and results in a good correlation with the full kinase panel result for the KHI. This clearly shows that by using such a focused MKP, one can reliably assess a compound’s selectivity; a full kinase panel is not needed to assess a compound’s kinase selectivity (at least at the early stages of a therapeutic project), although, we recommend using a full kinase panel at the later stages. Herein, we will begin by briefly discussing the Janssen kinase inhibitor set (more details have been given elsewhere7) used in our analysis. The criteria for determining a hit will be introduced, from which the definition of the KHI will naturally follow. An overview of the Monte Carlo simulations used to assess the effect of size and kinase membership of a MKP will be described. The strategy for identifying the focused MKP will then be overviewed. Finally, we will summarize our results and draw conclusions.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

MATERIALS AND METHODS Janssen Kinase Inhibitor Test Set Some 40 Janssen kinase projects over the past decade have resulted in a rich source of chemical assets for these therapeutic targets. Indeed, approximately 77000 Janssen compounds (the “77K set”) include a kinase annotation. These compounds have a variety of different origins: validated high-throughput screening and medium-throughput screening; kinase-focused combinatorial library design; lead optimization efforts; and synthesis based on in silico modeling predictions. We selected 3368 compounds (the “3K set”) from the diverse set of chemotypes to profile in the DiscoverX KINOMEscan assay. Prior to this selection, compounds chosen for profiling were done so primarily based on the hit-to-lead, and lead optimization needs of therapeutic projects. However, the current selection was based on chemical and biological filtering. In this way, approximately 2600 compounds were selected by therapeutic project chemists, approximately 400 compounds were selected from a Janssen screening library of diverse bioactivity, and approximately 360 compounds were selected from the Janssen inventory based on their activity cliffs8. Primary displacement efficacy (DE) profiling at 1 µM compound concentration was done over a DiscoverX KINOMEscan panel of 456 kinases (which included 394 human wild type kinases). In this assay2,6, test compounds compete with an immobilized reference ligand for binding to the kinase active site. Hits are determined by measuring the amount of kinase captured in test versus control samples via a quantitative PCR method able to detect the associated DNA label. Compounds for dose response (KD) validation were selected using multiple criteria.

ACS Paragon Plus Environment

Page 6 of 33

Page 7 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Overall, the 3K set provides good representation of the larger Janssen kinase chemical space within the expected limitations, and therefore, we decided this would be the best test set to use in our analysis. After a final curation of the data, the test set was trimmed down to exactly 3000 compounds with coverage over 414 kinases. In order to perform our analysis, it was necessary to quantify what constituted a hit at a given kinase. Heuristically, we imposed a hit cutoff of pKD ≥ 7 or DE ≥ 85% (at 1 µM) as it would best leverage both the single-point and dose-response data available. Moreover, a measure of selectivity for a given compound was also required. Although a variety of measures have been reported,2,9–16 from our imposed hit cutoff criteria a natural measure was simply the ratio given by the number of kinases hit divided by the total number of kinases tested; this then defined our kinase hit index (KHI). In Figure 1, we show the KHI (selectivity) distribution of the 3000 compounds used in the test set. Overall, we see there is representative coverage with more emphasis on the selective kinase inhibitor space (note that as a result of our hit cutoff definition, 546 of the 3000 compounds (18.2%) end up not being hits at any of the kinases in the panel of 414 and are not shown).

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. The kinase hit index (KHI) distribution for the test set. For display purposes, the percent of compounds (18.2%) not hitting any of the 414 kinase in the full kinase panel are not shown. Monte Carlo Simulations As noted early, the MKP actual design is very subjective. To better understand the resulting implications, we constructed MKPs over a range of sizes, and of varying kinase membership using Monte Carlo simulations. As part of this process, first, a file containing each of the 3K compounds with each of the kinases they hit (from the total set of 414 kinases) at the cutoff of pIC50 = 7 was created; this can be thought of as the master kinase–compound hit map. Now, for the given MKP size of interest, 100000 MKPs were randomly generated. Next, each kinase for a given MKP was joined against the master kinase–compound hit map to create the hit

ACS Paragon Plus Environment

Page 8 of 33

Page 9 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

submap for this MKP. From this, one then calculates the KHI for each of the 3K compounds based on the chosen MKP. Note, that if a given compound is very selective, such that none of the kinases it hits are part of a given MKP, the join will not pull this compound from the master kinase–compound hit map. In other words, the given compound does not hit any kinases in the MKP for the hit cutoff of pIC50 = 7, which results in a calculated KHI of zero based on the given MKP. In this way, the compound is “flagged” as more selective relative to the other compounds in the set. At every MKP size and for each MKP generated therein, the KHI was calculated for a given compound. Thereafter, the standard deviation of these values against the full KHI (based on the kinase profile data) was calculated. In this way, the error of using a MKP of that particular size was determined for that compound. Clearly, one wishes to know in general the overall error to be expected when testing a set of representative kinase inhibitors. To this end, we also calculated the average standard deviation () over all 3000 compounds in the test set; this then defined our average error and optimization parameter. In other words, mathematically we define the error of using a MKP of given size for the ith compound of the total number of M compounds as:  ∑  KHI  − Full KHI Error ≝ 1 

where the index j sums over each Monte Carlo simulation with the total number being given by

N. With this, the average error is defined as:

*

1 < SD > ≝ Avgerage Error = ) Error 2 ( 

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

where once again, M is the total number of compounds, which is 3000 in our case. We then multiply the average error by the total number of kinases in the full kinase panel (414 kinases), which then sets the units. Our initial efforts focused on covering the full range of MKP sizes. The chosen panel sizes were: 1, 5, 10, 12, 20, 30, 40, 43, 50, 60, 70, 80, 90, 100, 200, 300, and 414. We generated 100000 MKPs at each of these 17 panel sizes. Here, we define the total number of simulations to be the number of MKPs generated (trials) multiplied by the total number of panel sizes. This gives 1.7 million simulations in all. We note that while 100000 trials is substantial, it is clear that the kinase trial space is extensive (for example, there are ~1033 total trials at a MKP size of 20), and therefore, one needs to have a sense of how well the average error is converging as a function of the trial size. Using 4 trial sizes (1000, 10000, 25000, 100000) at a MKP size of 20, we made an initial assessment of the convergence. Going from smallest to largest trial size, we find the average error (now to 3 past the decimal place in number of kinases) to be 14.292, 14.222, 14.214 and 14.228 respectively. Clearly, we see excellent convergence (even between 10000 and 25000 trials), thus assuring us of the result at 100000 trials. (We have also checked the convergence for most of the other MKP sizes, and we find this trend to hold in general.)

RESULTS AND DISCUSSION Average Error Versus Mini Kinase Panel Size In Figure 2, we show the average error versus the MKP size. Table 1 shows the numerical results. From this plot, it is clear that (as expected) one does see an improvement in the average error as the size of the MKP is increased. However, this improvement quickly drops

ACS Paragon Plus Environment

Page 10 of 33

Page 11 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

off, and most of the improvement occurs at approximately a MKP size of 50 kinases. Thus, we propose 50 kinases as the maximum size for a MKP.

Figure 2. The plot of the average error () versus 17 mini kinase panel (MKP) sizes (1, 5, 10, 12, 20, 30, 40, 43, 50, 60, 70, 80, 90, 100, 200, 300, and 414). Each data point represents 100000 trials resulting in a total of 1.7 million simulations. Most of the improvement from increasing the MKP size occurs by 50 kinases.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 1. The table of the average error () for 17 MKP sizes (1, 5, 10, 12, 20, 30, 40, 43, 50, 60, 70, 80, 90, 100, 200, 300, and 414). Each data point represents 100000 trials resulting in a total of 1.7 million simulations.

Mini Kinase Panel Size 1 5 10 12 20 30 40 43 50 60 70 80 90 100 200 300 414

(Kinases) 65.1 29.0 20.4 18.5 14.2 11.5 9.8 9.4 8.6 7.8 7.1 6.5 6.1 5.7 3.3 2.0 0.0

Focused Mini Kinase Panel Determination From our efforts, we obtained a thorough understanding of the variation in average error with the MKP size when choosing the kinase members at random. We now wanted to know if we could substantially improve the average error by using a focused MKP. To answer this, we calculated the average error for each of the 100000 MKPs generated at a size of 20. We then identified the MKP giving the smallest average error.

ACS Paragon Plus Environment

Page 12 of 33

Page 13 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

The average error obtained with this focused MKP was 7.8 kinases, whereas that determined from random selection was 14.2 kinases; this is a 45.1% improvement. To put this improved average error in perspective, we note (once again referring to Table 1) that (for a random selection of 100000 trials) at: a MKP size of 50, the average error was 8.6 kinases; at a MKP size of 100, it was 5.7 kinases. As noted earlier, most of the improvement in average error occurs around a MKP size of 50. Therefore, as the average error for the focused MKP at a size of 20 falls within the range of average errors obtained for the MKP sizes between 50 – 100, we considered it to be “highly optimized.” This becomes even clearer when considering the focused MKP at a size of 12. Here, we found an enrichment of 46.1%, a similar value to that obtained for the focused MKP at a size of 20. However, the average error increased to 10.0 kinases, and is therefore less desirable as it falls outside the “highly optimized” range. Thus, we propose 20 kinases as the minimum size for a MKP. An artifact of the small size of the focused MKP is a strong “binning effect” of the calculated KHI at increments of 1/20 = 0.05. In other words, there is an inherent uncertainty in the exact precision. Nonetheless, we do find a good correlation of R2 = 0.786 between the KHI obtained with the full kinase panel of 414 kinases and that calculated from the focused MKP. To further explore this uncertainty in precision, we sliced a given “bin” into 9 slices, and then randomly – but symmetrically – displaced all points within the bin. In Figure 3, we show this correlation. Using this approach, a modest improvement in the correlation is obtained with R2 = 0.826; therefore, we estimate our overall correlation to be within the range R2 = 0.786 – 0.826.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. The correlation (R2 = 0.826) between the full kinase hit index (KHI) and that determined using the focused mini kinase panel (MKP).

ACS Paragon Plus Environment

Page 14 of 33

Page 15 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

In Figure 4, we show the focused MKP versus the full kinase panel distributions on the kinome tree drawn using the DiscoverX kinome (TREEspot) viewer (http://treespot.discoverx.com/TREEspot.aspx), and in Table 2 we list the individual kinases. Although, in general, the kinases of the focused MKP distribute themselves over most of the kinase families, we see that some families have more representation than others, while others have none. Most likely such a distribution would not have been anticipated a priori without the aid of the Monte Carlo simulations and the use of the average error as the optimization parameter.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4. The focused mini kinase panel (MKP) (left) versus the full kinase panel (right) distributions as they appear on the kinome tree drawn using the DiscoverX kinome (TREEspot) viewer.

Table 2. The kinases comprising the focused mini kinase panel (MKP). The kinase family classification is via DiscoverX kinome (TREEspot) Kinase Target GRK1 PRKCQ RIOK1 CASK CHEK2 CDC2L5 CDKL1 HIPK3 ABL1(Q252H)-phosphorylated EGFR(L747-T751del,Sins) CAMKK2 IKK-epsilon STK35 TLK1 PAK1 BRK PDGFRB PYK2 ALK BMPR1A

Kinase Family AGC AGC Atypical CAMK CAMK CMGC CMGC CMGC Mutant Mutant Other Other Other Other STE TK TK TK TK/Mutant TKL

While the focused MKP does give the smallest average error for the 100000 trials performed, there were other MKPs that gave an average error very close in value. In particular, we find that the second best MKP (with an average error of 8.0) gives R2 = 0.772 – 0.819, in near

ACS Paragon Plus Environment

Page 16 of 33

Page 17 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

coincidence with the focused MKP range. Moreover, we note that for the top 100 MKPs, the range in average error was 7.8 – 8.4, while the overall range (all 100000) was 7.8 – 37.6. In Figure 5, we show the distribution of average errors over all 100000 MKPs.

Figure 5. The distribution of average errors over all 100000 mini kinase panels (MKPs) generated to determine the focused mini kinase panel (MKP). Finally, we point out that the overlap in kinase members between the second best and focused MKPs consists of only two kinases. Thus, we postulate that the focused MKP is not

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

terminally unique, but rather sits as the best MKP within a cluster of other MKPs of similar average error. (See the supporting information for more details on the kinase and kinase family diversity of the focused set and its nearest neighbors). In Figure 6, we see the selectivity distribution of the full kinase panel versus the focused MKP (only the compound hits are shown; the compounds with a KHI of zero have been removed). Here we see that the focused MKP represents the fuller selectivity space very well. The main difference between them occurs in the more selective (lower KHI) region, where we find that the focused MKP simply has less coverage by comparison to the full kinase panel. However, as we tend towards the less selective (higher KHI) region, we find that the coverage between the distributions becomes identical.

Figure 6. The selectivity distribution of the full kinase panel versus the focused mini kinase panel (MKP) for the compound hits only; compounds with a zero KHI have been removed.

ACS Paragon Plus Environment

Page 18 of 33

Page 19 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

We can make sense of this by considering that for a “perfectly promiscuous” compound the MKP size and membership is not important, as its KHI will always be 1. At the other extreme, a “perfectly selective” compound will hit in a given MKP only if the MKP actually contains that particular kinase, where its KHI will then scale as 1 over the number of kinases in the MKP. Therefore, the real challenge of a viable MKP is to accurately represent the range within these extremes where the majority of compounds reside. To quantitatively understand this, one calculates the “ratio of distributions” by taking a MKP selectivity distribution and dividing it by the full 414 kinase panel distribution (e.g., Figure 6). This ratio (multiplied 100) then represents the % confidence of the MKP over the selectivity of the test set. In the supporting information (see Figure S1), we have done this for the best sets and their top 20 nearest neighbors (based on the average error) for the MKPs of panel size 20, 50, and 100. The results shown represent the average value for a given MKP size along with the one-sigma error bars.

CONCLUSIONS We analyzed a set of 3000 Janssen kinase inhibitors profiled at 414 kinases in the DiscoverX KINOMEscan to examine the necessity of using a full kinase panel (as opposed to using a smaller sized mini kinase panel (MKP)) to assess a compound’s selectivity. We quantified a compound’s selectivity by defining a hit cutoff (pKD ≥ 7 or DE ≥ 85% (at 1 µM)) and a corresponding kinase hit index (KHI) (the ratio of kinases hit divided by the total number of kinases tested). Monte Carlo simulations were then performed to randomly generate a series of MKPs over a range of sizes, and of varying kinase membership. A key quantity defined by us, and that served as our optimization parameter, was the average error. The average error of using

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

a smaller sized MKP compared with using the full kinase panel (e.g. 414 kinases) was evaluated over the course of 1.7 million Monte Carlo simulations. Primarily, we found that the average error in the KHI depends on the MKP size. As anticipated, an improvement in the average error is seen as the size of the MKP is increased. However, we also find that this improvement quickly drops off around a MKP size of 50 kinases. Thus, we propose 50 kinases as the maximum size for a MKP. Notably, one can reduce the average error by using particular combinations of kinases. With such a focused MKP of 20 kinases, we obtained an average error comparable to that of using a MKP size of 50 –100 kinases (chosen at random). Further, the KHI obtained with the focused MKP is in excellent agreement with that of the full kinase panel (R2 = 0.786 – 0.826). We consider our focused MKP to be “highly optimized,” and we propose 20 kinases as the minimum size for a MKP. Once again, this recommendation is made with the intention of using the focused MKP to primarily filter out the more promiscuous compounds. Compounds that are very selective are more dependent on the kinase membership since they hit a much smaller number of kinases. If some or all of the kinases they hit are not present, a reliable value of the KHI will be difficult to calculate. However, it is worth noting that in the latter case, the KHI of the compound will be zero, which then “flags” it as a more selective compound relative to the others in the test set. If one is interested in determining the KHI more reliably for selective compounds and using the KHI to filter out promiscuous compounds, we recommend going to a bigger focused MKP. (In the supporting information, we show the selectivity coverage of MKPs at panel sizes of 20, 50, and 100, and the resulting enrichment.)

ACS Paragon Plus Environment

Page 20 of 33

Page 21 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

We found other MKPs that gave an average error very close in value to that of the focused MKP. The second best MKP has an average error of 8.0 (and gives R2 = 0.772 – 0.819, which is proximal to the aforementioned correlation range obtained with the focused MKP). Moreover, the top 100 MKPs had an average error range of 7.8 – 8.4. Interestingly, we found that the second best and focused MKPs share only two of the same kinase members. Thus, we conclude that the focused MKP is not terminally unique. In other words, the focused MKP (the one with the smallest average error) is simply part of a cluster of other MKPs with similar average error, yet varying kinase membership. (We calculate the kinase and kinase family diversity of the top 20 MKPs (at a panel size of 20) in the supporting information). Although currently unexplored by us, it is reasonable to assert that although there is not necessarily high overlap between individual kinase members among MKPs of similar average error, there may be good overlap when comparing the similarities of the active sites of their kinase members. Such an analysis could provide a physical foundation for our results. Clearly, our study illustrates that one does not need a full kinase panel to reasonably assess a compound’s selectivity. Moreover, at the early stages of a therapeutic project, the use of a MKP can be very efficient and result in both cost and time savings. Nonetheless, at Janssen, we still profile later-stage compounds in a full kinase panel.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

AUTHOR INFORMATION Corresponding Author *Phone: +1-858-320-3375. E-mail: [email protected] ORCID Scott D. Bembenek: 0000-0003-1756-7188 Notes The authors declare no competing financial interest. The kinome trees for the TOC graphic were made using KinMap (http://www.kinhub.org/kinmap/)

ACS Paragon Plus Environment

Page 22 of 33

Page 23 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

REFERENCES 1

Brandt, P.; Jensen, A.J.; Nilsson, J. Small Kinase Assay Panels Can Provide a Measure of Selectivity. Bioorg. Med. Chem. Lett. 2009, 19, 5861–5863. 2 Karaman, M.W.; Herrgard, S.; Treiber, D.K.; Gallant, P.; Atteridge, C.E.; Campbell, B.T.; Chan, K.W.; Ciceri, P.; Davis, M.I.; Edeen, P.T.; Faraoni, R.; Floyd, M.; Hunt, J.P.; Lockhart, D.J.; Milanov, Z.V.; Morrison, M.J.; Pallares, G.; Patel, H.K.; Pritchard, S.; Wodicka, L.M.; Zarrinkar, P.P. A Quantitative Analysis of Kinase Inhibitor Selectivity. Nat. Biotechnol. 2008, 26, 127–132. 3 Martin, E.; Mukherjee, P.; Sullivan, D.; Jansen, J. Profile-QSAR: A Novel Meta-QSAR Method That Combines Activities Across the Kinase Family to Accurately Predict Affinity, Selectivity, and Cellular Activity. J. Chem. Inf. Model. 2011, 51, 1942–1956. 4 Bora, A.; Avram, S.; Ciucanu, I.; Raica, M.; Avram, S. Predictive Models for Fast and Effective Profiling of Kinase Inhibitors. J. Chem. Inf. Model. 2016, 56, 895-905. 5 Merget, B.; Turk, S.; Eid, S.; Rippmann, F.; Fulle, S. Profiling Prediction of Kinase Inhibitors: Toward the Virtual Assay. J. Med. Chem. 2017, 60, 474–485. 6 Davis, M.; Hunt, J.P.; Herrgard, S.; Ciceri, P.; Wodicka, L.M.; Pallares, G.; Hocker, M.; Treiber, D.K.; Zarrinkar, P.P. Comprehensive Analysis of Kinase Inhibitor Selectivity. Nat. Biotechnol. 2011, 29, 1046–1051. 7 Jacoby E.; Tresadern G.; Bembenek S.; Wroblowski, B.; Buyck, C.; Neefs, J.; Rassokhin, D.; Poncelet, A.; Hunt, J.; van Vlijmen, H. Extending Kinome Coverage by Analysis of Kinase Inhibitor Broad Profiling Data. Drug Discov. Today 2015, 20, 652–658. 8 Maggiora, G.M. On Outliers and Activity Cliffs – Why QSAR Often Disappoints. J. Chem. Inf. Model. 2006, 46, 1535–1535. 9 Graczyk P.P. Gini coefficient: A New Way to Express Selectivity of Kinase Inhibitors Against a Family of Kinases. J. Med. Chem. 2007, 50, 5773–5779. 10 Sciabola, S.; Stanton, R.V.; Wittkopp, S.; Wildman, S.; Moshinsky, D.; Potluri, S.; Xi, H. Predicting Kinase Selectivity Profiles Using Free-Wilson QSAR Analysis. J. Chem. Inf. Model. 2008, 48, 1851–1867. 11 Caffrey, D.R.; Lunney, E.A.; Moshinsky, D.J. Prediction of Specificity-Determining Residues for Small-Molecule Kinase Inhibitors. BMC Bioinformatics 2008, 9, 491–505. 12 Smyth L.A.; Collins I. Measuring and Interpreting the Selectivity of Protein Kinase Inhibitors. J Chem Biol. 2009, 2, 131-151. 13 Cheng, A.C.; Eksterowicz, J.; Geuns-Meyer, S.; Sun, Y. Analysis of Kinase Inhibitor Selectivity Using a Thermodynamics-Based Partition Index. J. Med. Chem. 2010, 53, 4502–4510. 14 Anastassiadis, T.; Deacon, S.W.; Devarajan, K.; Ma, H.; Peterson, J.R. Comprehensive Assay of Kinase Catalytic Activity Reveals Features of Kinase Inhibitor Selectivity. Nat. Biotechnol. 2011, 29, 1039–1045. 15 Posy, S.L.; Hermsmeier, M.A.; Vaccaro, W.; Ott, K.; Todderud, G., Lippy, J.S.; Trainor, G.L.; Loughney, D.A.; Johnson, S.R. Trends in Kinase Selectivity: Insights for Target Class-Focused Library Screening. J. Med. Chem. 2011, 54, 54–66. 16 Uitdehaag, J.C.M.; Zaman, G.J.R. A Theoretical Entropy Score As a Single Value to Express Inhibitor Selectivity. BMC Bioinformatics 2011, 12, 94–104.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment

Page 24 of 33

Page 25 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 1

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2

ACS Paragon Plus Environment

Page 26 of 33

Page 27 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 3

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4

ACS Paragon Plus Environment

Page 28 of 33

Page 29 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 5

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 6

ACS Paragon Plus Environment

Page 30 of 33

Page 31 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table of Contnet (TOC):

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 1

ACS Paragon Plus Environment

Page 32 of 33

Page 33 of 33 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 2

ACS Paragon Plus Environment