Performance of Dark Chemical Matter in High Throughput Screening

Oct 20, 2016 - High-throughput screening (HTS) has been a major source of drug discovery leads in the past 2 decades.(1, 2) During the past 10 years a...
0 downloads 0 Views 3MB Size
Article pubs.acs.org/jmc

Performance of Dark Chemical Matter in High Throughput Screening Ingo Muegge* and Prasenjit Mukherjee Boehringer Ingelheim Pharmaceuticals Inc., 900 Ridgebury Road, P.O. Box 368, Ridgefield, Connecticut 06877-0368, United States S Supporting Information *

ABSTRACT: A statistical analysis of 203 high-throughput screens was conducted studying the propensity of small molecules in the Boehringer Ingelheim screening deck to show biological activity after having tested as inactive previously in a growing number of screening assays. Dark chemical matter (DCM) compounds, which have been tested and found to be inactive in 50 or more assays, exhibit hit rates that are comparable to those of compounds tested in much fewer assays. Only compounds tested as inactive in 125 or more assays started showing a hit rate deterioration of up to 40% compared to compounds tested in less than 25 assays. The observed large number of DCM compounds in the BI screening deck is found to be in line with the expected fraction of DCM calculated based on a probability analysis. The analysis suggests not only that DCM compounds have the chance to occasionally provide valuable hits associated with higher selectivity as recently shown by Novartis (Nat. Chem. Biol. 2015, 11, 958) but that there is little compelling reason to exclude DCM compounds from screening decks in favor of previously untested or less tested compounds.

1. INTRODUCTION High-throughput screening (HTS) has been a major source of drug discovery leads in the past 2 decades.1,2 During the past 10 years at Boehringer Ingelheim in part driven by combinatorial chemistry,3 HTS compound collections have grown to more than one million compounds at many large pharmaceutical companies engaged in small molecule drug discovery4 as well as in academia.5 For a long time, HTS compound pool optimization has been influenced by distributions of physicochemical properties of small molecules associated with desirable traits such as oral bioavailability, good solubility, and cell permeability of screening hits and lead compounds.6−10 In recent years drug discovery efficiency and effectiveness have been discussion points for the pharmaceutical industry.11 Reducing the cost of HTS and associated hit triaging as well as the desire to run a variety of assays with higher information content including phenotypic screens has led to new HTS strategies involving a smaller number of compounds. Screening pools at pharmaceutical companies are being revisited improving the quality of compounds using analytical methods and also eliminating redundancies in chemical space.12 Other strategies of reducing the number of compounds to be screened include the stratification of screening compounds allowing for diversity-based or druglikeness13 based selections of screening plates for subscreens.14−16 The vast chemical space of drug-size small molecules has been estimate to include ∼1060 compounds.17 In comparison, the number of relevant bioactive small molecules with relevance for drug discovery is very small. Therefore, the odds of adding compounds to a screening deck that will ever show bioactivity against any target of interest seem to be quite long. Nonetheless, the average success rate of HTS campaigns is about 50%.4 Still, there are a large number of HTS campaigns © 2016 American Chemical Society

where no hits can be identified involving to a larger extent intractable targets such as protein−protein interactions.18 Although there is a significant bias in today’s screening collections toward biologically active compounds already,19 attempts have been made to increase the number of hits in HTS screens by selecting subsets of compounds with increased biodiversity (the number of unique targets modulated by all compounds in a screening deck). Petrone et al. showed that biodiverse screening sets outperform structurally diverse subsets in hit rate and number of hit scaffolds for a variety of target classes.20 One would conclude that an enrichment of a screening deck with future screening hits can therefore be reasonably achieved by increasing biodiversity. However, the goal of many screens is often to find novel chemical matter with high selectivity and optimizable properties. Therefore, for a screening deck one may not choose exclusively compounds with prior bioactivity observed. Rather, structural diversity and novelty concepts may continue to play a role in designing screening decks. Following this thought the question arises of how to assess the utility of compounds that have not been associated with activity against a target of interest yet. This question is, of course, addressed with every new HTS that is carried out with these compounds. However, with each screen the “darkness” of the compounds increases if they do not hit. Andrew Pope defines “dark” chemical matter (DCM) as compounds that have been screened in 50 or more HTS campaigns and did not show an effect of more than 50%.21 Dark chemical matter is being observed prominently in large screening decks of pharmaceutical companies. The Novartis deck, for instance, is reported to contain 41% of compounds Received: July 13, 2016 Published: October 20, 2016 9806

DOI: 10.1021/acs.jmedchem.6b01038 J. Med. Chem. 2016, 59, 9806−9813

Journal of Medicinal Chemistry

Article

found inactive previously (Table 1). Compounds found inactive in 1− 24 preceding HTS campaigns were combined in bin 1 (Dark1−24) for each new HTS screen. Compounds with higher numbers of previous HTS inactivity findings were grouped into bins 2−8 (Dark25−29, Dark50−74, Dark75−99, Dark100−124, Dark125−149, Dark150− 174, and Dark175−202, respectively). Previously untested compounds were treated in a separate bin. Note that as a consequence of this binning, there are fewer HTS data sets containing compounds with a higher number of previously inactive screens (see last column of Table 1). The number of participating compounds for each bin varies significantly among HTS screens. While Table 1 lists the median number of participating compounds, Supporting Information Figure S1 shows the distribution of participating compounds for bin Dark1− 24 for 202 of the 203 HTS assays (the first assay contains no Dark1− 24 compounds). The chronological setup of the HTS analysis allows for a prospective view on DCM hit rates taken from the perspective of each HTS campaign. This setup permits us to draw relevant conclusions on expected DCM hit rates for future screens. 2.1. Elimination of Activity Bias. Not unexpectedly, the summary of historic screening campaigns at BI (Figure 1) illustrates that some target classes such as kinases and GPCRs are represented much more prominently than others. Especially among kinases but also among other protein target classes there is the propensity of cross activity of ligands. For instance, the marketed drug imatinib hits a number of kinases including BCR−ABL1, PDGFR, and KIT; the marketed drug dasatinib, a pan-tyrosine kinase inhibitor, hits the Src kinase family (SRC, YES, FYN and LYN), BCR−ABL1, KIT, PDGFR, BMX, the Eph family, as well as VGEFR 2 and TIE2.23 Because of the increased tendency of an active small molecule to hit other target family members, there is a bias leading to higher hit rates of previously active compounds compared to dark or untested compounds in a statistical analysis of hit rates in our data set. Figure 2 illustrates this bias. The figure shows a distribution of compounds binned by how often they hit in the 203 HTS assays (white bars). In addition, we show the distribution of activity events caused by compounds binned the same way (black bars). Activity events are calculated by multiplying the number of compounds by the number of assays they hit. For instance each compound in bin 7 will generate seven activity events. We observe that more than 60% of compounds hit exactly once. However, they account for only slightly more than 30% of activity events. Compounds hitting in multiple assays contribute overproportionally to the total number of activity events. This bias, although associated with a smaller number of compounds, leads to a significantly higher hit rate of previously active chemical matter. Figure 3 shows how the median hit rate in the 203 HTS assays increases when compounds hitting in multiple assays are included. The hit rate of assays with an unrestricted number of activity events per compound was found to be 0.18, while the hit rate of compounds that hit in exactly one assay is more than 3 times smaller (0.05). To remove any bias on hit rates due to compounds hitting in multiple assays, we considered only compounds that are measured as active no more than once across the 203 HTS assays. Compounds hitting in two or more assays were excluded from the analysis throughout all analyses presented in this article. 2.2. Calculation of Expected Hit Rates of DCM as a Function of the Number of HTS Assays. The majority of compounds in BI’s screening deck are DCM compounds. To determine if this seemingly surprising fact points to a generally lower propensity of DCM to hit in an assay, we decided to compare observed and expected DCM fractions after a number of HTS screens. To this end we calculate a probability of observing a fraction of DCM as follows. Assuming that each compound i has the same chance of being active in a given HTS screen k, the probability pi of a compound to remain dark after participating in n HTS screens with the individual HTS hit rates, hk, can be expressed as

tested in 100 or more HTS campaigns without demonstrating activity against a single target.20 In fact, Petrone et al. used this finding as an argument to focus on biodiverse compound subsets for HTS.20 The questions in an age of smaller, more efficient screening decks are indeed, “Will these compounds ever hit?” At what point does the increased risk that a compound will not hit in the next screen outweigh the desirability of increased potential selectivity if the compound does hit eventually? A recent report by researchers from Novartis started to shed some light on these questions. Using a series of prospective examples, Wasserman et al. showed that DCM sometimes generates potent hits with unique activity and selectivity profiles that make them interesting starting points for lead optimization.22 The work presented here goes beyond demonstrating that DCM can generate valuable hits. We present a rigorous statistical analysis of 203 high-throughput screening (HTS) campaigns at Boehringer Ingelheim (BI) conducted between 1999 and 2012 suggesting that hit rates of DCM compounds are comparable to those of compounds that have not been tested frequently before. We will also show that a comparison of hit rates against bioactive compounds at large is less useful because the latter is largely governed by compounds that hit in multiple assays, thereby significantly inflating assay hit rates.

2. MATERIALS AND METHODS A total of 203 HTS data sets comprising between 406 000 and 985 000 compounds obtained at BI between 1999 and 2012 were used for the analysis. The HTS campaigns included 50 kinases, 39 GPCRs, 8 ion channels, and other target families (Figure 1). We define activity of a

Figure 1. Distribution of target classes screened in 203 HTS campaigns at BI between 1999 and 2012. compound as showing IC50, EC50, KI, or KD of less than 10 μM in the relevant concentration response assay following hit confirmation. A total of 1.2 million individual compounds were included in the analysis. To unambiguously classify DCM participating in each screen, the HTS data sets were ordered chronologically starting with the oldest. Analyzing the HTS screens in such order is essential for monitoring the fate of DCM compounds as they get tested in subsequent assays either increasing in darkness (number of screens without hitting) or leaving the DCM space by registering as a hit. As HTS run date we arbitrarily assigned the first date on which 2000 compounds were tested in single concentration. Only compounds with registration dates prior to the respective HTS run date were included in the analyses. Concentration−response data of participating HTS compounds were only included if generated after the HTS run date avoiding potential artifacts coming from up-front testing of known reference ligands. For each HTS campaign, participating compounds were grouped into eight bins according to the number of the HTS assays they were tested and

n

pi =

∏ (1 − hk) k=1

9807

(1) DOI: 10.1021/acs.jmedchem.6b01038 J. Med. Chem. 2016, 59, 9806−9813

Journal of Medicinal Chemistry

Article

Table 1. Median Number of Compounds and Hit Rates for DCM of Different Degrees of Darkness DCM bin number

DCM bin name

criteriaa

1 2 3 4 5

Dark1−24 Dark25−49 Dark50−74 Dark75−99 Dark100−124

6

Dark125−149

7

Dark150−174

8

Dark175−202

9

untested

not active in 1−24 previous screens not active in 25−49 previous screens not active in 50−74 previous screens not active in 75−99 previous screens not active in 100−123 previous screens not active in 125−149 previous screens not active in 150−174 previous screens not active in 175−202 previous screens compounds not previously tested

median number of compounds

median hit rate

number of HTS screens containing DCMb

65410 62162 61458 70893 120835

0.051 0.049 0.048 0.040 0.048

202 178 153 127 103

151397

0.046

78

184729

0.042

53

310701

0.029

28

6670

0.042

82

a

In all cases at least 1000 compounds of a DCM bin had to be present in the screen to be included in the analysis. In addition to HTS screens without any participation of a given DCM bin, this rule affected only the “untested” bin. The somewhat arbitrary cutoff was chosen to avoid cases in which the hit rates are inflated (or deflated) due to larger statistical fluctuations associated with low numbers of participating compounds. As a result, the number of HTS screens with participation of “untested” compounds was reduced from a possible 203 to only 82. bThe number of HTS screens with DCM participation decreases with increasing degree of “darkness”. For instance, Dark1−24 compounds participated in all but the first screen, Dark25−49 compounds were observed from the 26th screen onward reducing the number of HTS screens used to analyze their hit rates by 25 (from 203 to 178). Dark175−202 compounds emerged only after 175 screens leaving only 28 HTS data sets to be analyzed. To assess the propensity of DCM compounds to hit in future assays, we calculate ⟨pi⟩ as a function of the number of screens they participated in. To achieve reasonable statistics in assessing the hit rates of DCM, we bin compounds with participation in n screens into eight bins ranging from bin 1 (1−24 HTS screen participations) to bin 8 (more than 174 HTS participations). The hit rates for each assay, hk, are calculated independent of the binning introduced above. However, compound participation in each of these bins varies significantly. For instance, there are far more compounds present in bin 8 than in any other bin (Table 1). The unequal distribution of compounds across bins introduces a bias to the calculation of the expected probabilities ⟨pi⟩ of finding DCM. Here is an example of how this bias manifests itself. The hit rate of compounds in bin 8 is significantly lower than that of bin 1 (Table 1). Due to the large number of bin 8 compounds, the overall hit rate for assays with bin 8 compound participation will therefore be lower than expected from an equal distribution of compounds across all bins. The lower hit rates, hk, biased by an overproportional number of bin 8 compounds with lower hit rates, lead to a higher expected probability of DCM in bins 1−7. To remove this bias, we introduce scaling factors SFb for each bin b ensuring that the influence of hk as calculated for the entire hit set independent of bin membership of compounds does not bias the expected hit rates in the individual bins.

Figure 2. Distribution of occurrences of active compounds (white) and number of activity events (black) across 203 HTS campaigns.

SFb =

Ntotal Nb

(3)

Ntotal is the total number of compounds in all bins, and Nb is the number of compounds in bin b. Each compound i, based on its unique membership in a bin, can now be assigned a correction factor, SFi, where SFi = SFb if compound i is a member of bin b. Using these correction factors, we calculate an adjusted expected hit rate hk′ for each screen k as N (k)

Figure 3. Tukey box plot of hit rates across 203 HTS campaigns for compounds with a decreasing maximum number of HTS assay hits.

hk′ =

1 N

N

∑ pi i=1

SFl (k) Nall(k) ∑m = 1 SFm(k)

(4)

where Nact(k) is the number of hits in screen k, Nall(k) is the number of compounds participating in screen k, SFl(k) is the correction factor for each active compound l in screen k, and SFm(k) are the individual correction factors for all compounds m (actives and inactives) in screen k. With these adjusted assay hit rates hk′ we now calculate adjusted probabilities of inactivity of compound i after participating in n HTS screens as

We calculate the average probability for a collection of N compounds to be inactive after a number of HTS campaigns as

⟨pi ⟩ =

∑l =act1

(2) 9808

DOI: 10.1021/acs.jmedchem.6b01038 J. Med. Chem. 2016, 59, 9806−9813

Journal of Medicinal Chemistry

Article

∼30% of all DCM1−24 compounds that have not hit in the previous 24 assays show activity now and are no longer dark thereby depleting the DCM1−24 pool. Also, comparing the slopes of the boundaries between the different DCM bins gives a first visual impression of the rates with which DCM is hitting in future assays. To compare the hit rates of compounds with different degrees of darkness, we opted for analyzing the median hit rates (rather than averages) due to large fluctuations in the hit rates and the finding that the hit rates do not show a normal distribution (Figure 5, Supporting Information Figure S2). Figure 6 shows the median hit rates of chemical matter of

n

pi′ =

∏ (1 − hk′) k=1

(5)

and ⟨pi′⟩b =

1 Nb

Nb

∑ pi′ i=1

(6)

as average adjusted probability of inactivity for compounds belonging to bin b.

3. RESULTS Figure 4 illustrates the distribution of previously inactive untested chemical matter for a chronologically ordered series of

Figure 4. Distribution of DCM compounds for a chronological set of HTS campaigns conducted at BI between 1999 and 2012. Compounds of different degrees of darkness are colored starting on the left with compounds inactive in 1−24 previous screens followed on the right by compounds inactive in 25−49, 50−74, 75−99, 100−124, 125−149, 150−174, and more than 174 screens, respectively. New, untested compounds entering each screen are colored in black.

Figure 6. Median HTS hit rates of DCM for 203 HTS campaigns at BI between 1999 and 2012. The median number of compounds associated with each bin is given in parentheses. Significant differences in median hit rates in comparison to the Dark1−24 bin are shown with one, two, or three asterisks corresponding to p values of