Informing the Selection of Screening Hit Series with in Silico

Apr 18, 2017 - Following a number of years working as an ADME research scientist at Eli Lilly and Company, she joined Merck & Co., Inc. in Kenilworth,...
0 downloads 6 Views 1MB Size
Perspective pubs.acs.org/jmc

Informing the Selection of Screening Hit Series with in Silico Absorption, Distribution, Metabolism, Excretion, and Toxicity Profiles Miniperspective John M. Sanders,*,†,⊥ Douglas C. Beshore,*,‡,⊥ J. Christopher Culberson,† James I. Fells,† Jason E. Imbriglio,‡ Hakan Gunaydin,† Andrew M. Haidle,‡ Marc Labroli,‡ Brian E. Mattioni,†,∥ Nunzio Sciammetta,‡ William D. Shipe,‡ Robert P. Sheridan,† Linda M. Suen,‡ Andreas Verras,† Abbas Walji,‡ Elizabeth M. Joshi,§ and Tjerk Bueters§ †

Modeling & Informatics, ‡Discovery Chemistry, and §Pharmacokinetics, Pharmacodynamics, and Drug Metabolism, Merck & Co., Inc., Kenilworth, New Jersey 07065, United States

ABSTRACT: High-throughput screening (HTS) has enabled millions of compounds to be assessed for biological activity, but challenges remain in the prioritization of hit series. While biological, absorption, distribution, metabolism, excretion, and toxicity (ADMET), purity, and structural data are routinely used to select chemical matter for further follow-up, the scarcity of historical ADMET data for screening hits limits our understanding of early hit compounds. Herein, we describe a process that utilizes a battery of in-house quantitative structure−activity relationship (QSAR) models to generate in silico ADMET profiles for hit series to enable more complete characterizations of HTS chemical matter. These profiles allow teams to quickly assess hit series for desirable ADMET properties or suspected liabilities that may require significant optimization. Accordingly, these in silico data can direct ADMET experimentation and profoundly impact the progression of hit series. Several prospective examples are presented to substantiate the value of this approach.



INTRODUCTION For decades, corporate and academic drug discovery groups have used robotic high throughput screening (HTS) techniques to enable the exploration of large compound collections for specific and desired biochemical and cell-based activities.1,2 Primary screens of millions of compounds are typically conducted using one concentration and a single replicate (Figure 1). Activity in this primary screen is used to advance a significantly smaller subset of the initial library (e.g., typically ∼1%) to a confirmation screen, often carried out at the same concentration as the primary screen but with additional replicates. Beyond the confirmation screen, several replicates of 8- or 10-point concentration response curves are usually generated for an even smaller subset of the collection and these experiments are often performed using additional and orthogonal proteins and formats in order to better understand selectivity, pharmacology, and mechanisms of action.3,4 This frequently marks the end of the HTS campaign itself and is accompanied by the transfer of a substantial dossier to the discovery program team, which must then decide how to © 2017 American Chemical Society

allocate biological and chemical resources in order to identify and prioritize chemical starting points for a drug discovery program. The first task the team must complete is biological validation of the hits identified during the screening campaign (Figure 1). At our company, this process submits samples of hit molecules that are available within our sample collection to biochemical, cell-based, and biophysical assays of on- and off-target systems and ensures that the activity observed in the HTS campaign arises from bona fide ligand−target interactions. Following biological validation, samples must retain the desired biological activity following repurification or resynthesis in order to be considered chemically validated (Figure 1). This represents the first step of post-HTS wet chemistry support and often requires a significant time and resource investment, necessitating a careful selection process before individual compounds receive this level of attention. Received: October 26, 2016 Published: April 18, 2017 6771

DOI: 10.1021/acs.jmedchem.6b01577 J. Med. Chem. 2017, 60, 6771−6780

Journal of Medicinal Chemistry

Perspective

drug candidate in a timely fashion.13,14 Working from the hypothesis that the good ADMET properties of a promising hit series would be retained during lead optimization, as drug candidates often strongly resemble screening hits, we reasoned that the identification of compound classes possessing favorable overall property profiles during the initial assessment of HTS hits would facilitate and accelerate drug discovery efforts.15 Rather than rely on simple physiochemical parameters (i.e., log P), we have taken the approach of using QSAR models for ADMET end points to generate in silico ADMET (isADMET) profiles for hit series in order to inform teams and enable more thorough characterizations of their HTS hits. These profiles are useful for the identification of series with promising ADMET properties, the identification and assessment of tractability around ADMET liabilities, and the compilation of a portfolio of hits with different strengths and weaknesses to hedge against unforeseen complications at later discovery stages. When these profiles are leveraged early in the lead identification process, prior to a discovery team committing significant resources and time to hit series (Figure 1), isADMET profiles can meaningfully alter the perceived tractability of screening hits and, we assert, alter the course of a drug discovery program effort. To date, there are a limited number of reports describing the in silico characterization of HTS hits using QSAR models built for specific ADMET end points in addition to (or in place of) routinely calculated physical properties,16 but we are unaware of descriptions of more complete in silico profiling. Described herein is a strategy that we have developed to broadly characterize chemical starting points for drug discovery programs by integrating detailed isADMET properties with decisionenabling visualizations to inform discovery teams on series selection and potential solutions to ADMET issues. These visualizations allow for the rapid digestion of large amounts of data and facilitate the identification of promising chemical series and/or liabilities that these series may possess. Several prospective examples are presented to substantiate the value of the approach.

Figure 1. Schematic depicting the work flow and decision points during and after a typical HTS campaign. Orange boxes indicate assays conducted in a high-throughput format; blue boxes indicate assays conducted by the program team; the open arrow indicates the transition from the HTS screening group to the local project team.

In order to inform decision-making in the processes described above, HTS screening data are presented to drug discovery teams together with information on compound purity, physical properties, selectivity through prior HTS and non-HTS testing histories, ADMET properties, mechanism of action, and chemical structure-related data. These data sets, while extremely powerful and informative for driving decisions during the biological and chemical validation steps, have not eliminated the need for significant chemistry investments to optimize compounds during lead optimization. In fact, it is nearly always the case that compounds identified through HTS require several years of chemical modification and biological experimentation before a suitable drug candidate is identified.4 One of the challenges of lead optimization is the discovery of a compound that possesses the required combination of potency, selectivity, ADMET, safety, formulation, and manufacturing properties.5 Of these criteria, a substantial portion of the optimization effort is often devoted to solving ADMET issues. While understanding the ADMET properties of hit series is widely recognized as being key to a successful optimization campaign,6−8 it is often challenging to compare hit series to one another at the earliest stages of lead identification because only a small fraction of the compound collection has any historical measured ADMET data.9 As a result of this, teams use calculated parameters7 such as molecular weight, log P, polar surface area (PSA), Lipinski’s Ro5,10 and their intuition about chemical structure and viability as surrogates for experimental ADMET data during hit prioritization.11 The generation of experimental ADMET data is often delayed until hit series have been expanded and chemically evolved into more “mature” series at which time they frequently reveal the need for extensive ADMET optimization in order to return a pipeline candidate with an acceptable dose and dosing frequency.12 Because resources are finite and chemical space is relatively unlimited, it is imperative that discovery program teams selectively invest available resources in the chemical matter having the highest likelihood of delivering an efficacious and safe



METHODS Over the years, our company has internally developed a number of QSAR models based on ADMET data obtained for project compounds in our corporate collection. Models are available for human and rat intrinsic clearance (CLint), human P-glycoprotein (P-gp)-mediated efflux, and apparent permeability (Papp),17 in addition to others.18 These models are typically built on highly diverse data sets comprising tens of thousands of unique chemical structures, and we have demonstrated that these models offer good performance for the prospective evaluation of chemical matter (data not shown). While these models are not necessarily quantitatively accurate, they are in all cases able to meaningfully enrich for categorical compound properties. For instance, when predicting the human P-gp efflux ratio for a given compound, a prediction ratio greater than 4 has been shown, through subsequent measurement, to significantly enrich for experimentally determined P-gp efflux substrates (Figure 2A). An accurate numerical prediction of the efflux ratio is not necessary for the purposes described herein, and this model offers significant value to teams in the categorical classification of a molecule as a P-gp efflux substrate or nonsubstrate. Although this example is particularly relevant for programs directed at CNS targets, we generally find that the categorical classification afforded by our QSAR models adds significant value during the early phases of a program when measured data are often scarce, 6772

DOI: 10.1021/acs.jmedchem.6b01577 J. Med. Chem. 2017, 60, 6771−6780

Journal of Medicinal Chemistry

Perspective

using our internal P-gp QSAR model unless a wide PSA middle bin of 80−120 Å2 is adopted (now capturing 1214 compounds, Figure 2C, versus the 379 compounds contained by the middle bin of predicted P-gp efflux ratios in Figure 2A). This enrichment comes at the expense of the numbers of true negative and true positive P-gp substrates falling into low- and high-PSA ranges (Figure 2A and Figure 2C) and limits the utility of using PSA as a single-parameter decision-making tool for this end point. Findings of this nature motivated us to utilize explicit QSAR predictions for ADMET properties instead of exclusively making use of physical property-based approaches. To account for metabolism, we have utilized intrinsic clearance (CLint) rather than apparent clearance because correcting for nonspecific binding to microsomal proteins (fu,mic) has been shown to improve in vitro−in vivo correlations.20 In evaluating in silico and in vitro CLint, the free fraction of compound is incorporated to give an unbound CLint using the following relationship: CL int = CL int,app/fu,mic

where CLint,app represents the apparent loss of parent from a metabolic stability assessment in microsomes of the appropriate species and f u,mic is either the measured microsomal unbound fraction or a predicted value. In this work, measured CLint values are calculated from measured CLint,app and f u,mic values, whereas predicted CLint values are calculated from QSAR predictions of CLint,app and QSAR or physical-properties-based predictions of f u,mic.21 In addition to using QSAR models for microsomal CLint, we also make use of predicted rat in vivo intrinsic hepatic clearance, CLint, in order to account for compounds whose primary clearance pathway may not be driven by cytochrome P450-mediated oxidative metabolism. In this case, in vivo CLint is back-calculated from the well-stirred model of hepatic clearance using QSAR predictions for in vivo plasma clearance (CLp) and unbound fraction in plasma (f u,p) and an assumed blood-toplasma ratio of unity.22 We have defined property guidelines in our isADMET battery that we believe are appropriate to inform teams on HTS chemical matter (Table 1). Predicted values for compounds are displayed in a heat map, where color gradients are defined by desired and undesired ranges (Table 1) for a given end point (Figure 3). We favor visualization of these data via heat maps because they facilitate the interpretation of large data sets and are the standard practice of genome-wide association studies (GWAS) and other large data analysis platforms.23 Desirable predictions are rendered in green, moderate or indeterminate predictions are rendered in yellow, and undesirable predictions are rendered in red, though we emphasize categorical enrichment over absolute numerical accuracy. The value of this approach is that it permits dozens of series to be rapidly assessed and qualitatively compared to one another. The gradient values shown in Table 1 were largely based on the experience of the authors and institutional knowledge regarding what property space is generally considered

Figure 2. Categorization accuracy of human P-gp efflux ratio by the QSAR model for human P-gp efflux ratio (A) and by PSA (B, C). QSAR data shown are prospective prediction values made for 2403 compounds measured between January 2015 and March 2016 for compounds from more than 50 therapeutic programs. Bar charts are colored by experimental P-gp efflux ratio in a MDR1-overexpressing LLC-PK1 cell line for compounds demonstrating good Papp and a lack of endogenous transport in a control LLC-PK1 cell line.

prohibitively expensive to generate, or not able to be generated due to a compound being unavailable. Although physical properties have long been used as surrogates for ADMET parameters, it has been our experience, and that of others,16 that QSAR models can outperform simple physical property-based predictions of a specific end point while providing key structural insights. For example, we have compared categorical predictions made by our human P-gp QSAR model with predictions obtained using PSA calculations (Figure 2), which are routinely applied for this purpose.19 While PSA is useful for enriching for P-gp nonsubstrates and substrates in lowand high-PSA ranges, respectively (Figure 2B), the classification accuracy does not rise to the level of performance observed when

Table 1. isADMET Heat Map Color Gradient Set Points Used for Lead Identification Projects in silico model

desired value (green)

moderate value (yellow)

undesired value (red)

Papp (10−6 cm·s−1) P-gp efflux ratio human microsomal CLint (mL·min−1·kg−1) rat microsomal CLint (mL·min−1·kg−1) rat in vivo CLint (mL·min−1·kg−1)

20 1 1 1 1

10 3 125 250 2500

1 5 250 500 5000

6773

DOI: 10.1021/acs.jmedchem.6b01577 J. Med. Chem. 2017, 60, 6771−6780

Journal of Medicinal Chemistry

Perspective

for P-gp efflux as green (desirable) whereas a program seeking minimally CNS-penetrant matter would render these values as red (undesirable). Failure to understand and modify these gradients on a program-specific basis can result in the identification of chemical matter with an undesirable or intractable initial property profile. We have also chosen to treat HTS hits as chemical series by grouping similar compounds together and displaying the isADMET profile of a given series as a single heat map. Series can be defined in myriad ways, but prior to biological validation we assemble series using hierarchical clustering of titrated compounds and prior to chemical validation we use topological similarity searching in our corporate collection to expand each HTS hit into a comprehensive series. The motivation for using different methods at different stages is related to convenience; at early stages, the inclusion of isADMET profiles for hits and their neighbors can be burdensome due to a lack of hit attrition (Figure 1). At later stages, for example prior to chemical validation, leveraging the full sample repository to build comprehensive heat maps is preferable to restricting heat maps to active molecules uncovered during the screen. In many cases, the molecules that are titrated during an HTS campaign are a subset of active molecules identified in the campaign. In order to match the capacity of subsequent screening steps, representatives of active series are chosen for assay as the goal of the campaign is to identify chemical matter rather than to fully characterize its SAR. Further, our screening collection is a subset of the complete sample repository and a large number of molecules would not contribute to heat maps constructed from screened compounds. By using topological similarity searching to define the chemical space surrounding HTS hits, we capture many of the compounds that would be considered by a chemistry team to be relevant to a decision-making process about a series. This is beneficial even if neighbor molecules are inactive or have unknown activity at the biological target of interest since analogs can inform on the sensitivity of isADMET properties to perturbations of chemical structure and thus the perceived tractability of ADMET properties in each series. It is our assumption that series that are structurally diverse and well-behaved (i.e., they exhibit desirable predicted properties independent of chemical permutation) will retain desirable ADMET properties during an optimization campaign. Put differently, these series should not require extensive ADMET optimization relative to a series displaying extensive ADMET liabilities at the time of its identification. These concepts are foundational to fragment-based drug discovery, which emphasizes ligand efficiency, and are equally applicable to HTS starting points if attention is paid to ADMET properties.24

Figure 3. Representative isADMET heat maps for single hit series displaying potentially desirable (A) and undesirable (B) predicted property profiles. Rows represent individual compounds. In the legend, the red coloring used for in vivo CLint predictions of 0 accounts for compounds with predicted CLp values that exceed hepatic blood flow.

to be desirable or undesirable for each end point. For example, molecules with human in vitro CLint values greater than 250 mL· min−1·kg−1 may require significant effort to identify a clinical candidate with adequate bioavailability and half-life to support a once-daily, low dose oral therapy. Similarly, compounds with Papp values exceeding 20 × 10−6 cm·s−1 will frequently have unrestricted distribution and complete absorption, whereas values below 10 × 10−6 cm·s−1 may be associated with disposition complicated by drug transporters and/or incomplete and variable absorption. Coloring can also reflect experimental variability of the assays. For example, we believe that a predicted P-gp efflux ratio of >5 is more indicative of a bona fide P-gp substrate than is a prediction of ∼3. Accordingly, we use yellow in the color gradient to indicate a middle ground prediction space where there is less certainty and some caution is warranted (Figure 2A). In practice, the coloring of these properties requires adjustment by program teams to address program-specific needs. For instance, a CNS program would render small predicted ratios



RESULTS AND DISCUSSION At the outset of our investigation, we first attempted to address whether or not the use of isADMET profiles would result in the identification or prioritization of different chemical matter than other methods of hit selection. To address this, we retrospectively analyzed the output of several HTS campaigns and compared the decisions that were made by the project teams that prosecuted these campaigns to decisions that would be made on the basis of standard metrics such as the combination of ligand binding efficiency (LBE) and lipophilic ligand efficiency (LLE)25 or the attractiveness of isADMET profiles. By way of example, project 1 was a drug discovery program that pursued a small molecule therapeutic for the treatment of thrombosis. As the project 1 team analyzed their HTS hits, they 6774

DOI: 10.1021/acs.jmedchem.6b01577 J. Med. Chem. 2017, 60, 6771−6780

Journal of Medicinal Chemistry

Perspective

discovery of a small molecule therapeutic targeting a protein located in the CNS. In order to address one of the principal challenges of CNS programs, we initially chose to evaluate hit series based upon their propensity for central penetration coupled with attractive ADMET properties, leveraging QSAR models for human P-gp efflux ratio, Papp, human microsomal CLint, rat microsomal CLint, and rat in vivo CLint together with the coloring criteria outlined in Table 1. Project 2 provided an opportunity to directly and prospectively compare decision-making with and without isADMET profiles. Following biological validation, the program team began selecting compounds for chemical validation using physical properties (e.g., MW, ClogP, PSA, etc.) and on-target potency in addition to subjective characteristics such as chemical attractiveness and synthetic tractability. In addition to the compounds initially chosen for chemical validation, a supplemental group of chemical series was identified based solely on a combination of potency and isADMET profiles generated for the hit molecules and related chemical matter following the method described above. Prior to chemical validation, the project 2 team submitted representative compounds from each of the series chosen by both methods to a panel of in vitro ADMET-related assays (P-gp efflux, Papp, and microsomal CLint) to confirm their isADMET profiles. As shown in Figure 5, most of the series identified by

used LBE and LLE in their decision-making, and as a result the compounds that were biologically validated all have LBE ≥ 0.26. For the retrospective analysis, we made the following categorical classifications based on LBE and LLE: biologically validated hits with LLE ≥ 3.5 and LBE ≥ 0.26 or LBE ≥ 0.36 and LLE ≥ 2 were classified as “good”; hits with 0.26 ≤ LBE ≤ 0.36 and 2 ≤ LLE < 3 or LBE ≥ 0.36 and LLE < 2 were classified as “moderate”; hits with LBE < 0.36 and LLE < 2 were classified as “poor.” We compared the biologically validated hits, now classified in terms of their LBE and LLE values, with selections made based on the appearance of isADMET profiles. Interpretation of isADMET heat maps is subjective, but our observations are that teams can readily visualize and identify series as being desirable (Figure 3A) and undesirable (Figure 3B) once they have generally familiarized themselves with the chemical matter; the moderate series generate more discussion in terms of their overall classification. Figure 4 shows how the hit series from project 1

Figure 4. Retrospective comparison of different hit selection methods for project 1. Data points shown are representative molecules from distinct chemical series that were biologically validated. Representative molecules were chosen to indicate the highest level of activity reached by a given series during project 1. Data points are jittered for clarity. Figure 5. Plot of human microsomal CLint vs human P-gp efflux ratio for project 2 hits. Each data point reflects a representative molecule chosen to characterize a unique chemical series. Blue data points represent compounds chosen by the chemistry team for chemical validation; squares represent compounds chosen for their isADMET properties. Note that there are several overlapping compounds identified as blue squares.

scored, with the magenta data points representing series that received synthetic effort to enumerate SAR and perform other characterizations. From this analysis, we concluded that the chemical matter prioritized by isADMET profiles partially overlaps the matter prioritized by the team as well as by LBE and LLE. An interesting finding of this exercise is that one of the series chemically pursued by the team in order to build additional SAR and further characterize the series’ properties generated a poor isADMET profile in our retrospective study (Figure 4, top left). At the time this series was pursued, the team had demonstrated in a binding assay that the mechanism of action for the series was differentiated from other hits found in the HTS campaign. As a result of this pharmacology difference, the team was compelled to advance this series in order to preserve diversity in mechanism of action. This example illustrates that there are many criteria beyond physical properties, potency, and ADMET concerns that must be considered when resourcing decisions are made on drug discovery programs. Factors such as pharmacological mechanism of action, intellectual freedom to operate, and other considerations must be taken into account in order to advance an appropriately balanced chemical portfolio. We next describe the incorporation of isADMET profiles into the real-time prosecution of three HTS campaigns from our neuroscience portfolio. Each of these projects sought the

both approaches exhibited in vitro ADMET properties that were encouraging for hit series (low CLint and Papp and non-efflux substrates of P-gp). This finding validates the prospective application of in silico models for the identification of compounds with good overall in vitro ADMET-related properties. That is, our QSAR models prospectively identified compounds with desirable properties, reinforcing the validation work shown in Figure 2. More importantly, compounds identified for their performance in isADMET models may have little overlap with compounds prioritized by other considerations and provide additional chemical starting points, as found with project 1. In the case of project 2, we were able to generate enthusiasm for series that were not initially chosen by the team by advancing compounds to our in vitro ADMET assays prior to chemical validation. Consideration of isADMET data at this point can alter the perceived risks or attributes of a chemical series and alter decision-making such that series are progressed or halted based on their isADMET profiles. Because in vitro ADMET assays 6775

DOI: 10.1021/acs.jmedchem.6b01577 J. Med. Chem. 2017, 60, 6771−6780

Journal of Medicinal Chemistry

Perspective

able to demonstrate that the chemical space surrounding the series was generally consistent with favorable isADMET properties (Figure 6B). We note that in the absence of these isADMET profiles, this series would have been viewed as being “high risk” due to its limited size and likely would have been deprioritized accordingly. As such, we expect virtual libraries like that described for project 2 will become more frequently leveraged to help teams anticipate how the ADMET properties of a series may evolve. Additionally, it is foreseeable that some of the chemical functionalities that frequently appear in SAR exploration, and which consistently result in undesirable ADMET properties (e.g., higher CLint predictions in the case of phenyl), will be replaced by functionally similar but more ADMET-sensitive groups (e.g., oxazole in place of phenyl). Projects 3 and 4 were CNS targets from our neuroscience portfolio. In both cases, following HTS there were tens of biologically validated hits from different chemically distinct series. The projects 3 and 4 teams used isADMET profiles to characterize their chemical matter and to select representative molecules from promising series for in vitro ADMET confirmation. It is worth noting that the compounds submitted for in vitro ADMET characterization were chosen to sample the chemical diversity available within the hit series families while still maintaining good isADMET properties. Additionally, the demonstration of on-target activity was deliberately ignored for a compound to obtain in vitro ADMET characterization. The goal of these studies was to probe how widely the structures of a given series could be modified while still retaining desired ADMET properties, not to screen molecules in order to justify additional resource-intensive studies. Preservation of good in vitro ADMET properties within a chemical series was interpreted as a series having intrinsically good ADMET properties. As described above, the projects 3 and 4 teams only advanced series with encouraging isADMET profiles to in vitro ADMET characterization. At the time, these teams took the strategic position that series which were less promising from an isADMET perspective would be deprioritized and only pursued in the event that the series initially deemed to be promising were shown not to be suitable for lead optimization efforts. Fortunately, both teams were pleased to confirm that a number of series identified through utilization of isADMET profiling displayed good overall in vitro ADMET profiles and revisiting series predicted to be unfavorable was not required. Following experimental characterization of several examples of each chemically validated series, we found that series predicted to be attractive from an isADMET perspective were confirmed to display attractive lead-like properties (Figure 7). Compounds that were prospectively identified by QSAR as having low human microsomal CLint values in general exhibited good enrichment for low to moderate CLint (Figure 7A and Figure 7B). Generally speaking, data for compounds fall within 3-fold of their expected values and are categorically consistent with the predictions. Figure 7C−F depicts the relationship between experimental and predicted Papp and human P-gp efflux ratios, in which compounds are shaped and colored by series with the line as unity. Measured Papp values were, with few exceptions, categorically consistent with predictions (Figure 7C and Figure 7D). Similarly, the predictions for P-gp efflux generally led to the identification of non-P-gp substrates or series with tractable P-gp efflux (orange and green-brown triangles, Figure 7E, and green squares, Figure 7F) with the systematic exception of one series (blue pentagons, Figure 7E). Employing this strategy, both teams were able to leverage isADMET predictions to prioritize efforts

require less material than does the chemical validation process, we were able to leverage isADMET profiles to collect additional data for series that were not initially chosen for chemical validation. Compounds demonstrating measured properties consistent with good ADMET starting points were then viewed with a heightened interest that warranted resynthesis or, at a minimum, the submission of related molecules to the primary assay in order to increase our understanding of the series. In the case of project 2, perceptions of chemical matter fitness were altered by this approach and chemistry was initiated for several series in order to perform chemical validation and expand their SAR following in vitro confirmation of favorable isADMET properties. An interesting case that emerged from project 2 was the identification of a “series” containing two compounds with an attractive, albeit limited, isADMET profile due to the absence of a significant body of related chemical matter in our sample collection (Figure 6A). Following in vitro confirmation of the

Figure 6. (A) isADMET heat map for series 1 from project 2, which only contained two compounds from our sample collection. (B) isADMET heat map for a virtual library enumerated around the core of series 1 from project 2. Rows represent individual compounds. The boxed regions indicate where one R-group was held constant across library members (phenyl and oxazole, as indicated). In the legend, the red coloring used for in vivo CLint predictions of 0 accounts for compounds with predicted CLp values that exceed hepatic blood flow.

isADMET properties for one of the two members of series 1, the team virtually enumerated a small library of prophetic compounds to more thoroughly characterize the series from an isADMET perspective. One of the interesting results from this exercise was the observation that some substituents, such as an unsubstituted phenyl group, consistently enriched for high CLint predictions as indicated by the bands of orange and red in the human and rat microsomal CLint columns as well as the rat in vivo CLint column (Figure 6B). Derivatives containing an oxazole in place of a phenyl group at one of the positions in the core resulted in systematic lowering of CLint predictions (Figure 6B), consistent with changes in lipophilicity. From this, we were 6776

DOI: 10.1021/acs.jmedchem.6b01577 J. Med. Chem. 2017, 60, 6771−6780

Journal of Medicinal Chemistry

Perspective

Figure 7. Predicted vs measured scatterplots for human microsomal CLint (A, project 3; B, project 4), Papp (C, project 3; D, project 4), and human P-gp efflux ratio (E, project 3; F, project 4). Compounds are shaped and colored by series and the colors and shapes differ between projects 3 and 4. CLint values are measured in mL·min−1·kg−1; Papp values are measured in 10−6 cm·s−1.

pharmacodynamic response). In principle, diversity is never irrevocably lost because teams can always return to earlier steps in the screening process to “retriage” with different criteria; in practice, this can be difficult to accomplish for reasons technical and nontechnical in nature. Project 5, an infectious disease program, serves as an example where an MPO scoring function based on isADMET properties resulted in the identification of a series that addressed a key liability. A number of series identified by project 5 displayed a risk associated with the inhibition of hERG26 based on their isADMET profiles. For one of these series, series 3, nine molecules had historical data from our hERG binding assay and five of these displayed Ki values ranging from 100 nM to 10 μM. Although predictions for this series were not prospectively validated, our QSAR model flagged the series as having a propensity for hERG binding as indicated by the large proportion of molecules predicted to display ∼1 μM affinity in our binding assay (Figure 8A). In order to help the team focus on series predicted to be less prone to hERG binding, an MPO scoring function based on predicted hERG binding affinity and on-target potency and selectivity was employed. Through this approach, the team immediately identified a member from series 4 that was anticipated to display a substantial advantage over other series with respect to hERG binding. Although this series displays undesirable predicted CLint properties (Figure 8B), it still

toward identifying chemical matter that possessed favorable measured ADMET properties. We have also incorporated isADMET predictions into multiparameter optimization (MPO) scoring functions. Teams are frequently confronted with many series possessing similar isADMET profiles, and their differentiation, classification, and assignment to different workflows can become complicated. Converting isADMET profiles to numerical scores facilitates consistency in decision-making while also enabling the sorting of hit series by their isADMET profile scores. For instance, a program team seeking molecules that are peripherally restricted could devise a scoring function that rewards series displaying high predicted P-gp efflux ratios and low predicted rat in vivo CLint. Scores averaged over each series would then allow teams to sort their hits to quickly identify hit series that consistently display the desired property profile. This may be particularly beneficial when there are large numbers of compounds that require prioritization, such as prior to biological validation or earlier steps in the screening process (Figure 1). However, we caution that overweighting isADMET profiles prior to acquiring broader biological data sets can result in a loss of pharmacological diversity. Biological validation often includes detailed studies important for identifying and maintaining a healthy portfolio of chemical matter (for example, by demonstrating binding to a specific site on a protein in order to affect a particular 6777

DOI: 10.1021/acs.jmedchem.6b01577 J. Med. Chem. 2017, 60, 6771−6780

Journal of Medicinal Chemistry

Perspective

anticipated ADMET strengths and liabilities of their hit series. The retrospective and prospective application of our approach has shown that leveraging isADMET profiles results in the identification and selection of previously unidentified chemical matter. We continue to develop and refine this approach as part of our hit identification practices and encourage the medicinal chemistry community to consider similar approaches in their drug discovery efforts.



AUTHOR INFORMATION

Corresponding Authors

*J.M.S.: e-mail, [email protected]. *D.C.B.: e-mail, [email protected]. ORCID

John M. Sanders: 0000-0002-3788-4220 Robert P. Sheridan: 0000-0002-6549-1635 Elizabeth M. Joshi: 0000-0003-3267-0634 Present Address ∥

B.E.M.: Eli Lilly and Company, Lilly Corporate Center, DC0710, Indianapolis, IN 46285, U.S.

Author Contributions ⊥

J.M.S. and D.C.B. contributed equally.

Notes

The authors declare the following competing financial interest(s): Authors are current or former employees of Merck & Co., Inc., Whitehouse Station, NJ, U.S., and potentially own stock and/or hold stock options in the Company.

Figure 8. isADMET heat maps from project 5: (A) series 3, which was consistently predicted to bind to hERG; (B) series 4, which displayed less propensity for predicted hERG binding. Rows represent individual compounds.

Biographies John M. Sanders is a computational chemist at Merck & Co., Inc. in West Point, PA, U.S. He obtained a B.A. in Chemistry from Colgate University (2000) and a Ph.D. in Chemistry from the University of Illinois under the guidance of Eric Oldfield (2005).

attracted attention since hERG binding and inhibition represented a key liability for the team. The potential for improved hERG binding by series 4 is clearly demonstrated in the expanded series predictions, shown in Figure 8B, by the lower fraction of compounds predicted to exhibit hERG binding. Following experimental characterization in our hERG binding assay, a set of seven compounds from series 4 were shown not to display hERG channel binding (Ki > 60 μM, data not shown), in agreement with our QSAR model predictions. In addition to the case examples described above, it is worth noting that other predicted end points can be added to the profiles without significantly complicating their analysis. Leveraging heat maps assists in the digestion of large data sets (in terms of number of rows and columns), and consequently we now include other end points related to safety concerns and drug−drug interaction potential (reversible cytochrome P450 inhibition, CYP3A induction, and hERG, Cav1.2, and Nav1.5 ion channel inhibition). We believe that any statistically robust QSAR model which adds data that are relevant to decisionmaking processes can and should be utilized within this approach.

Douglas C. Beshore is a medicinal chemist at Merck & Co., Inc. in West Point, PA, U.S. He obtained his B.S. in Biochemistry from Albright College (1997) and his Ph.D. from the University of Pennsylvania (2007) under the guidance of Prof. Amos B Smith, III. J. Christopher Culberson joined Merck & Co., Inc. in West Point, PA, U.S., in 1991 after spending 4 years at Nutrasweet designing new sweeteners. As a member of the molecular modeling group, he was directly involved in oxytocin antagonists, HIV-RT inhibitors, and farnesyl transferase inhibitor projects, as well as having an interest in the hERG ion channel. He actively participated in the design and analysis of our screening collection and screening results in order to build a chemical collection that supports drug discovery efforts. He received his Ph.D. degree from University of Florida in Physical Chemistry under the direction of M. C. Zerner and W. L. Luken. James I. Fells is a computational chemist in the Modeling & Informatics group at Merck & Co., Inc. in Rahway, NJ, U.S. Prior to joining this company in 2014, he obtained his M.S. and Ph.D. in Organic Chemistry at The University of Memphis, followed by a postdoctoral fellowship in Pharmacology at The University of Tennessee Health Science Center.



CONCLUSIONS Described herein is an approach to inform drug discovery programs teams of the in silico properties of the molecules identified through screening and incorporate these properties into isADMET profiles. The intent of this review is to raise awareness of and advocate for the utilization of QSAR models in the characterization, prioritization, and execution on chemical matter as early as possible in the discovery process. By incorporating these data in readily digestible heat map visualizations, teams are enabled to rapidly analyze the

Jason E. Imbriglio received his Ph.D. from the University of Arizona and was an NIH Postdoctoral Fellow at Boston College. Jason joined Merck & Co., Inc. in Rahway, NJ, U.S., in the discovery chemistry group in 2004, where he currently serves as a Director. Jason’s work at this company has included all phases of discovery chemistry, including chemical biology, target validation, lead identification, and lead optimization, across a number of different therapeutic areas. Jason is an alumnus of the University of Massachusetts, Amherst. 6778

DOI: 10.1021/acs.jmedchem.6b01577 J. Med. Chem. 2017, 60, 6771−6780

Journal of Medicinal Chemistry

Perspective

and concentrated on developing new methods in molecular modeling, first in the field of virtual screening and currently in QSAR.

Hakan Gunaydin is a computational chemist at Merck & Co., Inc. in Boston, MA, U.S. He earned B.S. and M.S. degrees in Chemistry from Bogazici University in Turkey in 2001 and 2003, respectively. He earned his Ph.D. in Biochemistry and Molecular Biology from UCLA in 2008 working in the Houk laboratory on the modeling of reaction mechanisms and de novo enzyme design. He worked at Amgen, Inc. for 5 years before joining his current company in 2014.

Linda M. Suen is a Senior Scientist in medicinal chemistry at Merck & Co., Inc. in West Point, PA, U.S. She earned a B.A. in Biochemistry from Barnard College and Ph.D. in Organic Chemistry from Columbia University under the supervision of Prof. James Leighton. Andreas Verras is a computational chemist at Merck & Co., Inc. in Kenilworth, NJ, U.S. He earned a B.S. in Chemistry and a B.A. in Creative Writing from Emory University. His Ph.D. was conducted at UCSF in the laboratories of Paul Ortiz de Montellano and Tack Kuntz. Previous to joining his current company he worked at Syngenta Crop Protection in Basel, Switzerland.

Andrew M. Haidle is a Principal Scientist in Medicinal Chemistry at Merck & Co., Inc. in Boston, MA, U.S. He earned his B. S. in Chemistry and Cell/Molecular Biology in 1998 from the University of Michigan, where he worked on bioorganic chemistry projects involving oligosaccharyltransferase in the laboratory of Prof. James Coward. He then pursued the total synthesis of members of the cytochalasin natural product family under the direction of Prof. Andrew G. Myers at Harvard University, obtaining his Ph.D. from the Department of Chemistry and Chemical Biology in 2004. He has worked on both early and late stage discovery projects in multiple therapeutic areas at Pfizer Inc. and his current company.

Abbas Walji received his Ph.D. from University at Buffalo, The State University of New York and was a postdoctoral scholar at the California Institute of Technology and Princeton University. Abbas joined Merck & Co., Inc. in West Point, PA, U.S., in 2007, where he currently serves as a medicinal chemist in the Discovery Chemistry Modalities group. Elizabeth M. Joshi obtained a B.S. at Mary Washington College, followed by her Ph.D. in Chemistry under the supervision of Professor Timothy L. Macdonald at the University of Virginia. Following a number of years working as an ADME research scientist at Eli Lilly and Company, she joined Merck & Co., Inc. in Kenilworth, NJ, U.S. in 2013. Her work has focused on model development in the area of drug induced liver injury and drug metabolism, as well as the early application of in silico ADME tools to inform decision making in discovery.

Marc Labroli is an Associate Principal Scientist in Medicinal Chemistry at Merck & Co., Inc. in West Point, PA, U.S. He earned a B.S. in Chemistry from Villanova University in 1992, working in the laboratory of Dr. Walter Zajac. He then earned a Ph.D. in Chemistry from The University of Virginia in 1997 under the supervision of Professor Timothy Macdonald, working on topoisomerase inhibitors. Following postdoctoral research studies at the Scripps Research Institute in La Jolla, CA, under the supervision of Professor Dale Boger he joined Schering-Plough in 2000, which merged with his current company in 2009.

Tjerk Bueters works at Merck & Co., Inc. in West Point, PA, U.S., as a lead for translational PKPD offering strategic, educational, and hands-on support to develop and execute on translational plans in the neuroscience and cardiovascular disease areas. In addition, he is serving on several multidisciplinary drug discovery and early development teams, in which he is responsible for the overall pharmacokinetic, pharmacodynamics, and drug metabolism contributions. Before joining his current company, he has worked 8 years at the CNS & Pain unit of AstraZeneca in similar roles. Dr. Bueters has a Ph.D. in Quantitative Pharmacology obtained at Leiden University, The Netherlands, and did his postdoctoral research within neuroscience at the Karolinska Institute in Sweden.

Brian E. Mattioni is an ADME scientist at Eli Lilly in Indianapolis, Indiana. He earned his B.S. in Chemistry from Wingate University and Ph.D. in Computational Chemistry from the Pennsylvania State University under Professor Peter Jurs. Nunzio Sciammetta is a medicinal chemist at Merck & Co., Inc. in Boston, MA, U.S. He leads a medicinal chemistry team with a focus on enabling synthesis and applying computational design and new chemical technologies to medicinal chemistry projects. His current research interests include parallel medicinal chemistry, property based target design, hit-to-lead discovery, and bRo5 macrocycles in drug discovery. He joined his current company in 2012 and prior to this was a medicinal chemist at the Pfizer Sandwich Laboratories, U.K., and Leo Pharma, Denmark. Nunzio obtained a Ph.D. in Organic Synthesis from the University of Manchester, U.K. (1994), under the supervision of Dr. Andrew C. Regan and undertook postdoctoral studies at the University of Milan, Italy (Prof. C. Scolastico) and Leeds University, U.K. (Prof. R. Grigg).



ACKNOWLEDGMENTS The authors thank M. Kate Holloway for careful reading of the manuscript and constructive comments and David M. Tellers for discussion and support.

■ ■

DEDICATION This work is dedicated to the memory of Frank K. Brown.

William D. Shipe is an Associate Principal Scientist in medicinal chemistry at Merck & Co., Inc. in West Point, PA, U.S. He earned a B.S. in Chemistry with a minor in Mathematics from The Pennsylvania State University in 1999, working in the laboratory of Prof. Raymond L. Funk on the development of new methods for asymmetric induction via chiral auxiliaries and catalysts. He then earned a Ph.D. in Chemistry from The Scripps Research Institute in 2004 under the supervision of Prof. Erik J. Sorensen, completing enantioselective total syntheses of the diterpene guanacastepenes A and E. After leaving Scripps in 2003, he spent 1 year as a visiting graduate student at Princeton University before joining his current company in 2004.

ABBREVIATIONS ADMET, absorption, distribution, metabolism, excretion, and toxicity; CLint, intrinsic clearance; CLp, plasma clearance; HTS, high throughput screening; fu,mic, unbound fraction in microsomes; fu,p, unbound fraction in plasma; isADMET, in silico absorption, distribution, metabolism, excretion, and toxicity; Papp, apparent permeability; P-gp, P-glycoprotein; QSAR, quantitative structure−activity relationship

Robert P. Sheridan is a computational chemist and cheminformatics expert at Merck & Co., Inc. in Rahway, NJ, U.S. He earned his Ph.D. in Biochemistry from Princeton University and did his postdoctoral work at Fox Chase Cancer Center and Rutgers University before joining Lederle Laboratories in 1983. He joined his current company in 1991

(1) Mayr, L. M.; Fuerst, P. The future of high-throughput screening. J. Biomol. Screening 2008, 13, 443−448. (2) Macarron, R.; Banks, M. N.; Bojanic, D.; Burns, D. J.; Cirovic, D. A.; Garyantes, T.; Green, D. V. S.; Hertzberg, R. P.; Janzen, W. P.; Paslay, J. W.; Schopfer, U.; Sittampalam, G. S. Impact of high-throughput



6779

REFERENCES

DOI: 10.1021/acs.jmedchem.6b01577 J. Med. Chem. 2017, 60, 6771−6780

Journal of Medicinal Chemistry

Perspective

screening in biomedical research. Nat. Rev. Drug Discovery 2011, 10, 188−195. (3) Thorne, N.; Auld, D. S.; Inglese, J. Apparent activity in highthroughput screening: origins of compound-dependent assay interference. Curr. Opin. Chem. Biol. 2010, 14, 315−324. (4) Hughes, J. P.; Rees, S.; Kalindjian, S. B.; Philpott, K. L. Principles of early drug discovery. Br. J. Pharmacol. 2011, 162, 1239−1249. (5) Kerns, E. H.; Di, L. Drug-like Properties: Concepts, Structure Design and Methods. From ADME to Toxicity Optimization; Academic Press: San Diego, CA, 2008. (6) Bleicher, K. H.; Bohm, H. J.; Muller, K.; Alanine, A. I. Hit and lead generation: beyond high-throughput screening. Nat. Rev. Drug Discovery 2003, 2, 369−378. (7) Davis, A. M.; Keeling, D. J.; Steele, J.; Tomkinson, N. P.; Tinker, A. C. Components of successful lead generation. Curr. Top. Med. Chem. 2005, 5, 421−439. (8) Ballard, P.; Brassil, P.; Bui, K. H.; Dolgos, H.; Petersson, C.; Tunek, A.; Webborn, P. J. The right compound in the right assay at the right time: an integrated discovery DMPK strategy. Drug Metab. Rev. 2012, 44, 224−252. (9) van de Waterbeemd, H.; Gifford, E. ADMET in silico modelling: towards prediction paradise? Nat. Rev. Drug Discovery 2003, 2, 192−204. (10) Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Delivery Rev. 2001, 46, 3−26. (11) Leeson, P. D.; Springthorpe, B. The influence of drug-like concepts on decision-making in medicinal chemistry. Nat. Rev. Drug Discovery 2007, 6, 881−890. (12) Bueters, T.; Gibson, C.; Visser, S. A. Optimization of human dose prediction by using quantitative and translational pharmacology in drug discovery. Future Med. Chem. 2015, 7, 2351−2369. (13) Gleeson, M. P.; Hersey, A.; Montanari, D.; Overington, J. Probing the links between in vitro potency, ADMET and physicochemical parameters. Nat. Rev. Drug Discovery 2011, 10, 197−208. (14) Hop, C. E.; Cole, M. J.; Davidson, R. E.; Duignan, D. B.; Federico, J.; Janiszewski, J. S.; Jenkins, K.; Krueger, S.; Lebowitz, R.; Liston, T. E.; Mitchell, W.; Snyder, M.; Steyn, S. J.; Soglia, J. R.; Taylor, C.; Troutman, M. D.; Umland, J.; West, M.; Whalen, K. M.; Zelesky, V.; Zhao, S. X. High throughput ADME screening: practical considerations, impact on the portfolio and enabler of in silico ADME models. Curr. Drug Metab. 2008, 9, 847−853. (15) Leeson, P. D.; Young, R. J. Molecular property design: does everyone get it? ACS Med. Chem. Lett. 2015, 6, 722−725. (16) Desai, P. V.; Sawada, G. A.; Watson, I. A.; Raub, T. J. Integration of in silico and in vitro tools for scaffold optimization during drug discovery: predicting P-glycoprotein efflux. Mol. Pharmaceutics 2013, 10, 1249−1261. (17) Sherer, E. C.; Verras, A.; Madeira, M.; Hagmann, W. K.; Sheridan, R. P.; Roberts, D.; Bleasby, K.; Cornell, W. D. QSAR prediction of passive permeability in the LLC-PK1 cell line: trends in molecular properties and cross-prediction of Caco-2 permeabilities. Mol. Inf. 2012, 31, 231−245. (18) Sheridan, R. P.; McMasters, D. R.; Voigt, J. H.; Wildey, M. J. eCounterscreening: using QSAR predictions to prioritize testing for offtarget activities and setting the balance between benefit and risk. J. Chem. Inf. Model. 2015, 55, 231−238. (19) Mahar Doan, K. M.; Humphreys, J. E.; Webster, L. O.; Wring, S. A.; Shampine, L. J.; Serabjit-Singh, C. J.; Adkison, K. K.; Polli, J. W. Passive permeability and P-glycoprotein-mediated efflux differentiate central nervous system (CNS) and non-CNS marketed drugs. J. Pharmacol. Exp. Ther. 2002, 303, 1029−1037. (20) Giuliano, C.; Jairaj, M.; Zafiu, C. M.; Laufer, R. Direct determination of unbound intrinsic drug clearance in the microsomal stability assay. Drug Metab. Dispos. 2005, 33, 1319−1324. (21) Turner, D. B.; Yeo, K. R.; Tucker, G. T.; Rostami-Hodjegan, A. Prediction of nonspecific hepatic microsomal binding from readily available physicochemical properties. Drug Metab. Rev. 2006, 38, 162− 162.

(22) Yang, J. S.; Jamei, M.; Yeo, K. R.; Rostami-Hodjegan, A.; Tucker, G. T. Misuse of the well-stirred model of hepatic drug clearance. Drug Metab. Dispos. 2007, 35, 501−502. (23) Eisen, M. B.; Spellman, P. T.; Brown, P. O.; Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. U. S. A. 1998, 95, 14863−14868. (24) Erlanson, D. A.; McDowell, R. S.; O’Brien, T. Fragment-based drug discovery. J. Med. Chem. 2004, 47, 3463−3482. (25) Hopkins, A. L.; Keseru, G. M.; Leeson, P. D.; Rees, D. C.; Reynolds, C. H. The role of ligand efficiency metrics in drug discovery. Nat. Rev. Drug Discovery 2014, 13, 105−121. (26) Sanguinetti, M. C.; Tristani-Firouzi, M. hERG potassium channels and cardiac arrhythmia. Nature 2006, 440, 463−469.

6780

DOI: 10.1021/acs.jmedchem.6b01577 J. Med. Chem. 2017, 60, 6771−6780