Informing the Selection of Screening Hit Series with in Silico

Apr 18, 2017 - Jason's work at this company has included all phases of discovery chemistry, including chemical biology, target validation, lead identi...
0 downloads 0 Views 3MB Size
Subscriber access provided by Queen Mary, University of London

Perspective

Informing the selection of screening hit series with in silico absorption, distribution, metabolism, excretion, and toxicity profiles John M. Sanders, Douglas C. Beshore, J.Chris Culberson, James I. Fells, Jason E. Imbriglio, Hakan Gunaydin, Andrew M. Haidle, Marc Labroli, Brian E. Mattioni, Nunzio Sciammetta, William D. Shipe, Robert P. Sheridan, Linda M Suen, Andreas Verras, Abbas M. Walji, Elizabeth M Joshi, and Tjerk Bueters J. Med. Chem., Just Accepted Manuscript • Publication Date (Web): 18 Apr 2017 Downloaded from http://pubs.acs.org on April 18, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Medicinal Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Informing the selection of screening hit series with in silico absorption, distribution, metabolism, excretion, and toxicity profiles John M. Sanders†⊥*, Douglas C. Beshore‡⊥, J. Chris Culberson†, James I. Fells†, Jason E. Imbriglio‡, Hakan Gunaydin†, Andrew M. Haidle‡, Marc Labroli‡, Brian E. Mattioni†,∥, Nunzio Sciammetta‡, William D. Shipe‡, Robert P. Sheridan†, Linda M. Suen‡, Andreas Verras†, Abbas Walji‡, Elizabeth M. Joshi§, and Tjerk Bueters§ †

Modeling & Informatics, ‡Discovery Chemistry, §Pharmacokinetics, Pharmacodynamics, and Drug Metabolism, Merck & Co., Inc., Kenilworth, New Jersey 07065, United States

Keywords: in silico ADMET, HTS, QSAR, data visualization

ACS Paragon Plus Environment

1

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 48

Abstract

High-throughput screening (HTS) has enabled millions of compounds to be assessed for biological activity, but challenges remain in the prioritization of hit series.

While biological,

absorption, distribution, metabolism, excretion, and toxicity (ADMET), purity, and structural data are routinely used to select chemical matter for further follow-up, the scarcity of historical ADMET data for screening hits limits our understanding of early hit compounds. Herein, we describe a process that utilizes a battery of in-house quantitative structure activity relationship (QSAR) models to generate in silico ADMET profiles for hit series to enable more complete characterizations of HTS chemical matter. These profiles allow teams to quickly assess hit series for desirable ADMET properties or suspected liabilities that may require significant optimization. Accordingly, these in silico data can direct ADMET experimentation and profoundly impact the progression of hit series. Several prospective examples are presented to substantiate the value of this approach.

Introduction For decades, corporate and academic drug discovery groups have used robotic high throughput screening (HTS) techniques to enable the exploration of large compound collections (millions of compounds) for specific and desired biochemical and cell-based activities.1, 2 Primary screens of millions of compounds are typically conducted using one concentration and a single replicate (Figure 1). Activity in this primary screen is used to advance a significantly smaller subset of the initial library (e.g., typically ~1%) to a confirmation screen, often carried out at the same concentration as the primary screen but with additional replicates. Beyond the confirmation screen, several replicates of 8- or 10-point concentration response curves are usually generated for an even smaller subset of the collection and these experiments are often performed using additional and orthogonal proteins and formats in order to better understand selectivity,

ACS Paragon Plus Environment

2

Page 3 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

pharmacology, and mechanisms of action.3, 4 This frequently marks the end of the HTS campaign itself and is accompanied by the transfer of a substantial dossier to the discovery program team, which must then decide how to allocate biological and chemical resources in order to identify and prioritize chemical starting points for a drug discovery program. The first task the team must complete is biological validation of the hits identified during the screening campaign (Figure 1). At our company, this process submits samples of hit molecules that are available within our sample collection to biochemical, cell-based, and biophysical assays of on- and off-target systems and ensures that the activity observed in the HTS campaign arises from bona fide ligand/target interactions. Following biological validation, samples must retain the desired biological activity following repurification or resynthesis in order to be considered chemically validated (Figure 1). This represents the first step of post-HTS wet chemistry support and often requires a significant time and resource investment, necessitating a careful selection process before individual compounds receive this level of attention. In order to inform decision-making in the processes described above, HTS screening data are presented to drug discovery teams together with information on compound purity, physical properties, selectivity through prior HTS and non-HTS testing histories, ADMET properties, mechanism of action, and chemical structure-related data. These data sets, while extremely powerful and informative for driving decisions during the biological and chemical validation steps, have not eliminated the need for significant chemistry investments to optimize compounds during lead optimization. In fact, it is nearly always the case that compounds identified through HTS require several years of chemical modification and biological experimentation before a suitable drug candidate is identified.4

ACS Paragon Plus Environment

3

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 48

One of the challenges of lead optimization is the discovery of a compound that possesses the required combination of potency, selectivity, ADMET, safety, formulation, and manufacturing properties.5 Of these criteria, a substantial portion of the optimization effort is often devoted to solving ADMET issues. While understanding the ADMET properties of hit series is widely recognized as being key to a successful optimization campaign,6-8 it is often challenging to compare hit series to one another at the earliest stages of lead identification because only a small fraction of the compound collection has any historical measured ADMET data.9 As a result of this, teams use calculated parameters7 such as molecular weight, logP, polar surface area (PSA), Lipinski’s Ro5,10 and their intuition about chemical structure and viability as surrogates for experimental ADMET data during hit prioritization.11 The generation of experimental ADMET data is often delayed until hit series have been expanded and chemically evolved into more “mature” series at which time they frequently reveal the need for extensive ADMET optimization in order to return a pipeline candidate with an acceptable dose and dosing frequency.12 Because resources are finite and chemical space is relatively unlimited, it is imperative that discovery program teams selectively invest available resources in the chemical matter having the highest likelihood of delivering an efficacious and safe drug candidate in a timely fashion.13, 14 Working from the hypothesis that the good ADMET properties of a promising hit series would be retained during lead optimization, as drug candidates often strongly resemble screening hits, we reasoned that the identification of compound classes possessing favorable overall property profiles during the initial assessment of HTS hits would facilitate and accelerate drug discovery efforts.15

ACS Paragon Plus Environment

4

Page 5 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Rather than rely on simple physiochemical parameters (i.e., logP), we have taken the approach of using QSAR models for ADMET endpoints to generate in silico ADMET (isADMET) profiles for hit series in order to inform teams and enable more thorough characterizations of their HTS hits. These profiles are useful for the identification of series with promising ADMET properties, the identification and assessment of tractability around ADMET liabilities, and the compilation of a portfolio of hits with different strengths and weaknesses to hedge against unforeseen complications at later discovery stages. When these profiles are leveraged early in the lead identification process, prior to a discovery team committing significant resources and time to hit series (Figure 1), isADMET profiles can meaningfully alter the perceived tractability of screening hits and, we assert, alter the course of a drug discovery program effort. To date, there are a limited number of reports describing the in silico characterization of HTS hits using QSAR models built for specific ADMET endpoints in addition to (or in place of) routinely calculated physical properties,16 but we are unaware of descriptions of more complete in silico profiling. Described herein is a strategy that we have developed to broadly characterize chemical starting points for drug discovery programs by integrating detailed isADMET properties with decision-enabling visualizations to inform discovery teams on series selection and potential solutions to ADMET issues. These visualizations allow for the rapid digestion of large amounts of data and facilitate the identification of promising chemical series and/or liabilities that these series may possess.

Several prospective examples are presented to

substantiate the value of the approach.

Methods

ACS Paragon Plus Environment

5

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 48

Over the years, our company has internally developed a number of QSAR models based on ADMET data obtained for project compounds in our corporate collection. Models are available for human and rat intrinsic clearance (CLint), human P-glycoprotein (P-gp)-mediated efflux, and apparent permeability (Papp)17, in addition to others.18 These models are typically built on highly diverse datasets comprised of tens of thousands of unique chemical structures and we have demonstrated that these models offer good performance for the prospective evaluation of chemical matter (data not shown).

While these models are not necessarily quantitatively

accurate, they are in all cases able to meaningfully enrich for categorical compound properties. For instance, when predicting the human P-gp efflux ratio for a given compound, a prediction ratio greater than 4 has been shown, through subsequent measurement, to significantly enrich for experimentally determined P-gp efflux substrates (Figure 2A). An accurate numerical prediction of the efflux ratio is not necessary for the purposes described herein and this model offers significant value to teams in the categorical classification of a molecule as a P-gp efflux substrate or non-substrate. Although this example is particularly relevant for programs directed at CNS targets, we generally find that the categorical classification afforded by our QSAR models adds significant value during the early phases of a program when measured data are often scarce, prohibitively expensive to generate, or not able to be generated due to compound availability. Although physical properties have long been used as surrogates for ADMET parameters, it has been our experience, and that of others,16 that QSAR models can outperform simple physical property-based predictions of a specific endpoint while providing key structural insights. For example, we have compared categorical predictions made by our human P-gp QSAR model with predictions obtained using PSA calculations (Figure 2), which are routinely applied for this purpose.19 While PSA is useful for enriching for P-gp non-substrates and substrates in low- and

ACS Paragon Plus Environment

6

Page 7 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

high-PSA ranges, respectively (Figure 2B), the classification accuracy does not rise to the level of performance observed when using our internal P-gp QSAR model unless a wide PSA middle bin of 80-120 Å2 is adopted (now capturing 1214 compounds, Figure 2C, versus the 379 compounds contained by the middle bin of predicted P-gp efflux ratios in Figure 2A). This enrichment comes at the expense of the numbers of true negative and true positive P-gp substrates falling into low- and high-PSA ranges (Figures 2A and 2C) and limits the utility of using PSA as a single-parameter decision-making tool for this endpoint. Findings of this nature motivated us to utilize explicit QSAR predictions for ADMET properties instead of exclusively making use of physical property-based approaches. To account for metabolism, we have utilized intrinsic clearance (CLint) rather than apparent clearance because correcting for non-specific binding to microsomal proteins (fu,mic) has been shown to improve in vitro-in vivo correlations.20 In evaluating in silico and in vitro CLint,, the free fraction of compound is incorporated to give an unbound CLint using the following relationship: CLint = CLint,app/fu,mic where CLint,app represents the apparent loss of parent from a metabolic stability assessment in microsomes of the appropriate species and fu,mic is either the measured microsomal unbound fraction or a predicted value. In this work, measured CLint values are calculated from measured CLint,app and fu,mic values whereas predicted CLint values are calculated from QSAR predictions of CLint,app and QSAR or physical properties-based predictions of fu,mic.21 In addition to using QSAR models for microsomal CLint, we also make use of predicted rat intrinsic hepatic clearance, CLint, in order to account for compounds whose primary clearance pathway may not be driven by cytochrome P450-mediated oxidative metabolism. In this case, in vivo CLint is back-calculated

ACS Paragon Plus Environment

7

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 48

from the well-stirred model of hepatic clearance using QSAR predictions for in vivo plasma clearance (CLp) and unbound fraction in plasma (fu,p) and an assumed blood-to-plasma ratio of unity.22 We have defined property guidelines in our isADMET battery that we believe are appropriate to inform teams on HTS chemical matter (Table 1). Predicted values for compounds are displayed in a heat map, where color gradients are defined by desired and undesired ranges (Table 1) for a given endpoint (Figure 3). We favor visualization of these data via heat maps because they facilitate the interpretation of large datasets and are the standard practice of genome-wide association studies (GWAS) and other large data analysis platforms.23 Desirable predictions are rendered in green, moderate or indeterminate predictions are rendered in yellow, and undesirable predictions are rendered in red, though we emphasize categorical enrichment over absolute numerical accuracy. The value of this approach is that it permits dozens of series to be rapidly assessed and qualitatively compared to one another. The gradient values shown in Table 1 were largely based on the experience of the authors and institutional knowledge regarding what property space is generally considered to be desirable or undesirable for each endpoint. For example, molecules with an in vitro CLint values greater than 250 mL/min/kg may require significant effort to identify a clinical candidate with adequate bioavailability and halflife to support a once-daily, low dose oral therapy. Similarly, compounds with Papp values exceeding 20 x 10-6 cm/s will frequently have unrestricted distribution and complete absorption, whereas values below 10 x 10-6 cm/s may be associated with disposition complicated by drug transporters and/or incomplete and variable absorption. Coloring can also reflect experimental variability of the assays. For example, we believe that a predicted P-gp efflux ratio >5 is more indicative of a bona fide P-gp substrate than is a prediction ~3. Accordingly, we use yellow in

ACS Paragon Plus Environment

8

Page 9 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

the color gradient to indicate a middle ground prediction space where there is less certainty and some caution is warranted (Figure 2A). In practice, the coloring of these properties requires adjustment by program teams to address program-specific needs. For instance, a CNS program would render small predicted ratios for P-gp efflux as green (desirable) whereas a program seeking minimally CNS-penetrant matter would render these values as red (undesirable). Failure to understand and modify these gradients on a program-specific basis can result in the identification of chemical matter with an undesirable or intractable initial property profile. We have also chosen to treat HTS hits as chemical series by grouping similar compounds together and displaying the isADMET profile of a given series as a single heat map. Series can be defined in myriad ways, but prior to biological validation we assemble series using hierarchical clustering of titrated compounds and prior to chemical validation we use topological similarity searching in our corporate collection to expand each HTS hit into a comprehensive series. The motivation for using different methods at different stages is related to convenience; at early stages, the inclusion of isADMET profiles for hits and their neighbors can be burdensome due to a lack of hit attrition (Figure 1). At later stages, for example prior to chemical validation, leveraging the full sample repository to build comprehensive heat maps is preferable to restricting heat maps to active molecules uncovered during the screen. In many cases, the molecules that are titrated during an HTS campaign are a subset of active molecules identified in the campaign. In order to match the capacity of subsequent screening steps, representatives of active series are chosen for assay as the goal of the campaign is to identify chemical matter rather than to fully characterize its SAR. Further, our screening collection is a subset of the complete sample repository and a large number of molecules would not contribute to heat maps constructed from screened compounds. By using topological similarity searching to define the

ACS Paragon Plus Environment

9

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 48

chemical space surrounding HTS hits, we capture many of the compounds that would be considered by a chemistry team to be relevant to a decision-making process about a series. This is beneficial even if neighbor molecules are inactive or have unknown activity at the biological target of interest since analogs can inform on the sensitivity of isADMET properties to perturbations of chemical structure and thus the perceived tractability of ADMET properties in each series. It is our assumption that series that are structurally diverse and well-behaved (i.e., they exhibit desirable predicted properties independent of chemical permutation) will retain desirable ADMET properties during an optimization campaign. Put differently, these series should not require extensive ADMET optimization relative to a series displaying extensive ADMET liabilities at the time of its identification. These concepts are foundational to fragment-based drug discovery, which emphasizes ligand efficiency, and are equally applicable to HTS starting points if attention is paid to ADMET properties.24

Results and Discussion At the outset of our investigation, we first attempted to address whether or not the use of isADMET profiles would result in the identification or prioritization of different chemical matter than other methods of hit selection. To address this, we retrospectively analyzed the output of several HTS campaigns and compared the decisions that were made by the project teams that prosecuted these campaigns to decisions that would be made on the basis of standard metrics such as the combination of ligand binding efficiency (LBE) and lipophilic ligand efficiency (LLE)25 or the attractiveness of isADMET profiles.

ACS Paragon Plus Environment

10

Page 11 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

By way of example, Project 1 is a drug discovery program pursuing a small molecule therapeutic for the treatment of thrombosis. As the Project 1 team analyzed their HTS hits, they used LBE and LLE in their decision-making and as a result the compounds that were biologically validated all have LBE > 0.26. For the retrospective analysis, we made the following categorical classifications based on LBE and LLE: biologically validated hits with an LLE > 3.5 and LBE > 0.26 or LBE > 0.36 and 2 ≥ LLE ≥ 3 were classified as “good”; hits with 0.26 ≥ LBE > 0.36 and 2 ≥ LLE ≥ 3 or LBE > 0.36 and LLE < 2 were classified as “moderate”; hits with LBE < 0.36 and LLE < 2 were classified as “poor.” We compared the biologically validated hits, now classified in terms of their LBE and LLE values, with selections made based on the appearance of isADMET profiles. Interpretation of isADMET heat maps is subjective, but our observations are that teams can readily visualize and identify series as being desirable (Figure 3A) and undesirable (Figure 3B) once they have generally familiarized themselves with the chemical matter; the moderate series generate more discussion in terms of their overall classification. Figure 4 shows how the hit series from Project 1 scored, with the magenta data points representing series that received synthetic effort to enumerate SAR and perform other characterizations. From this analysis, we concluded that the chemical matter prioritized by isADMET profiles partially overlaps the matter prioritized by the team as well as by LBE and LLE. An interesting finding of this exercise is that one of the series chemically pursued by the team in order to build additional SAR and further characterize the series’ properties generated a poor isADMET profile in our retrospective study (Figure 4, top left). At the time this series was pursued, the team had demonstrated in a binding assay that the mechanism of action for the series was differentiated from other hits found in the HTS campaign. As a result of this

ACS Paragon Plus Environment

11

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 48

pharmacology difference, the team was compelled to advance this series in order to preserve diversity in mechanism of action. This example illustrates that there are many criteria beyond physical properties, potency, and ADMET concerns that must be considered when resourcing decisions are made on drug discovery programs. Factors such as pharmacological mechanism of action, intellectual freedom to operate, and other considerations must be taken into account in order to advance an appropriately balanced chemical portfolio. We next describe the incorporation of isADMET profiles into the real-time prosecution of three HTS campaigns from our neuroscience portfolio. Each of these projects sought the discovery of a small molecule therapeutic targeting a protein located in the CNS. In order to address one of the principal challenges of CNS programs, we initially chose to evaluate hit series based upon their propensity for central penetration coupled with attractive ADMET properties, leveraging QSAR models for human P-gp efflux ratio, Papp, human microsomal CLint, rat microsomal CLint, and rat in vivo CLint together with the coloring criteria outlined in Table 1. Project 2 provided an opportunity to directly and prospectively compare decision-making with and without isADMET profiles. Following biological validation, the program team began selecting compounds for chemical validation using physical properties (e.g., MW, cLogP, PSA, etc.) and on-target potency in addition to subjective characteristics such as chemical attractiveness and synthetic tractability.

In addition to the compounds initially chosen for

chemical validation, a supplemental group of chemical series was identified based solely on a combination of potency and isADMET profiles generated for the hit molecules and related chemical matter following the method described above. Prior to chemical validation, the Project 2 team submitted representative compounds from each of the series chosen by both methods to a panel of in vitro ADMET-related assays (P-gp efflux, Papp, and microsomal CLint) to confirm

ACS Paragon Plus Environment

12

Page 13 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

their isADMET profiles. As shown in Figure 5, most of the series identified by both approaches exhibited in vitro ADMET properties that were encouraging for hit series (good CLint and Papp, and non-efflux substrates of P-gp). This finding validates the prospective application of in silico models for the identification of compounds with good overall in vitro ADMET-related properties. That is, our QSAR models prospectively identified compounds with desirable properties, reinforcing the validation work shown in Figure 2. More importantly, compounds identified for their performance in isADMET models may have little overlap with compounds prioritized by other considerations and provide additional chemical starting points, as found with Project 1. In the case of Project 2, we were able to generate enthusiasm for series that were not initially chosen by the team by advancing compounds to our in vitro ADMET assays prior to chemical validation. Consideration of isADMET data at this point can alter the perceived risks or attributes of a chemical series and alter decision-making such that series are progressed or halted based on their isADMET profiles. Because in vitro ADMET assays require less material than does the chemical validation process, we were able to leverage isADMET profiles to collect additional data for series that were not initially chosen for chemical validation. Compounds demonstrating measured properties consistent with good ADMET starting points were then viewed with a heightened interest that warranted resynthesis or, at a minimum, the submission of related molecules to the primary assay in order to increase our understanding of the series. In the case of Project 2, perceptions of chemical matter fitness were altered by this approach and chemistry was initiated for several series in order to perform chemical validation and expand their SAR following in vitro confirmation of favorable isADMET properties.

ACS Paragon Plus Environment

13

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 48

An interesting case that emerged from Project 2 was the identification of a “series” containing two compounds with an attractive, albeit limited, isADMET profile due to the absence of a significant body of related chemical matter in our sample collection (Figure 6A). Following in vitro confirmation of the isADMET properties for one of the two members of Series 1, the team virtually enumerated a small library of prophetic compounds to more thoroughly characterize the series from an isADMET perspective. One of the interesting results from this exercise was the observation that some substituents, such as an unsubstituted phenyl group, consistently enriched for high CLint predictions as indicated by the bands of orange and red in the human and rat microsomal CLint columns as well as the rat in vivo CLint column (Figure 6B). Derivatives containing an oxazole in place of a phenyl group at one of the positions in the core resulted in systematic lowering of CLint predictions (Figure 6B), consistent with changes in lipophilicity. From this, we were able to demonstrate that the chemical space surrounding the series was generally consistent with favorable isADMET properties (Figure 6B). We note that in the absence of these isADMET profiles, this series would have been viewed as being “high risk” due to its limited size and likely would have been deprioritized accordingly. As such, we expect virtual libraries like that described for Project 2 will become more frequently leveraged to help teams anticipate how the ADMET properties of a series may evolve.

Additionally, it is

foreseeable that some of the chemical functionalities that frequently appear in SAR exploration, and which consistently result in undesirable ADMET properties (e.g., higher CLint predictions in the case of phenyl), will be replaced by functionally similar but more ADMET-sensitive groups (e.g., oxazole in place of phenyl). Projects 3 and 4 were CNS targets from our neuroscience portfolio. In both cases, following HTS there were tens of biologically validated hits from different chemically distinct series. The

ACS Paragon Plus Environment

14

Page 15 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Project 3 and 4 teams used isADMET profiles to characterize their chemical matter and to select representative molecules from promising series for in vitro ADMET confirmation. It is worth noting that the compounds submitted for in vitro ADMET characterization were chosen to sample the chemical diversity available within the hit series families while still maintaining good isADMET properties. Additionally, the demonstration of on-target activity was deliberately ignored for a compound to obtain in vitro ADMET characterization. The goal of these studies was to probe how widely the structures of a given series could be modified while still retaining desired ADMET properties, not to screen molecules in order to justify additional resourceintensive studies. Preservation of good in vitro ADMET properties within a chemical series was interpreted as a series having intrinsically good ADMET properties. As described above, the Project 3 and 4 teams only advanced series with encouraging isADMET profiles to in vitro ADMET characterization. At the time, these teams took the strategic position that series which were less promising from an isADMET perspective would be deprioritized and only pursued in the event that the series initially deemed to be promising were shown not to be suitable for lead optimization efforts. Fortunately, both teams were pleased to confirm that a number of series identified through utilization of isADMET profiling displayed good overall in vitro ADMET profiles and revisiting series predicted to be unfavorable was not required. Following experimental characterization of several examples of each chemically validated series, we found that series predicted to be attractive from an isADMET perspective were confirmed to display attractive lead-like properties (Figure 7).

Compounds that were

prospectively identified by QSAR as having low human microsomal CLint values in general exhibited good enrichment for low to moderate CLint (Figures 7A and 7B). Generally speaking,

ACS Paragon Plus Environment

15

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 48

data for compounds fall within 3-fold of their expected values and are categorically consistent with the predictions. Figures 7C-7F depict the relationship between experimental and predicted Papp and human Pgp efflux ratios, in which compounds are shaped and colored by series with the line as unity. Measured Papp values were, with few exceptions, categorically consistent with predictions (Figures 7C-7D). Similarly, the predictions for P-gp efflux generally led to the identification of non-P-gp substrates or series with tractable P-gp efflux (orange and green-brown triangles, Figure 7E, and green square, figure 7F) with the systematic exception of one series (blue pentagons, Figure 7E). Employing this strategy, both teams were able to leverage isADMET predictions to prioritize efforts toward identifying chemical matter that possessed favorable measured ADMET properties. We have also incorporated isADMET predictions into multi-parameter optimization (MPO) scoring functions. Teams are frequently confronted with many series possessing similar isADMET profiles and their differentiation, classification, and assignment to different workflows can become complicated. Converting isADMET profiles to numerical scores facilitates consistency in decision-making while also enabling the sorting of hit series by their isADMET profile scores. For instance, a program team seeking molecules that are peripherally restricted could devise a scoring function that rewards series displaying high predicted P-gp efflux ratios and low predicted rat in vivo CLint. Scores averaged over each series would then allow teams to sort their hits to quickly identify hit series that consistently display the desired property profile. This may be particularly beneficial when there are large numbers of compounds that require prioritization, such as prior to biological validation or earlier steps in the screening process (Figure 1). However, we caution that over-weighting isADMET profiles prior to acquiring

ACS Paragon Plus Environment

16

Page 17 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

broader biological datasets can result in a loss of pharmacological diversity. Biological validation often includes detailed studies important for identifying and maintaining a healthy portfolio of chemical matter (for example, by demonstrating binding to a specific site on a protein in order to affect a particular pharmacodynamic response). In principle, diversity is never irrevocably lost because teams can always return to earlier steps in the screening process to “retriage” with different criteria; in practice, this can be difficult to accomplish for reasons technical and nontechnical in nature. Project 5, an infectious disease program, serves as an example where an MPO scoring function based on isADMET properties resulted in the identification of a series that addressed a key liability.

A number of series identified by Project 5 displayed a risk associated with the

inhibition of hERG26 based on their isADMET profiles. For one of these series, Series 3, nine molecules had historical data from our hERG binding assay and five of these displayed Ki values ranging from 100 nM to 10 µM. Although predictions for this series were not prospectively validated, our QSAR model flagged the series as having a propensity for hERG binding as indicated by the large proportion of molecules predicted to display ~1 μM affinity in our binding assay (Figure 8A). In order to help the team focus on series predicted to be less prone to hERG binding, an MPO scoring function based on predicted hERG binding affinity and on-target potency and selectivity was employed. Through this approach, the team immediately identified a member from Series 4 that was anticipated to display a substantial advantage over many other series with respect to hERG binding. Although this series displays undesirable predicted CLint properties (Figure 8B), it still attracted attention since hERG binding and inhibition represented a key liability for the team. The potential for improved hERG binding by Series 4 is clearly demonstrated in the expanded series predictions, shown in Figure 8B, by the lower fraction of

ACS Paragon Plus Environment

17

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 48

compounds predicted to exhibit hERG binding. Following experimental characterization in our hERG binding assay, a set of seven compounds from Series 4 were shown not to display hERG channel binding (Kis >60 μM, data not shown), in agreement with our QSAR model predictions. In addition to the case examples described above, it is worth noting that other predicted endpoints can be added to the profiles without significantly complicating their analysis. Leveraging heat maps assists in the digestion of large datasets (both in terms of number of rows and columns), and consequently we now include other endpoints related to safety concerns and drug-drug interaction potential (reversible cytochrome P450 inhibition, CYP3A induction, and hERG, Cav1.2, and Nav1.5 ion channel inhibition). We believe that any statistically robust QSAR model which adds data that are relevant to decision-making processes can and should be utilized within this approach.

Conclusions Described herein is an approach to inform drug discovery programs teams of the in silico properties of the molecules identified through screening and incorporate these properties into isADMET profiles. The intent of this Perspective is to raise awareness of and advocate for the utilization of QSAR models in the characterization, prioritization, and execution on chemical matter as early as possible in the discovery process. By incorporating these data in readily digestible heat map visualizations, teams are enabled to rapidly analyze the anticipated ADMET strengths and liabilities of their hit series. The retrospective and prospective application of our approach has shown that leveraging isADMET profiles results in the identification and selection of previously unidentified chemical matter. We continue to develop and refine this approach as

ACS Paragon Plus Environment

18

Page 19 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

part of our hit identification practices and encourage the medicinal chemistry community to consider similar approaches in their drug discovery efforts.

ACS Paragon Plus Environment

19

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 48

Figure 1. Schematic depicting the work flow and decision points during and after a typical HTS campaign. Orange boxes indicate assays conducted in a high-throughput format; blue boxes indicate assays conducted by the program team; the open arrow indicates the transition from the HTS screening group to the local project team.

ACS Paragon Plus Environment

20

Page 21 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Figure 2. Categorization accuracy of human P-gp efflux ratio by the QSAR model for human Pgp efflux ratio (A) and by PSA (B and C). QSAR data shown are prospective prediction values made for 2,403 compounds measured between January 2015 and March 2016 for compounds from more than 50 therapeutic programs. Bar charts are colored by experimental P-gp efflux ratio in a MDR1-overexpressing LLC-PK1 cell line for compounds demonstrating good Papp and a lack of endogenous transport in a control LLC-PK1 cell line.

ACS Paragon Plus Environment

21

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 48

Figure 3. Representative isADMET heat maps for single hit series displaying potentially desirable (A) and undesirable (B) predicted property profiles. Rows represent individual compounds. In the legend, the red coloring used for in vivo CLint predictions of 0 accounts for compounds with predicted CLp values that exceed hepatic bloodflow.

ACS Paragon Plus Environment

22

Page 23 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Figure 4. Retrospective comparison of different hit selection methods for Project 1. Data points shown are representative molecules from distinct chemical series that were biologically validated. Representative molecules were chosen to indicate the highest level of activity reached by a given series during Project 1. Data points are jittered for clarity.

ACS Paragon Plus Environment

23

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 48

Figure 5. Plot of human microsomal CLint vs. human P-gp efflux ratio for Project 2 hits. Each data point reflects a representative molecule chosen to characterize a unique chemical series. Blue data points represent compounds chosen by the chemistry team for chemical validation; squares represent compounds chosen for their isADMET properties. Note that there are several overlapping compounds identified as blue squares.

ACS Paragon Plus Environment

24

Page 25 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Figure 6. A, isADMET heat map for Series 1 from Project 2, which only contained 2 compounds from our sample collection. B, isADMET heat map for a virtual library enumerated around the core of Series 1 from Project 2. Rows represent individual compounds. The boxed regions indicate where one R-group was held constant across library members (phenyl and oxazole, as indicated). In the legend, the red coloring used for in vivo CLint predictions of 0 accounts for compounds with predicted CLp values that exceed hepatic bloodflow.

ACS Paragon Plus Environment

25

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 48

Figure 7. Predicted vs. measured scatterplots for human microsomal CLint (A, Project 3; B, Project 4), Papp (C, Project 3; D, Project 4), and human P-gp efflux ratio (E, Project 3; F, Project

ACS Paragon Plus Environment

26

Page 27 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

4). Compounds are shaped and colored by series and the colors and shapes differ between Projects 3 and 4. CLint values are measured in mL·min-1·kg-1; Papp values are measured in 10-6 cm·s-1.

ACS Paragon Plus Environment

27

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 48

Figure 8. isADMET heat maps from Program 5. A, Series 3, which was consistently predicted to bind to hERG. B, Series 4, which displayed less propensity for predicted hERG binding. Rows represent individual compounds.

ACS Paragon Plus Environment

28

Page 29 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Table 1. isADMET heat map color gradient set points used for lead ID projects. in silico model

Desired value (green)

Moderate value (yellow)

Undesired value (red)

Papp (10-6 cm·s-1)

20

10

1

P-gp efflux ratio

1

3

5

Human microsomal CLint (mL·min-1·kg-1)

1

125

250

Rat microsomal CLint (mL·min-1·kg-1)

1

250

500

Rat in vivo CLint (mL·min-1·kg-1)

1

2500

5000

Corresponding Authors [email protected] [email protected]

Present Addresses ∥Eli

Lilly and Company, Lilly Corporate Center, DC0710, Indianapolis, IN 46285, USA.

Author Contributions The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript. ⊥These authors contributed equally.

ACS Paragon Plus Environment

29

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 48

Conflict of Interest Disclosure Authors are current or former employees of Merck & Co., Inc., Whitehouse Station, NJ, USA and potentially own stock and/or hold stock options in the Company.

Acknowledgment The authors thank M. Kate Holloway for careful reading of the manuscript and constructive comments and David M. Tellers for discussion and support.

Abbreviations ADMET, absorption, distribution, metabolism, excretion, and toxicity; CLint, intrinsic clearance; CLp, plasma clearance; HTS, high throughput screening; fu,mic, unbound fraction in microsomes; fu,p, unbound fraction in plasma; isADMET, in silico ADMET; Papp, apparent permeability; P-gp, P-glycoprotein; QSAR, quantitative structure/activity relationship.

Biography John M. Sanders is a computational chemist at Merck & Co., Inc. in West Point, Pennsylvania, USA. He obtained a B.A. in chemistry from Colgate University (2000) and a Ph.D. in chemistry from the University of Illinois under the guidance of Eric Oldfield (2005).

Douglas C. Beshore is a medicinal chemist at Merck & Co., Inc. in West Point, Pennsylvania, USA. He obtained his B.S. in biochemistry from Albright College (1997) and his Ph.D. from the University of Pennsylvania (2007) under the guidance of Prof. Amos B Smith, III.

ACS Paragon Plus Environment

30

Page 31 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

J. Christopher Culberson joined Merck & Co., Inc. in West Point, Pennsylvania, USA in 1991 after spending 4 years at Nutrasweet designing new sweeteners. As a member of the molecular modeling group, he was directly involved in Oxytocin antagonists, HIV-RT inhibitors and farnesyl transferase inhibitor projects, as well as interest in the hERG ion channel. He actively participated in the design and analysis of our screening collection and screening results in order to build a chemical collection that supports drug discovery efforts. He received his Ph.D. degree from University of Florida in Physical Chemistry under the direction of M.C. Zerner and W.L. Luken.

James Fells is a computational chemist in the Modeling & Informatics group at Merck & Co., Inc. in Rahway, New Jersey, USA. Prior to joining his company in 2014, he obtained his M.S. and Ph.D. in organic chemistry at The University of Memphis, followed by a postdoctoral fellowship in pharmacology at The University of Tennessee Health Science Center.

Jason Imbriglio received his Ph.D. from the University of Arizona and was an NIH Postdoctoral Fellow at Boston College. Jason joined Merck & Co., Inc. in Rahway, New Jersey, USA in the discovery chemistry group in 2004, where he currently serves as a Director. Jason’s work at his company has included all phases of discovery chemistry, including chemical biology, target validation, lead identification, and lead optimization, across a number of different therapeutic areas. Jason is an alumnus of the University of Massachusetts, Amherst.

ACS Paragon Plus Environment

31

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 48

Hakan Gunaydin is a computational chemist at Merck & Co., Inc. in Boston, Massachusetts, USA. He earned B.S. and M.S. degrees in Chemistry from Bogazici University in Turkey in 2001 and 2003, respectively. He earned his Ph.D. in Biochemistry and Molecular Biology from UCLA in 2008 working in the Houk laboratory on the modeling of reaction mechanisms and denovo enzyme design. He worked at Amgen, Inc. for five years before joining his current company in 2014.

Andrew Haidle is a Principal Scientist in medicinal chemistry at Merck & Co., Inc. in Boston, Massachusetts, USA. He earned his B. S. in chemistry and cell/molecular biology in 1998 from the University of Michigan, where he worked on bioorganic chemistry projects involving oligosaccharyltransferase in the lab of Prof. James Coward. He then pursued the total synthesis of members of the cytochalasin natural product family under the direction of Prof. Andrew G. Myers at Harvard University, obtaining his Ph.D. from the Chemistry and Chemical Biology department in 2004. He has worked on both early and late stage discovery projects in multiple therapeutic areas at Pfizer Inc. and his current company.

Marc A. Labroli is an Associate Principal Scientist in medicinal chemistry at Merck & Co., Inc. in West Point, Pennsylvania, USA. He earned a B.S. in chemistry from Villanova University in 1992, working in the laboratory of Dr. Walter Zajac. He then earned a Ph.D. in chemistry from The University of Virginia in 1997 under the supervision of Professor Timothy Macdonald, working on topoisomerase inhibitors. Following postdoctoral research studies at the Scripps Research Institute in La Jolla, CA under the supervision of Professor Dale Boger he joined Schering-Plough in 2000, which merged with his current company in 2009.

ACS Paragon Plus Environment

32

Page 33 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Brian E. Mattioni is an ADME scientist at Eli Lilly in Indianapolis, Indiana. He earned his B.S. in chemistry from Wingate University and Ph.D. in computational chemistry from the Pennsylvania State University under Professor Peter Jurs.

Nunzio Sciammetta is a medicinal chemist at Merck & Co., Inc. in Boston, Massachusetts, USA.

He leads a medicinal chemistry team with a focus on enabling synthesis, applying

computational design and new chemical technologies to medicinal chemistry projects. His current research interests include parallel medicinal chemistry, property based target design, hitto-lead discovery, and bRo5 macrocycles in drug discovery. He joined his current company in 2012 and prior to this was a medicinal chemist at the Pfizer Sandwich Laboratories, UK, and Leo Pharma, Denmark.

Nunzio obtained a Ph.D. in organic synthesis from the University of

Manchester, UK (1994) under the supervision of Dr. Andrew C. Regan and undertook postdoctoral studies at the University of Milan, Italy (Prof. C. Scolastico) and Leeds University, UK (Prof. R. Grigg).

William D. Shipe is an Associate Principal Scientist in medicinal chemistry at Merck & Co., Inc. in West Point, Pennsylvania, USA.

He earned a B.S. in chemistry with a minor in

mathematics from The Pennsylvania State University in 1999, working in the laboratory of Prof. Raymond L. Funk on the development of new methods for asymmetric induction via chiral auxiliaries and catalysts. He then earned a Ph.D. in chemistry from The Scripps Research Institute in 2004 under the supervision of Prof. Erik J. Sorensen, completing enantioselective total syntheses of the diterpene guanacastepenes A and E. After leaving Scripps in 2003, he

ACS Paragon Plus Environment

33

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 48

spent one year as a visiting graduate student at Princeton University before joining his company in 2004.

Robert Sheridan is a computational chemist and cheminformatics expert at Merck & Co., Inc. in Rahway, New Jersey, USA. He earned his Ph.D. in biochemistry from Princeton University and did his postdoctoral work at Fox Chase Cancer Center and Rutgers University before joining Lederle Laboratories in 1983. He joined his current company in 1991 and concentrated on developing new methods in molecular modeling, first in the field of virtual screening, and currently in QSAR.

Linda M. Suen is a Senior Scientist in medicinal chemistry at Merck & Co., Inc. in West Point, Pennsylvania, USA. She earned a B.A. in biochemistry from Barnard College and Ph.D. in organic chemistry from Columbia University under the supervision of Prof. James Leighton.

Andreas Verras is a computational chemist at Merck & Co., Inc. in Kenilworth, New Jersey, USA. He earned a B.S. in chemistry and a B.A. in creative writing from Emory University. His Ph.D. was conducted at UCSF in the labs of Paul Ortiz de Montellano and Tack Kuntz. Previous to joining his current company he worked at Syngenta Crop Protection in Basel, Switzerland.

Abbas Walji received his Ph.D. from University at Buffalo, The State University of New York and was a postdoctoral scholar at the California Institute of Technology and Princeton University. Abbas joined Merck & Co., Inc. in West Point, Pennsylvania, USA in 2007, where he currently serves as a medicinal chemist in the Discovery Chemistry Modalities group.

ACS Paragon Plus Environment

34

Page 35 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Elizabeth M. Joshi obtained a B.S. at Mary Washington College, followed by her Ph.D. in chemistry under the supervision of Professor Timothy L. Macdonald at the University of Virginia. Following a number of years working as an ADME research scientist at Eli Lilly and Company, she joined Merck & Co., Inc. in Kenilworth, New Jersey, USA in 2013. Her work has focused on model development in the area of drug induced liver injury, drug metabolism, as well as early application of in silico ADME tools to inform decision making in discovery.

Tjerk Bueters works at Merck & Co., Inc. in West Point, Pennsylvania, USA as a lead for translational PKPD offering strategic, educational and hands-on support to develop and execute on translational plans in the neuroscience and cardiovascular disease areas. In addition, he is serving on several multi-disciplinary drug discovery and early development teams, in which he is responsible for the overall pharmacokinetic, pharmacodynamics and drug metabolism contributions. Before joining his current company, he has worked 8 years at the neuroscience unit of AstraZeneca in similar roles. Dr. Bueters has a Ph.D. in quantitative pharmacology obtained at Leiden University, The Netherlands and did his post-doctoral research within neuroscience at the Karolinska Institute in Sweden.

References 1.

Mayr, L. M.; Fuerst, P. The future of high-throughput screening. J. Biomol. Screening

2008, 13, 443-448. 2.

Macarron, R.; Banks, M. N.; Bojanic, D.; Burns, D. J.; Cirovic, D. A.; Garyantes, T.;

Green, D. V. S.; Hertzberg, R. P.; Janzen, W. P.; Paslay, J. W.; Schopfer, U.; Sittampalam, G. S.

ACS Paragon Plus Environment

35

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 48

Impact of high-throughput screening in biomedical research. Nat. Rev. Drug Discovery 2011, 10, 188-195. 3.

Thorne, N.; Auld, D. S.; Inglese, J. Apparent activity in high-throughput screening:

origins of compound-dependent assay interference. Curr. Opin. Chem. Biol. 2010, 14, 315-324. 4.

Hughes, J. P.; Kalindjian, S. B.; Philpott, K. L. Principles of early drug discovery. Br. J.

Pharmacol. 2011, 162, 1239-1249. 5.

Kerns, E. H.; Di, L. Drug-like Properties: Concepts, Structure Design and Methods:

From ADME to Toxicity Optimization. Academic Press: San Diego, CA, 2008. 6.

Bleicher, K. H.; Bohm, H. J.; Muller, K.; Alanine, A. I. Hit and lead generation: beyond

high-throughput screening. Nat. Rev. Drug Discovery 2003, 2, 369-378. 7.

Davis, A. M.; Keeling, D. J.; Steele, J.; Tomkinson, N. P.; Tinker, A. C. Components of

successful lead generation. Curr. Top. Med. Chem. 2005, 5, 421-439. 8.

Ballard, P.; Brassil, P.; Bui, K. H.; Dolgos, H.; Petersson, C.; Tunek, A.; Webborn, P. J.

The right compound in the right assay at the right time: an integrated discovery DMPK strategy. Drug Metab. Rev. 2012, 44, 224-252. 9.

van de Waterbeemd, H.; Gifford, E. ADMET in silico modelling: towards prediction

paradise? Nat. Rev. Drug Discovery 2003, 2, 192-204. 10.

Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeny, P. J. Experimental and

computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Delivery Rev. 2001, 46, 3-26. 11.

Leeson, P. D.; Springthorpe, B. The influence of drug-like concepts on decision-making

in medicinal chemistry. Nat. Rev. Drug Discovery 2007, 6, 881-890.

ACS Paragon Plus Environment

36

Page 37 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

12.

Bueters, T.; Gibson, C.; Visser, S. A. Optimization of human dose prediction by using

quantitative and translational pharmacology in drug discovery. Future Med. Chem. 2015, 7, 2351-2369. 13.

Gleeson, M. P.; Hersey, A.; Montanari, D.; Overington, J. Probing the links between in

vitro potency, ADMET and physicochemical parameters. Nat. Rev. Drug Discovery 2011, 10, 197-208. 14.

Hop, C. E.; Cole, M. J.; Davidson, R. E.; Duignan, D. B.; Federico, J.; Janiszewski, J. S.;

Jenkins, K.; Krueger, S.; Lebowitz, R.; Liston, T. E.; Mitchell, W.; Snyder, M.; Steyn, S. J.; Soglia, J. R.; Taylor, C.; Troutman, M. D.; Umland, J.; West, M.; Whalen, K. M.; Zelesky, V.; Zhao, S. X. High throughput ADME screening: practical considerations, impact on the portfolio and enabler of in silico ADME models. Curr. Drug Metab. 2008, 9, 847-853. 15.

Leeson, P. D.; Young, R. J. Molecular property design: does everyone get it? ACS Med.

Chem. Lett. 2015, 6, 722-725. 16.

Desai, P. V.; Sawada, G. A.; Watson, I. A.; Raub, T. J. Integration of in silico and in vitro

tools for scaffold optimization during drug discovery: predicting P-glycoprotein efflux. Mol. Pharmaceutics 2013, 10, 1249-1261. 17.

Sherer, E. C.; Verras, A.; Madeira, M.; Hagmann, W. K.; Sheridan, R. P.; Roberts, D.;

Bleasby, K.; Cornell, W. D. QSAR prediction of passive permeability in the LLC-PK1 cell line: trends in molecular properties and cross-prediction of Caco-2 permeabilities. Mol. Inf. 2012, 31, 231-245. 18.

Sheridan, R. P.; McMasters, D. R.; Voigt, J. H.; Wildey, M. J. eCounterscreening: using

QSAR predictions to prioritize testing for off-target activities and setting the balance between benefit and risk. J. Chem. Inf. Model. 2015, 55, 231-238.

ACS Paragon Plus Environment

37

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

19.

Page 38 of 48

Mahar Doan, K. M.; Humphreys, J. E.; Webster, L. O.; Wring, S. A.; Shampine, L. J.;

Serabjit-Singh, C. J.; Adkison, K. K.; Polli, J. W. Passive permeability and P-glycoproteinmediated efflux differentiate central nervous system (CNS) and non-CNS marketed drugs. J. Pharmacol. Exp. Ther. 2002, 303, 1029-1037. 20.

Giuliano, C.; Jairaj, M.; Zafiu, C. M.; Laufer, R. Direct determination of unbound

intrinsic drug clearance in the microsomal stability assay. Drug Metab. Dispos. 2005, 33, 13191324. 21.

Turner, D. B.; Yeo, K. R.; Tucker, G. T.; Rostami-Hodjegan, A. Prediction of nonspecific

hepatic microsomal binding from readily available physicochemical properties. Drug Metab. Rev. 2006, 38, 162-162. 22.

Yang, J. S.; Jamei, M.; Yeo, K. R.; Rostami-Hodjegan, A.; Tucker, G. T. Misuse of the

well-stirred model of hepatic drug clearance. Drug Metab. Dispos. 2007, 35, 501-502. 23.

Eisen, M. B.; Spellman, P. T.; Brown, P. O.; Botstein, D. Cluster analysis and display of

genome-wide expression patterns. Proc. Natl. Acad. Sci. U. S. A. 1998, 95, 14863-14868. 24.

Erlanson, D. A.; McDowell, R. S.; O'Brien, T. Fragment-based drug discovery. J. Med.

Chem. 2004, 47, 3463-3482. 25.

Hopkins, A. L.; Keseru, G. M.; Leeson, P. D.; Rees, D. C.; Reynolds, C. H. The role of

ligand efficiency metrics in drug discovery. Nat. Rev. Drug Discovery 2014, 13, 105-121. 26.

Sanguinetti, M. C.; M., T.-F. hERG potassium channels and cardiac arrhythmia. Nature

2006, 440, 463-469.

TOC Graphic

ACS Paragon Plus Environment

38

Page 39 of 48

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

39

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Schematic depicting the work flow and decision points during and after a typical HTS campaign. Orange boxes indicate assays conducted in a high-throughput format; blue boxes indicate assays conducted by the program team; the open arrow indicates the transition from the HTS screening group to the local project team. Figure 1 84x79mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 40 of 48

Page 41 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Categorization accuracy of human P-gp efflux ratio by the QSAR model for human P-gp efflux ratio (A) and by PSA (B and C). QSAR data shown are prospective prediction values made for 2,403 compounds measured between January 2015 and March 2016 for compounds from more than 50 therapeutic programs. Bar charts are colored by experimental P-gp efflux ratio in a MDR1-overexpressing LLC-PK1 cell line for compounds demonstrating good Papp and a lack of endogenous transport in a control LLC-PK1 cell line. Figure 2 78x133mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Representative isADMET heat maps for single hit series displaying potentially desirable (A) and undesirable (B) predicted property profiles. Rows represent individual compounds. In the legend, the red coloring used for in vivo CLint predictions of 0 accounts for compounds with predicted CLp values that exceed hepatic bloodflow. Figure 3 73x180mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 42 of 48

Page 43 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Retrospective comparison of different hit selection methods for Project 1. Data points shown are representative molecules from distinct chemical series that were biologically validated. Representative molecules were chosen to indicate the highest level of activity reached by a given series during Project 1. Data points are jittered for clarity. Figure 4 84x53mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Plot of human microsomal CLint vs. human P-gp efflux ratio for Project 2 hits. Each data point reflects a representative molecule chosen to characterize a unique chemical series. Blue data points represent compounds chosen by the chemistry team for chemical validation; squares represent compounds chosen for their isADMET properties. Note that there are several overlapping compounds identified as blue squares. Figure 5 84x47mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 44 of 48

Page 45 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

isADMET heat map for Series 1 from Project 2, which only contained 2 compounds from our sample collection. B, isADMET heat map for a virtual library enumerated around the core of Series 1 from Project 2. Rows represent individual compounds. The boxed regions indicate where one R-group was held constant across library members (phenyl and oxazole, as indicated). In the legend, the red coloring used for in vivo CLint predictions of 0 accounts for compounds with predicted CLp values that exceed hepatic bloodflow. Figure 6 73x108mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Predicted vs. measured scatterplots for human microsomal CLint (A, Project 3; B, Project 4), Papp (C, Project 3; D, Project 4), and human P-gp efflux ratio (E, Project 3; F, Project 4). Compounds are shaped and colored by series and the colors and shapes differ between Projects 3 and 4. CLint values are measured in mL·min-1·kg-1; Papp values are measured in 10-6 cm·s-1. Figure 7 798x989mm (72 x 72 DPI)

ACS Paragon Plus Environment

Page 46 of 48

Page 47 of 48

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

isADMET heat maps from Program 5. A, Series 3, which was consistently predicted to bind to hERG. B, Series 4, which displayed less propensity for predicted hERG binding. Rows represent individual compounds. Figure 8 74x120mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Medicinal Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

TOC Graphic TOC Graphic 193x55mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 48 of 48