Emerging Approaches for the Identification of Protein Targets of Small

May 2, 2018 - Phone number: (847) 938 6594. ... with recent advances in broad endpoint profiling assays that have companion reference databases and ...
1 downloads 0 Views 5MB Size
Subscriber access provided by Kaohsiung Medical University

Perspective

Identification of Direct Protein Targets of Small Molecules Kenneth M. Comess, Shaun M McLoughlin, Jon A Oyer, Paul L Richardson, Henning Stockmann, Anil Vasudevan, and Scott E. Warder J. Med. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.jmedchem.7b01921 • Publication Date (Web): 02 May 2018 Downloaded from http://pubs.acs.org on May 2, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

Identification of Direct Protein Targets of Small Molecules

Kenneth M Comess, Shaun M McLoughlin, Jon A Oyer, Paul L Richardson, Henning Stoeckmann, Anil Vasudevan* and Scott E Warder

AbbVie Inc., 1 Waukegan Rd, N Chicago, IL 60064-1802

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT Small molecule (SM) leads in the early drug discovery pipeline are progressed primarily based on potency against the intended target(s) and selectivity against a very narrow slice of the proteome. So, why is there a tendency to wait until SMs are matured before probing for a deeper mechanistic understanding? For one, there is a concern about the interpretation of complex -omic data outputs and the resources needed to test these hypotheses. However, with recent advances in broad-endpoint profiling assays that have companion reference databases and refined technology integration strategies, we argue that data complexity can translated into meaningful decision-making. This same strategy can also prioritize phenotypic screening hits to increase the likelihood of accessing unprecedented target space. In this perspective we will highlight a cohesive process that supports SM hit prosecution, providing a data-driven rationale and a suite of methods for direct identification of SM targets driving relevant biological endpoints.

ACS Paragon Plus Environment

Page 2 of 120

Page 3 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

1.0 INTRODUCTION The Pharmaceutical industry strives to discover safe and efficacious molecules that modulate diseasedependent biology, with the ultimate goal of identifying and treating patients responsive to therapy. This mission presents many challenges, including sources of new therapeutic drug targets and selecting small molecules (SMs) with the highest likelihood of success. Although Phenotypic Drug Discovery (PDD) has contributed to the discovery of best-in-class medicines by connecting SMs to unique mechanisms of action (MOA), this approach presents significant complexity1-4. Ideally, PDD campaigns utilize physiologically relevant models that are linked to patient-derived biology, which in turn may self-select SM phenotypic hits affecting pathways and protein targets most relevant to the disease of interest. To enable this disease relevance, important considerations and assay design criteria have been encouraged, including the Phenotypic Rule of 35 and the concept of a chain of translatability3. However, hit progression from PDD is often impeded due to the resource-intensive and unpredictable time lines associated with SM-target deconvolution campaigns. As a result, the industry relies mainly on Targetbased Drug Discovery (TDD) with a large number of the prosecuted targets originating from the literature. In both cases, SM prioritization from TDD and PDD would benefit from a deeper appreciation of the MOA and target engagement profile6. Recently, PDD has experienced a renewed interest from drug hunters, and sparked a series of lively discussions and debates. One of the most common questions is: “PDD or TDD”? The pros and cons of this question have been thoroughly summarized in a recent review3. Fortunately, the field is recognizing the value of a balanced approach with potential to uncover unique biology. For example, a PDD campaign that leads to discovery of a new target often requires TDD follow-on because the initial hit active in the experimental model system is usually a cellular probe that lacks chemical properties required for an effective drug in humans. In cases where the PDD hit originates from an informer deck

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 120

and tracks with an annotated target, this may initiate a repurposing exercise and not require additional HTS. Lately, the PDD debate has focused on progressing SMs before a molecular target has been identified. Several commentaries have outlined the main scenarios and considerations for SM advancement, including impressive examples of perseverance7. This topic continues to be of interest, in part due to the complexity and high failure associated with target deconvolution.

In our experience, outside of

repurposing SMs previously optimized for in vivo target modulation, PDD probes worthy of nominating for a ‘blind medicinal chemistry campaign’ are the exception and not the rule. In other words, we often lack the detailed cellular MOA and evidence for unique biology to confidently de-risk such an effort. In addition, prosecution in the absence of a target precludes the use of structure-based drug design and the opportunity to have back-up chemical series for consideration. Moving forward, we feel that our commitment to understanding MOA on a deeper cellular level will afford more opportunities to not only progress SMs that lack a defined interaction but increase the success rate of identifying the target(s) driving the desired phenotype. Approaches for target identification and elucidation of MOA have been described in detail8-12. Beyond capturing the technology options driving success in our laboratories, our aim is to lay out a rational strategy for technology integration that merges foundational assays with additional fit-for-purpose solutions. The operational framework and recommended themes that we expand upon in this perspective are summarized in Figure 1. The process was initially implemented to progress PDD screening hits in a data-driven manner, but the added benefit of identifying higher quality TDD SMs for advancement was also quickly realized. The process is dynamic, with a constant influx of technology assessments. Ideally, for each new technology applied we test multiple compounds that cover broad target-class diversity, and generally avoid methods where value is restricted to a relatively small niche of

ACS Paragon Plus Environment

Page 5 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

SMs. In fact, new methods are gauged by how their data complements existing approaches in addition to their individual impact, because data integration is the foundation of our target deconvolution strategy. We believe that with refinement of our stratagem, including a deeper appreciation of the limitations of technology offerings, a process will evolve that leads to better choices for progression of SMs from PDD and TDD. The aim of this Perspective is to advocate for more-informed decision making throughout the progression of SMs in the early and late drug discovery pipeline. Because a single technology is highly unlikely to consistently deliver identification of targets and salient mechanistic biology, we present a core set of assays to set the SM knowledge baseline. In addition, reference SM profiles derived from data-rich biological assays are described as a powerful approach to rule out common MOAs, biologically bin SM cellular perturbations, and focus target identification strategies. SM enrichment of binding partners using chemically-modified hits and label-free approaches are summarized, including impact on PDD and TDD campaigns.

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2.0 SM Profiling for Differentiation, Prioritization, and MOA-Target ID 2.1 In Silico Profiling There are varied approaches for the computational identification of drug targets of SMs. These methods can be broadly classified into chemical similarity searching, data mining/machine learning, panel docking, and the analysis of bioactivity spectra. Recently, other classes such as protein-structure-based methods, have been proposed. A summary of some of the readily accessible tools for target prediction can be found in Table 1 and a detailed description of the databases, web servers and computational models can be found elsewhere13. An elegant example that combines PDD with cheminformatic target identification demonstrates the potential utility of this approach to bypass the laborious target identification and validation process14. With the advent of large data sets of well annotated biological activity such as PubChem, KEGG and CheMBL that can be combined with proprietary datasets such as those available within pharmaceutical companies, it is anticipated that this will continue to be an area of considerable value to complement experimental approaches to target identifications. An equally important and useful component of target prediction engines is the information on the mechanism of drug actions, corresponding to the prediction results in terms of target proteins. Artificial Intelligence (AI) and machine learning are two areas currently trending because of their potential to transform several facets of drug discovery including target identification. While there could be natural efficiency gains with these methods across a broad spectrum of drug discovery challenges, the promise of substantially improving target identification approaches is perhaps most compelling. Given the astronomical magnitude of scientific data that is being generated across the world both in terms of breadth and depth, it is naïve to assume scientists will be aware of new advances, let alone turn those diverse pieces of information into applicable knowledge without the computational-assisted ability to correlate, assimilate and connect all this data.

ACS Paragon Plus Environment

Page 6 of 120

Page 7 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

A thorough discussion of the various AI algorithms and burgeoning companies developing this expertise is beyond the scope of this perspective. Given the plurality of targets that most likely drive a phenotypic response, algorithms that connect this polypharmacology to a network of targets would significantly enhance the quest for highly validated biological pathways. Efforts on the automated design of ligands with pre-defined multi-target profiles represent an exciting step towards bridging the gap between pharmacology and the one-molecule one-target approach towards drug discovery.15 In addition, highspeed and fidelity structure-based approaches to mapping the modulator landscape of nearest neighbors applied to phenotypic screening hits could provide a robust amplification of some of the target identification methods referred to in the following sections.16 2.2 Target/Liability Assays The most straightforward means for confirming the direct target of small molecules can be screening purported targets in a high-throughput manner to directly measure effect. Because a phenotypic screen generally provides a functional readout, a panel of candidate biochemical assays related to that function should theoretically provide direct information about the target. However, as can be seen in Figure 2, the practical number of targets that can be screened in this way is far less than the 100,000+ potential targets in an organism (including splice variants and post-translational modifications)17 or even the 5001400 targets in the typical organelle18. Even the limited number of targets for which assays exist can quickly overwhelm the throughput of internal screening capacity, or the financial resources necessary for contracting external companies that provide screening services (e.g. DiscoveRx, CEREP, or Promega). Also, although assays have been developed for a majority of enzymes composing specific target classes, such as kinases or GPCRs, these best characterized targets are less likely to yield discovery of functionally novel small molecules. In contrast, panels of existing optimized HTS assays are unlikely to include truly functionally novel targets (e.g. orphan receptors, pseudoenzymes, regulatory elements)

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

and would not be interrogated by this approach. Efforts toward efficient parallel screening of multiple targets using affinity-based techniques are discussed in Section 4. Because of the technical limitations imposed by various screening formats, high-throughput screens for targets may not match the form of the target existing in the more biologically relevant tissue or organism. Unless the protein is directly obtained from a relevant biological source, it may have been obtained via heterologous overexpression in either truncated form to provide the relevant catalytic domain, or with extra tags/reporters to allow subsequent high-throughput screening (e.g. affinity tags for target isolation/antibody detection or other reporters for visualization – fluorophores, enzymes, etc.). In addition, the actual format of the screen can play a significant role in the success of a profiling effort due to the choice of the assay strategy (e.g. activity-based versus binding). Chemical matter can impact a target by binding the inactive or pre/pro form of the target, which may be missed if an activity assay on the already active protein is used for screening. Conversely, a binding assay format would work with inactive targets, but only if such a binding assay exists for the inactive form of the protein. The more novel the target, the less likely a binding or activity-based assay will be available for screening at all. Ultimately the success of profiling methods for target deconvolution depends on the target(s) of interest being present in the screening panel, using an assay that recapitulates the function that the phenotypic assay provides, and the target needs to be in a relevant form including appropriate cofactors at appropriate concentrations/ratios using an assay format that does not change the function of the target protein. However, if a target hypothesis can be derived by other means, confirming activity with a screening panel can provide valuable confirmation of direct interaction with that target and subsequently allow efficient screening of other chemical matter to provide the most desirable compound profile for further lead development.

ACS Paragon Plus Environment

Page 8 of 120

Page 9 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

If the hypothetical targets fall within a target class well-covered by profiling screens, then direct assays can be quite useful for triaging screening hits for desired profiles. For example with kinase-based projects, we use a combination of internal screening via TR-FRET probe displacement assays and judicious screening via commercial panels for those kinases not in our routine internal panel. In this way we are able to triangulate on-target activities of phenotypic screening hits and determine off-target liabilities for a subset of compounds produced by internal medicinal chemistry in order to correlate function with target profile and avoid targets with known liabilities19. Likewise, targets in other classes known to correlate with toxicity or other unwanted effects are screened via proprietary internal bioprofiling screening assays coupled with outside screens using CEREP panels for targets of known toxicities20 Promiscuous Interactions – Tales from Microtubules and the Electron Transport Chain While discrete protein interactions are, perhaps, the most logical entities to examine, effects of compounds on larger cellular structures are equally important. In a recent cell viability phenotypic screen, the identification of several colchicine analogues coupled with the observation of key gene expression (GEx) markers raised the specter of microtubule disruption-induced mitotic catastrophe as a principle mechanism for several hit compounds. To explore the prevalence of this phenomenon, 204 hits were examined in replicate at a single dose in an in vitro porcine model of tubulin polymerization. The results, shown in Figure 3a, clearly demonstrate the pervasiveness of the anti-mitotic effect across the hit set. The majority of the compounds examined in the assay triggered a decrease in the rate of the microtubule growth phase. In many of these cases, the effect may be considered minor, producing a less than 30% reduction in growth rate and still achieving maximal polymerization over the assay duration. It should be noted, however, that orthogonal studies have observed that tubulin-driven anti-mitotic effects observed at nanomolar compound levels in cells require micromolar compound quantities to

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

recapitulate the effect in vitro. Regardless, it is conspicuous that nearly one-third of the compounds assayed produced a significant reduction in the rate of microtubule assembly and this effect was not confined to privileged chemotypes. Nearly every structural cluster shown in Figure 3a possessed analogues that were anti-mitotic. It should be emphasized that many of the original hits from our viability screen originated from non-oncology programs, and supports monitoring this activity during SAR campaigns. Perhaps the most striking observation is that relatively minor structural changes, such as those shown in the top two panels of Figure 3B, may result in dramatic differences in tubulin polymerization. In the case of hits1 and 2 in Figure 3B, conversion of an internal cyclopropane ring to an alkene and/or halogen addition alters the tubulin polymerization profile from a colchicine-like destabilization affect to one that looks more like a Taxol-like tubulin stabilizing affect (Figure 3C). This would suggest that during the process of optimizing a structure-activity relationship, either in conventional or in phenotypic drug discovery processes, alteration in tubulin binding characteristics may obfuscate assay results, potentially providing misleading target validation outcomes. Indeed, this appears be the case with purported inhibitors of MuT homologue 1 (MTH1) TH287 and TH588. In assessment of this target, we were unable to correlate enzyme knock-down/-out or enzyme inhibition with an anti-proliferative phenotype. However, we were able to establish that cell growth inhibition was only observed for SMs that had measurable inhibition of microtubule polymerization. This observation was supported by a recent publication, concluding that these compounds are likely affecting cancer cells through their tubulin-driven anti-mitotic effects21. The mitochondria also represent another major source of proteins that bind and are functionally affected by numerous chemotypes. One of the more notable examples originates from electron transport chain (ETC) proteins (Complex I-V). In fact many compounds, including marketed drugs (e.g. troglitazone), inhibit more than one of the ETC complex proteins. In a study by Pfizer, it was noted that many pharmaceutical agents that resulted in mitochondrial dysfunction also had a propensity for

ACS Paragon Plus Environment

Page 10 of 120

Page 11 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

unfavorable physiochemical properties22. In the case of oncology programs where the desired endpoint is cell death, ETC inhibition can be confounding, especially if this activity is a component of a SM’s polypharmacology. For example, inhibitors of Complex I and V result in a benign phenotype across a broad range of primary cell types. Although one can test these molecules for the “Crabtree effect” using a “glucose-galactose” assay to inform on ETC impingement, it is usually only predictive when this is the main activity of the SM23. This assay involves culturing HepG2 cells in galactose as an energy source to make the cells sensitive to the ATP depletion that is induced by ETC inhibition. A more conclusive measurement involves direct assessment of mitochondrial respiration, but this assay usually has a much lower throughput. In addition to routine assay panels, hits and leads from TDD and PPD programs should be routinely assessed for activity against microtubule and mitochondrial ETC proteins. Beyond these mechanisms, broader activities leading to mitochondrial dysfunction also need to be measured during SAR campaigns. 2.3

Broad Profiling with a Companion Reference Database

In addition to the aforementioned panels of enzyme and direct mechanistic assays, methods that are more target agnostic yet, through a broad survey at the gene, protein, or signaling level, offer valuable insight to either phenotypic screening hits or chemical series from a TDD campaign. Comprehensive readouts such as genome-wide transcription often suffer from the complexity of data-driven reduction of “long gene lists to short gene lists”. One approach we regularly employ to utilize these complex data sets is to compare profiles generated by our experimental SMs to reference databases of profiles associated with well-characterized mechanisms. L1000 GEx profiling (Genometry, Inc., Cambridge MA) and the BioMAP platform (DiscoveRx, Freemont, CA) are two commercial offerings that we have utilized in our PDD and TDD programs, in part because of their strong reference database offerings. In the case of phenotypic screening hits, utility is provided by both positive and negative results. For example, SM

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

effects that show strong correlation to database profiles of ciclopirox, thapsigargin or bortezomib could indicate common MOAs such as metal-chelation, endoplasmic reticulum (ER) stress or proteasome inhibition, respectively. Conversely, the lack of a correlation to reference molecules may strengthen the case for a SM possessing a unique MOA, particularly when complementing results from orthogonal methods. The following discussion will detail our recent advances utilizing these approaches and internal case studies of successful implementation. The L1000 Platform In a landmark study by Lamb et alet al., the concept of connecting human disease, genes, and SMs/drugs that modulate biology was elegantly demonstrated in a project termed the ‘Connectivity Map’ (CMap)24,25. A database of µArray gene expression profiles for 164 SM perturbagens across 3 cell lines was generated and mined using publicly available pattern-matching software. The success of initial case studies provided the impetus to greatly expand CMap as community resource project, but the genomewide µArray profiling was cost prohibitive at the necessary scale. The L1000 platform effectively solved this challenge by establishing a high throughput reduced-representation GEx analysis26. This platform, matured by the Broad Institute and recently commercialized by Genometry (http://genometry.com), requires crude human cell lysates, which are subjected to a multiplex ligation-mediated amplification assay in 384-well plates. The L1000 assay directly measures 978 probes representing 1000 Landmark (LM) genes, and computational inference predicts correlated effects on the remaining portion of the full transcriptome27. Over a decade after the initial description of CMap and many published success stories, the L1000 technology platform has fueled the NIH Library of Integrated Network-based Cellular Signatures (LINCS) program to include >1.3 million publically available L1000 profiles28. These profiles were generated from 42,553 perturbagens that include 19,811 small molecule compounds, 18,493 shRNAs, 3,627 cDNAs, and 622 biologics. The cell line coverage was also increased from the original 3 to

ACS Paragon Plus Environment

Page 12 of 120

Page 13 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

a core set of 9 cell lines, in addition to select data across a variable number of cell lines (3-77) for SMs with unknown MOAs. The enormity of this publicly available data has prompted a number of studies to address study design, data quality control, and analysis pipelines29-31. In a recent example, deep learning techniques converted 978-dimensional continuous L1000 data for 3699 SMs into 100-dimensional binary barcode representations32. The barcode captures SM structure and target information, and predicts HTS promiscuity to a greater degree than the original data measurements. This model demonstrated high performance, captured underlying biology and predicted the targets of structurally diverse SMs with unappreciated MOA. This and the other studies emphasize the need to enhance the quality of data outputs from noisy, large scale data as a necessary step prior to integration with orthogonal technologies for hypothesis generation. It is intriguing to speculate how the available L1000 data may be uniquely exploited for probing biological complexity beyond the 2D cell cultures from which they were derived. In a recent report, Senkowski et al. investigated SM sensitivity for tumor cells in 3D culture33. Their study was based on a previous observation that quiescent cell spheroids have reduced sensitivity to cytotoxic SMs, but acquired sensitivity to inhibitors of oxidative phosphorylation (OXPHOS). HCT-116 tumor cells in 2D culture, proliferating spheroids and quiescent spheroids were treated with a dose escalation of OXPHOS inhibitors. Gene sets consisting of the top 30 regulated LM genes revealed dose-dependent and selective upregulation of mevalonate pathway genes for the quiescent spheroids. Moreover, cotreatment of the quiescent spheroids with OXPHOS and mevalonate pathway inhibitors (e.g., simvastatin) resulted in synergistic cell killing. In addition to revealing actionable mechanistic insights, the authors also emphasized the value of monitoring GEx changes over a dose-response, both to capture salient changes and to increase the confidence of observed modulations. This concept was originally

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 120

reported using µArray data, and stressed by the authors as a viable approach to capture polypharmacology.34 The minimal use of this methodology has likely been due to the cost-prohibitive nature of µArray platforms. Dose-response (DR)-L1000 profiling provides an opportunity to rigorously assess this strategy, and we will describe both TDD and PDD examples of successful implementation. As described in Section 2.2, even the most subtle modification can transform a chemical series into a potent MTI. For oncology TDD programs, a large boost in the cellular potency with an absence of a comparable enhancement of the in vitro binding or activity assay warrants direct assessment of the SM in an MTI assay. However, on numerous occasions we have observed activity disconnects that are not straightforward. Our general approach to inform on SM MOA is to expose cells in a 384-well plate to a short-term compound treatment (6hrs) and then monitor DR-L1000 profiles (Figure 4A). As an example, two chemical series from a recent oncology medicinal chemistry TDD campaign were essentially indistinguishable when assessed by in vitro target-binding assays.

While the compounds were

equipotent against most cell lines using an ATP quantitation assay (CellTiter-Glo, Promega Corporation) and cellular confluence, other lines were identified that exhibited significant potency differences (Figure 4B). If the enhanced activity could be attributed to an off-target that did not directly impact the biology of interest, then this information may be useful for series (de)prioritization. To start to address this disconnect, we wanted to understand if the cell growth inhibition was directly correlated with an ontarget cellular response. However, interpretation was hampered by the lack of a robust cellular PD marker. Using DR-L1000, a cluster of target-related genes were identified that strongly modulated compound 2 compared to 1 (Figure 4C, Cell Line A), concomitant with the observed tumor cell line growth inhibition (Figure 4B, top). Several of these genes have well established mechanistic connections to target biology while others have not been previously described. Although these data do not specifically identify the target/MOA of the cellular activity disconnect, they reveal that both SM activities are driven by a mechanism that modulates the on-target biology. Follow-on studies include confirming

ACS Paragon Plus Environment

Page 15 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

cellular target engagement and correlating SM potency and GEx profiles to generate hypotheses for candidate targets driving the observed cellular activities. DR-L1000 has also provided significant value for our PDD campaigns. Ideally, a screen is designed with the “Phenotypic Rule of 3” as a guiding principle and focuses on the physiological relevance of the system, stimulus and readout5. For some biological questions of interest it is not always possible to fulfill all of these criteria, so strategies are needed to address critical gaps. For example, we recently completed a screen using a primary cell system, patient relevant stimulation, and reporter assay readout. Although individual reporter assay endpoints have been considered to be of lower relevance, a recent demonstration of molecule phenotyping (MP) using GEx signatures has demonstrated great promise35. We also rationalized that augmenting the reporter assay hits with DR-L1000 profiles would provide hit confirmation and possibly define a ‘GEx consensus signature’ of target pathway inhibition. For a series of screening hits that did not modulate any targets when assessed in binding and activity assay panels, we compared the reporter assay IC50 plots (Figure 5A left panel) to DR-L1000 data of genes affected by compound treatment (an example of one gene is shown in Figure 5A right panel). This comparison reveals an exceptionally strong correlation between inhibition of the pathway reporter and specific GEx effects, and thus demonstrates how GEx effects can reflect inhibitory potency of individual compounds. The broad endpoints provided by L1000 also have the potential to differentiate compounds through induced effects outside the common target response. For example, a heat-map comparing a gene set for six SMs (3 medium and 3 high potency inhibitors) revealed common perturbations suggestive of target inhibition (Figure 5B, bottom), but also highlighted genes that were affected only by one inhibitor class, suggestive of two distinct MOAs (Figure 5B, top). Many of the monitored genes shared established connections to the pathway biology of interest, but unappreciated mechanisms were also uncovered. DR-L1000 also provided value in compound characterization by generating doseresponse profiles that segregated into discrete potency bins. In Figure 5C, we plotted 3 genes related to

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the phenotypic endpoint. The most potent response (Gene 1) was for a transcription factor coregulator, followed by an established genomic target of the transcription factor (Gene 2). Gene 3 is consistent with a general stress response with established association to the pathway, and either originates from a cellular feed-back mechanism or represents polypharmacology of the SM. Although this latter interpretation has been emphasized as the main value for DR-GEx34, we have been accruing evidence that the engagement of a single target may manifest multiple response profiles that inform on complexity of SM MOAs. In the absence of the broad dose response chosen, the actual gene regulation would not be appreciated. For example, profiles from a single high dose would lead to the conclusion that the three genes were minimally regulated (Figure 5C). The BioMAP Platform BioMAP systems are comprised of human primary cell-based models and represent a diversity of tissue and disease models. Primary cells alone or in co-culture are treated with factors and stimulated to mimic the complexities of disease biology. Panels of protein markers, many with established clinical relevance, are measured to generate compound-specific signatures. The signatures are then compared to a reference database of SMs and biologics to reveal correlation and thereby inform on MOA. The process is summarized in Figure 6A. Establishment of the platform and application to SM MOA has been described in detail.36-38 Our early use of this technology was mainly limited to differentiation of later-stage clinical candidates or comparison to reference molecules. The desire to profile hits from PDD at an earlier stage was predicated on the large reference database (>4,500 test articles) and strength of this approach for identifying common MOAs. However, our initial access to this technology for phenotypic screening hits was restricted by the cost per data point and lack of strong evidence to support the quality of these

ACS Paragon Plus Environment

Page 16 of 120

Page 17 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

early leads. Ironically, we argued that broad-endpoint profiling with a companion reference database was part of the foundational data needed to understand and prioritize hits for target deconvolution. In 2016, we profiled 12 compounds originating from 3 PDD campaigns (Table 2). In the case of PDD-1, we generated BioMAP profiles for 2 molecules prioritized from a large Cancer Cell Line (CLL) profiling screen. Previous data for both molecules supported unprecedented mechanisms of tumor cell growth inhibition. The lack of a strong database correlation supported the unique MOA hypotheses and reinforced the decision to pursue additional target deconvolution approaches. BioMAP data for compounds derived from PDD-2 and PDD-3 identified direct targets for the lead SMs. Overall, data from this profiling platform significantly impacted the chemistry strategy for chemical probe generation. It is clear that PDD and TDD campaigns would benefit from characterizing and prioritizing leads using the BioMAP platform. To increase access to the technology, a process has been developed that provides a more cost-effective mechanism for prioritizing larger numbers of SMs for downstream analyses. Figure 6B outlines the introduction of an Early Triage Panel (ETP) and Diversity Screen (DS)39. The ETP reduces the breadth of experimental outputs measured to expand throughput and focus on a central goal to deprioritize compounds with overt cytotoxicity to primary human cell types. Then, the DS stage enables a program-specific analysis across 12 BioMAP cellular systems (148 endpoints) that captures the total number of activities, cytotoxicities, scaffold-specific biology, and provides flags for key toxicity alerts. The ability to query the reference database or test for mechanism classification37 may improve the usefulness of DS analysis, although because the DS uses only one concentration of test article, the result could be less accurate. Importantly, individual assays and endpoints that prove informative from the BioMAP systems may be implemented as counter-screens during lead optimization. Overall, this approach should improve the quality of chemical matter that is progressed, and build institutional knowledge regarding chemical cores in screening libraries.

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2.4 CRISPR – Small Molecule Positive Selection Genome-Wide The ability to enable protein recruitment to genomic sites of interest through application of CRISPR systems40 has greatly expanded our capabilities in target identification. Forward genetic screens that segregate cell lines or samples by an observed phenotype in order to contrast differences in genetic sequence or expression have an established track record for application in target identification.6,41,42 An example of the general utility of CRIPSR targeted knockouts was shown in validation of a forward genetic screen that consistently identified HPRT mutations in cells that spontaneously acquired resistance to the cytotoxic effects of 6-thioguanine (6-TG).43 To confirm these mutations were causative, CRISPR reagents targeting HPRT coding exons were utilized and shown to generate 6-TG resistance in the parental line through gene disruption. However it is important to note that targeted genetic perturbations will not always parallel a chemically-induced phenotype. Discrepancies could arise from genetic perturbations causing complete loss of a protein versus small molecule treatment inhibiting just one facet of that protein’s total activity, e.g. inhibiting enzymatic activity while distinct preserving scaffolding functions. Also, targeted knockouts via CRISPR will have specificity based on nucleotide sequence, whereas small molecule engagement will be driven by protein structure and conformation. Therefore a small molecule-induced phenotype driven by polypharmacology or uncharacterized offtargets will not be fully recapitulated by genetic perturbation of a single gene. This aspect could prove a liability in instances where highly conserved genes express functionally redundant proteins, and genetic targeting of a single gene will not elicit a phenotype associated with a small molecule that engages all conserved protein targets. However this single gene specificity of CRISPR targeting can also be exploited to facilitate target identification, as elegantly demonstrated in a recent publication that established chemical-induced cytotoxicity in basal breast cancer lines was not related to MELK activity or protein levels.44

ACS Paragon Plus Environment

Page 18 of 120

Page 19 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

We have expanded from the functional test at a single gene level to essentially generate a shortcut to acquired resistance by pairing available genome-wide guide RNA (gRNA) libraries with phenotypic selection (Figure 7). A key benefit of this approach is the unbiased nature of the pooled libraries. Therefore, results are not restricted to pre-existing hypotheses. This use of pooled lentiviral CRISPR screening presents a shortcut through two key features. First, genes are systematically disrupted via CRISPR targeting, as opposed to relying on mutations that have spontaneously occurred in the population. Second, the integrated targeting gRNA construct can be sequenced and serve as a surrogate identifier of the gene disruption that occurred in a cell that survived selection. This direct readout of individual targeted perturbations avoids the significant challenge of a global RNAseq or DNAseq search for the key mediators of resistance amongst a pool of downstream or passenger effects. The utility of this approach in target ID was shown in the initial demonstration of genome-wide pooled CRISPR screening in human cells, which revealed several components of the mismatch repair pathway as additional mediators of 6-TG sensitivity and genes required for etoposide cytotoxicity, TOP2A and CDK6.45 We have integrated pooled CRISPR screening into our target identification technology stack as a tool capable of profiling SMs of unknown MOA or lacking well-characterized targets. A recent example of the value provided by this technology occurred in our characterization of phenotypic screening hits that inhibit a disease-specific signaling pathway. This pathway is fairly typical with signaling activation transmitted through successive phosphorylation events, but the lead compounds showed no inhibitory activity when profiled against a kinome panel. The initial phenotypic screen was also performed in a primary human cell line with pathway activation mediated by exogenous stimulus. However we capitalized on the range of mutations and disrupted signaling networks that occur in human cancers. We identified clonal cancer cell lines with constitutive activation of this same signaling network, but driven

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

by intrinsic mutations as opposed to external stimuli. Transcriptional profiling established commonalities between the primary cell model and these cancer lines, supporting their use as a surrogate model. The proliferation capacity and effectiveness of viral transduction in this cancer cell model allowed us to apply genome-wide screening. Importantly, these cancer cell lines also exhibited addiction to this pathway for maintained viability and proliferation. A clonal cell culture model with constitutive activation of the pathway was chosen for the screen because compound treatment induced cell death, presumably as a direct consequence of pathway inhibition. This compound-associated selection was applied and enriched cells with CRISPR-targeted gene disruptions that conferred resistance to pathway inhibition. We hypothesized that particular gene disruptions that alter the cellular response to compound treatment would inform our target hypotheses. This approach may not directly identify the precise target of a molecular inhibitor, because disruption of the target protein is expected to phenocopy the small molecule effect as opposed to creating resistance. However, the diversity of pathway inputs and regulation that exist in biological systems create the potential to reveal nodes and regulators proximal to the direct target of a small molecule inhibitor. This particular instance revealed gene disruptions that strongly focused on mitochondrial functions and specifically components of the oxidative phosphorylation pathway. This prompted follow-up with immuno-purified enzyme assays that validated direct inhibitory activity against an enzyme complex that was not previously linked to the biology under investigation. Therefore, this case demonstrated how pooled CRISPR knockout screening can be used to reveal MOA of uncharacterized SMs and establish novel biological connections. Additional studies have similarly applied this approach beyond proof of concept and employed the technology to clarify the target mechanism of rigosertib46 or detail mechanisms of cytotoxicity associated with the antiviral compound GSK983.47 In the examination of GSK983 cytotoxicity, a genomewide shRNA screen identified the direct inhibitory target, DHODH, whereas the parallel CRISPR knockout screen revealed distinct but related gene targets. This difference highlights how the particular

ACS Paragon Plus Environment

Page 20 of 120

Page 21 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

perturbation used can affect the experimental results. Targeted mutations generated by CRISPR are more likely to yield complete loss of function alleles, whereas shRNA transfection is more likely to result in a hypomorph phenotype or partial reduction in activity. More specifically, viral transduction of CRISPR reagents yields a pool of cells with heterogeneous indel mutations. Many of these indels completely disrupt protein production or activity, but a fraction of nucleotide mutations will retain functional expression and protein activity. In contrast, transduction of shRNA reagents may not completely eliminate protein production and subsequent activity, but a more uniform effect could be observed from cell to cell across the total population. In screens where selection is possible, the potential phenotypic variance associated with mixed CRISPR indels may be desirable. We initiate these screens with enough cells and virus so that each library gRNA is represented in at least 100 cells at the initial infection. At this level of representation several distinct mutational outcomes and allelic combinations should be generated for the downstream phenotypic selection. Analysis will identify particular gRNAs and gene targets prioritized in the screen, and validation efforts could ascertain whether particular mutation profiles were required for the phenotypic effect, e.g. loss of a single allele for partial knockdown. We also seek to maximize potential impact of the data generated by performing parallel phenotypic selections with different levels of stringency. Situations where strong viability selections (maximal growth inhibition) are applied may identify a small number of genes required for cytotoxic effects, but lower concentrations of compound (IC20 or IC80) may reveal additional genetic interactions, such as gene disruptions that sensitize to compound-induced cytotoxicity or confer partial tolerance. The total number and diversity of genetic perturbations that significantly alter cellular response to compound are unpredictable, especially when MOA is unknown, thus it can be beneficial to plan multiple selection concentrations for an initial screen. The example given above relies on compound cytotoxicity as the experimental means for positive selection of resistant cells. However, alternative means of cell isolation should also be feasible with pooled screening approaches, such as FACS

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

enrichment of live cell populations based on levels of fluorescent reporters or cell surface markers. This approach, two serial rounds of FACS enrichment based on GFP reporter activity or staining of cell surface markers, was elegantly used in genome-wide screens that identified host factors required for HIV infection or regulators of PD-L1 expression in tumors, respectively48,49. To further broaden the utility of the CRISPR system, a number of enzymatically inactive Cas9 variants, e.g. dCas9-KRAB or dCas9-VP64, have been developed, which enable modulation of transcription levels as an alternative to gene disruption.50,51 Which of these particular systems is best suited to uncovering unknown MOA is unlikely to be consistent across all scenarios, but each pooled screening approach has strong potential to guide toward the target biology due to the unbiased genome-wide interrogation.

ACS Paragon Plus Environment

Page 22 of 120

Page 23 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

3.0 Target Enrichment 3.1 Chemical Probe-Based Target Enrichment – Intracellular Targets Considerations for Probe-Based Target Identification The resurgence of phenotypic screening as a viable and orthogonal drug discovery platform has been accompanied by renewed interest in target identification strategies. Across the academic and industrial research sectors, considerable time and investment has been poured into target identification technologies, both novel and established. Since many of these techniques leverage the use of chemical probes, it is prudent to first consider the challenges involved in their design and implementation. When contemplating the development of a chemical probe from a lead molecule, the predominant challenge entails the effective tethering of a reporter component while avoiding the introduction of unwanted negative characteristics. Common concerns include the disruption of the primary drug-target interaction, and potential changes in physical-chemical properties including changes in solubility, nonspecific protein binding, cell permeability and probe localization. The judicious choice of linkers may mitigate these effects For example, a short polyethylene glycol (PEG) linker terminated with an azide (N3) or Boc-protected amine (-NHBoc) can maintain the solubility of the probe, minimize non-specific protein binding, but result in some loss of cell-permeability (Figure 8) as evidenced in multiple cell based assays (data not shown) as long as the position of attachment is not critical for target binding. When desired, the azide / Boc-protected amine is converted to a primary amine and reacted with an application-specific reporter (e.g. solid supports, fluorophores, biotin, expression tags, anionic permeability modifiers, etc.). While inclusion of a suitably long linker possessing desirable chemical character is important, it is also imperative that it is attached to a functionally benign vector on the compound of interest. Compounds

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

whose targets are well elaborated may benefit from the application of structural or modeling data to facilitate probe construction. In the absence of this information, available structure-activity relationship data can be used to conduct a hypothesis-driven approach to probe design. Alternatively, the chemistry of certain hits may afford the facile attachment of linkers, which may be subsequently examined in activity assays for their effect. Such attachments may be opportunistic, based on the nature of the hits, or engineered during synthesis of the compounds. Advances in C-H activation chemistry may provide additional routes to attachment not yet realized.52 A more comprehensive approach to ligand attachment has been developed by Kanoh et al53, and extended by Nishiya et al54, whereby a heterobifunctional linker with a photoprobe on one termini, and either affinity resin, or an amine reactive functional group on the other termini is first photocrosslinked to the compound of interest.

This creates a population of crosslinked compounds that is then

immobilized on affinity resin and used for chemoprecipitation as described below. While this is very efficient from a workflow point of view relative to designing and making specific probes, and would be very useful for difficult to synthesize natural products, it suffers from several challenges. Most notably reproducing and characterizing the crosslinked pool would be difficult as a variety of products are obtained and testing to be sure the mixture maintains activity would prove difficult or impossible so failure to identify a target would be difficult to interpret. In our experience, the ability to be sure the linked probe retains activity is crucial, and to date we have only rarely been unable to deliver a linked probe that maintains some level of the parent drugs activity. Affinity Capture “Chemo-Precipitation” The most common application of chemical probes for the purposes of target identification entails their use as “bait” to capture and enrich cellular entities to which they have a reasonably strong affinity. Usually, the probes are conjugated to a solid bead-based support comprised of sepharose or agarose,

ACS Paragon Plus Environment

Page 24 of 120

Page 25 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

which is then incubated in a cell or tissue homogenate. However, several permutations to the technique exist including the use of arrays or conjugation of a soluble affinity tag such as biotin or FLAG. After incubation in a biological matrix, the targets are recovered through the collection and washing of the beads. This process mirrors that of immuno-affinity precipitation (immuno-precipitation) and therefore may be considered to be chemical-affinity precipitation or chemo-precipitation. Chemo-precipitation is a popular target identification technique owing to its ease, scalability and direct interaction with the cellular targets. Despite these advantages, chemo-precipitation suffers from many limitations both technological and biological. Most obvious is the potential inability to create an affinity probe that retains its target binding capability, either due to the complexity of the binding mode or synthetic limitations. Lack of target binding may also be driven by the metabolism of the compounds into active yet unidentified forms, obviating the ability to create a competent chemo-precipitation reagent. Biologically speaking, chemo-precipitation has been limited to cell and tissue homogenates due to a lack of probe permeability. The decompartmentalization, dilution and pH normalization experienced upon lysis may result in target denaturation, binding partner dissociation, and proteolysis among other undesired events. Recently, ligation of a chloroalkane tag to compounds has shown promise for target enrichment using Promega’s Halotag system. Since the tag permits the probe to retain cellular permeability comparable to parent compound, in-cell chemo-precipitation is fast becoming a reality55-57. Given the considerations, it is expected that success rates will vary depending upon the nature of the screen, the types of targets being pursued, the design of the compound library and the differential allocation of resources to target identification efforts between research institutions. Implementation of additional techniques such as the use of quantitative mass spectrometry, have improved success rates58. However, the enhanced value of chemo-precipitation-based data sets is realized when paired with orthogonal target identification technologies.

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Photo-Affinity Labeling Enrichments A key failing point of chemo-precipitation experiments is poor target affinity or denaturation of the target upon cell lysis. This limitation can be offset through the introduction of a covalent bond between the probe and its target while the cells remain in culture. A popular technique to achieve this objective while minimizing non-specific covalent labeling of the biological matrix is photo-affinity labeling (PAL). Photo-affinity labeling uses an analog of a biologically active small molecule that bears both photoreactive and reporter functional groups.59 The photo-affinity probe is added to proliferating cells and irradiated with UV light. Irradiation of the photo-reactive group generates a reactive chemical species (e.g., carbene, nitrene, or radical) that covalently cross-links the photo-affinity probe to its binding partner(s) based upon the proximity of the two. Photo-cross-linked protein targets are then visualized by the reporter group (e.g., fluorophore, biotin, or radioactive label). There are several photo-reactive functional groups frequently used in PAL (e.g., benzophenones, diazirines, aryl azides and acyltetrazoles). 59-61 Recent advances in small-molecule bio-imaging techniques have made it possible to further couple such an “in situ profiling” approach with imaging of drug uptake and subcellular distribution.62-65 In a recent advance, coupling PAL with in-situ screening and target discovery, Cravatt and colleagues describe an approach that marries fragment-based ligand discovery with quantitative chemical proteomics to map thousands of reversible small molecule-protein interactions in human cells.66 This approach of fragment-based PAL was combined with phenotypic screening to identify small molecule ligands for a poorly characterized membrane protein, PGRMC2. While PAL has the potential to mitigate deficiencies in conventional chemo-precipitation approaches, it also contains several challenges that may reduce its effectiveness. In particular, the rapid quenching of the activated photochemical intermediate often results in low cross-linking efficiency. This necessitates the design of a probe that positions the photochemical warhead in close proximity of the protein binding partner(s). When the target is elaborated, photo-affinity probe design may be facilitated thorough existing structural data

ACS Paragon Plus Environment

Page 26 of 120

Page 27 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

and/or computational modeling. However in the absence of de novo target information, photo-affinity probe design may become particularly onerous. For Target ID, therefore, probe design involves a significant level of empirical assessment and is advocated as a complement to other techniques rather than a single approach. Activity-Based Protein Profiling For situations where molecular probes for chemo-precipitation or photo-affinity labeling cannot be designed, activity based protein profiling (ABPP) offers an alternative. Activity-based protein profiling is a target identification technique that entails screening compounds against probes that have been developed to interact in a functionally-dependent manner with broad protein classes including proteases, kinases and others67-69. Since ABPP requires a functional set of targets for enrichment, it can discriminate between inactive pro-enzymes and cleaved functional entities (e.g. hydrolases/proteases), 70

as well as proteins activated by post-translational modification. ABPP reagents are comprised of a

reporter, a short spacer and a warhead capable of covalently binding to a structural motif shared between active members of a protein class. Early forms of ABPP used biotin or fluorophore reporters (see Figure 8).71,72 Since these probes suffered reduced cell permeability, they were applied to cell and tissue homogenates after dosing the drug candidate of interest. The targets of the dosed compounds were subsequently inferred by loss of ABPP reporter signals (Figure 9). In more recent applications, the ABPP platform has been adapted into a two-stage process. After dosing a compound of interest, an ABPP probe with a bioorthogonal handle (e.g. alkyne, azide) that maintains cell-permeability is applied to cells in culture.73 Following reaction with its targets in situ, the cells are lysed and treated with a reporter molecule attached to a bioorthogonal handle that pairs with the ABPP probe. This process allows both compound and ABPP probe access to targets in both live cells and even in vivo making it quite unique among most target ID methods73.

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 120

A common application of ABPP is profiling compounds for their off-targets and quantifying target engagement to the desired target. As an example, Ahn et al were able to use a fluorophosphonate ABPP to demonstrate that their FAAH inhibitor, PF-3845, was highly selective versus other serine hydrolases74 By quantitating the level of target engagement required to raise anandamide levels in the brains of mice, ABPP was able to predict the expected levels of target engagement that would be required for human applications of FAAH inhibitors. Likewise, we used the same ABPP method to identify relevant off-targets of our own preclinical FAAH inhibitors.75 More recently, a FAAH inhibitor, BIA 10-2474, led to severe adverse reactions in the clinic, including one fatality. Subsequent ABPP analysis indicated that BIA 10-2474 also interacted with a number of serine lipases. It was hypothesized that alteration of the lipid networks in cortical neurons was responsible for the toxicity.76 Since ABPP probes are often carefully optimized, target profiling and identification may be streamlined without the concern of developing a new set of molecular probes for each project. The major limitation in using ABPP for target identification is a lack of probes for the majority of druggable protein space let alone allowing for the discovery of alternative binding modes and allosteric modulators within the established ABPP-covered target space. In order to overcome the limited scope of true “activity”-based Protein Profiling, reactivity-based profiling (RBP) takes advantage of the characteristics of certain functional groups in biomolecules by expanding ABPP to nucleophilic side-chains such as free sulfhydryls77 and amines78 or even specific posttranslational modifications such as sulfenic acids79. One unique application of RBP capitalizes on the propensity of metabolizing enzymes oxidizing the drug to create a cross-reactive moiety in situ. By choosing a biorthogonal handle that is resistant to metabolism, the in vivo targets of reactive drug metabolites may be assessed. While the majority of reactive metabolites react preferentially with free thiols, enriched proteins are not necessarily responsible for metabolism itself76.

ACS Paragon Plus Environment

Truly stretching the

Page 29 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

concept of RBP, recent photoaffinity modified fragment screening has been shown useful in creating selective chemical matter and identifying novel targets simultaneously and for targets at compound affinities too low for non-crosslinking methods66. The key to effective ABPP/RBP is the identification of interesting targets that can be labeled by a probe and then be used competitively to test compounds – the absence of a probe for the target family of interest renders ABPP/RPB ineffective. Target Display Techniques While chemo-precipitation, photo-affinity labeling and activity-based protein profiling have evolved to enrich targets directly from lysates, cells and tissues, the immense heterogeneity in protein expression level may preclude a successful target identification campaign. In these cases, screening compounds against expression-enriched libraries of artificially generated proteins, such as those generated by display methods, may provide target insights that direct approaches are too insensitive to obtain. Several methods have been employed across academic and industrial research centers for pooled protein display-based target identification. These techniques are distinct from arrayed protein methods such as the Retrogenix (High Peak, UK) method discussed in Section 3.2. The pool-based techniques rely on foundational strategies of linking gene to protein and include bacteriophage, bacterial, yeast, ribosome, and mRNA display methods. Figure 10 outlines the methodology for bacteriophage display and an example from our laboratory. Critically, when display technologies are applied to target identification, the goal is to identify the biological protein partners for ligand binding, even if the physiological protein abundance or affinity is low. In contrast, for other display applications (for example, antibody affinity maturation), when the goal is to identify or create a very high affinity partner; combinatorial optimization of amino acid sequences may be used to create the highest possible affinity target regardless of whether it exists in nature. Thus a differentiating requirement in the application of display technologies for target

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

identification is capture of small molecule/target binding interactions over a wide range of affinities. Bacteriophage display using full length cDNA libraries as a source of genetic material is a proven method for unbiased small molecule target identification80; multiple examples of success are collated in a review by Takakusagi et al81. As a reverse chemical genomics technique, it allows low abundance and weakly binding targets to be detected through the amplification of phage DNA. It is aided by recent advances in modern DNA sequencing, which have afforded more sensitive quantitation of candidate targets (see Figure 10E for an example). Further, the cDNA library may be tailored to fit relevant biology from any DNA or RNA source. Massive oversampling of expressed proteins is afforded by phage amplification in E. coli, so that thousands of copies of every gene product are presented for binding. Despite these advantages, phage display technologies face important limitations. Perhaps the most perturbing challenge involves the bacterial expression of eukaryotic genes, which limits the scope of posttranslational modification. Drug candidates that bind targets in a modification-dependent manner therefore will be lost in these types of screens. Additionally, strategies must be employed to reduce the amplification of smaller proteins or protein fragments over larger proteins. Nevertheless, we find bacteriophage display an important component to our suite of target identification methods due to its ability to amplify and detect weak but functionally relevant drug-target interactions. Yeast-Chem Hybrid Similar to display approaches, artificial cellular systems may be leveraged for target identification by positive selection screening. Based on work by Licitra and Liu82 a chemical yeast-three hybrid genetic system, or yeast-chem hybrid, can be employed, whereby cell growth on limited media is controlled by the presence of a molecular probe that brings two parts of a synthetic essential receptor into proximity. In this application, haploid yeast cells expressing an anchor protein fused to a DNA binding domain are mated with a library of haploid yeast cells each containing a separate target “prey” fused to an

ACS Paragon Plus Environment

Page 30 of 120

Page 31 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

activation domain. The resulting diploids are treated with a molecular probe constructed from the compound of interest linked to an anchor protein binding element. In only those diploids where the molecular probe binds both the anchor protein and a prey protein are biosynthetic genes encoding the production of histidine activated, allowing cellular growth on histidine depleted media. By creating libraries of genetically encoded “prey” targets from cDNA expression libraries, entire proteomes can be interrogated for their ability to bind the molecular probe.83 Several companies have commercialized this technology (e.g. Hybrigenics, Cambridge, MA) and specialize in the identification of direct targets, confirmation of targets identified by other methods and finding off-targets for lead candidates. As with display, these systems have the advantage of normalizing protein levels relative to native biological systems so that expression level differences do not impact the results. However, depending on the probe used and the artificial nature of the protein targets profiled, the technique is prone to high false positive/negatives, once again highlighting the importance of orthogonal methods. 3.2 Chemical Probe-Based Plasma Membrane Target Identification Considerations for Plasma Membrane Protein Target Identification While target identification from preparations of intracellular proteins poses formidable challenges, interrogation of proteins embedded in the plasma membrane requires separate and unique consideration. More than a third of the expressed proteome is embedded in the plasma membrane and greater than 50% of marketed small molecule drugs target these proteins. Despite this, methods to broadly and robustly measure compound engagement at the level of the membrane are lackluster. This is largely driven by the hydrophobic character of these proteins which confounds their extraction and recovery. When taken together with their lower overall copy number and propensity to aggregate, generalized methods are needed to profile this critical group of candidate targets.84

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Ligand-Directed Chemical Biology Direct large scale interrogation of the cellular plasma membrane proteome is fraught with biological and technological complications, the greatest of which include poor solution-phase behavior and sample preparations contaminated with intracellular proteins. Chemical biology techniques have been forwarded to help mitigate these circumstances that capitalize on covalent associations and unique cell surface features in order maximize sample recovery and purity. Every cell is coated in a layer of carbohydrate consisting of large branched sugar chains called glycans. Glycans are physically attached to plasma membrane proteins through asparagine (N-linked) and serine (O-linked) side chains. The majority of proteins embedded in the cell membrane possess glycans, which contrasts with intracellular proteins that contain few. Since the chemistry of cellular glycans frequently can be manipulated with little to no detriment to the cell, glycans provide a unique and convenient avenue towards the enrichment of plasma membrane proteins. Consequently, chemical biology applications have focused on the development of soluble trifunctional probes. One arm of the probe is the selectivity function, and consists of the compound whose targets are under investigation. The second arm consists of a reactive moiety designed to specifically and covalently bind to the cell surface glycan chains. The third and final arm is an enrichment function, which permits purification of the covalently bound membrane associated glyco-proteins from the cellular milieu. In one application of this paradigm, cells85 and even animals86 can be labeled with non-natural sugars under physiological conditions, introducing new functional groups for a wide range of chemical modifications. In a recent example, a glycophosphatidylinositol (GPI)-anchored receptor was labeled with sugars containing azide functionalities. A trifunctional probe consisting of a receptor-binding ligand, biotin and an azide-reactive cyclooctyne was introduced. Since the covalent reaction between azides and cyclooctynes occurs only at high concentrations, enrichment of the receptor would only transpire if

ACS Paragon Plus Environment

Page 32 of 120

Page 33 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

the probe were bound to it for an appropriate time. This was observed after avidin purification by Western blotting and mass spectrometry (Figure 11). Sensitivity and specificity of the process may be increased further by cleaving the sugar-peptide bond with a glycosidase enzyme (PNGase F), limiting matrix effects. The method can also be used to map crosslinking sites, image receptors and their internalization.87 The metabolic labeling of cell surface glycans is an attractive method owing to high label incorporation, lack of deleterious cellular effects and overall cleanliness of the sample preparation. For primary and non-proliferating cells, metabolic labeling becomes difficult if not impossible. In these cases, a different trifunctional probe (TriCEPS, DualsystemsBiotech) may be employed, which similarly uses cell surface glycans as a mechanism for plasma membrane protein enrichment. When treated in low concentrations of sodium meta-periodate, glycan alcohols are carbonylated resulting in the decoration of the cell surface in aldehydes. A trifunctional probe containing a hydrazine as its reactive functionality will rapidly form covalent hydrazone bonds with the aldehydes. Applications of this technique are fairly recent and have involved the use of bio-therapeutics in lieu of small molecules. Figure 12 shows a study in our laboratory involving TriCEPS modification of three antibodies genetically engineered to display a spectrum of affinities for the surface receptor EGFR (0.0014, 0.3, 1 μM). These were tethered to the trifunctional hydrazine probe via a N-hydroxysuccinimide functionality (Figure 12A) and applied to cell lines containing as few as 2,000 copies of EGFR on their surface. In all three cases, EGFR was successfully enriched demonstrating the applicability to the method to low affinity and low abundance surface receptors (Figure 12B). As an alternative to glycans, certain amino acids on the protein backbone can serve as anchoring points for ligand-directed chemistries. One of the early successful examples of so-called traceless (native function) affinity labeling88 is ligand-directed tosyl (LDT) chemistry (Figure 13A). By unifying covalent

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

bond formation and ligand cleavage, a phenyl sulfonate (tosyl) linker can serve as both a linker between the ligand and the reactive group as well as a leaving group. The SN2-type reaction between the tosyl group and a nucleophilic amino acid results in the release of the ligand during the labeling reaction. LDT chemistry was first demonstrated by modification of human carbonic anhydrase (hCA). LDT reagents comprised of a benzenesulfonamide, a fluorophore (Dc, 7-dimetylaminocoumarin) or a biotin capture tag were connected through a tosylate group. Human red blood cells were successfully labeled, with no apparent hemolysis, and SDS-PAGE analysis after cell lysis indicated a single fluorescent band that corresponded to endogenous hCA. Blocking studies with an inhibitor did not lead to hCA labeling, suggesting that selective labeling was achieved by an affinity-based reaction. hCA labeling was also conducted in live animals, where the biotin-type labeling reagent was intravenously injected into Slc:ICR mice. Analysis by Western blotting showed that hCA-selective labeling occurred, demonstrating the power of ligand-directed chemistry in the context of complex proteomes.88 Proteins that have been successfully captured to date are FK506 binding protein 12 (FKBP12), Src homology 2 (SH2) domain, and congerin II, whereby LDT reagents contained a range of ligands, such as the synthetic analog of FK506, a peptide, and carbohydrate ligands.89 LDT has been used for selective protein labeling, such as heat shock cognate 70 (Hsc70),90 a molecular chaperone, and 14-3-3 proteins 91. Despite its high target- and site-selectivity, the LDT method is limited by its slow rate and low labeling efficiency and labeling membrane-bound receptors in live cell systems has been problematic. Liganddirected acyl imidazole (LDAI) chemistry was developed to address this problem (Figure 13B).92 Due to the controlled reactivity of acyl imidazole and the selective binding driven by the ligand, acyl transfer from the LDAI reagent to a natural amino acid on the target protein surface was efficiently accelerated by a proximity effect to afford labeled receptors. Successful examples of receptor capture include GPIanchored receptors,92 the bradykinin B2 receptor 93 and the NMDA receptor.93

ACS Paragon Plus Environment

Page 34 of 120

Page 35 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

An alternative traceless affinity labeling reaction called affinity-guided DMAP (AGD) chemistry involves the use of a catalyst to facilitate the chemical modification of target proteins. In particular, the organocatalyst 4-dimethylaminopyridine (DMAP), a commonly used catalyst for acyl transfer reactions, has been used for membrane protein labeling. The strategy employs an affinity ligand tethered to the DMAP catalyst, which, in the presence of appropriate acyl donors, facilitates the acyl transfer reaction to a nucleophilic amino acid residue near the active site of target proteins. A series of ligand-tethered DMAP catalysts specific to a range of proteins has been designed. For instance, the SH2 domain and FKBP12 probes had high target specificity in bacterial cell lysates and animal tissue extracts.94 Recently, Ueda et al. have reported a study evaluating LDT and epoxide-based ligand-directed chemistries to capture human FKBP12. They compared probes for labeling the target protein surface and probes with reactive groups for labeling of the interior of the ligand binding pocket and showed that that only the ligand binding pocket-oriented reagents labeled the target protein, whereas the protein surface-oriented reagent was ineffective.95 Although powerful, the amino acid-based proximity-driven chemistries reported to date share the limitation that the reactive groups on the molecular probes are inherently reactive with functional groups already present in cellular systems. Moreover, they rely on the presence of specific amino acids in close proximity to the ligand-binding site, which is problematic for on- and off-target proteinidentification when no a priori knowledge about the target proteome is available. Similar to intracellular target identification, cell-surface receptors have been shown to be amenable to photo-affinity labeling (PAL) in a method termed Capture Compound Mass Spectrometry (CCMS).96 In this case, tri-functional compounds are designed that entail a drug molecule attached as the selectivity function, a photo-activatible reactivity function to induce a covalent cross-link with the target protein subsequent to equilibrium binding by the selectivity function, and a function for the isolation of

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 120

compound-protein conjugates. As a case study to demonstrate the targeting and capture capability of CCMS, a workflow to capture GPCRs directly on the surface of living cells has been reported.

97

Here,

capture compounds based on a GPCR ligand targeting the dopamine D2 receptor have been employed as a proof-of-concept.97 Cellular Protein Expression Arrays As described previously with display technologies, screening for targets against libraries of artificially produced proteins can be advantageous. Plasma membrane proteins notoriously have been incompatible with most protein display and array techniques, however, owing to their biophysical properties. Retrogenix (High Peak, UK) invented a method for cellular surface protein microarray that has been applied successfully to multiple target identification questions across various therapeutic areas, including antiviral98, anti-parasitic99,100 and oncologic101. By comparison with more precedented arrayed protein techniques102, the Retrogenix format confers several advantages, principally the enablement of membrane protein overexpression and presentation in a mammalian cellular context. This method comprises a reverse transfection array of cDNA spots encoding GFP and a single human membrane protein in a bicistronic message. These spots are prepared on a glass surface amenable to cell attachment. HEK293 cells are overlaid, and reverse transfection occurs via lipofection to express GFP and a unique membrane protein in the cell monolayer at the positions defined by the initial array. Similar to recombinant protein arrays102, a fluorescently labeled small molecule, peptide, or antibody is allowed to bind to overexpressed protein targets followed by washing and imaging. The array of GFP spots allows registration of the positions of all test proteins. A key advantage to this process over more conventional recombinant protein arrays is that it potentially affords well-folded membrane protein presentation in a physiologically relevant milieu, decreasing the false negative discovery rate. Similar to recombinant protein arrays, all proteins are expected to be presented at roughly similar concentrations,

ACS Paragon Plus Environment

Page 37 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

in contrast to methods using physiologically correct cell or tissue lysate matrices where critical target proteins may be expressed at levels too low to measure. Using the same three EGFR antibodies modified to display a spectrum of affinity (0.0014, 0.3, 1 μM) described in the previous section, we assessed the sensitivity of the platform. In Figure 14, the antibodies show clear replicate binding to live cells overexpressing EGFR on their surface. Since antibody binding is detected by application of a fluorescently labeled secondary antibody, application to small molecules requires tagging the molecule directly with biotin or a fluorophore. Using alprenolol, mixed results were observed, where binding to the expected beta-2 adrenergic receptor was confirmed, but other known targets were not seen (data not shown). An impressive feature we have observed so far with the Retrogenix approach is an absence of false positives. The fact that either a validated target is identified or no targets are identified at all, contrasts with the broad swath of disproven targets found in every other target identification method. It remains to be seen whether false negative results occur to mitigate the zero false positive rate. PRESTO-Tango Genetically engineered cell lines overexpressing membrane target proteins and pathway-dependent reporter genes have been used to identify targets. For example, Bassilana et al. have identified GPR39 as the target for a Hedgehog pathway inhibitor103. Roth et al. have described a parallelized platform to interrogate a large number of druggable human GPCRs via a G protein-independent β-arrestinrecruitment assay. GPCRs are known to induce arrestin translocation, which can be measured by transcriptional activation following arrestin translocation (“Tango”)104. This assay was translated into a platform technology by parallel receptor expression and screening via transcriptional output through Tango (“PRESTO-Tango”)105 whereby the required reagents and methods are available to the scientific community and hence can be considered open-source. In a modular design, a Tango gene construct was

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

furnished for each GPCR in the human genome. The 5′ end entails a cleavable signal sequence for membrane localization and a Flag epitope tag to enable monitoring of cell-surface expression. The 3′ end includes the sequences for the tobacco etch virus (TEV) nuclear inclusion a endopeptidase-cleavage site and the tetracycline transactivator (tTA) protein. Ligand binding to the target GPCR stimulates recruitment of the arrestin-TEV protease fusion, triggering the release of tTA. Free tTA then enters the nucleus and triggers a measurable reporter gene activity. PRESTO-Tango has been validated on 120 nonorphan human GPCR targets, but not all GPCRs are accessible, possibly because some receptors do not interact with arrestins in an agonist-dependent fashion. 3.3 Label-Free Target Enrichment Considerations for Going Label-Free The approaches described in the previous two sections share a common goal: the generation of target candidates and mechanistic hypotheses that may be actioned for validation. However, none of the technologies are sufficiently robust when used in isolation and benefit from combination. A drawback from leveraging orthogonal techniques resides in the upfront material costs and labor required to properly execute a study with the understanding that generation of molecular probes and labels may ultimately fail. Consequently, application and integration of techniques that do not require probe chemistry, but rather use the compound leads as they are is an important part of a target identification platform. A few of these types of applications are described herein. Neumatic Protein Organization Technology (NPOT) NPOT® technology utilizes a transitory pH gradient to mimic both the intra- and extracellular pH. This gradient allows ligand-protein interactions to occur at their physiological pH, followed by a precipitation event that is well-described by Kirkwood-Buff theory106. In practice, tissue or cell lysates are prepared at

ACS Paragon Plus Environment

Page 38 of 120

Page 39 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

4°C, in the absence of detergent, reducing agent, or protease/phosphatase inhibitors. All dilutions and washes utilize buffer with equal osmolality, trace elements, vitamins, and salts, in concentrations that recapitulate the interstitial medium or cytoplasm. Compound (≤10-6M) is incubated with the lysate and the macromolecular assemblies are separated using a differential microdialysis system. The macromolecules (protein groups) migrate in a transitory continuous pH gradient (5-10) to their mean zwitterionic point. Nematic crystals gradually form from the macromolecular clusters into heteroassemblies. This compound-protein mixture is subsequently trapped in mineral oil and isolated. The precipitate is solubilized and proteins identified by MS. To asses this approach, five compounds were studied, both to complement target identification efforts from PDD hits of interest, and to investigate off-target effects. For three of the SMs, NPOT® data supported and/or lead to the prioritization of specific hypotheses. In the absence of complementary data sets, it may be challenging to prioritize the information-rich protein lists to directly identify the direct binding partner(s). However, when these data are subjected to orthogonal methods for protein enrichment and pathway mapping, a potentially unique snapshot of the SM’s interactome may be revealed. Unique Polymer Technology (UPT) When lacking the capability to generate a functional molecular probe for chemo-precipitation, a method by which the compounds may be immobilized to a surface in a random orientation with no modification is desirable. Unique Polymer Technology (UPT) offers this advantage. Unmodified molecules of interest are non-covalently immobilized on a unique polymer surface, presumably via many weak interactions9. The appropriate polymer and binding conditions are empirically determined by quantifying the amount of small molecule released during a series of washing steps.107 A biological lysate of interest is incubated with the affinity matrix, washed, and retained proteins are identified by using gel electrophoresis-MS. In a recent example histone deacetylase 2 and prohibitin were implicated as the targets of a small

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

molecule targeting BrCa cells.108. The UPT method claims that 60% of enriched proteins are either directly interacting or part of an enriched complex.9 CETSA-MS The relatively new Thermal Shift Assay (TSA) method of CETSA-MS (Cellular Thermal Shift Analysis – Mass Spectrometry) affords simultaneous determination of compound binding and target identification across a wide array of soluble and membrane-bound targets.109-112 The principle is as follows. Using either living cells or cell lysates, with or without added test compound, multiple replicate samples are heated, each at a separate temperature, for a preset period of time. The heating causes proteins to unfold and aggregate, creating a loss of signal. In a target engagement embodiment, the signal is measured by Western immunoblotting and gel quantitation.113,114 The midpoint temperature of the signal loss, designated the protein unfolding transition (Tm), is measured for each species. The measured Tm of individual proteins is affected by ligand binding; typically binding leads to thermodynamic stabilization and a clearly measurable change in the Tm value by several degrees C (called the “∆Tm”). In the CETSA-MS target identification embodiment, as many as 4,000 – 6,000 proteins are measured at ten temperatures by mass spectrometric proteomics techniques, and the Tm values and ∆Tm values of liganded proteins are assigned.109,110,112 By comparing the Tm values for all of the observed proteins in a cell or tissue sample with and without test compound incubation, proteins whose structure is thermodynamically affected by ligand binding are identified as potential targets. The analysis may be taken further by comparing changes in Tm values between intact cells and pre-lysed cells. These observations permit inference of the value of protein-protein interactions and endogenous cofactor binding.112 In an extension of this work, the dimensionality and efficiency of the process was increased by including a parallel set of experiments in which cells or lysate samples were pre-incubated with varying

ACS Paragon Plus Environment

Page 40 of 120

Page 41 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

concentrations of the test compound115. Because both test compound concentration and denaturation temperatures are simultaneously varied, the experiment comprises a two-dimensional profiling strategy. As above, proteins bound by a small molecule ligand will demonstrate altered thermodynamic stability and thus a shift in the midpoint temperature of the unfolding transition. In the one-dimensional version of CETSA, a binary binding measurement is ascertained at a high compound concentration (optimally, >10x Ki for any target or off-target), and additional compound doses are used in follow-on experiments to deconvolute the relative affinities for selected proteins. The simultaneous temperature and compound concentration response profiling in the two-dimensional strategy affords an in situ confirmation of the binding, as all real binders should show a dose response. By simultaneously dosing temperature and compound concentration in a single two-dimensional experiment, identification, confirmation, and affinity rank ordering of ligand-target interactions are determined in a single experiment. While a reduced number of drug concentrations is typically used in two-dimensional CETSA compared to a typical isothermal dose response experiment, and thus the quantitative resolution of affinities of the drug to different cellular targets is reduced, the efficiency of identifying and characterizing so many interactions in one experiment should more than compensate for the increased imprecision. Yeast Haploinsufficiency (HIP) and Homozygous (HOP) Profiling The use of yeast deletion strains to provide genome-wide measurements for uncovering SM MOA are well-established116. A HIP assay involves treating a set of heterozygous deletion diploid strains with SMs that have measurable cytotoxicity. Reducing the gene dosage in half increases the sensitivity of the drug’s cytotoxic effect and is referred to as drug-induced haploinsufficiency.

This may lead to

identification of the direct target(s) of the SM or reveal proteins that are mechanistically related to function of the target(s). HOP profiling utilizes yeast haploid or diploid strains with complete deletion of

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

non-essential genes. Protein knockouts that confer resistance to SM treatment are identified and used in concert with HIP data to provide comprehensive analyses of SM MOAs117. As part of a target deconvolution effort for SMs derived from a CCL profiling campaign, we integrated HIP data with the suite of approaches described herein. In all accounts, these data failed to provide hypotheses or add to existing biological themes. With the advent of additional data analyses options, and knowledge of the targets for some of these SMs, we are currently reevaluating these screens to assess the potential hidden mechanistic value of these data. Other Target Engagement Technologies As mentioned previously, CETSA is a method that relies on the thermal stabilization of the target protein to determine ligand engagement. In a similar manner, other methods such as Drug Affinity Responsive Target Stability (DARTS), Stability of Proteins from Rates of Oxidation (SPROX) and Protein-Ligand Interactions by Mass Spectrometry, Titration and H/D Exchange (PLIMSTEX) rely on enhanced target stabilization to inform on ligand engagement118-120. With DARTS, engagement is measured by increased resistance to proteolysis, typically by pronase. When considering SPROX, engagement is measured by a ligand-induced reduction in non-specific oxidation owing to the retention of a folded protein state. PLIMSTEX is similar to SPROX in this regard, only it measures the propensity of the target to undergo hydrogen-deuterium exchange instead of oxidation. Each of these techniques possess unique advantages and liabilities in their implementation but have all been used to demonstrate ligand binding and, in some cases, ligand interaction strength.

ACS Paragon Plus Environment

Page 42 of 120

Page 43 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

4.0 TARGET VERIFICATION AND ENGAGEMENT Considerations for confirmation of target hypotheses Once a candidate target or list of target candidates for small molecule binding is generated through either direct binding or indirect mechanism of action studies, target engagement assays typically are required both for confirmation of the binding hypothesis and evaluation of additional small molecule chemical matter. It is unlikely a PDD hit meets all criteria for advancement to in vivo studies without chemical optimization and confirmation. At a minimum, a dozen or more related compounds should be synthesized to understand the structure-activity relationships of the ligand-target interaction. More typically, the identification and confirmation of a disease-related target (or toxicity-related off-target) galvanizes development and implementation of a full high throughput screening campaign to identify both structurally related and unrelated chemical matter that might act more favorably or selectively on the target and the disease phenotype. Because the direct target engagement assay (or enzymatic activity-based assay, if relevant) is linked to the phenotypic assay from which the effort originated, it is likely advantaged over a target-based assay in both potentially exploiting previously unidentified targets and in foreknowledge that the target is druggable. 4.1 CETSA As described above, CETSA is employed to characterize the temperature at which individual proteins within cells or lysates unfold. In the original target engagement example of CETSA,114 individual proteins are monitored by using immunoblot analysis. This comparatively straightforward technique allows one to confirm binding to and to affinity-rank order compounds both in intact cells and lysates, in cell culture and in intact tissue and even whole animals.121 One can gain mechanistic insight into compound effects on protein-protein interaction and protein-cofactor interaction through analyzing intact cell versus lysate affinities.122 The breadth of possibilities is rich, as long as an adequate antibody can be obtained,

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

and recent reviews and many original research publications document that the technique can be applied across a variety of target types and in many different laboratories122,123. It is worth noting that in some cases lack of a CETSA binding stabilization or destabilization signal may be a biophysical consequence of the type of binding rather than actual inability of the small molecule to bind. Such false negatives are reported to occur in rare cases.113 4.2 NanoBRET The NanoBRET platform was introduced for the analysis of protein-protein interactions in 2015124 and was rapidly followed by publication of a technique for intracellular SM-target interaction targets125 akin to in vitro time-resolved fluorescent resonance energy transfer (TR-FRET) probe displacement assays.126 Both the in vitro TRFRET and intracellular NanoBRET versions of the assay comprise homogeneous (no wash) formats and provide a highly specific measurement of target engagement. In the in vitro method the protein of interest is the only target present; in the NanoBRET method specific tagging of the target of interest affords discrimination. The target specificity inherent in the NanoBRET cellular version is critical so that the presence of so many other potential targets in a cell does not interfere. While the compound and probe molecule may bind many proteins, a NanoBRET ratiometric signal is only measured when probe engages the tagged protein. Further, tagged proteins can be introduced to cells by transient transfection under conditions where expression level is as low as endogenous protein,125 in order to minimize the cellular consequences of overexpression; the assay detectability remains high due to the high sensitivity engineered Nanoluciferase (Nluc) enzyme and novel cell permeable substrate system127. In several recent instances, the sensitivity afforded by the detection system was taken advantage of to remove the limitations of exogenous expression entirely. White et alet al. fused DNA encoding full length Nluc into the endogenous genomic loci of genes coding for β-arrestin2 and CSCR4 to measure β-arrestin recruitment and GPCR internalization and trafficking.128 Schwinn et al. describe a

ACS Paragon Plus Environment

Page 44 of 120

Page 45 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

general technique for using CRISPR/Cas9 technology and ribonucleoprotein electroporation to efficiently tag endogenous proteins in multiple cell types.129 Both approaches promise to enhance the quality and efficiency of the NanoBRET intracellular target engagement format and can afford low throughput ligand-target binding confirmation and high throughput screening for new ligands for confirmed targets. Finally, because the NanoBRET probe displacement assay is non-lytic, all measurements including kinetic displacement analyses can be carried out in intact cells. Thus compounds can be characterized not only for intracellular target engagement but also for their kinetic binding off-rate.125 While the technology holds many significant advantages in efficiency over CETSA, the fact that CETSA requires no protein tagging allows CETSA the advantage of more cell types and affords translatability. Additionally, a small molecule that binds at a different site on the target of interest than the probe and does not modulate binding of the probe is effectively invisible in NanoBRET probe displacement assays but not CETSA. A probe can be made from a small molecule of interest in order to expand the ability to discover new ligands to a novel binding site, but initial target engagement studies by NanoBRET are most easily afforded when the target hypothesis falls into a well described enzyme class where probes exist for a range of targets within the class, for example kinase enzymes. This is nicely demonstrated by the publication of a 178 kinase NanoBRET array for assessing intracellular kinase selectivity130. 4.3 Bump-Hole One challenge associated with matching a target with a compound identified from a phenotypic screen is definitively confirming that the target identified drives the phenotype. Although many of the techniques described in this publication can show whether a compound interacts with a target, there is still a risk that off-targets are the real driver of said phenotype. One exceptionally powerful method for confirming a target hypothesis is the “bump-and-hole” strategy whereby the test compound is modified with a sterically demanding group (a “bump”) that matches a target with amino acid substitutions that

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

accommodates the bump (the “hole”) such that only the hole-modified target can bind the bumpmodified drug, but neither the native on-target, nor potential off-targets are still able to bind the bumpy drug131,132. Then in cells expressing the mutated target the bump-modified drug can show the phenotype only via interaction with the modified target. This has been used for a number of targets including kinases, BET-family proteins, and others. 4.4 Imaging and target localization Target localization is a process to determine where the drug target is located in cells and even whole organisms. Knowledge of target localization can increase success in target identification in the drug discovery process. For example, cell-membrane proteins require a different target identification strategy than cytosolic or secreted proteins. Determining the subcellular localization may also be useful in prioritizing drug targets. Extracellular domains of plasma membrane proteins and secreted proteins are easily accessible by drug molecules and could be prioritized over cytosolic proteins in certain cases. Several techniques are available to determine target localization in cells and tissues. For instance, FRET leverages the proximity between a ligand and its target, resulting in a quantitative readout for target occupancy133. Ligand-directed protein labeling covalently tags a target protein with a reporter and can be used not only to visualize or isolate target proteins, but also to quantify target engagement134. CETSA can monitor binding of unlabeled drug compounds to target proteins. Recently, single cell, spatially resolved CETSA has been achieved with adherent cells135. The ability to visualize targets in live cells, or even in whole organisms, not only provides insights into target localization for the purpose of target identification, but could also improve testing of clinical efficacy of new small molecule inhibitors. A number of small molecule target imaging agents have been developed that leverage bioorthogonal two-step labeling methods. An attractive strategy entails equipping the drug molecule with a reactive trans-cyclooctene (TCO), which leaves the drug cell-

ACS Paragon Plus Environment

Page 46 of 120

Page 47 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

membrane permeable and, upon binding its target, can be visualized using a bioorthogonal tetrazine (Tz)-labeled fluorophore. Using this strategy, it was possible to image aurora kinase A (AKA) in live cells and to quantify cellular kinase expression136. Other successful examples leveraging Tz/TCO bioorthogonal pre-targeting include BTK137, MET138, EGFR139 and PARP140. Molecular imaging, often via positron emission tomography (PET), is a noninvasive biomedical technique that enables the quantification and visualization of drug targets in human and pre-clinical subjects by monitoring the time-dependent distribution of minute quantities of drug labeled with positron-emitting isotopes. Although one of the major applications of PET is the study of target occupancy, it is a prime technology to determine target localization in humans and preclinical species. By measuring the effect of different drug doses on radioligand binding, tissue penetration, target engagement and the relationship between plasma exposure and target occupancy can be determined in vivo. Although PET is a powerful technology to study target localization both clinically and pre-clinically, it relies on radioligands that must be selective for the target of interest and possess a range of physicochemical and pharmacological characteristics that allow them to be radiolabeled with short-lived positron-emitting isotopes, safely administered to humans, and quantified on the target in vivo141. 4.5 SM-Protein Target Engagement Methods (ASMS and TSA) Figure 2 and section 2.1 above discuss the scale of challenge in agnostically screening PDD hits against many targets. However, it is noteworthy that modalities exist to screen multiple protein targets in parallel on a compound collection in affinity based HTS screens. For example, a collection of purified proteins of relevance to a particular pathway can be screened rapidly by affinity selection-mass spectrometry, and then confirmed in cellular pathway activity assays142. In this type of study there is no need to tag the small molecules, and all proteins are controlled to same concentration. However, the

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

library of proteins is restricted to those that are individually cloned, expressed, and purified. Several iterations of mass spectrometry-based direct ligand detection methods based on either size-exclusion chromatography or ultrafiltration are all robust in our estimation and can be applied to this parallel protein target and ligand matrixed ID approach.143-145 Thermal shift analysis (TSA) (also known as differential scanning fluorimetry) methods also can be applied to detect ligands for an array of purified protein targets146,147. The detection method is less direct than ASMS techniques, as binding is inferred by thermal stabilization or destabilization of the protein being measured, but the protein detection is more direct, in a sense, because one measures the kinetic process of thermally-induced protein unfolding. This protein detection-based method requires excess compound in order to saturate binding sites and provides a clear signal, akin to the requirements for lower throughput NMR-based methods148.

ACS Paragon Plus Environment

Page 48 of 120

Page 49 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

5.0 Technology Integration – Case Studies 5.1 Strategic Considerations for Technology Integration Since a single technology does not exist that will routinely identify the direct targets for SMs of interest, there is a need for a coherent integration strategy. Many of the enabling approaches are only available commercially, so routine utilization of multiple compound-technology combinations is generally costprohibitive. Our initial objective for SMs derived from both PDD and TDD is to identify or rule out common targets and mechanisms. The key to this prioritization step involves profiling technologies that broadly survey SM biological perturbations and couple this data with a companion reference database. In cases where a project team has a question about a small set of compounds, either from TDD or PDD campaigns, we generally suggest starting with BioMAP profiling (Figure 6). We then initiate DR-L1000 (Figure 4) and include the initial compounds, key analogs, and BioMAP reference database match compounds in our profiling submission. Conversely, if we have a large number of phenotypic screening hits of interest (e.g. 10-50), we usually start with DR-L1000 profiling to inform on the number of MOA buckets that exist, prior to BioMAP DiversityPlus profiling. Moving forward, we envision profiling a larger set of TDD or PDD hits using the BioMAP ETP and DS as additional level of prioritization, prior to implementing both DR-L1000 and BioMAP DiversityPlus. It is important to emphasize that a reference database match may not be due to direct engagement of the SM with the protein target. Although a direct binding hypothesis should be followed-up, the data could also be consistent with an indirect perturbation. In the absence of a strong database hit, target enrichment approaches should be considered. If the SMs of interest are ‘mature’, either derived from a PDD informer library or part of an active TDD medicinal chemistry effort, they are usually more amenable to target enrichment strategies involving a modified parent compound. For example, a linkermodified probe for AC-MS may be synthesized from existing intermediates using established SAR.

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Although cellular potency is not a direct predictor of successful target identification, it definitely provides a theoretical advantage for non-covalent protein enrichment and identification. In addition, SMs optimized against a specific target may have a reduced off-target profile, thus minimizing the complexity of data deconvolution. Emerging technologies that facilitate label-free enrichment will likely play an important role in the target identification tool box and complement the linker-based techniques. To demonstrate successful implementation of the concepts described, we have highlighted 2 case studies. The first demonstrates successful use of phenotypic screening to prioritize a hit based on perceived unique biology, followed by AC-MS for target identification. In the second example, we lacked a clear target enrichment hypothesis but identified a gain-of-function target using SM-mediated acquired resistance (spontaneous and positive selection genome-wide CRISPR approach). Engagement of Compound CGS-0059 with the putative target was verified using CETSA. 5.1 High-Content PDD: AbbVie’s BETi Story During 2006-2008, we initiated a campaign to elucidate the selectivity, MOA, and target engagement profiles for SMs derived from a variety of sources, including pipeline TDD projects, PDD screens, and SM cellular active libraries. One of the main technology platforms that we adopted was Protein-fragment Complementation Assays (PCAs) developed by Odyssey Thera (ODT), Inc.149,150. PCAs facilitate the detection and localization of complexes formed by 2 proteins that interact as part of a normal cellular pathway. A PCA is created by expressing genes of a pair of interacting proteins, each linked in-frame to fragments of a rationally dissected reporter gene. The association of 2 proteins of interest brings together complementary reporter fragments and enables productive folding into an active structure. The resulting quantifiable signal can then be spatially localized in living cells by automated high-content microscopy. Using PCA, ODT developed a profiling platform to probe the activity of over 130 proteinprotein interactions covering critical pathways in the context of living cells. Each assay was run at two or

ACS Paragon Plus Environment

Page 50 of 120

Page 51 of 120 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Medicinal Chemistry

three time-points in HEK-293 cells, with a subset receiving a pretreatment stimulation with pathway activators or inhibitors. A complementary panel of immunofluorescence and biochemical assays provided additional value for interpreting the effects of compound treatment. One of the more attractive features of this platform was access to a SM reference database. This allowed for rapid identification of common MOAs and ultimately provided a ‘SM uniqueness index’ that supported datadriven decision making. Using the ODT platform and an integrated technology approach we were able to impact programs across multiple therapeutic areas. The most noteworthy example was origination of our BETi program. Compound CMPD-604 was initially investigated as a lead from an anti-inflammatory program (WO 2009084693 A1)151. However, observations of growth inhibition and cytotoxicity against a small panel of cell lines prompted a closer analysis of the PCA perturbation profiles. With multiple activities correlating to stress/inflammation, cell cycle control, and apoptosis, we focused our attention on the potential of this molecule as an anti-cancer compound. Potent anti-proliferative effects were observed for the tumor cell lines profiled in the ODT panel (SKOV3; HCT116) and the PCA signature shared activities with pre-clinical and clinical oncology compounds included in the ODT reference library. The PCA signature for CMPD-604 did not correlate directly with a specific reference molecule, suggesting that the MOA was potentially unique or at least not a common MOA (Figure 15A). Prior to initiating a target identification campaign, CMPD-604 was profiled against panels of enzymes and receptors.

The

compound was mostly inactive, but did inhibit the peripheral benzodiazepine receptor (p-BZD), with an IC50 of 400nM. These data prompted profiling CMPD-604 analogs against p-BZD and correlating this activity with growth inhibition of a DoHH2 tumor cell line. The lack of correlation suggested that p-BZD was not a driver of the anti-proliferative activity (Figure 15B). Additional cancer cell line profiling established CMPD-604 as a broadly active compound with enhanced potency and maximal growth inhibition against hematological non-adherent tumor cell lines. Beyond an attractive cellular profile, the

ACS Paragon Plus Environment

Journal of Medicinal Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

decision to initiate chemistry to synthesize an AC-MS probe for a target deconvolution campaign is often influenced by the existence of chemical functionalities that can be directly modified. CMPD-604 had carboxylate and aryl halide functionalities, facilitating the synthesis of the resin-bound SM using multiple vectors. The resin was incubated with K562 tumor cell lysate, in the presence and absence of free CMPD-604 (10 µM). The bromodomain family of proteins (BRD2, BRD3, BRD4, and BRDT) were highly enriched and completed with free CMPD-604, nominating these proteins as putative mediators of the anti-proliferative activities (Figure 15C).

Since the parent compound was amenable to direct

modification, it took