Covadots: In Silico Chemistry-Driven Tool to Design Covalent

4 days ago - Thousands of seals died along the coasts of the heavily polluted Baltic Sea in the late 1980s. Scientists... SCIENCE CONCENTRATES ...
1 downloads 0 Views 1MB Size
Subscriber access provided by EDINBURGH UNIVERSITY LIBRARY | @ http://www.lib.ed.ac.uk

Computational Chemistry

Covadots: In Silico Chemistry-Driven Tool to Design Covalent Inhibitors Using a Linking Strategy Laurent Hoffer, Magali Saez-Ayala, Dragos Horvath, Alexandre Varnek, Xavier Morelli, and Philippe Roche J. Chem. Inf. Model., Just Accepted Manuscript • Publication Date (Web): 25 Mar 2019 Downloaded from http://pubs.acs.org on March 26, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Covadots: In Silico Chemistry-Driven Tool to Design Covalent Inhibitors Using a Linking Strategy

Laurent Hoffera, Magali Saez-Ayalaa, Dragos Horvathb, Alexandre Varnekb, Xavier Morellia* and Philippe Rochea,*

a Aix

Marseille Univ, CNRS, Inserm, Institut Paoli Calmettes, CRCM, Marseille CEDEX 09 13273 France

b Laboratoire

de Chemoinformatique, CNRS UMR7140, 1 rue Blaise Pascal, 67000, Strasbourg, France

* To

whom correspondence should be addressed. E-mail: [email protected] [email protected] 1

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2

ACS Paragon Plus Environment

Page 2 of 65

Page 3 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Abstract

We recently reported an integrated fragment-based optimization strategy called DOTS (Diversity Oriented Target-focused Synthesis) that combines automated virtual screening (VS) with semirobotized organic synthesis coupled to in vitro evaluation. The molecular modeling part consists of hit-to-lead chemistry, based on the growing paradigm. Here, we have extended the applicability of the DOTS strategy by adding new functionalities, allowing a generic chemistry-driven linking approach with a particular emphasis on covalent drugs. Indeed, the covalent mode of action can be described as a specific case of linking, where suitable linkers are sought to fuse a bound organic compound with a nucleophilic protein side chain. The proof of concept is established using three retrospective study cases in which known noncovalent inhibitors have been converted to covalent inhibitors. Our method is able to automatically design reference covalent inhibitors (and/or analogs) from an initial activated substructure and predict their binding mode. More importantly, the

3

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

reference compounds are ranked high among several hundred putative adducts, demonstrating the utility of the approach to design covalent inhibitors.

4

ACS Paragon Plus Environment

Page 4 of 65

Page 5 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Introduction

Hit identification has been greatly facilitated by advances in screening technologies and by the development of fragment-based strategies 1. However, one major problem remains the hit-to-lead (H2L) optimization of validated hits to derive lead compounds with high affinity for the target 2. In this H2L optimization context, molecular modeling coupled to structural data can be employed to extract or design active ligands with reasonable physicochemical properties and good accessibility by organic synthesis 3. We previously described the DOTS methodology, which consists of an integrated strategy for the hit-to-lead optimization of fragments by combining automated virtual screening (VS) with robotic-based synthesis coupled to the experimental evaluation of compounds 4. While the original approach was purely focused on growing strategy, new functionalities were implemented in the present work to allow generic chemistrydriven linking. Covalent ligands received particular focus because covalent design can also be described as a specific case of linking, where suitable linkers are sought to fuse a small molecule with its protein target.

5

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 65

By definition, covalent drugs contain an electrophilic moiety (called “warhead” in this context) able to react with a nucleophilic group from protein residues. The result is a covalent protein-ligand complex (“covalent adduct”). Typical nucleophilic groups are thiol from cysteine or hydroxyl from serine, but additional residues such as lysine, histidine, threonine, tyrosine or methionine can also be targeted

5, 6.

In practice, the

inhibition of the target usually occurs in a two-step mechanism. First, the reactive compound binds noncovalently to the protein with its reactive warhead close to a nucleophilic residue of the target. This step is mainly driven by the binding affinity of the compound. Then, the covalent linkage is formed in situ, resulting in a covalent protein-ligand complex. This step is mainly governed by the reactivity of the warhead and nucleophilic group.

Covalent drugs have frequently been associated with putative toxicity and nonspecific binding due to their intrinsic reactive behavior. Nonetheless, well-known drugs in the pharmacological history are covalent ligands (e.g., aspirin, β-lactam

6

ACS Paragon Plus Environment

Page 7 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

antibiotics), although most of them were discovered by serendipity, and more remarkably, their covalent mode of action was identified a long time later 7.

Recently, there has been a resurgence of covalent drugs 8, and several have been approved by the FDA over the past few years (Figure 1.A). For instance, neratinib (HKI-272) 9 is used to treat breast cancer and is reported as a dual inhibitor of human epidermal growth factor receptor 2 (HER2) and human epidermal growth factor receptor (EGFR) tyrosine kinases. Similarly, afatinib (BIBW-2992) (PF00299804)

11

and osimertinib (AZD9291)

12

10,

dacomitinib

are second or third generations of

irreversible small molecule inhibitors of the activity of the EGFR family. All of them are used to treat metastatic non-small cell lung cancer. Finally, ibrutinib treat B-cell cancers by covalently targeting Bruton’s tyrosine kinase.

7

ACS Paragon Plus Environment

13

is used to

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1: A. Chemical structures of FDA-approved drugs that form covalent interactions with their targets. Reactive groups are highlighted in red and prodrugs that form disulfide bonds in green. B. List of warheads considered to target the cysteine residue. C. List of warheads considered to target the serine residue.

8

ACS Paragon Plus Environment

Page 8 of 65

Page 9 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Highlighted text represents the reference warhead for each study case. Warhead classes present in FDA-approved drugs are indicated by solid colored circles.

Covalent drugs can have great benefits in terms of potency or selectivity within a class of related targets. This feature is particularly interesting in the cancer therapy field to achieve good selectivity for a specific protein (or a mutant with constitutive activity) that is overexpressed in cancer cells. Finally, they usually possess extended duration of action because of the covalent linkage, permanently disabling the targeted protein 14.

In silico Methods to Design Covalent Inhibitors Covalent ligands are particularly challenging in the molecular modeling field. Indeed, their reactive nature involves the formation of a covalent bond, a quantum mechanical phenomenon that is not properly handled by most energy-assessment methods such as nonreactive force fields (FFs) or empirical scoring functions

15.

For

instance, FFs do not take electrons into account explicitly and thus are unable to model bond breaking or creation. In other words, the topology of the system remains 9

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

identical during the simulation: only the coordinates of atoms can change. In contrast, quantum mechanical theory handles electrons but is computationally expensive and cannot be applied in VS 16.

Various computational methods and their challenges in the context of covalent ligands were recently reviewed

17.

The main approximation of all docking-based

approaches concerns the total neglect of energetic contribution from the covalent bond formation because most docking methods were developed to handle typical noncovalent ligands and were subsequently adapted to address covalent ligands. This main drawback is, however, not critical when both the reactive residue and an appropriate electrophilic warhead are known from previous experiments. In such cases, the goal of molecular modeling is reduced to finding putative binding modes for a given compound that fits into the binding pocket while exhibiting reasonable geometry around the new covalent bond formed between the ligand and the protein residue.

10

ACS Paragon Plus Environment

Page 10 of 65

Page 11 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Most programs that enable covalent docking perform conformational sampling of the ligand within the binding site with constraints around the targeted nucleophilic protein residue. The main question raised by any computer method is when to create the covalent bond. Some approaches generate the covalent adduct before the docking stage during the design of the virtual library or preparation of inputs, while others create it ‘on the fly’ during the conformational sampling according to both geometric and energetic criteria. For instance, popular docking programs such as GOLD

18,

FlexX

19,

ICM

20

bond, while DOCKovalent

and AutoDock 23,

FITTED

21, 22

24

rely on preformation of the covalent

and CovDock

25

belong to the second

category. An alternative strategy relying on the pharmacophore concept was also reported for the design of covalent inhibitors

26.

Similarly, the LigandScout tool

27,

which uses pharmacophores for the VS stage, was employed to discover covalent inhibitors for cathepsin S

28

and enteroviral 3C protease

29

targets. Finally,

DOCKTITE, a fast and versatile method implemented within the Molecular Operating Environment (MOE) also uses a pharmacophore-guided docking approach

11

ACS Paragon Plus Environment

30.

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 65

Despite their crude approximations regarding theory, in silico methods have demonstrated their ability to design potent covalent inhibitors in many prospective projects 7, 17, 31.

Here, we report a generic linking upgrade to the recently described DOTS strategy 4

that is able to tackle the specific case of designing putative covalent probes starting

from one known noncovalent inhibitor or fragment-like hit. The proof of concept is demonstrated using three retrospective cases, including one with FDA-approved drugs, different nucleophilic residues and common warheads used in the covalent drug field (Figure 1.B and 1.C). A focus on warheads found in FDA-approved drugs was adopted during this retrospective study. For instance, acrylamide, nitrile, propargyl and epoxide functions are already present in drugs on the market (Figure 1.A). Additional warhead classes from the covalent inhibitor literature

32

were also

considered (for example, maleimide, aldehyde and vinylsulfonyl functions). While the vinylsulfonyl moiety is not currently present in FDA-approved drugs, recent studies

12

ACS Paragon Plus Environment

Page 13 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

were reported in which it has been successfully used

33, 34;

notably, NU6300 is the

first covalent ATP-competitive inhibitor targeting the CDK2 protein 35.

Results and Discussion Global Strategy

The goal of the approach is to transform a hit or known noncovalent ligand into a covalent inhibitor by identifying chemical moieties that may link a reference substructure with a predefined nucleophilic residue of the target while 1) maintaining its original binding mode according to the fragment-based drug design (FBDD) paradigm, 2) achieving reasonable geometry for the covalent adduct and 3) achieving expected good synthetic accessibility. In practice, compounds with the covalent warhead would be synthesized, and the covalent adduct would be created

in situ. While DOTS was designed for computer-based chemistry-driven growing purposes, minor modifications of the pipeline allow the generic linking optimization of two activated nonoverlapping bound fragments. At this stage, covalent H2L can be 13

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 65

described as a specific case of linking where the second fragment is systematically a protein residue that contains a nucleophilic function.

Regarding the growing strategy, the workflow consists of two main stages: the design of a focused chemical library and its subsequent VS under constraints with the S4MPLE tool 36, 37. In contrast to the previous approach, both linking and covalent strategies rely on a two-step design for the virtual library (Figure 2). All reactions were defined accordingly: a nucleophilic group from the targeted protein residue as first reactant and a reference warhead as second reactant. In the first step, similar to the growing approach, a focused library is generated by combining an activated fragment with a collection of functionalized building blocks (BBs) using common in

silico-encoded chemical reactions that are relevant in medicinal chemistry

38.

In the

second stage, this intermediate library is used as a new collection of putative reactive compounds to generate covalent adducts starting from a protein residue (here cysteine or serine) and a set of covalent reactions extracted from the covalent drug literature

32.

Additional popular warheads to target cysteine were also considered:

14

ACS Paragon Plus Environment

Page 15 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

nitrile, acrylamide/Michael-acceptor and thiol (Figure S1). Nitrile and aldehyde were the considered warheads to target the serine residue

39

(Figure S2). The role of this

second design stage is to create putative covalent adducts. All these “covalent adducts” define the raw focused covalent library that is screened under restraints during the VS stage. It should be noted that the targeted protein residue is treated as part of the covalent adducts because this covalent mode is designed as a specific case of linking strategy. In other words, the protein residue is removed from the binding site during the preparation of inputs. It is simulated as part of the ligand (Figure 2) with the FF dedicated to small molecules (see Material and Methods).

The raw focused covalent library is then postprocessed to extract a representative subset of diverse compounds with reasonable physicochemical properties. In addition, atoms that must be constrained during the VS are automatically flagged using a maximum common substructure (MCS) search with two reference substructures (the fragment and the residue backbone). Thus, both ends of each generated compound are flagged.

15

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1

Organic Reactions Reaction 1 Reaction 2

+

+



Reaction 10 …

Reaction N

Medicinal Chemistry

Activated Fragment 2

Collection of BBs

Covalent Reactions NH2

Reaction 1’ Reaction 2’

+

SH

+



Reaction 10’ …

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 65

O

Reaction N’

Protein Residue

Intermediate Focused DB

Warheads

NH2

NH2

SH

SH

O

O

NH2

Raw Focused Covalent DB (Covalent Adducts)

SH

O

Figure 2: Building of the raw focused covalent library, composed of covalent adducts, with cysteine as an example residue. 1) DOTS-like growing stage to generate an intermediate focused DB starting from an activated fragment, common medicinal chemistry reaction schemes and a source of commercially available BBs. 2) Linkinglike stage using a given protein residue, a set of covalent reaction schemes and the intermediate DB to design putative covalent adducts. The intermediate focused DB usually composed of several thousand compounds is used as a source of potential reactive covalent compounds. It should be noted that in the particular case of CovaDOTS, BBs must possess both a reactive function and a warhead. This considerably reduces the number of putative BBs that can be used as a linker in this covalent context.

16

ACS Paragon Plus Environment

Page 17 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

The second main stage, namely VS, is also very similar for the three modes (growing/linking/covalent). It consists of a constrained conformational sampling to maintain the original position of the reference substructures, followed by a postprocessing protocol involving the constraint-free minimization of the best poses to both optimize the geometry and compute unbiased energies.

In the original growing approach in DOTS, a single extremity of the compounds was constrained during the conformational sampling because there was only one reference fragment. In contrast to the growing strategy, both the linking and covalent modes rely on two reference substructures, and the goal consists of finding an appropriate moiety (or linker) that could link the two original substructures while maintaining their initial binding mode. Atoms from both extremities of each ligand that were flagged during the virtual library preparation are constrained during the conformational sampling, and one bond at the substructure-linker interface is automatically defined as broken. This “broken bond” trick

37

allows sampling of the

linker moiety despite the two fixed extremities in the ligand without deleting any

17

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 65

harmonic force field contributions. In practice, only the backbone of the protein residue is constrained in the covalent mode, enabling its side chain to move freely during the simulation, especially if torsional rearrangements are required at the covalent adduct interface.

All constraints are removed during the postprocessing stage in both the growing and linking modes. In contrast, in the covalent mode, constraints are kept on the backbone of the targeted residue to strictly maintain its position in the protein binding site. There is an energy trick that allows this procedure: no clash is detected during energy computations because both the backbone atoms (included in the covalent adduct) and the surrounding residues from the binding site are fixed throughout the simulation. Table 1 summarizes the status of the different entities during any type of hit-2-lead simulation.

Mode

Conformational Sampling Stage

Postprocessing Stage

Fragment 1

Fragment 2

Linker

Fragment 1 Fragment 2

Growing

fixed

na

free

free

na

free

Linking

fixed

fixed

free

free

free

free

Covalent

fixed

fixed

free

free

fixed

free

18

ACS Paragon Plus Environment

Linker

Page 19 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 1: Status of different entities during a simulation according to the optimization mode (growing, linking, covalent) and VS stage (conformational sampling, postprocessing). The term “fragment” represents the reference substructure. In growing mode, the term “linker” corresponds to newly added atoms in the given ligand. In covalent mode, “fragment 2” corresponds to the backbone from the targeted protein residue. Retrospective Validation of the Approach

This computer-based covalent H2L protocol was challenged using three retrospective studies with distinct targets, several warheads (Figure 1.B and 1.C) and the two most commonly targeted nucleophilic residues (cysteine and serine).

A reasonable size of the virtual covalent library is required to challenge the VS approach. Therefore, for each targeted residue, different warheads have been selected to increase the number of compounds in the designed chemical libraries. Indeed, only BBs that possess both a warhead and an activated function compatible with the selected chemical reaction(s) are automatically considered during the twostep design stage. This drastically reduces the number of available BBs for several warhead classes. For example, a disulfide rule was added to generate covalent adduct between thiol group and cysteine. It should be noted that disulfide bond 19

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

between cysteine and covalent ligands are mainly created using a tethering strategy 40

in which each ligand from the screened library already contains a disulfide bond in

its structure. However, it is still possible to transform the thiol group into tetheringcompliant group with additional synthesis steps for concrete use, although this kind of warhead should not be prioritized in the development of covalent drugs.

Three end points or criteria were used to validate the covalent H2L approach: i) the ability to automatically generate the reference covalent inhibitor (and/or close analogs) and their related covalent adducts starting from an original substructure with a reactive group, a collection of BBs and encoded chemistry rules, ii) the ability to accurately reproduce the known binding mode of the reference covalent adduct in the binding site and iii) the ability to achieve a good ranking in the VS for the reference compounds within the collection of putative adducts. Two distinct rankings were analyzed. First, the global rank was considered to estimate the efficiency of the method to prioritize the known reference covalent compound with respect to several hundred compounds. The second ranking scheme consists of making a warhead-

20

ACS Paragon Plus Environment

Page 20 of 65

Page 21 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

specific ranking for each class to highlight the ability of the approach to retrieve the expected linker when a suitable warhead class is known.

There are strong requirements to define the retrospective datasets of interest since the goal is to transform a known noncovalent inhibitor into a covalent inhibitor using known structural data. First, two similar ligands (one noncovalent compound and one covalent inhibitor) are needed. In addition, the reactive group must be present in one extremity of the covalent compound to mimic a putative covalent H2L project starting from a reference fragment. Finally, 3D structures of the target in complex with these reference compounds that do not exhibit large conformational changes within the binding site are required. In the end, three test cases, EGFR, extracellular signalregulated kinase 2 (ERK2) and prolyl endopeptidase (PREP), were selected to challenge our covalent H2L protocol.

EGFR Study Case

EGFR is a transmembrane receptor tyrosine kinase that plays a central role in the regulation of cellular growth. Dysregulation of EGFR, such as constitutive activity due 21

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

to mutations, can induce different cancer types

41.

Many drugs targeting this protein

and various EGFR mutants are currently either on the market or in clinical trials 42.

Gajiwala et al. published a study with X-ray structures of both covalent (dacomitinib/vizimpro, cov1/adduct1, PDB 4I24) and noncovalent (gefitinib/iressa, lig1, PDB 4I22) inhibitors in complex with EGFR

43

(Table 2). Both gefitinib and

dacomitinib are FDA-approved drugs for the treatment of metastatic non-small cell lung cancer. The targeted nucleophile residue is a cysteine, and the reference warhead is a nonterminal acrylamide moiety that involves the introduction of one stereocenter in the covalent adduct once the covalent bond with the cysteine has been formed. The correct configuration is “S” according to the 4I24 PDB structure. It should be noted that there is no electron density around the terminal piperidine ring from adduct1 (only the nitrogen atom from the ring is present in the PDB model).

As illustrated in Table 2, the reference inhibitor cov1 (dacomitinib) and close analogs (analog1a, analog1b) were generated starting from the frag1 substructure, the collection of BBs and the common chemical reactions (see Material and 22

ACS Paragon Plus Environment

Page 22 of 65

Page 23 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Methods). The same set of common organic reactions was used for all test cases. Of note, analog1b is a substructure of the FDA-approved drug afatinib, with the terminal tetrahydrofuranyl moiety replaced by a methyl group. Two 3D substructures (substruct1 and backbone from cysteine), shared by all compounds, were employed for the MCS search, and matching atoms were constrained to their original location from PDB 4I22 during the conformational stage of the VS. The expected covalent adduct (adduct1) between the ligand and the targeted cysteine residue was also formed using the set of cysteine-related covalent reactions from Figures 1.B and S1.

23

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Warhead Global Rankc a b Rank (N=19) (N=269)b

Name

PDB

lig1 gefitinib

4I22

-

-

adduct1

4I24

-

-

frag1

-

-

-

substruct1

-

-

-

cov1 dacomitinib

-

S: 5 R: 2

S: 37 (13.8%) R: 23 (8.6%)

analog1a

-

S: 3 R: 1

S: 30 (11.2%) R: 22 (8.2%)

analog1b

-

S: 8 R: 10

S: 43 (16.0%) R: 66 (24.5%)

a The bN

2D Structure

Page 24 of 65

warhead-specific rank is obtained by considering only acrylamide warheads.

refers to the total number of compounds in the given dataset.

c The

global rank is obtained by considering all warheads.

24

ACS Paragon Plus Environment

Page 25 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 2: Description of 2D structures for the EGFR study case. It includes noncovalent and covalent reference compounds (with their respective PDB codes), the original fragment for virtual synthesis, the substructure considered during the constrained conformational sampling stage and close analogs of the reference covalent compound. Moieties corresponding to the BBs are highlighted in green. The warhead-specific and global ranks obtained in the virtual screening are given, and the reference covalent compound is highlighted (gray background). A total of 269 compounds, including cov1 and two close analogs, were finally screened. The binding mode of the covalent adduct was accurately reproduced, as highlighted in Figure 3.

Figure 3: A. Overlay of the best pose from the covalent adduct of analog1b in the experimental electron density from 4I24 (pink). It should be noted that there is no electron density around the terminal piperidine moiety of the reference compound. B. Superposition of the best pose from the covalent adduct of analog1b (white) over the PDB model 4I24 (adduct1, yellow). Additionnaly, a RMSD value of 0.5 Å was measured between the covalent adduct of cov1 and the experimental model from 4I24 for the atoms present in the electron density. 25

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Surprisingly, the alternative R configuration was better ranked for both cov1 (rank 23, top 8.6%) and analog1a (rank 22, top 8.2%). Indeed, the R configuration allowed the piperidine or pyrrolidine rings to be placed deeper in the binding site with an alternative rotamer for the cysteine side chain and overall reasonable geometry. The S stereoisomer was still fairly highly ranked for both cov1 (rank 37, top 13.8%) and analog1a (rank 30, top 11.2%). In contrast, the correct S stereoisomer of analog1b had a better rank (rank 43, top 16%) than the R stereoisomer (rank 60, top 24.5%). This behavior seems to be correlated with the hindrance of the tertiary amino group. Unexpected enantiomers from compounds with a larger hydrophobic sidechain (piperidine in cov1, and pyrrolidine in analog1a) adopt a conformation with their hydrophobic moiety deeply bound in the cavity but at the cost of a rearrangement of Cys797 sidechain. In contrast, the smaller dimethylamine from analog1b establishes fewer pairwise interactions in this area, allowing the proper stereoisomer to be prioritized. If Cys797 is constrained, enantiomer S is correctly predicted for all

26

ACS Paragon Plus Environment

Page 26 of 65

Page 27 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

compounds. The top 30 compounds and their 2D structures are available in the supporting information (Figure S3). More importantly, when considering only the acrylamide warhead-specific ranking, the reference compounds were ranked among the first places: cov1 was ranked 2 and 5, respectively, for the R and S stereoisomers. Similarly, the closest analogs (analog1a and analog1b) achieved ranks 1 and 3 for the R and S stereoisomers, respectively. Although there are only 19 acrylamide-containing linkers in the screened database, this warhead is well represented among covalent drugs currently on the market. When only the correct S stereoisomer was considered, the reference compound cov1 and its highly similar analog (analog1a) were ranked in the top 14% and 11%, respectively. Although these ranks were slightly lower than the typical top 10%, the approach was still able to enrich top hit list compounds with the reference covalent inhibitor and close analogs while accurately predicting the binding mode.

ERK2 Study Case

27

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Mitogen-activated protein kinase 1 (MAPK1), also known as ERK2, is a protein kinase that is specific to serine and threonine residues. It regulates numerous cell functions, including proliferation, gene expression, differentiation, mitosis, cell survival and apoptosis. Mutations affecting the expression or activity of ERK2 can result in the development of cancer, and this protein is targeted by many drugs 44, 45. Blake et al. discovered noncovalent tetrahydro-pyrido-pyrimidine ERK2 inhibitors with low nanomolar IC50 values

46.

One year later, Ward et al. reported covalent

ERK2 inhibitors with the same shared scaffold

33.

The associated PDB structures are

4O6E and 4ZZM, respectively, for the noncovalent (lig2) and covalent inhibitors (cov2/adduct2). The targeted nucleophile residue is a cysteine, and the covalent warhead is a vinylsulfonamide group. As shown in Table 3, the reference inhibitor cov2 was correctly generated using frag2 as the reference fragment, in addition to the collection of BBs and the relevant reactions. Two 3D substructures (substruct2 and the backbone from cysteine) were employed for the MCS search, and matching atoms were constrained to their original

28

ACS Paragon Plus Environment

Page 28 of 65

Page 29 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

locations from PDB 4O6E during the conformational stage of the VS. The covalent adduct (adduct2) between the ligand and the cysteine residue was also generated using the same set of covalent reactions (Figures 1.B and S1).

29

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Warhead Global Rankc Ranka (N=3)b (N=407)b

Name

PDB

lig2

4O6E

-

-

adduct2

4ZZM

-

-

frag2

-

-

-

substruct2

-

-

-

cov2

-

1

18 (4.4%)

a

2D Structure

Page 30 of 65

The warhead-specific rank is obtained by considering only vinylsulfonamide

warheads. bN

refers to the total number of compounds in the given dataset.

c The

global rank is obtained by considering all warheads.

Table 3: Description of structures considered for the ERK2 study case. It includes noncovalent and covalent reference compounds (with their respective PDB codes), the original fragment for virtual synthesis and the substructure considered during the constrained conformational sampling stage. Moieties corresponding to the BBs are highlighted in green. The warhead-specific and global ranks obtained in the virtual

30

ACS Paragon Plus Environment

Page 31 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

screening are given, and the reference covalent compound is highlighted (gray background).

The binding mode of adduct2 is suitably reproduced as depicted in Figure 4. While the thioether moiety from the flexible cysteine side chain is well predicted, the linear alkyl chain is not accurately modeled due to the slight rearrangement of the tetrahydro-pyrido-pyrimidine cycle between noncovalent and covalent ligands. However, despite nonnegligible RMSD variations (approximately 1 Å) between the binding mode of the noncovalent and covalent ligands, the method was still able to address these modifications.

Figure 4: A. Overlay of the best pose of adduct2 in the experimental electron density 31

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

from 4ZZM (pink). B. Superposition of the best pose from adduct2 (white) over the PDB model 4ZZM (yellow). A RMSD value of 0.5 Å was measured between these two conformations.

A total of 407 compounds, including cov2, were screened under restraints in the binding site of ERK2 (4O6E). The reference lead compound cov2 achieved a good overall rank (rank 18, top 4.4%) using the energetic criterion. The top 30 compounds and their 2D structures are available in the supporting information (Figure S4). Remarkably, cov2 was ranked first in the vinylsulfonamide warhead-specific ranking. However, only three compounds bearing the vinylsulfonyl reactive function were present in the screened database. In conclusion, in the ERK2 study case, the covalent H2L strategy was able to automatically design the reference compounds cov2 and adduct2 to achieve good ranking in the VS using both ranking criteria and to predict an acceptable binding pose.

32

ACS Paragon Plus Environment

Page 32 of 65

Page 33 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

PREP Study Case Prolyl endopeptidase (PREP) is a serine peptidase that cleaves peptide bonds at the C-terminal side of proline residues. It is involved in the maturation and degradation of peptide hormones and neuropeptides. Alterations in PREP enzyme activity have been measured in several diseases, including Parkinson's disease 47, 48. Van der Veken et al. described a new series of noncovalent nM inhibitors of PREP 49,

while Kaszuba et al. reported a study including molecular dynamics and

crystallography about similar compounds but with a covalent mechanism of action

50.

The associated PDB structures for the noncovalent (lig3) and covalent inhibitors (cov3/adduct3) are 4BCD and 4AN0, respectively. In this PREP test case, the targeted nucleophile residue is a serine, and the covalent warhead is a nitrile group. In contrast to previous cases, the goal here is to transform the shared substructure (lig3sub) between the two reference compounds (lig3 and cov3) into the known covalent inhibitor cov3. The compound frag3 is used as an activated fragment during 33

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the library design stage with a different set of rules because the targeted residue is a serine: nitrile (carboximidate adduct) and aldehyde (hemiacetal adduct) warheads are considered (Figure 1.C and S2). Two 3D substructures (substruct3 and the backbone from serine) were employed for the MCS search, and matching atoms were constrained to their original location from PDB 4BCD during the conformational stage of the VS. As shown in Table 4, the reference covalent inhibitor cov3 was successfully built starting from the frag3 compound, the collection of BBs and the considered in silicoencoded reactions. One close analog (analog3), with the pyrrolidine cycle replaced by a piperidine, was also automatically generated. The covalent adduct (adduct3) between the reference ligand and the residue of interest was automatically created using the serine-focused set of covalent reactions.

Name

PDB

2D Structure

34

ACS Paragon Plus Environment

Warhead Ranka Global Rankc (N=199)b (N=303)b

Page 34 of 65

Page 35 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

lig3

4BCD

-

-

lig3sub

4BCD

-

-

adduct3

4AN0

-

-

frag3

-

-

-

substruct3

-

-

-

cov3

-

1

1 (0.3%)

analog3

-

2

2 (0.7%)

a The bN

warhead-specific rank is obtained by considering only nitrile warheads.

refers to the total number of compounds in the given dataset.

c The

global rank is obtained by considering all warheads.

Table 4: Description of structures considered for the PREP study case. It includes noncovalent and covalent reference compounds (with their respective PDB codes), the original fragment for virtual synthesis, the substructure considered during the constrained conformational sampling stage, and close analogs of the reference 35

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

covalent compound. Moieties corresponding to the BBs are highlighted in green. The warhead-specific and global ranks obtained in the virtual screening are given, and the reference covalent compound is highlighted (gray background). Adduct3 was correctly positioned in the binding site (Figure 5); however, the carboximidate adduct was slightly displaced with respect to the electron density from 4AN0. It should be noted that there are some concerns regarding the geometry of this carboximidate group in the model from the 4AN0 PDB structure, where the CCN angle is close to 180° (167°) instead of its expected value of approximately 125° [PDB 5YP2].

Figure 5: A. Overlay of the best pose from adduct3 in the experimental electron density from 4AN0 (pink). B. Superposition of the best pose from adduct3 (white) over the PDB model 4AN0 (yellow). A RMSD value of 0.7 Å was measured between 36

ACS Paragon Plus Environment

Page 36 of 65

Page 37 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

these two conformations.

A total of 303 molecules, including both cov3 and analog3, were screened under restraints. The reference covalent inhibitor cov3 achieved a perfect rank (rank 1, top 0.3%) using the energetic criterion, while the similar analog3 achieved second place (rank 2, top 0.7%). The top 30 compounds and their 2D structures are available in the supporting information (Figure S5). Not surprisingly, high ranks were also obtained for cov3 and analog3 using the nitrile warhead-specific ranking. In contrast to the EGFR and ERK2 test cases, the majority of putative covalent compounds that were generated included the reference nitrile warhead. In conclusion, in the PREP study case, the covalent H2L strategy was able to automatically generate cov3 and adduct3 compounds. The best predicted pose fit most of the experimental electron density from the targeted compound and was later ranked in the first position by both ranking criteria.

Distribution of Warheads

37

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The distribution of warheads considered in the final virtual libraries was computed for each study case to identify any putative bias (Table 5). The number of each warhead in the screened library depends on the commercially available BBs. In our protocol, the same generic database of BBs, regularly updated using a fully automated workflow, is used for all growing and linking projects. For the specific case of CovaDOTS, only BBs that possess both a warhead and an activated function compatible with the selected chemical reaction(s) are considered. This drastically reduces the number of available BBs. The distribution analysis clearly shows that thiol, nitrile and aldehyde functional groups are more frequent in commercial BBs that were suitable for the first growing-like stage relying on common medicinal chemistry reactions (Figure 2.1). However, it is important to point out that these warheads are not prioritized in drug discovery projects. So, in practice BBs containing other warheads would be selected in priority. Interestingly, the reference warheads represent a minority in both the EGFR and ERK2 test cases, whereas the reference warhead corresponds to the major species for the PREP target. Despite these great

38

ACS Paragon Plus Environment

Page 38 of 65

Page 39 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

frequency differences, the approach was able to enrich the top hit list with the expected warhead class for all test cases.

39

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Study Case EGFR (Cys) ERK2 (Cys) PREP (Ser)

Page 40 of 65

Acrylamide + Epoxide Michael Acceptor

Alkyne

Thiol

Maleimide

Vinylsulfonyl

Nitrile

Aldehyde

2.5%

20.4%

0.7%

0.7%

6.8%

5.7%

63.2%

-

2.2%

22.8%

3.4%

0.7%

4.9%

4.4%

61.5%

-

-

-

-

-

-

-

66.9%

33.1%

Table 5: Distribution of warheads in the virtual libraries designed for the three test cases. The reference warhead of the known covalent inhibitor is highlighted in bold for each case.

General Discussion Three criteria were monitored to validate the ability of the approach to design covalent inhibitors: i) the ability to retrospectively design reference covalent compounds starting from a noncovalent substructure, ii) the ability to reproduce the experimental binding mode of a covalent adduct and iii) the ability to rank reference covalent ligands highly in VS among hundreds of other putative adducts. For all three test cases, the protocol was able to automatically generate the reference covalent inhibitor and its related covalent adduct using a library of BBs and

40

ACS Paragon Plus Environment

Page 41 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

the two sets of encoded chemical reactions (common and covalent, see Figures S1 and S2 for the latter set). Interestingly, very close analogs were also generated for the EGFR and PREP reference ligands. Concerning the ability to reproduce the expected binding mode of covalent adducts, the simulations converged towards the correct solution with high accuracy for EGFR and reasonably good accuracy for ERK2 and PREP cases. While the flexible residue side chain adopted the expected conformation, the flexible linker only partially fitted the electron density for the ERK2 adduct. A similar output was obtained around the newly formed covalent bond for the PREP study case. Finally, moderate to excellent results were obtained for the ranking after VS: moderate ranks were achieved for the EGFR study case (approximately the top 10-15%) and excellent ranks for both ERK2 and PREP targets, within the top 5% and the highest rank, respectively. More importantly, when considering only the same warhead as in the targeted covalent compound, this one or a very close analog was ranked in the top solutions in all three test cases presented above (Tables 2, 3 and 4).

41

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

It is important to point-out that some warheads such as thiol, nitrile and aldehyde are not prioritized during drug discovery process. They were considered here to increase the size of the covalent chemical libraries. Therefore, a keen strategy for any prospective project would select compounds from the top hit lists of different FDA-approved warhead classes. Additional criteria could be used for the selection of the best covalent candidates in prospective projects. For example, the docking of the reactive compounds bearing a warhead corresponding to covalent adducts that were ranked in the top hit list of the VS could be used to guide the selection. These reactive species were also automatically generated during the chemical library preparation. A modified binding site in which the targeted residue has been mutated into glycine or alanine is required for this docking stage, as described for the SCAR method

51.

Indeed, the original binding site would prevent binding of the reactive

ligands in noncovalent docking context for steric reasons. In the end, good candidates would adopt the expected binding mode without any constraint, while having their warhead located in the mutated area. This additional check has been

42

ACS Paragon Plus Environment

Page 42 of 65

Page 43 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

performed with success on the three test cases. Finally, rigid linkers that naturally pre-orientate the warhead toward the nucleophilic residue are of high interest to facilitate the creation of the covalent bond. In conclusion, the results from three independent retrospective studies, which involved various covalent warheads and different nucleophilic residues, show valuable perspectives for the computer-based chemistry-driven design of covalent ligands using this modified DOTS methodology. Despite these interesting features, the approach has several limitations, as does any in silico covalent method. For example, the same set of covalent reactions with their associated warheads was considered for a given nucleophilic residue to increase the size of the generated library to be screened and to show the ability of the approach to deal with varied chemistry from the covalent inhibitor literature. However, warheads possess different reactivity, and this method is not able to choose the most suitable one from a series of different warheads because it neglects the energy contribution from the formation of covalent bonds. This drawback is,

43

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

however, shared by all in silico approaches, although some recent efforts have been devoted to the prediction of the intrinsic reactivity of the warheads using quantum mechanics modeling or experimental approaches

52.

Selecting compounds from the

top hit lists for different warhead classes would maximize the chance of identifying compounds with the proper reactivity range. In addition, the keystone of the approach is the FBDD paradigm, where the conservation of binding mode during the H2L process is achieved. This conservation is modeled through the inclusion of constraints on both ends (reference substructure and backbone of targeted residue) of each covalent adduct. Thus, this approach could fail if the core of the compound requires nonnegligible adjustments. However, the ERK2 study case highlighted that the methodology was able to address slight rearrangements of the reference substructure. Finally, this procedure requires that the previous (n-1) and next (n+1) protein residues around the targeted one must be fixed during the simulation; otherwise, the minimization stages would either eject the adduct compound from its original location because of violent clashes with the

44

ACS Paragon Plus Environment

Page 44 of 65

Page 45 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

binding site or result in large energy penalties. However, enabling partial flexibility of the binding site for other residues is theoretically possible, as shown in previous work 36.

Conclusions We report an original approach for generic covalent hit-to-lead optimization that relies on both chemistry knowledge and structure-based molecular modeling. This new tool is an upgrade of the previously reported DOTS strategy for in silico growingbased optimization 4. New features have been added to the DOTS workflow to perform generic linking-based optimization. The covalent strategy highlighted in this work is a specific case of linking where the second reference substructure is systematically a nucleophile residue. The covalent approach has been validated using three retrospective studies. In each case, ligands containing the expected warhead were correctly generated in the first step, and the related covalent adducts were built in the second step. In all 45

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

retrospective cases, the reference adducts were ranked relatively high among the set of generated compounds and even first for one test case. In addition, the expected linker was prioritized effectively given the warhead class in all test cases. Finally, the binding mode of covalent adducts was reproduced with reasonable geometry considering available electron density maps. Although the strategy has been validated only with cysteine and serine as the targeted residues, due to available data, it could easily be extended beyond these residues to target other selective residues such as histidine, lysine or methionine 5, 34. Similarly, the number of alternative warheads with different reactivity profiles and physicochemical properties is in constant progression 53-55. These new warheads can be introduced into the CovaDOTS protocol as long as they are represented in commercial libraries. The choice of warhead should be guided by the nature and reactivity of the targeted residue and the type of project. Typically, in a real case scenario different warheads with various profiles would be selected for a given target. Warheads such as thiol, nitrile and aldehyde would most probably not be considered

46

ACS Paragon Plus Environment

Page 46 of 65

Page 47 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

in the development of covalent drugs; however, they are many examples of chemical probes that use of these warheads. This novel in silico chemistry-driven approach, implemented in the DOTS pipeline, is well suited for highly automated design and prioritization of covalent inhibitors starting from a hit or noncovalent ligand with reasonable physicochemical properties and good synthetic accessibility.

47

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 48 of 65

Material & Methods

Preparation of Building Block Database The preparation of the generic database of functionalized BBs relies on an automated pipeline described in a previous work 4. Briefly, the collection of BBs is regularly

updated

automatically

from

MolPort

postprocessed

supplier using

(http://www.molport.com) tools

from

and

is

ChemAxon

(http://www.chemaxon.com). The main steps consist in standardizing compounds, removing salts and duplicates, validating them using a structure checker and generating unambiguous stereoisomers and tautomers. The generic collection of BBs used for this work contained about 220K reactive compounds.

Building of Focused Covalent Library The creation of the focused covalent library relies on an activated substructure (for instance, a fragment containing one reactive function), the generic collection of functionalized BBs and two sets of in silico encoded chemical reactions (one for each stage of the covalent library design pipeline, Figure 2). The first set contains common 48

ACS Paragon Plus Environment

Page 49 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

medicinal chemistry relevant reactions describes previously by Hartenfeller and colleagues

38

to generate compounds that should be accessible during the wet-lab

organic synthesis stage. The reactions considered for all study cases are rules #30, #47, #48 and #51 (Figure S6) according to the previously reported study

38.

The

reductive amination (#30), and Buchwald-Hartwig (#51) reactions belong to amination class of reaction and use aldehyde and aryl-halide as second reactant, respectively. Rule #47 corresponds to the formation of an amide function starting from carboxylic acid and amine reactants. Similarly, rule #48 creates a sulfonamide function from amine and sulfonyl-chloride reactants. An additional reaction, where the product is an ester group, was implemented in house (Figure S6). The first stage of the library design relies on a previously described pipeline 4.

EGFR case study Eligible reactions from first set of reactions are #30, #47, #48 and #51 for this study case because frag1 contains an amine group (Figure S6). A collection of about 130K compounds was generated after the first stage of the library design. The raw covalent 49

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

library contained about 3000 adduct compounds. The post-processing stage of the library is described in details in a paragraph below. For the EGFR case, the physicochemical parameters used to filter the generated adducts were “max MW=600 Da”, “max logP=5”, “max TPSA=150”, “max rotatable bond count=14” and “number of rings = 3 or 4”. The final library used for VS contained 269 adduct compounds. Finally, the reference covalent inhibitor (cov1) from EGFR case was automatically generated using the amide rule (#47) by coupling the amino group from frag1 with the carboxylic acid function from a compound ( “4-(piperidinyl)but-2-enoic acid” ) present in the commercial database of BBs (Figure S7).

ERK2 case study The same reactions as for the EGFR study case were used here because frag2 also contains an amine reactive function. Collections of about 130K and 3K compounds were generated after the first and second stages of the library design, respectively. The physico-chemical parameters used to filter the generated adducts were “max MW=550 Da”, “max logP=5”, “max TPSA=150”, “max rotatable bond 50

ACS Paragon Plus Environment

Page 50 of 65

Page 51 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

count=12” and “number of rings = 3 or 4”. The final library used for VS contained 407 adduct compounds. Finally, the known ERK2 covalent inhibitor (cov2) was designed by merging the secondary aliphatic amine from frag2 with the “ethenesulfonyl chloride” BB using rule #48 (Figure S7).

PREP case study This study case relies on reactions #47 and “ester” because frag3 contains a carboxylic acid function (Figure S6). A collection of about 98K compounds was generated after the first stage of the library design. The raw covalent library contained about 5800 adduct compounds. The physico-chemical parameters used to filter the generated adducts were “max MW=550 Da”, “max logP=5”, “max TPSA=170”, “max rotatable bond count=12” and “number of rings = 2 or 3”. The final library used for VS contained 303 adduct compounds. Finally, the reference covalent inhibitor (cov3) from PREP target was created by coupling the carboxylic acid function from frag3 with the “pyrrolidine-2-carbonitrile” using rule #47 (Figure S7).

Postprocessing 51

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

As previously reported for the DOTS workflow, several fully automated postprocessing stages were applied to the raw covalent focused library to extract a diverse subset of duplicate-free representative compounds with a focus on structures with reasonable physiochemical properties. Looser thresholds were used in this study because physico-chemical properties were automatically computed on the adducts that include the protein residue. A maximum of one additional ring was permitted compared to the initial fragment. The main difference with the DOTS protocol is that with CovaDOTS two substructures are required for the MCS search to add constraints on both ends of each compound and to define their constrained initial locations. An alanine-like substructure (with a dummy atom instead of the methyl group), prepositioned in the binding site, was used to flag atoms from backbone without any ambiguity during the MCS search. Thus, the full side chain of residue involved in the covalent bond will be unlocked during the subsequent conformational sampling stage with S4MPLE.

Preparation of Binding Sites 52

ACS Paragon Plus Environment

Page 52 of 65

Page 53 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

MOE version 2016 (Chemical Computing Group Inc., Montreal, QC, Canada) was used to prepare the binding sites starting from reference PDB files from Tables 2, 3 and 4. All residues, with at least one atom within 10 Å radius from reference inhibitor were selected to define the binding site. A large binding site was defined because S4MPLE relies on a FF-based energy function.

VS Protocol and ranking S4MPLE was used to screen the compounds within the binding site with constraints during the conformational sampling stage on the reference substructure and the backbone from the residue involved in the covalent bond

36, 37.

These

constraints are used because the goal is to transform an initial hit into a covalent inhibitor while maintaining its original binding mode. S4MPLE is based on a hybrid genetic algorithm. The sampling stage mainly consisted in three independent simulations of 150 generations with a population of 30 individuals. All saved poses were merged into one file before switching to the postprocessing stage that involved a minimization of all nonredundant poses while 53

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

unlocking all ligand atoms but considered backbone. The energy used to rank all screened compounds is equal to the best energy of the complex minus the energy of the best conformer of the free ligand. The latter is obtained by performing three independent simulations with the same parameters on the ligand alone. The goal is to compute a potential energy difference between bound and free forms to estimate the interaction energy while considering the strain energy of the ligand. In contrast to common growing or linking strategies, the considered ligand is not exactly the same between both sampling stages in the covalent mode. Indeed, the covalent adduct could be freely sampled to identify the lowest energy conformer as described previously with DOTS. However, this approach could lead to inappropriate minima, especially if flexible residue side chains are considered such as lysine. In contrast, sampling the reactive ligand bearing the warhead would lead to wrong energy differences because properties of the atoms around the reactive center could change: for example, an epoxide cycle is opened after a covalent bond is formed. This modification of the system prevents computing relevant energy differences

54

ACS Paragon Plus Environment

Page 54 of 65

Page 55 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

between the bound covalent adduct and the free reactive ligand. To address these issues, the ligand considered in the conformational sampling stage is the covalent adduct in which the residue has been mutated into hydrogen. Thus, hybridization types and their associated force field atomic types are maintained while the lowest energy conformer represents reliable geometry whatever the considered residue side chain (it should be noted that there is a major overlap between both ranking schemes). In the end, a top hit list is extracted for further analysis.

Author information *Correspondence

should

be

addressed

to

[email protected]

and

[email protected]

Funding Sources: Agence Nationale de la Recherche (ANR-15-CE18-0023) Canceropôle PACA, in the context of the Canceropôle pre-maturation call for projects. Fondation ARC pour la Recherche sur le Cancer (PJA20171206125) 55

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ORCID [email protected], 0000-0003-1906-7128 [email protected], 0000-0002-8726-0053 [email protected], 0000-0003-0173-5714 [email protected], 0000-0003-1886-925X [email protected], 0000-0001-8101-7901 [email protected], 0000-0002-5580-0588

Acknowledgments This study was partly supported by research funding from ANR grant (ANR-15-CE18-0023), Fondation ARC pour la Recherche sur le Cancer (PJA20171206125), Canceropôle PACA, French National Cancer Institute (INCa), and the Provence-Alpes-Côte d’Azur Region. M.SA. was supported by a fellowship from Institut Paoli-Calmettes. We would like to thank Bernard Chetrit from the Datacentre IT and Scientific Computing facility of the CRCM. Abbreviations used BB, Building Blocks; DOTS, Diversity-Oriented Target-Focused Synthesis; EGFR, Human Epidermal Growth Factor Receptor; ERK2, Extracellular Signal-Regulated Kinase 2; FDA, Food and Drug Administration; H2L, Hit-to-Lead; HER2, Human 56

ACS Paragon Plus Environment

Page 56 of 65

Page 57 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Epidermal Growth Factor Receptor 2; PREP, Prolyl-Endopeptidase; FBDD, Fragment-Based Drug Design; VS, Virtual Screening.

Supporting Information List of covalent reaction schemes for cysteine (Figure S1). List of covalent reaction schemes for serine (Figure S2). Chemical structure of the top 30 compounds for EGFR test case (Figure S3). Chemical structure of the top 30 compounds for ERK2 test case (Figure S4). Chemical structure of the top 30 compounds for PREP test case (Figure S5). SMARTS-encoded reactions used during the first stage of library design (Figure S6). First step of the chemical library design for the three test cases (Figure S7). This information is available free of charge via the Internet at http://pubs.acs.org”

57

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References

1.

Romasanta, A. K. S.; van der Sijde, P.; Hellsten, I.; Hubbard, R. E.; Keseru, G.

M.; van Muijlwijk-Koezen, J.; de Esch, I. J. P., When fragments link: a bibliometric perspective on the development of fragment-based drug discovery. Drug Discov

Today 2018, 23, 1596-1609. 2.

Bleicher, K. H.; Böhm, H. J.; Müller, K.; Alanine, A. I., Hit and lead generation:

beyond high-throughput screening. Nat Rev Drug Discov 2003, 2, 369-78. 3.

Hoffer, L.; Muller, C.; Roche, P.; Morelli, X., Chemistry-driven Hit-to-lead

Optimization Guided by Structure-based Approaches. Mol Inform 2018, 37, e1800059. 4.

Hoffer, L.; Voitovich, Y. V.; Raux, B.; Carrasco, K.; Muller, C.; Fedorov, A. Y.;

Derviaux, C.; Amouric, A.; Betzi, S.; Horvath, D.; Varnek, A.; Collette, Y.; Combes, S.; Roche, P.; Morelli, X., Integrated Strategy for Lead Optimization Based on Fragment Growing: The Diversity-Oriented-Target-Focused-Synthesis Approach. J

Med Chem 2018, 61, 5719-5732. 5.

Mukherjee, H.; Grimster, N. P., Beyond cysteine: recent developments in the

area of targeted covalent inhibition. Curr Opin Chem Biol 2018, 44, 30-38. 6.

Kharenko, O. A.; Patel, R. G.; Brown, S. D.; Calosing, C.; White, A.;

Lakshminarasimhan, D.; Suto, R. K.; Duffy, B. C.; Kitchen, D. B.; McLure, K. G.; Hansen, H. C.; van der Horst, E. H.; Young, P. R., Design and Characterization of Novel Covalent Bromodomain and Extra-Terminal Domain (BET) Inhibitors Targeting a Methionine. J Med Chem 2018, 61, 8202-8211. 7.

Lonsdale, R.; Ward, R. A., Structure-based design of targeted covalent

inhibitors. Chem Soc Rev 2018, 47, 3816-3830. 8.

Singh, J.; Petter, R. C.; Baillie, T. A.; Whitty, A., The resurgence of covalent

drugs. Nat Rev Drug Discov 2011, 10, 307-17. 9.

Rabindran, S. K.; Discafani, C. M.; Rosfjord, E. C.; Baxter, M.; Floyd, M. B.;

Golas, J.; Hallett, W. A.; Johnson, B. D.; Nilakantan, R.; Overbeek, E.; Reich, M. F.; 58

ACS Paragon Plus Environment

Page 58 of 65

Page 59 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Shen, R.; Shi, X.; Tsou, H. R.; Wang, Y. F.; Wissner, A., Antitumor activity of HKI272, an orally active, irreversible inhibitor of the HER-2 tyrosine kinase. Cancer Res 2004, 64, 3958-65. 10.

Lin, N. U.; Winer, E. P.; Wheatley, D.; Carey, L. A.; Houston, S.; Mendelson,

D.; Munster, P.; Frakes, L.; Kelly, S.; Garcia, A. A.; Cleator, S.; Uttenreuther-Fischer, M.; Jones, H.; Wind, S.; Vinisko, R.; Hickish, T., A phase II study of afatinib (BIBW 2992), an irreversible ErbB family blocker, in patients with HER2-positive metastatic breast cancer progressing after trastuzumab. Breast Cancer Res Treat 2012, 133, 1057-65. 11.

Engelman, J. A.; Zejnullahu, K.; Gale, C. M.; Lifshits, E.; Gonzales, A. J.;

Shimamura, T.; Zhao, F.; Vincent, P. W.; Naumov, G. N.; Bradner, J. E.; Althaus, I. W.; Gandhi, L.; Shapiro, G. I.; Nelson, J. M.; Heymach, J. V.; Meyerson, M.; Wong, K. K.; Jänne, P. A., PF00299804, an irreversible pan-ERBB inhibitor, is effective in lung cancer models with EGFR and ERBB2 mutations that are resistant to gefitinib.

Cancer Res 2007, 67, 11924-32. 12.

Greig, S. L., Osimertinib: First Global Approval. Drugs 2016, 76, 263-73.

13.

Pan, Z.; Scheerens, H.; Li, S. J.; Schultz, B. E.; Sprengeler, P. A.; Burrill, L. C.;

Mendonca, R. V.; Sweeney, M. D.; Scott, K. C.; Grothaus, P. G.; Jeffery, D. A.; Spoerke, J. M.; Honigberg, L. A.; Young, P. R.; Dalrymple, S. A.; Palmer, J. T., Discovery

of

selective

irreversible

inhibitors

for

Bruton's

tyrosine

kinase.

ChemMedChem 2007, 2, 58-61. 14.

Bandyopadhyay, A.; Gao, J., Targeting biomolecules with reversible covalent

chemistry. Curr Opin Chem Biol 2016, 34, 110-116. 15.

Jaillet, L.; Artemova, S.; Redon, S., IM-UFF: Extending the universal force field

for interactive molecular modeling. J Mol Graph Model 2017, 77, 350-362. 16.

Cavasotto, C. N.; Adler, N. S.; Aucar, M. G., Quantum Chemical Approaches

in Structure-Based Virtual Screening and Lead Optimization. Front Chem 2018, 6, 188. 17.

Sotriffer, C., Docking of Covalent Ligands: Challenges and Approaches. Mol

Inform 2018, 37, e1800062. 59

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

18.

Page 60 of 65

Jones, G.; Willett, P.; Glen, R. C.; Leach, A. R.; Taylor, R., Development and

validation of a genetic algorithm for flexible docking. J Mol Biol 1997, 267, 727-48. 19.

Rarey, M.; Kramer, B.; Lengauer, T., Multiple automatic base selection:

protein-ligand

docking

based

on

incremental

construction

without

manual

intervention. J Comput Aided Mol Des 1997, 11, 369-84. 20.

Abagyan, R.; Totrov, M.; Kuznetsov, D., ICM—A new method for protein

modeling and design: Applications to docking and structure prediction from the distorted native conformation. Journal of Computational Chemistry 1994, 15, 488506. 21.

Bianco, G.; Forli, S.; Goodsell, D. S.; Olson, A. J., Covalent docking using

autodock: Two-point attractor and flexible side chain methods. Protein Sci 2016, 25, 295-301. 22.

Morris, G. M.; Goodsell, D. S.; Huey, R.; Olson, A. J., Distributed automated

docking of flexible ligands to proteins: parallel applications of AutoDock 2.4. J

Comput Aided Mol Des 1996, 10, 293-304. 23.

London, N.; Miller, R. M.; Krishnan, S.; Uchida, K.; Irwin, J. J.; Eidam, O.;

Gibold, L.; Cimermančič, P.; Bonnet, R.; Shoichet, B. K.; Taunton, J., Covalent docking of large libraries for the discovery of chemical probes. Nat Chem Biol 2014, 10, 1066-72. 24.

Lawandi, J.; Toumieux, S.; Seyer, V.; Campbell, P.; Thielges, S.; Juillerat-

Jeanneret, L.; Moitessier, N., Constrained peptidomimetics reveal detailed geometric requirements of covalent prolyl oligopeptidase inhibitors. J Med Chem 2009, 52, 6672-84. 25.

Toledo Warshaviak, D.; Golan, G.; Borrelli, K. W.; Zhu, K.; Kalid, O., Structure-

based virtual screening approach for discovery of covalently bound ligands. J Chem

Inf Model 2014, 54, 1941-50. 26.

Steindl, T.; Laggner, C.; Langer, T., Human rhinovirus 3C protease: generation

of pharmacophore models for peptidic and nonpeptidic inhibitors and their application in virtual screening. J Chem Inf Model 2005, 45, 716-24.

60

ACS Paragon Plus Environment

Page 61 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

27.

Wolber, G.; Langer, T., LigandScout: 3-D pharmacophores derived from

protein-bound ligands and their use as virtual screening filters. J. Chem. Inf. Mod. 2005, 45, 160-169. 28.

Markt, P.; McGoohan, C.; Walker, B.; Kirchmair, J.; Feldmann, C.; De Martino,

G.; Spitzer, G.; Distinto, S.; Schuster, D.; Wolber, G.; Laggner, C.; Langer, T., Discovery of novel cathepsin S inhibitors by pharmacophore-based virtual highthroughput screening. J Chem Inf Model 2008, 48, 1693-705. 29.

Schulz, R.; Atef, A.; Becker, D.; Gottschalk, F.; Tauber, C.; Wagner, S.;

Arkona, C.; Abdel-Hafez, A. A.; Farag, H. H.; Rademann, J.; Wolber, G., Phenylthiomethyl Ketone-Based Fragments Show Selective and Irreversible Inhibition of Enteroviral 3C Proteases. J Med Chem 2018, 61, 1218-1230. 30.

Scholz, C.; Knorr, S.; Hamacher, K.; Schmidt, B., DOCKTITE-a highly versatile

step-by-step workflow for covalent docking and virtual screening in the molecular operating environment. J Chem Inf Model 2015, 55, 398-406. 31.

Scarpino, A.; Ferenczy, G. G.; Keserű, G. M., Comparative Evaluation of

Covalent Docking Tools. J Chem Inf Model 2018, 58, 1441-1458. 32.

Shannon, D. A.; Weerapana, E., Covalent protein modification: the current

landscape of residue-specific electrophiles. Curr Opin Chem Biol 2015, 24, 18-26. 33.

Ward, R. A.; Colclough, N.; Challinor, M.; Debreczeni, J. E.; Eckersley, K.;

Fairley, G.; Feron, L.; Flemington, V.; Graham, M. A.; Greenwood, R.; Hopcroft, P.; Howard, T. D.; James, M.; Jones, C. D.; Jones, C. R.; Renshaw, J.; Roberts, K.; Snow, L.; Tonge, M.; Yeung, K., Structure-Guided Design of Highly Selective and Potent Covalent Inhibitors of ERK1/2. J Med Chem 2015, 58, 4790-801. 34.

Jones, L. H., Reactive Chemical Probes: Beyond the Kinase Cysteinome.

Angew Chem Int Ed Engl 2018, 57, 9220-9223. 35.

Anscombe, E.; Meschini, E.; Mora-Vidal, R.; Martin, M. P.; Staunton, D.;

Geitmann, M.; Danielson, U. H.; Stanley, W. A.; Wang, L. Z.; Reuillon, T.; Golding, B. T.; Cano, C.; Newell, D. R.; Noble, M. E.; Wedge, S. R.; Endicott, J. A.; Griffin, R. J., Identification and Characterization of an Irreversible Inhibitor of CDK2. Chem Biol 2015, 22, 1159-64. 61

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

36.

Page 62 of 65

Hoffer, L.; Renaud, J. P.; Horvath, D., In silico fragment-based drug discovery:

setup and validation of a fragment-to-lead computational protocol using S4MPLE. J

Chem Inf Model 2013, 53, 836-51. 37. for

Hoffer, L.; Chira, C.; Marcou, G.; Varnek, A.; Horvath, D., S4MPLE--Sampler Multiple

Protein-Ligand

Entities:

Methodology

and

Rigid-Site

Docking

Benchmarking. Molecules 2015, 20, 8997-9028. 38.

Hartenfeller, M.; Eberle, M.; Meier, P.; Nieto-Oberhuber, C.; Altmann, K. H.;

Schneider, G.; Jacoby, E.; Renner, S., A collection of robust organic synthesis reactions for in silico molecule design. J Chem Inf Model 2011, 51, 3093-8. 39.

Micale, N.; Scarbaci, K.; Troiano, V.; Ettari, R.; Grasso, S.; Zappalà, M.,

Peptide-based proteasome inhibitors in anticancer drug design. Med Res Rev 2014, 34, 1001-69. 40.

Erlanson, D. A.; Lam, J. W.; Wiesmann, C.; Luong, T. N.; Simmons, R. L.;

DeLano, W. L.; Choong, I. C.; Burdett, M. T.; Flanagan, W. M.; Lee, D.; Gordon, E. M.; O'Brien, T., In situ assembly of enzyme inhibitors using extended tethering. Nat

Biotechnol 2003, 21, 308-14. 41.

da Cunha Santos, G.; Shepherd, F. A.; Tsao, M. S., EGFR mutations and lung

cancer. Annu Rev Pathol 2011, 6, 49-69. 42.

Carles, F.; Bourg, S.; Meyer, C.; Bonnet, P., PKIDB: A Curated, Annotated

and Updated Database of Protein Kinase Inhibitors in Clinical Trials. Molecules 2018, 23. 43.

Gajiwala, K. S.; Feng, J.; Ferre, R.; Ryan, K.; Brodsky, O.; Weinrich, S.; Kath,

J. C.; Stewart, A., Insights into the aberrant activity of mutant EGFR kinase domain and drug recognition. Structure 2013, 21, 209-19. 44.

Mehdizadeh, A.; Somi, M. H.; Darabi, M.; Jabbarpour-Bonyadi, M.,

Extracellular signal-regulated kinase 1 and 2 in cancer therapy: a focus on hepatocellular carcinoma. Mol Biol Rep 2016, 43, 107-16. 45.

Sebolt-Leopold, J. S.; Herrera, R., Targeting the mitogen-activated protein

kinase cascade to treat cancer. Nat Rev Cancer 2004, 4, 937-47.

62

ACS Paragon Plus Environment

Page 63 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

46.

Blake, J. F.; Gaudino, J. J.; De Meese, J.; Mohr, P.; Chicarelli, M.; Tian, H.;

Garrey, R.; Thomas, A.; Siedem, C. S.; Welch, M. B.; Kolakowski, G.; Kaus, R.; Burkard, M.; Martinson, M.; Chen, H.; Dean, B.; Dudley, D. A.; Gould, S. E.; Pacheco, P.; Shahidi-Latham, S.; Wang, W.; West, K.; Yin, J.; Moffat, J.; Schwarz, J. B., Discovery of 5,6,7,8-tetrahydropyrido[3,4-d]pyrimidine inhibitors of Erk2. Bioorg

Med Chem Lett 2014, 24, 2635-9. 47.

Mantle, D.; Falkous, G.; Ishiura, S.; Blanchard, P. J.; Perry, E. K., Comparison

of proline endopeptidase activity in brain tissue from normal cases and cases with Alzheimer's disease, Lewy body dementia, Parkinson's disease and Huntington's disease. Clin Chim Acta 1996, 249, 129-39. 48.

Svarcbahs, R.; Julku, U. H.; Norrbacka, S.; Myöhänen, T. T., Removal of prolyl

oligopeptidase reduces alpha-synuclein toxicity in cells and in vivo. Sci Rep 2018, 8, 1552. 49.

Van der Veken, P.; Fülöp, V.; Rea, D.; Gerard, M.; Van Elzen, R.; Joossens,

J.; Cheng, J. D.; Baekelandt, V.; De Meester, I.; Lambeir, A. M.; Augustyns, K., P2substituted N-acylprolylpyrrolidine inhibitors of prolyl oligopeptidase: biochemical evaluation, binding mode determination, and assessment in a cellular model of synucleinopathy. J Med Chem 2012, 55, 9856-67. 50.

Kaszuba, K.; Róg, T.; Danne, R.; Canning, P.; Fülöp, V.; Juhász, T.; Szeltner,

Z.; St Pierre, J. F.; García-Horsman, A.; Männistö, P. T.; Karttunen, M.; Hokkanen, J.; Bunker, A., Molecular dynamics, crystallography and mutagenesis studies on the substrate gating mechanism of prolyl oligopeptidase. Biochimie 2012, 94, 1398-411. 51.

Ai, Y.; Yu, L.; Tan, X.; Chai, X.; Liu, S., Discovery of Covalent Ligands via

Noncovalent Docking by Dissecting Covalent Docking Based on a "Steric-Clashes Alleviating Receptor (SCAR)" Strategy. J Chem Inf Model 2016, 56, 1563-75. 52.

Lonsdale, R.; Burgess, J.; Colclough, N.; Davies, N. L.; Lenz, E. M.; Orton, A.

L.; Ward, R. A., Expanding the Armory: Predicting and Tuning Covalent Warhead Reactivity. J Chem Inf Model 2017, 57, 3124-3137. 53.

Flanagan, M. E.; Abramite, J. A.; Anderson, D. P.; Aulabaugh, A.; Dahal, U.

P.; Gilbert, A. M.; Li, C.; Montgomery, J.; Oppenheimer, S. R.; Ryder, T.; Schuff, B. 63

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

P.; Uccello, D. P.; Walker, G. S.; Wu, Y.; Brown, M. F.; Chen, J. M.; Hayward, M. M.; Noe, M. C.; Obach, R. S.; Philippe, L.; Shanmugasundaram, V.; Shapiro, M. J.; Starr, J.; Stroh, J.; Che, Y., Chemical and computational methods for the characterization of covalent reactive groups for the prospective design of irreversible inhibitors. J Med

Chem 2014, 57, 10072-9. 54.

Ábrányi-Balogh, P.; Petri, L.; Imre, T.; Szijj, P.; Scarpino, A.; Hrast, M.;

Mitrović, A.; Fonovič, U. P.; Németh, K.; Barreteau, H.; Roper, D. I.; Horváti, K.; Ferenczy, G. G.; Kos, J.; Ilaš, J.; Gobec, S.; Keserű, G. M., A road map for prioritizing warheads for cysteine targeting covalent inhibitors. Eur J Med Chem 2018, 160, 94-107. 55.

Gehringer, M.; Laufer, S. A., Emerging and Re-Emerging Warheads for

Targeted Covalent Inhibitors: Applications in Medicinal Chemistry and Chemical Biology. J Med Chem 2019.

64

ACS Paragon Plus Environment

Page 64 of 65

Page 65 of 65 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table of Contents graphic

Target

Reference   fragment  

Cysteine   Serine  

CovaDOTS

O R1

R1

N H

N

Warheads  

O R1

N R2

Covalent inhibitor

O R1 S O

Virtual   Screening  

R1

R1 SH

O

Chemistry-­‐ Driven  Library   Design  

ACS Paragon Plus Environment