Identification of Covalent Binding Sites Targeting Cysteines Based on

Aug 2, 2016 - Covalent drugs have attracted increasing attention in recent years due to good inhibitory activity and selectivity. Targeting noncatalyt...
1 downloads 3 Views 3MB Size
Subscriber access provided by CORNELL UNIVERSITY LIBRARY

Article

Identification of Covalent Binding Sites Targeting Cysteines based on Computational Approaches Yanmin Zhang, Danfeng Zhang, Haozhong Tian, Yu Jiao, Zhihao Shi, Ting Ran, Haichun Liu, Shuai Lu, Anyang Xu, Xin Qiao, Jing Pan, Lingfeng Yin, Weineng Zhou, Tao Lu, and Yadong Chen Mol. Pharmaceutics, Just Accepted Manuscript • DOI: 10.1021/acs.molpharmaceut.6b00302 • Publication Date (Web): 02 Aug 2016 Downloaded from http://pubs.acs.org on August 3, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Molecular Pharmaceutics is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

Identification of Covalent Binding Sites Targeting Cysteines based on Computational Approaches Yanmin Zhanga, Danfeng Zhangb, Haozhong Tianb, Yu Jiaob, Zhihao Shib, Ting Rana, Haichun Liua, Shuai Lua,b, Anyang Xua, Xin Qiaoa, Jing Pana, Lingfeng Yina, Weineng Zhoua, Tao Lua,b*, Yadong Chena* a

Laboratory of Molecular Design and Drug Discovery, School of Science, China Pharmaceutical University, Nanjing 211198, China.

b

State Key Laboratory of Natural 639 Longmian Avenue Medicines, China Pharmaceutical University, 24 Tongjiaxiang, Nanjing 210009, China.

ABSTRACT: Covalent drugs have attracted increasing attention in recent years due to good inhibitory activity and selectivity. Targeting noncatalytic cysteines with irreversible inhibitors is a powerful approach for enhancing pharmacological potency and selectivity because cysteines can form covalent bonds with inhibitors through their nucleophilic thiol groups. However, most human kinases have multiple noncatalytic cysteines within the active site; to accurately predict which cysteine is most likely to form covalent bonds is of great importance but remains a challenge when designing irreversible inhibitors. In this work, FTMap was firstly applied to check its ability in predicting covalent binding site defined as the region where covalent bonds are formed between cysteines and irreversible inhibitors. Results show that it has excellent performance in detecting the hot spots within the binding pocket and its 1

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

hydrogen bond interaction frequency analysis could give us some interesting instructions for identification of covalent binding cysteines. Furthermore, we proposed a simple but useful covalent fragment probing approach and showed that it successfully predicted the covalent binding site of seven targets. By adopting a distance-based method, we observed that the closer the nucleophiles of covalent warheads are to the thiol group of a cysteine, the higher possibility a cysteine is prone to form a covalent bond. We believe that the combination of FTMap and our distance-based covalent fragment probing method can become a useful tool in detecting the covalent binding site of these targets.

Keywords: covalent binding sites; cysteines; fragment probing; irreversible inhibitors

1. INTRODUCTION The protein kinome, comprising about 518 human kinases, has long been recognized as an attractive but challenging drug target family.1 Many kinase inhibitors have been approved by the U.S. Food and Drug Administration (FDA) for therapeutic use in oncology.2-4 However, the search for clinically applicable kinase inhibitors that target the highly conserved ATP pocket has been thwarted by several hurdles, such as selectivity, cellular potency, and intellectual property issues.5 Due to the highly conserved structural similarity of the kinase active sites and the high concentration of ATP, reversible inhibitors are greatly limited in clinical use for lack of high inhibitory activity and specificity.6, 7 Recent years have witnessed increasing attention on covalent drugs and their significance has been strengthened by large amounts of literature.7, 8 Covalent binding agents refer to the kind of drug molecules that bind 2

ACS Paragon Plus Environment

Page 2 of 39

Page 3 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

to their targets by forming covalent bonds with part of the targets.8, 9 The covalent bond renders these ligands binding to their targets in an inherently irreversible manner, resulting in enhanced inhibitory activity.9 In addition, covalent compounds possess high selectivity due to their special structural requirement of the targets.7 Moreover, it has been demonstrated that only small amounts of covalent compounds is required in comparison with their non-covalent counterparts, indicating a certain safety.7, 10 It has been anticipated that the next decade will witness a resurgence of interest in this important therapeutic.7 Targeting noncatalytic cysteine residues with irreversible inhibitors has been demonstrated to be a powerful approach for strengthening pharmacological potency and selectivity.11, 12 Cysteine shows rich chemical reactivity through its nucleophilic thiol group and it is also one of the least abundant amino acid in proteins.11, 13 These properties render cysteine ideal for designing covalent drugs.10, 12, 14 Currently, at least four cysteine-targeted kinase inhibitors are in clinical trials for advanced cancer and they all rely on an acrylamide electrophile to form an covalent bond with cysteines in kinases.7, 15 However, most human kinases have multiple noncatalytic cysteines which may be suitable for covalent binding within the active site.2, 16 Some cysteines are more special, and therefore more easily targeted, than others, predicting those cysteines would be very useful in the design of irreversible inhibitors. Generally, there are two strategies for developing covalent inhibitors. The first is a structure-guided design on the basis of the existing non-covalent inhibitors. Examples include the first generation covalent inhibitors of epidermal growth factor receptor (EGFR) derived from PD168393, the fibroblast growth factor receptor 4 (FGFR4) inhibitor FIIN-1, the serine/threonine-protein kinase NEK2 inhibitor JH295, and the ribosomal s6 kinase (RSK) inhibitor FMK.13 The 3

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

second strategy is to create libraries of potentially covalent kinase inhibitors in combination with broad-based kinase profiling. The development of JNK-IN-8 as an excellent c-Jun N-terminal kinases 3 (JNK3)17 covalent inhibitor and ibrutinib against Bruton’s tyrosine kinase (BTK)18 are great examples. Both strategies involve a detailed and explicit bioinformatics analysis of the binding sites such as multiple sequence alignment.13, 19 FTMap, a direct computational analog of the experimental screening approaches, globally samples small organic molecules as probes on the surface of a target protein, finds favorable positions, clusters conformations and ranks the clusters (also called “hot spots”) on the basis of average energy.20 It is particularly useful and accurate in predicting bound poses of the user-selected molecules and detecting whether a compound is likely to bind in the hot spot region or not, and finally providing input for the design of larger ligands.21 In this work, we first conducted multiple sequence alignment of the binding pocket of various covalent inhibitor kinase targets. Then, FTMap was applied to check its ability in predicting covalent binding site defined as the region where covalent bonds could be formed between those cysteines (called “covalent binding cysteines”) and irreversible inhibitors. Results show that it has excellent performance in detecting the hot spots within the binding pocket and its hydrogen bond interaction frequency analysis could give us some interesting instructions for the identification of covalent binding cysteines. Furthermore, we proposed a simple but useful distance-based method which employed a series of covalent warheads to detect the covalent binding sites and results showed that it successfully predicted the covalent binding sites of the seven targets. We believe that the combination of FTMap and our

4

ACS Paragon Plus Environment

Page 4 of 39

Page 5 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

distance-based covalent fragment probing method can become a useful tool to detect the covalent binding of these targets.

2. MATERIALS AND METHODS 2.1. Data preparation. Seven kinase targets (Figure 2a) that belong to the tyrosine kinase (TK) and CMGC groups including BTK, Proto-oncogene tyrosine-protein kinase (cSrc), EGFR, ERK2, FGFR4, JNK3 and VEGFR-2 were investigated in this study. Twenty-two covalent complexes of these seven targets

18, 22, 23

were obtained from the protein data bank (PDB) databases (www.rcsb.org)24.

They are experimentally resolved to exhibit a covalent bond between the ligand and the receptor by Michael addition reaction. Among them, two are special. One is PDB 3GEN25 of BTK which contains a quite similar cognate ligand with the approved BTK irreversible inhibitor ibrutinib. Another is PDB 1YWN26 of vascular endothelial growth factor receptor 2 (VEGFR-2 or kinase insert domain receptor [KDR]) as there is no crystal structures in complex with their irreversible inhibitors currently. Only one chain of each crystal structure with the fewest missing atoms or chain breaks was extracted. Then, protein preparation wizard module in Maestro (Schrödinger Inc.)27 was used for adding missing hydrogens for side chains and removing bound waters of these crystal structures. 2.2. Multiple Sequence Alignment. The sequence of the binding sites for the seven targets was aligned by the Multiple Sequence Alignment Module in Schrödinger28. The sequence of active binding site which was defined as the cavity expanded 10 Å of the cognate ligand for each target was aligned for 5

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

comparison. 2.3. FTMap. Based on the extremely efficient fast Fourier transform (FFT) correlation approach, FTMap samples billions of probe positions on dense translational and rotational grids and adopts the sum of correlation functions for scoring.21 It was performed through its online server (http://ftmap.bu.edu) which scans the entire surface of the protein with a library of 16 small organic probe molecules with varying hydrophobicity and hydrogen bonding capability. The 16 types of small molecular probes are composed of ethanol, isopropanol, isobutanol, acetone, acetaldehyde, dimethyl ether, cyclohexane, ethane, acetonitrile, urea, methylamine, phenol, benzaldehyde, benzene, acetamide and N,N-dimethylformamide.20 Every six bound probe clusters with the lowest mean interaction energies were retained for each probe. The clusters from different probe types were then clustered into consensus sites (CSs), which define hot spots where multiple probes congregate with high affinity. The CSs were ranked on the basis of the number of probe clusters they included, with the largest CSs representing the most important sites and the nearby smaller CSs revealing other important subsites. Moreover, the hydrogen bond (hbonded) interactions and nonbonded interactions including van der Waals forces, hydrophobic interactions, cation π interaction, electrostatic interaction, ionic interaction and etc. between each residue and the probes were saved for further analysis. The frequency of probes demonstrates the interactions occurred in contacting with certain residues. 2.4. Covalent Fragment Probing. Hubert Li et al. constructed a docking-based fragment probing method to detect potential 6

ACS Paragon Plus Environment

Page 6 of 39

Page 7 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

druggable sites in protein-protein interacting interfaces.29 Results showed that the high ligand density regions are strongly correlated with known protein-protein interacting surfaces. We extended their ideas in our study which attempts to locate covalent binding sites. It includes several steps. Firstly, an energy grid for docking was generated according to the protein-ligand active cavity, and cognate ligands in the protein complexes were regarded as part of the protein with covalent warheads removed. Secondly, a database of 79 representative covalent warheads from both known irreversible inhibitors and literature30-32 (Figure S1 in Supporting Information) were sampled into the entire active cavity. Glide (Schrödinger Inc.) standard precision (SP)33 mode was adopted for docking the covalent fragment probes as it is considered as the most favorable software through a self-docking analysis.34 The top scoring pose per ligand was conserved for ranking of binding sites. Moreover, a distance-based analysis was conducted and the details were illustrated in Figure 1. We attempted to evaluate the possibility of a covalent warhead binding to cysteine by calculating the distance between the electrophiles of covalent warheads and the cysteine’s nucleophilic thiol group. Generally, the most frequently used functional groups include acrylamides and other α,β-unsaturated groups, boronic acids, and α-halogen ketones.35 As shown in Figure 1, the 79 covalent fragments were categorized into 6 groups according to their reactive groups.32, 35 In Figure 1a, 1b, 1e and 1f, their covalent warheads contain an acrylamide group (e.g., covalent fragments 1-1, 1-4, 2-24,2-4,2-21, 2-38 and etc.), a propynylamide group (e.g., covalent fragments 2-29, 2-30 and etc.), a cyano group (e.g., covalent fragments 2-32, 2-33, 2-34 and etc. ) and cyanoacrylamide group (e.g., covalent fragments 3-1 and etc.), respectively. These groups primarily target cysteines through a 7

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Michael addition reaction.35 For Figure 1c and 1d, the covalent warheads have a chloroacetylamide group (e.g., covalent fragments 1-3, 2-27, 2-28 and etc.) and a chlorothiadiazole group (e.g., covalent fragments 2-41, 2-42 and etc.) individually. Their reaction with protein side chains involves nucleophilic replacement of the halogen.35 For comparison, both covalent and non-covalent crystal structures for each target were employed in detecting covalent binding site, and the corresponding distances between the electrophilic parts of covalent warheads and the nucleophiles of protein cysteines were measured. For instance, Figure 1 displays the measuring methods for some representative electrophilic groups including covalent fragments 2-4, 2-30, 1-3, 2-41, 2-32 and 3-1 with EGFR cysteines within the active site (i.e., Cys797, Cys775 and Cys781). The mean and median distances for the covalent fragments to each cysteine were calculated. [Figure 1 comes about here] 3. RESULTS AND DISCUSSION 3.1. Multiple Sequence Alignment. As shown in Figure 2b and Figure 3, there were multiple cysteines (at least two) in the active site and some of their cysteines were quite conserved, for example, cysteines in the 39th position for BTK, EGFR, FGFR4 and JNK3, 54th position for BTK, cSrc and EGFR, and 59th position for ERK2, FGFR4 and VEGFR-2. Particularly, most of the covalent binding cysteines were situated in this three conserved regions of the active binding pocket, where the covalent binding cysteines of Cys154 for JNK3 was in the 39th position, Cys481 for BTK, Cys345 for cSrc and Cys797 for EGFR were in the 54th position, and Cys164 for ERK2 was in the 59th position (Figure 3). However, the distributions of cysteines within each target were 8

ACS Paragon Plus Environment

Page 8 of 39

Page 9 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

quite different where they were widely distributed in the sequence (Figure 3) and crystal structures (Figure 2b) and their distributions were not related to their sequence identity. For example, although FGFR4 and VEGFR-2 possessed the highest similarity (about 42% sequence identity of the binding site), but the distributions of cysteines were quite different that the covalent binding Cys477 of FGFR4 was located in the Glycine-rich region (G-loop) while the covalent binding Cys1043 of VEGFR-2 was situated in the active catalytic loop (A-loop) (Figure 2b and Figure 3). Due to the fact that the distribution of cysteines was scattered and varied from protein to protein, we could not discriminate which cysteine was the covalent binding one only by sequence alignment. [Figure 2 comes about here] [Figure 3 comes about here] 3.2. FTMap Analysis. FTMap was applied to analyze the hot spots for the 7 targets which include 22 crystal structures in total. Both the ligand-bound and ligand-unbound crystal complexes were investigated to avoid any bias. The outcomes of FTMap (Figure 4) indicated that the bound ligands of the seven targets form conserved hydrogen bonds with methionine, alanine or cysteine in hinge region (i.e., Met477 and Glu475 for BTK, Met341 for cSrc, Met793 for EGFR, Met106 for ERK2, Ala553 for FGFR4, Met149 for JNK3, and Cys917 for VEGFR-2). For the type II inhibitors like the cognate ligands of PDB 4R6V and 1YWN, they also form hydrogen bonds with the asparagine and glutanmine in the DGF-motif (i.e., Asp630 for FGFR4, and Asp1044 and Glu883 for VEGFR-2). Moreover, their covalent warheads are toward the corresponding covalent binding cysteines (i.e., Cys481, Cys345, Cys797, Cys164, 9

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Cys477, Cys154 and Cys1043 for BTK, cSrc, EGFR, ERK2, FGFR4, JNK3 and VEGFR-2, respectively) for facilitating the Michael addition reaction. Table S1 in the Supporting Information and Figure 4 display the distribution of the 16 different probe clusters from FTMap. In druggable targets, it has been shown that the main hot spot binds at least 16 probe clusters and, in combination with its nearby hot spots, predicts the binding site that can potentially incorporate drug-size ligands.36-38 From Table S1 in the Supporting Information, FTMap could recognize at least one druggable site in all the targets except for PDB 2J5F of EGFR and PDB 4QQC of FGFR4 in the ligand-bound group and PDB 3SVV of cSrc, PDB 2J5E of EGFR and PDB 4QQC of FGFR4 in the ligand-unbound group. Because there was no significant difference for the number of CSs with probe clusters over 16 for the ligand-bound and ligand-unbound groups; we analyzed the results generated from the ligand-bound complexes in detail. To enable the CSs more recognizable, different colors were used to denote them according to the number of probe clusters that bind to them. Besides, in order to better explore the protein-ligand binding interaction, the frequency of both hbonded and nonbonded interactions for each residue were calculated. We would explicitly analyze the FTMap results for each target, accordingly. 3.2.1. BTK As a key target in B-cell activation and development, BTK has emerged as a new molecular target for treating B-cell malignancies and autoimmune diseases.19 Despite that a variety of BTK inhibitors such as dasatinib, LFM-A13, CGI-1746, and RN-486 have been developed, most of these inhibitors failed in clinical trials. Whereas the irreversible inhibitor ibrutinib was approved to treat patients with mantle cell lymphoma in 2013.39 As shown in Table S1 in 10

ACS Paragon Plus Environment

Page 10 of 39

Page 11 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

the Supporting Information, FTMap detected seven CSs for BTK and four of them (CS1 ~ CS4) were druggable. From Figure 4a, CS3 (16 probe clusters) was located in the hinge region while CS2 (17 probe clusters) was near the oxydibenzene part of ibrutinib. Despite CS4 (16 probe clusters) and CS1 (19 probe clusters) were far away from the cognate ligand, they incorporated in the vicinity of DFG motif and the hydrophobic back pocket, correspondingly. Met477 and Glu475 possessed 2.8 % and 0.2 % of hbonded interactions, as well as 2.7 % and 0.2 % of nonbonded interactions, respectively. Moreover, the covalent binding Cys481 was detected 2.2 % hbonded interactions and 1.6 % nonbonded interactions. As for Cys527 and Cys464, no hbonded or nonbonded interactions were obtained. 3.2.2. cSrc cSrc is a major cytosolic tyrosine kinase in vascular tissue, which is linked to cancer (e.g., colon, liver, lung, breast, pancreas and etc.) progression by promoting other signals.40 A number of tyrosine kinase inhibitors that target cSrc tyrosine kinase have been developed for therapeutic use, including bosutinib, bafetinib, AZD-530, XLl-999, KX01 and XL228.41 As shown in Figure 4b, three of the top five CSs (CS1 ~ CS3) were located in the binding pocket. The druggable CS2 (18 probe clusters) overlapped on the quinazoline part in the hinge region and CS1 (22 probe clusters) was then situated around 3-bromoaniline substituent. In addition, Met477 in the hinge region obtained the second highest percent of hbonded and nonbonded interactions (12.8 % and 6.6 %, respectively) followed by Asp404 of the DFG motif (17.8 % and 8.3 %, accordingly). The percent of hbonded and nonbonded interactions for the covalent binding Cys345 were 5.0 % and 2.9 %, correspondingly, higher than Cys277 with the percent of both hbonded and nonbonded interactions lower than 0.8 %. 11

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

3.2.3. EGFR EGFR already has several reversible (e.g., gefitinib and erlotinib) and irreversible (e.g., afatinib (FDA approved), canertinib, dacomitinib and WZ-4002) inhibitors in clinical trials.42 Although the top five probe clusters were situated within the binding pocket, only CS1 around the DFG motif obtained 16 probe clusters (Table S1 in the Supporting Information). CS2 and CS3 both with 15 probe clusters overlapped on quinoline scaffold in the hinge region and the hydrophobic back pockets, respectively. CS4 (12 probe clusters) and CS5 (11 probe clusters) were matched to 3-chloro-4-(pyridin-2-ylmethoxy)aniline substituent of the cognate ligand. The percent of interactions indicated that Met793 achieved the second highest percent of hbonded and nonbonded interaction of 8.0 % and 4.7 %, respectively (Figure 4c). The covalent binding Cys797 had an hbonded frequency of 3.7 % and a nonbonded interactions frequency of 1.3 % followed by Cys775 with an hbonded and nonbonded frequency of 2.7 % and 2.7 %, respectively. Other crystal structures of EGFR, expect PDB 2J5E without any interactions identified, the interactions frequency were stronger for Cys775 (hbonded %: 7.7 % and nonhbonded%: 1.7 %) than Cys797 (hbonded %: 0.6 % and nonhbonded%: 1.1 %). Additionally, no interaction with Cys781 was detected in all eight crystal structures. 3.2.4. ERK2 ERK2, a member of the MAP kinase family, acts as an integration point for multiple biochemical signals and is involved in a wide variety of cellular processes such as proliferation, differentiation, transcription, regulation, and development.43 Various ERK1/2 kinase inhibitors have been developed especially those with antitumor activity in MAPK 12

ACS Paragon Plus Environment

Page 12 of 39

Page 13 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

inhibitor-naïve and MAPK inhibitor-resistant cells containing BRAF or RAS mutatants.44 From Table S1 in the Supporting Information and Figure 4d, three hot spots with over 16 probe clusters were identified and among them, CS2 (22 probe clusters) was situated in the hinge region. CS1 (27 probe clusters) and CS2 (16 probe clusters) were located in a relatively large cavity where the location of CS1 was surrounded by G-loop, DFG motif and αC helix. In terms of hydrogen bond frequency, Met106 occupied the highest hbonded interaction of 14.0 % and the third highest nonhbonded interaction of 7.3 %. The covalent binding Cys164 then achieved the second strongest hbonded interaction of 12.8 % and a relatively high nonhbonded interaction of 2.3 %. No interactions were detected for Cys38 which is also located in the binding pocket. 3.2.5. FGFR4 FGFRs are receptors that bind to the members of the fibroblast growth factor (FGF) family. The past decades have witnessed the emergence of several irreversible inhibitors of FGFR4 and their corresponding crystal structures.23, 45 FTMap recognized CS1 with probe clusters over 16, which exactly mapped on the hinge region (Table S1 in the Supporting Information and Figure 4e). The CS2 (15 probe clusters) overlapped on the phenyl part of the covalent warhead in the location of the covalent binding Cys477. The hbonded interaction frequency for Ala553, Asp630 and Glu475 that formed hydrogen bonds with the cognate ligand were 4.0 %, 12.6 %, and 0.8 % respectively; and the corresponding nonbonded interaction frequency were 1.8 %, 4.2 %, and 2.2 %. Moreover, the three cysteines (Cys477, Cys552, and Cys608) within the binding pocket achieved the hbonded interaction frequency of 6.6 %, 6.9 %, and 2.9 %, respectively, and the nonhbonded interaction frequency of 2.1 %, 1.7 %, 13

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

and 1.2 %, accordingly. Despite that the hbonded interaction of Cys477 and Cys522 were comparable, the nonhbonded interaction frequency of Cys477 was higher than that of Cys522, indicating a higher covalent binding capability of Cys477. [Figure 4 comes about here] 3.2.6. JNK3 JNKs, first characterized as stress-activated members of the mitogen-activated protein kinase (MAPK) family, have become a focus of inhibitor screening because their critical roles in the development of a number of diseases, such as diabetes, neurodegeneration and liver disease.46 Three CSs with more than 16 probe clusters were spotted in the binding pocket where CS1 (25 probe clusters) was located in the hinge region and CS2 (17 probe clusters) were situated in a surface groove far away from the cognate ligand (Table S1 in the Supporting Information and Figure 4f). Small hot spots like CS4 and CS5 were around the pyridine part of the ligand, demonstrating the rationality of this area for accommodating ligands. The hinge region Met149 obtained a frequency of 4.9 % and 3.7 % for hbonded and nonbonded interactions, respectively. It should be noting that no interactions were detected for the three cysteines (Cys154, Cys79 and Cys251) in the binding cavity; however, Cys283 located near CS2 was recognized with slight hbonded and nonbonded interaction frequency of 0.5 % and 0.1 %, individually. 3.2.7. VEGFR-2 VEGFR-2 kinase is a critical target for the discovery of small-molecule inhibitors against tumor-associated angiogenesis.47 Various types of VEGFR-2 kinase inhibitors have been approved by the FDA for clinical use, which includes pyrimidines (e.g., pazopanib), diaryl 14

ACS Paragon Plus Environment

Page 14 of 39

Page 15 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

ureas (e.g., sorafenib, regorafenib), indolinones (e.g., sunitinib), quinazolines (e.g., vatalanib), and some other types (e.g., axitinib).34,

48

According to Table S1 in the Supporting

Information and Figure 4g, FTMap recognized three hot spots with over 16 probe clusters in the VEGFR-2 binding pocket. CS2 (21 probe clusters) overlapped on the hinge region and CS1 (23 probe clusters) was located in the hydrophobic back pocket. CS3 (18 probe clusters) and CS5 (7 probe clusters) were situated around the other parts of the cognate ligand. A high hbonded interaction frequency and nonbonded interaction frequency were observed for the residues formed hydrogen bonds: Cys917 (26.9 % and 5.8 %), Asp 1044 (9.7 % and 4.2 %) and Glu885 (4.9 % and 5.7 %). As for the two cysteines, Cys1043 obtained the second highest percent of hbonded interaction frequency (15.4 %) and the fifth highest percent of nonbonded interaction frequency (4.0 %) while Cys1022 simply achieved a very low percent of hbonded and nonbonded interaction frequency of 0.02 % and 1.2 %, respectively. The hbonded interaction frequency for the hinge region Cys917 was the strongest. Overall, results showed that FTMap performed favorably in detecting hot spots in the hinge region, DFG motif as well as the hydrophobic back pocket which are the three most important binding sites in kinases. Besides, the analysis of hbonded and nonbonded interaction frequency for five of seven targets (JNK3 and VEGFR-2 as exceptions) has proven that the covalent binding cysteines achieved the highest frequency of the interaction. As most known covalent inhibitors are developed on the basis of known reversible inhibitors by an introducing Michael addition acceptor to form covalent bond with cysteines, and it is generally very difficult to make sure which cysteines to be targeted. We further proposed a covalent binding site detection strategy by sampling covalent warheads into the binding 15

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

pockets and the details were discussed as follows. 3.3. Covalent Fragment Probing Analysis. It has been demonstrated that small molecular probes preferentially chose to bind to the active site (i.e., ATP binding site) from FTMap outcomes. Moreover, evidence showed that the known non-covalent part of the covalent inhibitors are essential to both biochemical and cellular potency by a kinetic approach describing the components of overall inhibitor potency (reversible binding and chemical reactivity) on covalent drugs.15 Herein, our main purpose was to detect covalent binding sites around the reversible inhibitors which already bind to the active site. Thus, for the crystal structures with covalent binding ligands, the covalent warheads were deleted; while for the non-covalent crystal structures nothing has been changed for the original cognate ligands. The molecular surface area (MSA) and solvent accessible surface area (SASA) calculated by Pymol49 of the cysteines (Figure 5a) and their sulfur atoms (Figure 5b) for both the covalent and non-covalent crystal structures were compared to ensure that there was no significant difference between these two crystal structures. From Figure 5a, the MSA of cysteines were all within 90 Å2 ~ 107 Å2 and the SASA of generally occupied less than 50% of MSA for all cysteines. However, some exceptions existed with the percentage of SASA over MSA were 167 % and 111 % for Cys277 in cSrc, 115.4 % and 52.8 % for Cys477 in FGFR4, 70.5% and 50.7 % for Cys154 in JNK3 for covalent and non-covalent crystal structures respectively. Meanwhile, the MSA of the sulfur atoms of these cysteines were all around 25 Å2 and the SASA were also less than 50 % of the corresponding MSA (Figure 5b). Outliers of the percentage of SASA over MSA includes the Cys345 (56.6 % and 51.8 %) and Cys277 (187.4 % and 151.7 %) in cSrc for 16

ACS Paragon Plus Environment

Page 16 of 39

Page 17 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

covalent and non-covalent crystal structures respectively, as well as Cys477 (119.7 %) in FGFR4 and Cys154 (100.7 %) in JNK3 only for covalent crystal structures. Despite this, the overall trend of both MSA and SASA for the cysteines as well as their reactive sulfur atoms was the same in both covalent and non-covalent crystal structures. Moreover, the intrinsic reactivity of cysteines were investigated by calculating the acid dissociation constant (pKa) values through PROPKA3 (http://propka.org/) as it is ultrafast while can still provide acceptable results.50, 51 However, as shown in Table S2 in the Supporting Information, only the covalent binding cysteines for 3GEN of BTK (i.e., Cys481), 4QQC and 4R6V of FGFR (i.e., Cys477), and 3V6S of JNK (i.e., Cys154) obtained the lowest pKa values which may represent highest reactivity of cysteines. The covalent binding cysteines in the remaining crystal structures were not the ones with the highest reactivity. This may be due to the fact that reactivity is related to several factors, such as pKa, exposure to solvent (e.g., buried area) and some steric conflicts.51 It has been demonstrated that heightened reactivity is not necessarily a defining feature for all functional cysteines. For instance, catalytic cysteines in some enzymes may show decreased reactivity until they bind their physiological substrates or may depend more on substrate recognition than inherent catalytic power. Activity for other cysteines may also be not rely on their nucleophilicity.52 Nevertheless, pKa values for cysteines in selected covalent and non-covalent crystal structures were almost the same from Table S3 in the Supporting Information, further demonstrating the consistency of the covalent and non-covalent structures. [Figure 5 comes about here] As shown in the left panel of Figure 6, fragment probes sampled to the covalent crystal 17

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

structures were depicted magenta and those to non-covalent crystal structures were colored blue. We observed that all the targets probed the known covalent consensus sites (called “covalent CS”) in the covalent crystal structures (the magenta colored fragments in Figure 6). For BTK (Figure 6a) and JNK3 (Figure 6f), about two third of the covalent warheads were detected in the covalent binding sites. The remaining covalent fragments were located in the DFG-out hydrophobic region which can accommodate many types of fragments. This was similar to the results from FTMap where CS1 (blue) and CS4 (salmon) (left panels of Figure 4a and Figure 4f) were also situated in this area because ibrutinib itself is a typeⅠinhibitor without occupation on this pocket. As for JNK3, this deviation may attribute to its long and wide binding site which can incorporate large fragments. For VEGFR-2, almost all the covalent probes appeared on the vicinity of Cys1043 which has been proved to be the covalent binding cysteine53, 54. For those in non-covalent crystal structures (colored blue in Figure 6a ~ Figure 6f left panel), we observed that they performed quite similarly compared with their covalent counterparts. For FGFR4 (Figure 6e), although the fragment probes occupied two different regions, they were all close to the covalent binding cysteine (Cys477). This may result from the different initial locations of Cys477 in the covalent PDB 4R6V and the non-covalent PDB 4UXQ. Furthermore, we focused on the distance between those fragment probes and the covalent binding cysteine. To make it more quantitative, we assumed that the closer the nucleophiles of covalent warheads are to the thiol group of a cysteine, the higher possibility a cysteine is prone to form a covalent bond. As shown in the right panels of Figure 6a ~ Figure 6f and Table S3 in the Supporting Information, all the covalent binding cysteine obviously achieved 18

ACS Paragon Plus Environment

Page 18 of 39

Page 19 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

the smallest mean and median values of distance between the electrophilic group of the fragments and the thiol group of the cysteines. From Figure 6 and Table S3 in the Supporting Information, the mean distance were all less than 10 Å (i.e., BTK [9.9 Å], cSrc [7.2 Å], EGFR [6.6 Å], ERK2 [8.4 Å], FGFR4 [6.8 Å], JMK3 [8.8 Å] and VEGFR-2 [6.7 Å]). The corresponding median values were also less than 10 Å except cSrc (11.5 Å) and JNK3 (10.4 Å). In contrast, for the other cysteines within the active sites, the corresponding distances were generally two to three times longer than the covalent binding cysteines. For instance, the mean distance for the covalent binding Cys797 of EGFR were 6.6 Å and 5.7 Å for the covalent PDB 2JIV and non-covalent PDB 3POZ, respectively (Figure 6c and Table S3 in the Supporting Information); while for the other two cysteines Cys775 and Cys781, the corresponding distances were about twice (11.4 Å and 13.3 Å) and three times (21.3 Å and 22.5 Å) longer than those to Cys797, individually. Another example was FGFR4 where the mean distance to Cys477 of FGFR4 were 6.8 Å and 7.1 Å for the covalent PDB 4R6V and non-covalent PDB 4UXQ, individually. Nevertheless, the corresponding distances were about twice (11.4 Å and 19.4 Å) for Cys552, three times (21.5 Å and 19.8 Å) for Cys540 and Cys608 (18.0 Å and 10.6 Å). It should be noted that BTK and VEGFR-2 are two targets using two assumed crystal structures without a covalent bound ligand. For BTK, the mean distances for the covalent binding Cys481 were 9.9 Å and 6.0 Å for the covalent 3GEN and non-covalent 3K54, individually; shorter than that of Cys527 (12.7 Å and 10.5 Å) and Cys464 (18.7 Å and 20.7 Å). For VEGFR-2, the mean distances for the covalent binding Cys1043 were 6.7 Å and 8.7 Å for the assumed covalent 1YWN and non-covalent 4ASD, respectively; shorter than that of Cys917 (8.8 Å and 10.8 Å) and Cys1022 (20.6 Å and 20.3 19

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Å). It meant that no matter for the covalent or non-covalent crystal structures, the application of covalent fragments to probe the covalent binding site has certain reliability. [Figure 6 comes about here] 4. CONCLUSION In this work, multiple sequence alignment of seven kinase targets were conducted for investigating the conservation of cysteines within the active binding site, which includes BTK, cSrc, EGFR, ERK2, FGFR4, JNK3 and VEGFR-2. FTMap was employed to explore the binding site and hydrogen bond frequency. Results show that FTMap can favorably detect the hot spots in the active site and its hydrogen bond frequency analysis indicates that a cysteine with a higher frequency of hbonded and nonbonded interaction is more likely to form covalent bonds. Furthermore, we proposed a covalent fragment probing approach which adopted a series of covalent warheads to detect the covalent binding site. Although covalent fragments appear on multiple sites for some targets (JNK3 and BTK), the main distribution is concentrated in the vicinity of the known covalent binding cysteines. Additionally, it has been demonstrated that the closer the nucleophiles of covalent warheads are to the thiol group of a cysteine, the higher possibility a cysteine is prone to form covalent bond. The consistent results for both covalent and non-covalent crystal structures indicate certain reliability of this approach. In total, we believe that the combination of FTMap particularly the hydrogen bond frequency analysis and our distance-based covalent fragment probing method can become a useful tool in detecting the covalent binding site where cysteines forms covalent bond with ligands of these targets.

20

ACS Paragon Plus Environment

Page 20 of 39

Page 21 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

FIGURES

Figure 1. Six types of covalent warheads and the mechanisms of reacting with cysteines. For each type, an example is given by measuring the distance (unit: Angstrom [Å], represented by purple and orange dot lines plus their corresponding values with for covalent and non-covalent crystal structures, respectively) between the electrophiles of the covalent warheads and the nucleophilic thiol group of cysteine in EGFR (magenta and cyan ball and sticks for covalent PDB 2JIV and non-covalent PDB 3POZ, individually). 21

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. The distribution of targets and their cysteines in the seven targets (a) The distribution of the targets in kinome tree; (b) The distribution of cysteines in the crystal structures. The crystal structure depicted as green cartoon in the right panel is VEGFR-2 protein (PDB ID: 1YWN). The atoms of cysteines are colored differently but their sulfur, oxygen and nitrogen atoms for the covalent binding cysteines (labeled by the target name and cysteine number) that form covalent bonds are colored as gold, red and dark blue, respectively. For carbon atoms of the covalent binding cysteines and all atoms of other cysteines, the colors are: BTK(grey), cSrc (salmon), EGFR(indigo), ERK2(orange), FGFR4(blue), JNK3 (magenta) and VEGFR-2 (green).

22

ACS Paragon Plus Environment

Page 22 of 39

Page 23 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

Figure 3. Multiple sequence alignment of the binding site defined as the pocket expanded 10 Å of the bound ligand. Each line represents the corresponding binding site sequence of a certain target. Only the matched residues are colored, and the colors are weighed by alignment quality which means the more residues matched the darker the color depicted. The pink cycled amino acids are cysteines within the binding site. The covalent binding cysteines cycled with pink rectangles are labeled.

23

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

24

ACS Paragon Plus Environment

Page 24 of 39

Page 25 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

Figure 4. FTMap and hydrogen bond frequency maps in the seven proteins. The cognate ligands are depicted as green ball and sticks. Important residues (light gray sticks) in the hinge region and DFG-motif as well as some other residues formed hydrogen bonds with the bound ligand are labeled. The cysteines within the binding pocket are also labeled with the covalent binding ones labeled blue. CSs are denoted as lines with CS1 (cyan), CS2 (magenta), CS3 (yellow), CS4 (salmon), and CS5 (grey) successively in descending number of probe clusters. In the right panel, the red bar and blue bar represent the hbonded interactions (hbonded%) and the nonbonded interactions (nonbonded%), respectively. Purple stars stand for the residues that indeed form hydrogen bonds with the cognate ligands and the green stars represent the cysteines existing in the vicinity of binding pockets detected by FTMap. Those residues that formed hydrogen bonds with the cognate ligands as well as the cysteines within the binding pocket are labeled and the corresponding hbonded (the former one) and nonbonded (the latter one) interaction frequency are given in parentheses. The covalent 25

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

binding cysteines are also colored blue. The corresponding targets are as follows: (a) BTK (PDB ID: 3GEN); (b) cSrc (PDB ID: 2QQ7); (c) EGFR (PDB ID: 2JIV); (d) ERK2 (PDB ID: 3C9W); (e) FGFR4 (PDB ID: 4R6V); (f) JNK3 (PDB ID: 3V6S); (g) VEGFR-2 (PDB ID: 1YWN).

26

ACS Paragon Plus Environment

Page 26 of 39

Page 27 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

Figure 5. Comparison of MSA and SASA (unit: Å2) of the binding site cysteines and their corresponding sulfur atoms in both covalent and non-covalent crystal structures. (a) MSA and SASA of the binding site cysteines. (b) MSA and SASA of the corresponding sulfur atoms. MSA_Covalent and MSA_non-Covalent represent the molecular surface area of the cysteines or sulfur atoms in the covalent and non-colvalent crystal structures, respectively. SASA_Covalent and SASA_non-Covalent indicates solvent accessible surface area of the cysteines or sulfur atoms in the covalent and non-colvalent crystal structures, respectively.

27

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

28

ACS Paragon Plus Environment

Page 28 of 39

Page 29 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

Figure 6. Covalent fragment probing maps. In left panel, the covalent fragment probes are depicted as magenta lines in covalent crystal structures and cyan lines in non-covalent crystal structures. The cognate ligands are indicated as green ball and sticks and pink sticks in covalent and non-covalent crystal structures, respectively. Important residues (light gray and pink sticks in covalent and non-covalent crystal structures, respectively) and cysteines within the binding pocket are labeled with the covalent binding ones labeled blue. The region near the covalent binding cysteines is denoted as the covalent consensus site (covalent CS). In right panel of bar chart, the distance (unit: Å) between ligand centroids and cysteine thiol 29

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

groups is colored green, the mean distance between the covalent fragment electrophiles to the thiol group of covalent cysteines is colored orange and blue for covalent and non-covalent crystal structures, individually. The corresponding targets and PDBs are as follows: (a) BTK (covalent PDB: 3GEN and non-covalent PDB: 3K54); (b) cSrc (covalent PDB: 2QQ7 and non-covalent PDB: 3TZ9); (c) EGFR (covalent PDB: 2JIV and non-covalent PDB: 3POZ); (d) ERK2 (covalent PDB: 3C9W and non-covalent PDB: 4Q4P); (e) FGFR4 (covalent PDB: 4R6V and non-covalent PDB: 4UXQ); (f) JNK3 (covalent PDB: 3V6S and non-covalent PDB: 2OK1); (g) VEGFR-2 (covalent PDB: 1YWN and non-covalent PDB: 4ASD).

30

ACS Paragon Plus Environment

Page 30 of 39

Page 31 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

ASSOCIATE CONTENT Supporting Information Figure S1. Fragments used to probe the covalent binding sites. Table S1. Probes summary for the ligand-bound and ligand-unbound crystal structures. Table S2. Calcualted pKa values and percentage of buried area for cysteines in the seven targets. Table S3. Summarization of distances, MSA, SASA, pKa, percentage of buried area for cysteines in covalent and non-covalent crystal structures. AUTHOR INFORMATION Corresponding Authors For Yadong Chen: Tel.: +86-25-86185163. Fax: +86-25-86185182. E-mail: [email protected]. For Tao Lu: Tel.: +86-25-86185180. Fax: +86-25-86185179. E-mail: [email protected].

Author Contributions The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript. All authors contributed equally. Conflict of interest The authors declare no competing financial interest. ACKNOWLEDGMENTS We sincerely thank Dr. Huifang Li, Dr. Amy I Gilson and Dr. Jin Yan for improving the language of the manuscript. The work was funded by National Natural Science Foundation of China (81302634, 21302225), the Natural Science Foundation of Jiangsu Province (BK20130662),

the

Fundamental

Research

Funds

for

the

Central

Universities

(PT2014LX0072), the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), and the Jiangsu Qinglan Project. We also thank Illustration Reproduced Courtesy of Cell Signaling Technology, Inc. (www.cellsignal.com) for providing Kinome Render tools for the generation of kinome tree. 31

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

REFERENCE (1)

Manning, G.; Whyte, D. B.; Martinez, R.; Hunter, T.; Sudarsanam, S. The protein

kinase complement of the human genome. Science 2002, 298 (5600), 1912-1934. (2)

Zhang, J.; Yang, P. L.; Gray, N. S. Targeting cancer with small molecule kinase

inhibitors. Nat. Rev. Cancer 2009, 9 (1), 28-39. (3)

Wu, P.; Nielsen, T. E.; Clausen, M. H. FDA-approved small-molecule kinase

inhibitors. Trends Pharmacol. Sci. 2015, 36 (7), 422-439. (4)

Wu, P.; Nielsen, T. E.; Clausen, M. H. Small-molecule kinase inhibitors: an analysis

of FDA-approved drugs. Drug Discov. Today 2016, 21 (1), 5-10. (5)

Barf, T.; Kaptein, A. Irreversible protein kinase inhibitors: balancing the benefits and

risks. J. Med. Chem 2012, 55 (14), 6243-6262. (6)

Carmi, C.; Mor, M.; Petronini, P. G.; Alfieri, R. R. Clinical perspectives for

irreversible tyrosine kinase inhibitors in cancer. Biochem. Pharmacol. 2012, 84 (11), 1388-1399. (7)

Singh, J.; Petter, R. C.; Baillie, T. A.; Whitty, A. The resurgence of covalent drugs.

Nat. Rev. Drug Discov. 2011, 10 (4), 307-317. (8)

Mah, R.; Thomas, J. R.; Shafer, C. M. Drug discovery considerations in the

development of covalent inhibitors. Bioorg. Med. Chem. Lett. 2014, 24 (1), 33-39. (9)

Kumalo, H. M.; Bhakat, S.; Soliman, M. E. Theory and applications of covalent

docking in drug discovery: merits and pitfalls. Molecules 2015, 20 (2), 1984-2000. (10)

Potashman, M. H.; Duggan, M. E. Covalent modifiers: an orthogonal approach to

drug design. J. Med. Chem. 2009, 52 (5), 1231-1246. 32

ACS Paragon Plus Environment

Page 32 of 39

Page 33 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

(11)

Serafimova, I. M.; Pufall, M. A.; Krishnan, S.; Duda, K.; Cohen, M. S.; Maglathlin,

R. L.; McFarland, J. M.; Miller, R. M.; Frödin, M.; Taunton, J. Reversible targeting of noncatalytic cysteines with chemically tuned electrophiles. Nat. Chem. Biol. 2012, 8 (5), 471-476. (12)

Smith, A. J.; Zhang, X.; Leach, A. G.; Houk, K. Beyond picomolar affinities:

quantitative aspects of noncovalent and covalent binding of drugs to proteins. J. Med. Chem. 2008, 52 (2), 225-233. (13)

Liu, Q.; Sabnis, Y.; Zhao, Z.; Zhang, T.; Buhrlage, S. J.; Jones, L. H.; Gray, N. S.

Developing irreversible inhibitors of the protein kinase cysteinome. Chem. Biol. 2013, 20 (2), 146-159. (14)

Copeland, R. A.; Pompliano, D. L.; Meek, T. D. Drug–target residence time and its

implications for lead optimization. Nat. Rev. Drug Discov. 2006, 5 (9), 730-739. (15)

Schwartz, P. A.; Kuzmic, P.; Solowiej, J.; Bergqvist, S.; Bolanos, B.; Almaden, C.;

Nagata, A.; Ryan, K.; Feng, J.; Dalvie, D. Covalent EGFR inhibitor analysis reveals importance of reversible interactions to potency and mechanisms of drug resistance. Proc.

Natl. Acad. Sci. U.S.A. 2014, 111 (1), 173-178. (16)

Leproult, E.; Barluenga, S.; Moras, D.; Wurtz, J.-M.; Winssinger, N. Cysteine

mapping in conformationally distinct kinase nucleotide binding sites: application to the design of selective covalent inhibitors. J. Med. Chem. 2011, 54 (5), 1347-1355. (17)

Zhang, T.; Inesta-Vaquera, F.; Niepel, M.; Zhang, J.; Ficarro, S. B.; Machleidt, T.;

Xie, T.; Marto, J. A.; Kim, N.; Sim, T. Discovery of potent and selective covalent inhibitors of JNK. Chem. Biol. 2012, 19 (1), 140-154. 33

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(18)

Pan, Z.; Scheerens, H.; Li, S. J.; Schultz, B. E.; Sprengeler, P. A.; Burrill, L. C.;

Mendonca, R. V.; Sweeney, M. D.; Scott, K. C.; Grothaus, P. G. Discovery of selective irreversible inhibitors for Bruton’s tyrosine kinase. ChemMedChem 2007, 2 (1), 58-61. (19)

Vargas, L.; Hamasy, A.; Nore, B. F.; E Smith, C. Inhibitors of BTK and ITK: state of

the new drugs for cancer, autoimmunity and inflammatory diseases. Scand. J. Metall. 2013,

78 (2), 130-139. (20)

Ngan, C. H.; Bohnuud, T.; Mottarella, S. E.; Beglov, D.; Villar, E. A.; Hall, D. R.;

Kozakov, D.; Vajda, S. FTMAP: extended protein mapping with user-selected probe molecules. Nucleic. Acids. Res. 2012, gks441. (21)

Brenke, R.; Kozakov, D.; Chuang, G.-Y.; Beglov, D.; Hall, D.; Landon, M. R.;

Mattos, C.; Vajda, S. Fragment-based identification of druggable ‘hot spots’ of proteins using Fourier domain correlation techniques. Bioinformatics 2009, 25 (5), 621-627. (22)

Zhu, K.; Borrelli, K. W.; Greenwood, J. R.; Day, T.; Abel, R.; Farid, R. S.; Harder, E.

Docking covalent inhibitors: a parameter free approach to pose prediction and scoring. J.

Chem. Inf. Model. 2014, 54 (7), 1932-1940. (23)

Tan, L.; Wang, J.; Tanizaki, J.; Huang, Z.; Aref, A. R.; Rusan, M.; Zhu, S.-J.; Zhang,

Y.; Ercan, D.; Liao, R. G. Development of covalent inhibitors that can overcome resistance to first-generation FGFR kinase inhibitors. Proc. Natl. Acad. Sci. U.S.A. 2014, 111 (45), E4869-E4877. (24)

Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.; Weissig, H.;

Shindyalov, I. N.; Bourne, P. E. The protein data bank. Nucleic. Acids. Res. 2000, 28 (1), 235-242. 34

ACS Paragon Plus Environment

Page 34 of 39

Page 35 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

(25)

Marcotte, D. J.; Liu, Y. T.; Arduini, R. M.; Hession, C. A.; Miatkowski, K.; Wildes,

C. P.; Cullen, P. F.; Hong, V.; Hopkins, B. T.; Mertsching, E. Structures of human Bruton's tyrosine kinase in active and inactive conformations suggest a mechanism of activation for TEC family kinases. Protein Sci. 2010, 19 (3), 429-439. (26)

Miyazaki, Y.; Matsunaga, S.; Tang, J.; Maeda, Y.; Nakano, M.; Philippe, R. J.;

Shibahara, M.; Liu, W.; Sato, H.; Wang, L. Novel 4-amino-furo [2, 3-d] pyrimidines as Tie-2 and VEGFR2 dual inhibitors. Bioorg. Med. Chem. Lett. 2005, 15 (9), 2203-2207. (27)

Sastry, G. M.; Adzhigirey, M.; Day, T.; Annabhimoju, R.; Sherman, W. Protein and

ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J.

Comput. Aided Mol. Des. 2013, 27 (3), 221-234. (28)

Schrödinger, M., LLC New York. In NY: 2009.

(29)

Li, H.; Kasam, V.; Tautermann, C. S.; Seeliger, D.; Vaidehi, N. Computational

Method To Identify Druggable Binding Sites That Target Protein–Protein Interactions. J.

Chem. Inf. Model. 2014, 54 (5), 1391-1400. (30)

Whang, J. A.; Chang, B. Y. Bruton's tyrosine kinase inhibitors for the treatment of

rheumatoid arthritis. Drug Discov. Today 2014, 19 (8), 1200-1204. (31)

Flanagan, M. E.; Abramite, J. A.; Anderson, D. P.; Aulabaugh, A.; Dahal, U. P.;

Gilbert, A. M.; Li, C.; Montgomery, J.; Oppenheimer, S. R.; Ryder, T. Chemical and computational methods for the characterization of covalent reactive groups for the prospective design of irreversible inhibitors. J. Med. Chem. 2014, 57 (23), 10072-10079. (32)

London, N.; Miller, R. M.; Krishnan, S.; Uchida, K.; Irwin, J. J.; Eidam, O.; Gibold,

L.; Cimermančič, P.; Bonnet, R.; Shoichet, B. K. Covalent docking of large libraries for the 35

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

discovery of chemical probes. Nat. Chem. Biol. 2014, 10 (12), 1066-1072. (33)

Cross, J. B.; Thompson, D. C.; Rai, B. K.; Baber, J. C.; Fan, K. Y.; Hu, Y.; Humblet,

C. Comparison of several molecular docking programs: pose prediction and virtual screening accuracy. J. Chem. Inf. Model. 2009, 49 (6), 1455-1474. (34) Zhang, Y.; Yang, S.; Jiao, Y.; Liu, H.; Yuan, H.; Lu, S.; Ran, T.; Yao, S.; Ke, Z.; Xu, J. An integrated virtual screening approach for VEGFR-2 inhibitors. J. Chem. Inf. Model. 2013,

53 (12), 3163-3177. (35)

Jöst, C.; Nitsche, C.; Scholz, T.; Roux, L.; Klein, C. D. Promiscuity and selectivity in

covalent enzyme inhibition: a systematic study of electrophilic fragments. J. Med. Chem. 2014, 57 (18), 7590-7599. (36)

Dennis, S.; Kortvelyesi, T.; Vajda, S. Computational mapping identifies the binding

sites of organic solvents on proteins. Proc. Natl. Acad. Sci. U.S.A. 2002, 99 (7), 4290-4295. (37)

Landon, M. R.; Lancia, D. R.; Yu, J.; Thiel, S. C.; Vajda, S. Identification of hot

spots within druggable binding regions by computational solvent mapping of proteins. J. Med.

Chem. 2007, 50 (6), 1231-1240. (38)

Kozakov, D.; Hall, D. R.; Chuang, G.-Y.; Cencic, R.; Brenke, R.; Grove, L. E.;

Beglov, D.; Pelletier, J.; Whitty, A.; Vajda, S. Structural conservation of druggable hot spots in protein–protein interfaces. Proc. Natl. Acad. Sci. U.S.A. 2011, 108 (33), 13528-13533. (39)

Li, X.; Zuo, Y.; Tang, G.; Wang, Y.; Zhou, Y.; Wang, X.; Guo, T.; Xia, M.; Ding, N.;

Pan, Z. Discovery of a series of 2, 5-diaminopyrimidine covalent irreversible inhibitors of Bruton’s tyrosine kinase with in vivo antitumor activity. J. Med. Chem. 2014, 57 (12), 5112-5128. 36

ACS Paragon Plus Environment

Page 36 of 39

Page 37 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

(40)

Oda, Y.; Renaux, B.; Bjorge, J.; Saifeddine, M.; Fujita, D. J.; Hollenberg, M. D. cSrc

is a major cytosolic tyrosine kinase in vascular tissue. Can. J. Physiol. Pharmacol. 1999, 77 (8), 606-617. (41)

Musumeci, F.; Schenone, S.; Brullo, C.; Botta, M. An update on dual Src/Abl

inhibitors. Future. Med. Chem. 2012, 4 (6), 799-822. (42)

Singh, J.; Petter, R. C.; Kluge, A. F. Targeted covalent drugs of the kinase family.

Curr. Opin. Chem. Biol. 2010, 14 (4), 475-480. (43)

Vantaggiato, C.; Formentini, I.; Bondanza, A.; Bonini, C.; Naldini, L.; Brambilla, R.

ERK1 and ERK2 mitogen-activated protein kinases affect Ras-dependent cell signaling differentially. J. Biol. 2006, 5 (5), 14. (44)

Morris, E. J.; Jha, S.; Restaino, C. R.; Dayananth, P.; Zhu, H.; Cooper, A.; Carr, D.;

Deng, Y.; Jin, W.; Black, S. Discovery of a novel ERK inhibitor with activity in models of acquired resistance to BRAF and MEK inhibitors. Cancer Discov. 2013, 3 (7), 742-750. (45)

Hagel, M.; Miduturu, C.; Sheets, M.; Rubin, N.; Weng, W.; Stransky, N.; Bifulco, N.;

Kim, J. L.; Hodous, B.; Brooijmans, N. First selective small molecule inhibitor of FGFR4 for the treatment of hepatocellular carcinomas with an activated FGFR4 signaling pathway.

Cancer Discov. 2015, 5 (4), 424-437. (46)

Bogoyevitch, M. A.; Ngoei, K. R.; Zhao, T. T.; Yeap, Y. Y.; Ng, D. C. c-Jun

N-terminal kinase (JNK) signaling: recent advances and challenges. BBA-Proteins Proteom 2010, 1804 (3), 463-475. (47)

Huang, L.; Huang, Z.; Bai, Z.; Xie, R.; Sun, L.; Lin, K. Development and strategies

of VEGFR-2/KDR inhibitors. Future. Med. Chem. 2012, 4 (14), 1839-1852. 37

ACS Paragon Plus Environment

Molecular Pharmaceutics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(48)

Zhang, Y.; Jiao, Y.; Xiong, X.; Liu, H.; Ran, T.; Xu, J.; Lu, S.; Xu, A.; Pan, J.; Qiao,

X. Fragment virtual screening based on Bayesian categorization for discovering novel VEGFR-2 scaffolds. Mol. Divers. 2015, 19 (4), 895-913. (49)

DeLano, W. The PyMOL Molecular Graphics System; DeLano Scientific: San

Carlos, CA. 2009. (50)

Olsson, M. H.; Søndergaard, C. R.; Rostkowski, M.; Jensen, J. H. PROPKA3:

consistent treatment of internal and surface residues in empirical p K a predictions. J. Chem.

Theory Comput. 2011, 7 (2), 525-537. (51)

Marino, S. M.; Gladyshev, V. N. Analysis and functional prediction of reactive

cysteine residues. J. Biol. Chem. 2012, 287 (7), 4419-4425. (52)

Weerapana, E.; Wang, C.; Simon, G. M.; Richter, F.; Khare, S.; Dillon, M. B.;

Bachovchin, D. A.; Mowen, K.; Baker, D.; Cravatt, B. F. Quantitative reactivity profiling predicts functional cysteines in proteomes. Nature 2010, 468 (7325), 790-795. (53)

Wissner, A.; Floyd, M. B.; Johnson, B. D.; Fraser, H.; Ingalls, C.; Nittoli, T.; Dushin,

R. G.; Discafani, C.; Nilakantan, R.; Marini, J. 2-(Quinazolin-4-ylamino)-[1, 4] benzoquinones as covalent-binding, irreversible inhibitors of the kinase domain of vascular endothelial growth factor receptor-2. J. Med. Chem. 2005, 48 (24), 7560-7581. (54)

Wissner, A.; Fraser, H. L.; Ingalls, C. L.; Dushin, R. G.; Floyd, M. B.; Cheung, K.;

Nittoli, T.; Ravi, M. R.; Tan, X.; Loganzo, F. Dual irreversible kinase inhibitors: quinazoline-based inhibitors incorporating two independent reactive centers with each targeting different cysteine residues in the kinase domains of EGFR and VEGFR-2. Bioorg.

Chem. 2007, 15 (11), 3635-3648. 38

ACS Paragon Plus Environment

Page 38 of 39

Page 39 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular Pharmaceutics

For Table of Contents Use Only

Identification of Covalent Binding Sites Targeting Cysteines based on Computational Approaches Yanmin Zhanga, Danfeng Zhangb, Haozhong Tianb, Yu Jiaob, Zhihao Shib, Ting Rana, Haichun Liua, Shuai Lua,b, Anyang Xua, Jing Pana, Xin Qiaoa, Lingfeng Yina, Weineng Zhoua, Tao Lua,b*, Yadong Chena*

39

ACS Paragon Plus Environment