Structural Isosteres of Phosphate Groups in the Protein Data Bank

Feb 24, 2017 - We developed a computational workflow to mine the Protein Data Bank for isosteric replacements that exist in different binding site ...
0 downloads 0 Views 6MB Size
Article pubs.acs.org/jcim

Structural Isosteres of Phosphate Groups in the Protein Data Bank Yuezhou Zhang,†,⊥,▽ Alexandre Borrel,‡,§,⊥,○ Leo Ghemtio,† Leslie Regad,†,§ Gustav Boije af Gennas̈ ,‡ Anne-Claude Camproux,§ Jari Yli-Kauhaluoma,‡ and Henri Xhaard*,†,‡ †

Division of Pharmaceutical Biosciences, ‡Division of Pharmaceutical Chemistry and Technology, Faculty of Pharmacy, University of Helsinki, P.O. Box 56, FI-00014 Helsinki, Finland § Laboratoire Molécules Thérapeutiques in silico (MTi), UMRS-973, Université Paris Diderot, Sorbonne Paris Cité, INSERM, F-75013 Paris, France S Supporting Information *

ABSTRACT: We developed a computational workflow to mine the Protein Data Bank for isosteric replacements that exist in different binding site environments but have not necessarily been identified and exploited in compound design. Taking phosphate groups as examples, the workflow was used to construct 157 data sets, each composed of a reference protein complexed with AMP, ADP, ATP, or pyrophosphate as well other ligands. Phosphate binding sites appear to have a high hydration content and large size, resulting in U-shaped bioactive conformations recurrently found across unrelated protein families. A total of 16 413 replacements were extracted, filtered for a significant structural overlap on phosphate groups, and sorted according to their SMILES codes. In addition to the classical isosteres of phosphate, such as carboxylate, sulfone, or sulfonamide, unexpected replacements that do not conserve charge or polarity, such as aryl, aliphatic, or positively charged groups, were found.



INTRODUCTION Synthesis of analogs or congeneric series is a fundamental process in medicinal chemistry.1 Novel compounds are sought for improved potency, enhanced selectivity, attenuated toxicity, or altered physical properties, or simply as new drug candidates.2−4 These analogs bear replacements that are often designed to be isosteric, i.e., to retain the physicochemical or topological properties of the reference compound(s). At the same time the replacements are desired to be bioisosteric, i.e. capable of preserving an initial biological activity.5 Many factors are at interplay in bioisosteric replacements: the nature of the replacement, the atomistic details of the interactions between replaced fragment and target protein, and biological activity, the latter itself being the outcome of complex molecular recognition processes. Studying these factors simultaneously and at a large scale is to a great extent beyond our reach since it would require enormous data for many analog series. Classical bioisosteric replacement strategies are based on similarity in physicochemical properties. Bioisosteres have been studied from the beginning of the last century, and consequently, a vast amount of knowledge is available to medicinal chemists.6 © 2017 American Chemical Society

For example, SwissBioisostere is a freely available database, currently containing 4.5 million replacements, which are automatically identified from ChEMBL as matched molecular pairs.7,8 BIOSTER is a manually curated commercial database that contains 26 000 pairs of potentially bioisosteric replacements.9 These databases reflect our current knowledge and their contents may be affected by specific constraints. For example the SwissBioisostere database contains only a handful of phosphate isosteres, most likely due to the fact that congeneric series most often do not include phosphate substituents (regarded as a liability by medicinal chemists). Computational analyses play an important role in the study of bioisosteres. The isosteric nature of chemical replacements can be predicted based on electronic and steric properties, molecular topology, molecular shape, and molecular interaction fields.10,11 Preferred molecular interactions can be studied statistically for example using the X-ray structures of protein−ligand complexes or the Cambridge Structural Database.12 In these analyses, atoms Received: August 31, 2016 Published: February 24, 2017 499

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling

Figure 1. (A) Computational workflow. (B) Organization of the data produced. The data is composed of folders, structure files (pdb format), and text files (smi, SMILES notation). Abbreviations introduced in the figure: ligand structural replacement, LSR; binding site, BS; ligand, LGD.

are considered either as isolated, for example in knowledge-based scoring functions,13 or as groups of several atoms.14 In other studies, ligands’ local environments have been studied at the level of complete functional groups following ligand fragmentation, for example using interaction fingerprint in the Sc-PDB-Frag protein−ligand interaction patterns database15 or pharmacophore fingerprints in the KRIPO database.16 There are several platforms that allows to mine the PDB by superimposing ligand substructures and studying the molecular environments obtained, among them PSILO from the MOE environment,17 Proasis3 environment18 from DesertSci and Relibase19 from the Cambridge Crystallographic Data Centre. Most of these are restricted to commercial users, and they are not designed to study structural isomerism. In an alternative strategy introduced by Kennewell and coworkers,20 the three-dimensional structures of protein−ligand complexes are structurally aligned, and the ligand substructures spatially occupying the binding site of the functional group under scrutiny extracted. This approach enables the detection of replacements that may be variable in terms of molecular interactions (exemplified in Supporting Information 1, Figure S1). Common to all the methods based on structural data is that molecular interactions are considered without accounting for affinity changes. In this study, we continued the work of Kennewell et al. to build a fully automated data mining workflow for extracting ligand structural replacements. As a main application, we investigate the structural replacements of phosphate groups in AMP, ADP, ATP, and pyrophosphate (POP). Several factors motivated us to work with phosphate replacements: a data set of significant size can be envisioned, e.g. ATP-binding proteins;21,22

the bioisosteric replacements of phosphate groups as highly desirable moieties for medicinal chemistry programs23,24 due to the hydrophobic cell membrane readily blocking the diffusion of negatively charged phosphates;25 the removal mechanisms by phosphatases leading to a shorter lifetime of compounds; and the laborious synthesis protocols associated with low yields and inefficient purification of compounds.26,27



RESULTS AND DISCUSSION We refer to Pi1, Pi2, and Pi3 following the annotation in the PDB files as the α-phosphate of AMP, ADP, and ATP; the βphosphate of ADP and ATP; and the γ-phosphate of ATP, respectively. Pi1 and Pi2 are also used for the phosphate groups of POP. Data Collection. Data Organization. We report a streamlined computational workflow useful to extract ligand fragments that occupy the binding sites of reference functional groups in protein X-ray structures. All the scripts, written in Python2.7 programming language, are made accessible through the GitHub collaborative code sharing platform at https://github.com/ ABorrel/LSRs. The workflow provides the user with analysis graphs generated on-the-fly through interfacing with the R package.28 The workflow reads a set of parameters, which should allow a relatively easy customization of the search criteria, running an updated version of the PDB, or changing the reference ligands used. The workflow (Figure 1A) functions by identifying proteins bound to a reference ligand and by superimposing onto them different crystal structures of the same protein (or alternatively a close homologue or a mutated form of the protein) in complex with other ligands. Here nucleotide-binding proteins, i.e., 500

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling Table 1. Data Curation: Number of Complexes and Ligands Collected in Each of the Datasets

number of structural isosteres in data set, after ShaEP filter

collected replacementscontaining proteins

AMP ADP ATP POP

reference complexes in PDB

references complexes

total (isostere-containing + empty)

total (isosterecontaining)

with binding sites identical to reference protein

Pi1

Pi2

Pi3

455 1430 851 204

194 510 390 92

4236 11925 11696 1577

1154 4598 4815 424

459 1181 894 133

741 3210 3633 370

2771 3150 361

2177

intact cycles are present within the sphere defined for the structural replacement. During this preparation of the manuscript, we tried several methods to classify the structural replacements. This is not a trivial task since we aim to classify very short character strings, usually five-seven characters long. These strings are furthermore not robust to even small shifts of the ligands in the binding site, i.e., they can vary significantly even if the same region of the ligand is present. This made clustering methods unsuccessful for organizing the data, and we devised the data sorting method based on SMILES codes described above. During the revision of the manuscript, we furthermore attempted to classify the fragments according to the shape and electrostatic potential scores computed by ShaEP, which are readily obtained by the workflow. The final data however showed little possibility to segregate the molecular fragments according to these (Figure 3). Only phosphate replacements and carboxylate groups are identified to be similar to the phosphate groups on the reference nucleotides. We thus retain the classification based on SMILES, which has the advantage to help the analysis and be easy to interpret. Contents of the Run Presented. The data set is comprehensively described as Supporting Information in order to focus the study on examples of interest to medicinal chemists. The workflow identified 1143 reference proteins: 189 bound to AMP, 488 to ADP, 374 to ATP, and 92 to POP. These reference proteins were divided into clusters that share less than 30% sequence identity of which 70 clusters have three or more members (Supporting Information 6, Figure S2). The names of the folders containing the data associated with reference proteins were appended with both the cluster number and an identifier relating to cellular function, based on the keywords found in the PDB files or for small clusters with the label “out”. The most populated clusters are for kinases and heat shock proteins, reflecting the massive crystallographic and compound discovery work for these targets. The whole-body superimpositions of reference proteins with their close homologues resulted in an average root-mean-square deviation (RMSD) of the Cα carbon coordinates of about 1.5 Å, most of the data being below 3 Å (Supporting Information 6, Figure S3), indicating success in the retrieval and processing of homologous protein structures. The flexibility of the binding sites was assessed using either all-atom RMSDs or the longest Cα−Cα carbon movement (Supporting Information 7, Figure S4). In order to calculate a RMSD, equivalence between pairs of amino acids was defined based on their sequence numbering, and only when exact matches occurred for both amino acid types and numbers, binding site flexibility could be studied (i.e., only a fraction of the data set, ∼20−50%, Table 1). While large loop movements were found to take place, the binding sites typically do not undergo extensive changes. For kinases, movements are well identified at the glycine-rich P-loop.31 Metals appear, not

proteins bound to AMP, ADP, ATP, and POP are used as examples. The workflow is applied to extract the replacements of Pi1, Pi2, and Pi3 of these reference proteins, in total 16 413 local structural replacements for 929 different ligands (Table 1). The replacements are then sorted based on a decomposition into SMILES codes (Figure 1B), and the associated structural data, such as binding site amino acids and complete ligand atoms, extracted. The results is a downloadable archive, composed of structure files in pdb format and organized hierarchically into folders (Supporting Information 2). These files can be easily visualized with standard software, for example PyMOL.29 The hierarchy is provided as a text file (Supporting Information 3), allowing an easy retrieval of specific information. The parameters of the run presented are accessible at the end of the “main.py” script and can be found in the Experimental Section (below). Altogether, 40 examples are discussed in this manuscript. For each of them we provide a diagram of ligand interaction computed using LigPlot+,30 which should allow to browse rapidly through the examples of structural isosteric replacement given (Supporting Information 4). Of these examples, 32 are of high resolution, better than 2.3 Å, among which 21 have been solved at a resolution better than 2.0 Å (resolution of the 40 examples as well as characteristics of the ligands, Supporting Information 5, Table S4). SMILES-Based Folders. We set out to build a classification system for the substructures extracted. The purpose of the classification was to facilitate the data analysis, i.e., to extract the structural replacements of most interest, and conversely to avoid parsing through many uninteresting or nearly identical examples. Examples of replacements of most interest are the ring structures with a structural fit on phosphate groups, the carbon-only replacements, the sulfur-containing replacements, and halogencontaining replacements. In the final version of the manuscript, in order to simplify the analysis, we first set aside the very small replacements, i.e. those containing fewer than three atoms. We then extracted in a mutually exclusive sequential order the structural isosteres and placed the related data into folders (Figures 1B and 2). The folders are based on the presence in the SMILES code of (1) a ring (cycle folder, a numeral in the SMILES code); (2) a phosphorus atom (letter P); (3 and 4) the halogen atoms fluorine and chlorine as well as (5 and 6) boron and beryllium (corresponding element symbols); (7) nitro (combinations of SMILES leading to NO2); (8) sulfone (combinations leading to SO2); (9) sulfur (S); (10) amide or carbamoyl (combinations leading to CON); (11) carboxyl or ester (combinations leading to COO); and certain atoms, (12) only C; (13) only C and O; (14) only C and N; (15) only C, O, and N. The cycle folder has another level of subdivisions, i.e., the same level of subdivision as the main categories. The extraction of rings (cycle folder) is complex, and the SMILES-based identification works only if 501

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling

phosphate oxygen atoms. For AMP, the neighboring metals are different (zinc and sodium being the most prevalent), and there are less metals overall as ∼25% of the reference complexes contain at least one metal at the binding site. A brief analysis of the conformation of AMP, ADP, and ATP in the reference complexes suggest the β-pyranose form as the predominant ribose anomer (Supporting Information 9, Figure S6). In few complexes the atomic distances below 3.3 Å suggest an intramolecular hydrogen bond between the ribose O-3′ atom and the closest phosphate oxygen atom (Supporting Information 9, Figure S7). Considerations about Structural Isosteres. The data relative to the examples presented in this section, replacements that are not bioisosteric, is given in Table 2. The examples of SARs discussed are presented in Table 3. Characteristics of Phosphate Binding Regions. In most cases, ligands are found to be anchored at the adenine-ribose sites, as can be inferred from the higher number of replacements for Pi1 compared to Pi2 or Pi3 (Figure 2). Phosphate groups bind generally in a solvent-exposed region, which reflects their involvement in catalytic reactions. For example, the transfer of phosphoryl group in kinases requires, in addition to the substrate, binding of a cosubstrate, cofactors, as well as metal cations. Furthermore, phosphate groups are surrounded by a high hydration cage that is not likely to be fully desolvated upon binding, due to an associated enthalpic penalty. Considering these two factors, it is not surprising that binding data related to congeneric series (as discussed for several examples below) suggest congeners to be relatively susceptible to substitutions in the region exposed to solvent. Optimization of potency through the solvent exposed parts of small molecules has become an interesting challenge in the discovery of active compounds.32−34 Structural Replacement of a Phosphate That Displaces Metals. It is often thought that isosteric structural replacements conserve molecular interactions where possible. While this is generally true, we found many examples, where for example water molecules are displaced by ligand fragments or intervene as bridges to compensate for the lost hydrogen bonds (see below). Displacing unfavorable water molecules has been recognized as a strategy to gain potency and is profoundly linked to an entropic gain upon ligand binding.35 This study reveals cases that have received less attention, when a metal cation is displaced by a fragment of the ligand. In human cyclin dependent kinase 2 (CDK2) (example 1, Figure 4A−C), ligand’s primary amino group occupies the binding site of Mg2+ (N to Mg2+ distance of 0.6 Å across the complexes after the whole-body superimposition). The compound capable of displacing the metal with a −CH2NH3+ tail presents a 20-fold higher IC50 toward CDK2: compare from Table 4 in ref 36 the compound 4g (IC50 of 0.07 μM, −CH2NH3+) with the compound 4f (IC50 of 1.37 μM, −H). The favorable replacement of a metal cation could be driven by formation of a salt bridge between the primary amino tail and Asp145 (carboxylate O···N shortest distance of 2.6 Å). Structural Replacements of a Phosphate by Protein Amino Acids. The design of phosphate mimics is usually driven by isosteric considerations, which do not take into account the binding site rearrangements. In E. coli biotin carboxylase B (example 2, Figure 4D−F), upon binding of a benzimidazole inhibitor to the ATP site, a large conformational change in its glycine-rich loop (amino acids 163−168, sequence “GGGGRG”) takes place. The amide backbone of the loop at Gly164 occupies the Pi2 site of ATP and engages in π−π stacking on the ortho-chlorobenzyl group of the bound inhibitor. The

Figure 2. Distribution of the data set into categories assigned based on the SMILES codes of the structural isosteres (A) Pi1 of AMP, ADP, or ATP or any Pi of POP. (B) Pi2 of AMP or ADP or any Pi of POP. (C) Pi3 of ATP.

surprisingly, to be prevalent at phosphate binding sites (Supporting Information 8, Figure S5, Table S5). Approximately 55−71% of the reference proteins bound to ADP, ATP, and POP bind at the same site also at least one metal atom; magnesium is the metal most often found, in many cases coordinated by the 502

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling

Figure 3. Segregation of the structural replacements according to their shape and electrostatic similarity scores compared to the reference phosphate group computed by ShaEP. (A, B) Boxplot of the electrostatic potential (ESP) overlap score for (A) noncyclic and (B) cyclic fragments. (C, D) Boxplots of shape overlap score for (C) noncyclic or (D) cyclic fragments. (E, F) Scatterplots of these scores.

Replacements of a Phosphate That Takes Advantage of Intramolecular Stabilization. Nucleotide binding sites are large, enabling in turn binding of large ligands by intramolecular stacking, which leads to the U-shaped conformations. In this

ortho-chlorobenzyl docks to a pocket lined by the side chains of Met169, Tyr199 (phenolic O···Cl distance of 3.6 Å) and Lys159 (N···Cl distance 4.0 Å) and occupies the site of Lys116 side chain (primary ε-amino group displaced by about 6.0 Å). 503

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling Table 2. Phosphate Replacements That Are Not True Bioisosteric Replacements: Examples 1-10a target ligand

folder

3ULI

1N3

cycle/CON

3JZI

JZL

4G6O

reference protein

reference ligand

KS-5

4EOM

ATP

replacement of metal (Mg2+)

C+N

LG-22

1DV2

ATP

ligand is replaced by binding site mutated reference protein E288K

E28

cycle/F

KS-5

4GVA

ADP

intramolecular stabilization U-shape (fluorine···CH)

3FV8

JK3

Br

KS-5

4KKE

AMP

intramolecular stabilization U-shape (bromine···CH)

2UZW

SS4

1JBP

ADP

Bos taurus, PKA

2F7E

2EA

1JBP

ADP

intramolecular stabilization U-shape (π−π) mutated LSRprotein N286D intramolecular stabilization U-shape (π−π)

Homo sapiens, PDK1 Homo sapiens, PKAB3

3RWP

ABQ

cycle/ KS-14 C+O+N cycle/ KS-14 C+O+N F KS-14

1H1W

ATP

intramolecular stabilization U-shape (π−π)

3L9M

L9M

S

KS-3

1L3R

ADP

2Y4O

DLL

COO

SYout

4RVO

ADP

intramolecular stabilization U-shape (π-sulfur) mutated LSR-protein V123A, L173M, Q181K ester intramolecular stabilization U-shape (π−π, T-shaped)

2WEQ

GDM

C+O+N

HP16

1AMW

ADP

no.

figure

target protein

1

4A−C

2

4D−F

3

5A

4

5B

5

5C

Homo sapiens, CDK2 Escherichia coli, biotin carboxylase Homo sapiens, ERK2 Homo sapiens, JNK3 Bos taurus, PKA

6

5D

7

5E

8

5F

9

5G

10

5H

a

target PDB code

Burkholderia cenocepacia, phenylacetateCoA ligase Saccharomyces cerevisiae, HSP90

cluster

replacement and comments

intramolecular stabilization U-shape (covalent) mutated LSR protein L34I, I35V

Information about examples 1−10 and a pathway to locate them in the dataset hierarchy.

Table 3. Specific Examples of SARs that Illustrate Replacements That Are Not Bioisosteric

mimicry of nucleotides takes place, driven by energetic mechanisms that combine hydrophobic stacking and other molecular interactions: fluorine and bromine to CH contacts (examples 3 and 4, Figure 5A and B), parallel π−π stacking

study, intramolecularly stacked bioactive conformations were found to transcend protein families (Figure 5) as a general mechanism to accommodate phosphate structural replacements. There is nonetheless a surprising diversity in how the functional 504

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling

Figure 4. Example of a ligand structurally replacing a metal ion (A−C) and an example where the binding site loop occupies the position of the phosphate groups (D−F). (A−C) Homo sapiens CDK2, ligand 1N3 (example 1, PDB codes 3ULI and 4EOM). (D−F) Escherichia coli biotin carboxylase, ligand JZL (example 2, 3JZI and 1DV2). (A, D) Reference; (B, E) ligands containing the structural isosteres (green); (C, F) close-up view of the superimposition. In this and following figures, the ligands are named according to their PDB 3-letter codes.

(examples 5 and 6, Figure 5C and D), π to sulfur (example 8, Figure 5F), T-shaped π−π stacking (example 9, Figure 5G), and covalent constraints (examples 7 and 10, Figure 5E and H). The position of the U-shaped ligands relative to the reference nucleotides differs, with the “gap” located for example on phosphate groups (example 3, Figure 5A) or on the ribose groups (example 8, Figure 5F). The ability of bioactive conformations to adopt a U-shape may be a general property of large binding sites; for example, a U-shape is the known bioactive conformation of suvorexant to orexin 2 receptor, a protein that endogenously binds peptides.37 Corroborating the importance of intramolecular stacking toward binding potency, congeners bearing destabilized (or not well stabilized) U-shapes lead to lower binding affinities. For example, in a series of piperazine-derived inhibitors of c-Jun Nterminal kinase 3 (JNK3; example 4, Figure 5B), destabilization of the interaction between a bromine group and substituents at a furan ring led to up to 100-fold decrease in IC50s; compare, Table 1 in ref 38, compound 4g (IC50 of 0.16 μM, propargyl) with compound 4e (0.33 μM, allyl), compound 1 (1.1 μM, ethyl), and compound 4a (9.9 μM, no substituent). Similarly, in protein kinase A (example 6, Figure 5D), replacing an indole moiety with, e.g., phenyl that leads to decreased stacking contact surface led to 50-fold decrease in IC50s; compare, Table 1 in ref 39, compound 1b (IC50 of 14 nM, indole) with compound 1a (IC50 of 690 nM, phenyl) (see also refs 39 and 40). For another set of compounds targeted at the protein kinase AB3 (PKAB3) (example 8, Figure 5F), changing the thiadiazole ring to a nonsulfur-containing five-membered ring decreased the IC50s of congeners by 4- to 500-fold; compare, Table 1 in ref 41, compound 1 (IC50 of 3.2 nM, thiadiazole) with compounds 15 (17 nM), 22 (1600 nM), and 26 (85 nM). This is consistent with recent work highlighting the role of sulfur interactions in compound design.42 Phosphate, Carboxyl, Esters, Sulfones, and Sulfonamides. From this point on, we present structural isosteres according to their chemical nature, which to a large extent, but not completely, reflects the data folders to which they have been automatically assigned (see above and Figure 1B). This is mainly

due to the choice made to allow only a single instance of each replacement in order not to overload the data set. In addition, there is ambiguity introduced by the use of fragments: for example, when only some atoms of an ester functional group are retrieved, the ester −O-alkyl group may not be included in the selection sphere, and carboxyl group may be detected instead. We thus decided to group both in the same folder. Bioisosteres of phosphate are often similar to phosphate in terms of size, charge, hydrogen bonding, and geometry. Typical phosphate bioisosteric replacements are phosphonates, carboxylates and malonates,43 sulfamates,44 squaramides and squaric acids,45 and boron-containing fragments46 as well as phosphorothioates.47 The data relative to the examples presented in this section is given in Table 4. Phosphorus-containing functionalities (P and cycle/P folders) are the most abundant. A majority of them cannot be regarded as actual structural replacements but rather simple phosphate groups instead. Phosphate isosteres are easily obtained by replacing the bridging oxygen by a methylene group, resulting in the phosphonate moiety; see for example bacterial biotin carboxylase (example 11). Another modified phosphate group is the endogenously present cyclic AMP; see for example human phosphodiesterase 4D (example 12). Carboxylate functionalities (COO and cycle/COO folders) are widely used phosphate isosteres (Figure 5A−F), which is taking advantage of the conservation in the negative charge. In Mycobacterium tuberculosis pantothenate synthetase (example 13, Figure 6A−C), a carboxyl group binds to the Pi2 site of ATP. Nearby, on the same side of the molecule, a sulfonamide group binds to the Pi1 site, composing a larger fragment that mimics both phosphate groups simultaneously. In Bradyrhizobium japonicum malonamidase E2 (example 14, Figure 6D−F), a very good match of the carboxylate oxygen to the phosphate oxygen atoms is seen. The ligand in example 14 also bears an amide group that seemingly matches well the phosphate oxygen atoms; nonetheless, the latter disqualifies as a genuine amide replacement, since the binding site is mutated (S131A), leading to different interactions than the parent phosphate. Ester functionalities cannot be separated from carboxyl groups on 505

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling

of the structures). Local hydrogen-bonding geometries appear different, with Pi1 phosphate oxygen potentially hydrogen bonding to the Nε of His311 (2.6 Å) as well as to Ser118 (2.7 Å), but with longer distances (3.4 and 3.7 Å, respectively) present in the N-acylsulfonamide containing complex. Sulfonate is found for example in Escherichia coli glutamine PRPP amidotransferase (example 18), with an overall good overlay to the reference phosphate (longest O···O distance 1.8 Å). Sulfonate can, like phosphate, both accept hydrogen bonds and act as a negatively charged replacement. Boron functionalities were also identified (cycle/B folders), but in small numbers. In phosphodiesterase PDE4 (example 19, Figure 6G−I), benzoxaborole binds at the AMP phosphate site, participating in the hexadentate coordination of two metals (Zn2+ and Mg2+ cations in the benzoxaborole-bound protein and two Mg2+ cations in the AMP-bound protein). The two oxygen atoms of benzoxaborole and an activated water molecule are replacing the three oxygen atoms of the AMP phosphate group (in the superimposed complexes benzoxaborole to phosphate O···O closest approach distances are 0.4, 1.0, and 1.1 Å away; the boron atom is closest to the phosphorus atom, 0.7 Å away). Some classical phosphate isosteres are not found in the data set, e.g. 1H-tetrazole or squaric acid. This is due to the fact that in order to be identified, protein complexes need to have been crystallized both with the considered functional groups and in complex with one of the references AMP/ADP/ATP or POP. There are less than 20 compounds with a tetrazole functionality in the PDB to date, and similarly very few examples of squaramides (see e.g. PDB code 4YFK for one example). Previously Poorly Recognized Phosphate Isosteres. The data relative to the examples presented in this section is given in Table 5. The examples of SARs discussed are presented in Table 6. Uncharged and Relatively Apolar Rings. Rings are of key interest to medicinal chemists since they can be used as scaffolding groups, and we have placed them into a special folder (cycle/...) itself subdivided into the main categories. The rings stabilized through intramolecular interactions have been presented above (Figure 5, examples 3−10). Well-positioned rings with respect to protein amino acids are also found binding at phosphate sites (Figure 7, examples 20−21 and 23−24). These four examples are from unrelated proteins (kinases PDK1 and CDK2, HSP90, and phosphodiesterase 4D), illustrating a novel type of phosphate structural replacement that transcend protein families. In human PDK1 ATP-binding site (example 20, Figure 7A), the inhibitor’s ring fits almost perfectly on top of the phosphate. The benzyl group is well-positioned in a pocket formed by Val96, Gly89, Val96, Lys117 (side chain), Gly91, and Ser94. Comparing the binding mode of the PDK1 inhibitor’s benzyl moiety with the binding mode of the Pi1 of ATP, two potential hydrogen bonds are lost: phosphate oxygens to the side-chain hydroxyl of Ser94 (2.6 Å) and to the side chain nitrogen of Lys111 (2.9 Å), this latter shifting away from the benzyl group of the inhibitor (closest approach 3.9 Å). Congeners having linkers of different lengths have decreased activities: A two-carbon linker leads to a 100-fold decrease in IC50, whereas a shorter linker leads to a 6fold drop in IC50; compare, Table 1 in ref 48, compound 8f (IC50 of 0.8 μM, benzyl, optimal linker) with compound 8r (IC50 > 10 μM, two-carbon linker) and compound 8g (IC50 of 5.2 μM, phenyl). In human CDK2 (example 21, Figure 7B), the ligand’s difluorophenyl group is near the Pi1 of ADP, whereby the oxygen

Figure 5. Selected examples where the ligand is adopting a U-shape conformation. (A) Homo sapiens ERK2 kinase, ligand E28 (example 3, 4G6O and 4GVA); (B) Homo sapiens JNK3 kinase, ligand JK3 (example 4, 3FV8 and 4KKE); (C) Bos taurus PKA kinase, ligand SS4 (example 5, 2UZW and 1JBP); (D) Bos taurus|Homo sapiens PKA, ligand 2EA (example 6, 2F7E and 1JBP); (E) Homo sapiens PDK1, ligand ABQ (example 7, 3RWP and 1H1W); (F) Homo sapiens PKAB3, ligand L9M (example 8, 3L9M and 1L3R); (G) Burkholderia cenocepacia PAAK2, ligand DLL (example 9, 2Y4O and 4RVO); (H) Saccharomyces cerevisiae Hsp90, ligand GDM (example 10, 2WEQ and 1AMW).

the basis of their SMILES definitions (and thus they are assigned to the COO folder). In bacterial phenylacetate-CoA ligase (example 15) an ester carbonyl participates in the coordination of a K+ ion, closest approach 2.8 Å (equivalent distance in the reference protein, Pi2 phosphate O···K+, 2.5 Å). The data set also contains for example lactone that is found in human HSP90 (example 16). Sulfonate, sulfonamides, and acylsulfonamides (SO2 and cycle/SO2 folders) are phosphate isosteres that take advantage of the well-positioned oxygen atoms along a tetrahedral shape. Sulfonamide in Mycobacterium tuberculosis pantothenate synthetase (example 13, Figure 6A−C) displays an excellent structural overlay of the Pi1 oxygen of ATP. In human serine carboxypeptidase (example 17), the N-acylsulfonamide nitrogen atom is located at equivalent site compared to the Pi1 phosphate oxygen of AMP (1.1 Å away upon whole-body superimposition 506

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling Table 4. Classical Phosphate Isosteres: Examples 11−20 no.

figure

target protein Pseudomonas aeruginosa, biotin carboxylase Homo sapiens, PDE4D

11 12

target PDB code

target ligand

folder

cluster

reference protein

reference ligand

replacement and comments

2VQD

AP2

P

LG-22

1DV2

ATP

phosphonate

2PW3

CMP

P

HD-55

1TB7

AMP

13

6A−C

Mycobacterium tuberculosis, pantothenate synthetase

4MUF

2DJ

COO

SY-43

4G5Y

ATP

14

6D−F

Bradyrhizobium japonicum, malonamidase E2

1O9O

MLM

CON

OT-out

1OCM

POP

15

Burkholderia cenocepacia, PAAK2

2Y4O

DLL

COO

4RVO

ADP

16 17 18

Homo sapiens, HSP90 Homo sapiens, serine carboxy peptidase Escherichia coli, glutamine PRPP amidotransferase Homo sapiens, PDE4B

2QFO 3TLB 1ECF

A51 DSZ PIN

COO SO2 SO2

SYOUT HP-16 HD-62 TF-out

cyclic AMP mutated LSR protein D201N carboxylate mutated LSR protein T2A, E77G carboxylate mutated LSR protein S131A ester

1BYQ 3TLZ 1ECJ

ADP AMP AMP

lactone N-acylsulfonamide sulfonate

3O0J

3OJ

cycle/B

HD-55

1ROR

AMP

2,1-benzoxaborole

19

6H−I

Figure 6. Classical phosphate isosteres. (A−C) Mycobacterium tuberculosis pantothenate synthetase bound to 2DJ (example 13, 4MUF and 4G5Y). (D− F) Bradyrhizobium japonicum malonamidase E2 mutant S131A bound to MLM (example 14, 1O9O and 1OCM). (G−I) Homo sapiens PDE4B bound to 3OJ (example 19, 3O0J and 1ROR). (A, D, G) Structural isostere containing proteins; (B, E, F) reference proteins; (C, F, G) superimposed ligands.

interaction network coordinating the Mg2+ cation intact (see example 22); compare, Table 1e in ref 49, compound 90 with compounds 51−64. In particular, an analog bearing an acyl linker and a phenyl group instead of difluorophenyl leads to about 1000-fold improved activity; compare, Table 1e in ref 49, compound 90 with compound 57 (IC50 of 0.13 μM, acyl linker and phenyl R-group); or a shorter R group, such as the one in compound 61 (IC50 of 4.8 μM, −NH2 following the acyl linker, i.e. amide). In phosphodiesterase 4B (example 23, Figure 7C), the ligand’s dichloropyridine group binds close to the Pi1 site of AMP

atoms of the phosphate group lie almost within the plane of the difluorophenyl ring. In the ADP-bound reference protein, the phosphate groups and the side-chain carboxylate of Asp145 coordinate Mg2+; furthermore, Lys51 is optimally placed to interact simultaneously with both phosphate groups. This is a good example where crystallographic data may mislead structural studies, since the inhibitor’s IC50 is >100 μM (see compound 90 in supporting information Table S6 in ref 49). The low affinity may be due to the ligand disturbing the Asp145-mediated interaction network. Potent analogs with shortened ligand (carbamoyl linker to acyl) were found, and they leave the 507

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling Table 5. Unrecognized Phosphate Isosteres: Examples 21−40 target ligand

folder

cluster

reference protein

reference ligand

3RCJ

3RC

cycle/only C

KS-14

1H1W

ATP

Homo sapiens, CDK2 Homo sapiens, CDK2 Homo sapiens, phosphodiesterase 4B Homo sapiens, HSP90

3R9H 3RAK 1XMU

Z67 03Z ROF

cycle/only C only C cycle/Cl

3NIZ 4I3Z 1TB7

ADP ADP AMP

2CCU

2D9

only C

1BYQ

ADP

aryl, partial

2UZO 2UZN 3BD6

C62 C96 RDD

ADP ADP AMP

4MK5

28A

CON C+O+N cycle/ C+O+N Cycle/C+O +N

4I3Z 4I3Z 1FA9

8C

Homo sapiens, CDK2 Homo sapiens, CDK2 Oryctolagus cuniculs, glycogen phosphorylase b Infuenza A virus, endonuclease

OT-5 KS-5 HD55 HP16 KS-5 KS-5 PLout OTout

benzyl mutated LSR protein Y288G Q292A difluorophenyl partial phenyl replacement dichloropyridine

3HW5

AMP

8D

Homo sapiens, CDK2

3FZ1

B98

CON

KS-5

1FIN

ATP

30 31

Saccharomyces cerevisiae, HSP90 Homo sapiens, HSP90

2VWC 3HHU

BC2 819

CON CON

2WEP 2XK2

ADP ADP

32

Homo sapiens, CDK2

2B55

D31

CON

AT-16 HP16 KSout

2,4-dioxo-1,3-thiazolidine 2-imino-4-oxo-1,3-thiazolidine cyanuric acid (1,3,5-triazinane-2,4,6trione) 3-hydroxypyridin-2(5H)-one mutated LSR protein I201V 3-(aminomethyl)-[1,4]diazepin-5one amide carbamate

4EOO

ATP

33 34

Homo sapiens, CDK2 Homo sapiens, HSP90

3QTX 4BQJ

X43 XKL

C+O+N C+O+N

4I3Z 2XK2

ADP ADP

Ovis aries, 6phosphogluconatedehydrogenase Homo sapiens, HSC70 Homo sapiens, farnesyl diphosphate synthase Homo sapiens, PFKFB3

1PGO

NDP

cycle/C + N

1PGN

POP

dihydronicotinamide ring (NADH)

4H5W 4P0W

BET 1XH

COO cycle/only C

1BA0 4H5D

POP POP

3QPV

FDP

cycle/C + O

3QPU

POP

quaternary amine sesquiterpenoid with a rearranged drimane skeleton D-fructofuranose

3Q82

MER

cycle/S

1XA1

POP

1D1A

DAE

F

KS-5 HP16 OTout HD-2 TFout PTout OTout OT-4

amide intramolecular stabilization, Ushape nitro nitro

1FMW

ATP

no.

figure

20

7A

Homo sapiens, PDK1

21 22 23

7B

24

7D

25 26 27

8A

28 29

7C

8B

35

9A

36 37

9B 9C

38

9D

39

9E

40

9F

target protein

Staphylococcus aureus, sensor domain of BlaR1 Dictyostelium discoideum, myosin motor domain

target PDB code

replacement and comments

carboxylate, pyrrolidine scaffolding unit beryllium trifluoride mutated LSR protein Y312C, R761N, Q760P

Polar, Potentially Charged, Rings. Rings that function as phosphate mimics are often polar (examples 25, 25−29, Figure 8). Three of the examples (examples 25, 27, and 28) can be regarded as acidic, thus mimicking phosphate groups through their negative charges. In human CDK2 (example 25, Figure 8A), a 2,4-dioxo-1,3thiazolidine group is found to bind at ADP’s Pi1 site. The ring is likely to be deprotonated, i.e. in its ionic form, since it is in the vicinity of the carboxylate group of Asp145 (2.6 Å). The interaction network is actually complex as it also involves the side chain Nε of Lys33 (1.9 Å from Asp 145 and 2.6 Å away from ligand’s dioxothiazolidine group). Modifying the dioxothiazolidine moiety by replacing the trans-oxygen with NH (example 26) leads to a 100-fold increase in affinity; compare, Table 1 in ref 51, compound 14 (IC50 of 27 μM, dioxothiazolidine, H at R2) as well as compounds 13 (IC50 of 0.03 μM, iminooxothiazolidine, methyl at R2) with the compound 1 (IC50 of 0.18 μM, iminooxothiazolidine, H at R2). An explanation for the higher activity could be found at the iminooxothiazolidine group that forms parallel hydrogen bonds with both Asp145 oxygen atoms and has one additional interaction with Asn132 side chain (3.5 Å); Lys33 is pushed away in example 26. In rabbit glycogen phosphorylase b (example 27, Figure 8B), the 1-β-D-ribofuranosyl derivative, contains a cyanuric acid

(suboptimal overlay with a 2.5 Å-shift of the plane of the ring from the phosphorus atom). One of the AMP’s phosphate oxygen interacts with Mg2+ (2.1 Å) and with Zn2+ (2.2 Å), mediating an interaction with Asp201; otherwise the phosphate is surrounded by water molecules. The dichloropyridine ring is well accommodated, forming polar interactions with two water molecules from its nitrogen atom (2.9 and 3.1 Å away) and participating in a hydrophobic cluster together with Leu393 and Met347. Two water molecules occupy a location equivalent to phosphate oxygen atoms. In human HSP90 protein (example 24, Figure 7D), a benzylic group substituted with a sulfone group binds to a site close to the Pi2 phosphate binding site of ADP (closest approach after protein superimposition 0.8 Å). In congeneric series, addition of the benzylic group leads to 3-fold improvement of affinity; compare in ref 50 compound 6 (IC50 of 0.74 μM, benzylic sulfone) and compound 1 (IC50 of 2.0 μM, no substituent). The benzylic group displaces particularly Asp54, introducing formation of salt bridges with ligand’s 4′N atom50 and possibly with Lys58. In addition, the Mg2+ cation coordinated by the phosphate groups appears to be displaced by the ligand’s benzylic group. The sulfone moiety forms a new hydrogen-bond outside of the replacement area. 508

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling Table 6. Selected SARs for Unrecognized Phosphate Isosteres

moiety that binds at the AMP’s Pi1 site. In the AMP-bound form, there are no metals, and its two phosphate oxygen atoms face the well hydrated binding pocket, where they bind to Arg309 and Arg310 (Arg309 P···O distance of 2.7 Å and Arg310 P···O distance of 3.3 Å) as well as through a water-mediated contact to Arg242. Cyanuric acid has an equivalent role in example 27. One of its nitrogen atom hydrogen bonds simultaneously to both Arg309 and Arg 310 guanidinium nitrogen atoms (N···N 3.1 and 3.2 Å, slightly out of plane; this interaction is charge-reinforced through deprotonation). The acyl group is at a hydrogenbonding distance from the guanidinium nitrogen atoms of Arg310 (2.9 Å) and Arg242 (3.1 Å).

In infuenza A viral endonuclease (example 28, Figure 8C) the catechol moiety of the hydroxypyridinone ligand acts through coordination of a metal cation, i.e. Mn2+. The ligand is likely in its ionized form, in analogy to catechol.52 This is also the case for eight other structures in the same study.53 See, e.g., Figure 4 in ref 53, compound 5 (IC50 of 0.45 μM). In human CDK2 (example 29, Figure 7D), the fused (3R)-3(aminomethyl)-1,4-diazepin-5-one moiety of ligand B98 acts as a scaffold. When compared to the ATP-bound complex, a large conformational difference can be found, placing Glu12−Thr14 on top of the α- and β-phosphate groups, akin to the example 2 above. The acyl group of B98 binds near the Pi1 site, at close distance from Asp145 (carboxyl to carbonyl O···O distance 2.7 509

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling

oxygen atoms, this is not the case of the nitrogen atom that is most often (but not always) facing the solvent. Many examples are found for HSP90 inhibitors, as can be seen among the examples already shown, Mycobacterium tuberculosis pantothenate synthetase (examples 13, Figure 6A) and human HSP90 (example 24, Figure 7B). In yeast HSP90 bound to a cyclic ansamycin inhibitor (example 30), the amide structural isostere binds to the Pi1 site of ATP, next to the 1,4-benzoquinone moiety occupying Pi2. The structural isostere retains the interaction between the amide carbonyl and Phe124 main chain (N···O distance of 2.8 Å, NH main chain of Phe124 to carbonyl oxygen atom, 3.2 Å P···O phosphate oxygen). The CON folder also contains other functional groups, such as carbamate in the case of HSP90 (example 31). In human CDK2 (example 32), the amide linker acts as a structural isostere likely by another mechanism, i.e. through intramolecular stabilization of the ligand core (stabilizing a U-shape) since the intramolecular distance between the amide N atom and an acyl group is 2.9 Å (spaced by four intramolecular bonds). The ligand in this example is shorter and the binding site in the inhibitor’s bound form is constricted: the Pi1 and Pi2 sites of ATP are in particular occupied by Gly13. Nitro-containing Compounds. Nitro compounds are identified by their own SMILES category (NO2). Akin to amides they often face the solvent but there are only few instances of nitro groups in the vicinity of phosphate groups. In human CDK2 (example 33), the nitro moiety of an inhibitor binds in between the two ADP phosphate groups. In the ADP-bound complex the Lys33 side chain atom N of NH3+ is approximately equidistant from both Pi1 and Pi2 phosphate oxygen atoms (2.8 and 3.3 Å). The inhibitor bound at the Pi1 site shows a longer distance of O atoms to N (NH3+), about 4−4.5 Å. In terms of affinity, the nitro compounds are among the most active ones in the SAR series of Table 1 in ref 49, compare compound 51 (IC50 of 0.02 μM, oNO2) with compound 54 (IC50 of 0.07 μM, m-NO2) and compound 57 (IC50 of 0.13 μM, phenyl). In other series NO2 activities are also in the same range as the other polar groups, such as NH2 or F in comparison to the nonsubstituted phenyl group; for example see Table 1 in ref 49 and compare compound 33 (IC50 of 2.2 μM, p-NO2) with compound 32 (IC50 of 1.6 μM p-F) and with compound 34 (IC50 of 3.1 μM, p-OMe). In human HSP90 protein (example 34), the nitro group binds to the Pi1 site of ADP, toward the solvated part of the cavity and with an excellent superimposition of one of the oxygen atoms (0.6 Å). The distances from this phosphate O to the main chain N atom of Gly137 is 3.3 Å and equivalent distances to the mimicking oxygen of NO2 are found. The chloro to nitro moieties have the same affinity, see ref 55 and compare compound 2 (IC50 of 52 nM, X = Cl) with compound 3 (IC50 of 48 nM, X = NO2). The analog bearing a nitro was 4-fold more potent in cells. Therefore, it was chosen as a phosphate isostere to expand the series; replacements of the nitro group at later stage were generally well tolerated; see the SAR series in ref 55 for compounds 7−29 and compounds 30−32. Miscellanous Structural Isosteres. Miscellaneous structural isosteres are both unexpected and have not been studied in congeneric series. They open new perspectives for developing phosphate replacements in compound design. Positively Charged Replacements. The positively charged replacements are somewhat counterintuitive given that phosphate is a negatively charged moiety. Often, the positively charged replacements are associated with the presence of a negatively charged aspartate at the binding site that allows

Figure 7. Selected examples of cyclic replacements of phosphate in phylogenetically unrelated proteins. (A) Homo sapiens PDK1, ligand 3RC (example 20, 3RCJ and 1H1W); (B) Homo sapiens CDK2, ligand Z67 (example 21, 3R9H and 3NIZ); (C) Homo sapiens phosphodiesterase 4D ligand ROF (example 23, 1XMU and 1TB7); (D) Homo sapiens HSP90 alpha, ligand 2D9 (example 24, 2CCU and 1BYQ).

Figure 8. Selected examples of polar cycles. (A) Homo sapiens CDK2, ligand C62 (example 25, 2UZO and 4I3Z); (B) Oryctolagus cuniculus glycogen phosphorylase, ligand RDD (example 27, 3BD6 and 1FA9); (C) Influenza A virus endonuclease domain ligand, 28A (example 28, 4MK5 and 3HW5); (D) Homo sapiens CDK2, ligand B98 (example 29, 3FZ1 and 1FIN).

Å) and Lys33 (O···Nε 3.0 Å). The primary amino group of B98 binds at the Pi2 site (1.4 Å in the superimposed complexes from the closest oxygen atom of the β-phosphate), at a hydrogen bonding distance from Asn132 (N···O distance 2.6 Å), forming an interaction similar to a phosphate oxygen atom (phosphate to carbonyl O···O distance 3.1 Å). The primary amino group is nonetheless not necessary for activity: replacement of CH2NH2 with CH2OH led to 5-fold increase in IC50, and a compound with only one hydrogen atom in its R1 group instead retains the activity; compare, Table 1 in ref 54, compound 18 (IC50 of 0.146 μM, R1 = CH2NH2) with compound 13 (IC50 of 25 nM, R1 = CH2OH) and with compound 9 (IC50 of 0.16 μM, R1 = H). Amides. Amides and carbamoyl groups (CON folder) were found as structural isosteres of phosphate, although they have been probably used actually as linking groups. While the acyl group appears in the majority of cases to mimick phosphate 510

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling

CH3 atom 3.2 Å; the equivalent O···O distance is 4.3 Å considering any phosphate oxygen atom); and binding is accompanied by repositioning the Asp366−Arg432 salt bridge (guanidinium N···O 2.9 Å closest approach). Generally, betaine is a small solvent molecule used in crystallization buffers due to its stabilizing properties;56 it is also an endogeneous osmolyte used for regulation by kingdoms of life from bacteria, fungi, plants to animal cells, and it targets endogenous proteins, such as transporters.57 To our best knowledge, the binding affinity of betaine for the nucleotide binding domain of HSC70 has not been reported. Taken together, this data demonstrates that placing a positively charged group as a phosphate structural isostere might be in some cases a valid and interesting strategy. Aliphatic Apolar. In farnesyl diphosphate synthase58 (example 37, Figure 9C) the binding site is quite elongated and POP binds in a well-hydrated region (eight water molecules in its close hydration shell). The POP binding site can be occupied by the fused two-ring skeleton of the terpene arenarone, whose aliphatic ring presents a very different chemistry from POP but leads only to minor side chain conformational rearrangement of the protein upon binding. In the POP-bound complex, one phosphate oxygen atom forms an ion-pair with Arg60 (2.7 Å closest approach) and simultaneously a water-mediated salt bridge with Glu93 (closest polar atom distance 5.4 Å, slightly offset below the POP binding site). Binding the terpene displaces the intervening water molecule and pulls apart the Arg60−Glu93 salt bridge (closest polar distance 11.1 Å). The binding site remains otherwise unchanged: The amide side chain of Gln96 is staggered between two phosphates of POP (3.1 Å from both phosphate oxygen atoms) in both complexes (Lys57, 2.7 and 2.9 Å). Natural Products, Ribose, and Lactam. In 6-phosphofructo2-kinase/fructose-2,6-bisphosphatase (PFKFB) (example 38, Figure 9D), fructose-2,6-diphosphate is a feedback inhibitor.59 Its α-D-ribofuranose moiety binds at the phosphate binding site of pyrophosphate. POP itself interacts with Arg74 (2.8 Å closest approach guanidinium nitrogen to phosphate oxygen) and Arg98 (2.9 Å) in a solvated binding cleft (eight water molecules). The oxygen atoms of α-D-ribofuranose mimick one phosphate of POP by forming hydrogen bonds to Arg74 guanidinium group. In the β-lactam sensor domain of BlaR1 protein (example 39, Figure 9E), a pyrophosphate group is structurally replaced by a covalently bound penicillin G molecule. The molecule uses a pyrrolidine ring to form a scaffolding unit60 that carries substituents isosteric to phosphate groups (especially carboxylate), providing a promising scaffold for the design of biphosphate isosteres. Penicillin G forms a covalent interaction with Ser59, which replaces the hydrogen bond formed by the pyrophosphate group and Ser59 (distance of 2.4 Å). Furthermore, penicillin G binds near a Lys62−Ser 107 hydrogen bond (distance of 2.9 Å). Upon binding of POP, the side chain of Lys62 is conformationally displaced and the hydrogen bond is broken (N···O distance 4.1 Å). The binding of penicillin G influences the carboxylation of Lys62, which induces a mechanism of penicillin G resistance.61,62 Ligands Used to Trap Conformational Analogs. In Mg− ATP complex of the motor domain of myosin II from Dictyostelium discoideum (example 40, Figure 9F), a ligand with beryllium trifluoride, is found. In the complex, BeF3 is an exact mimic of Pi3. The modified nucleotides, i.e. ATP analogs with vanadate aluminum fluoride and beryllium, were made to study the possible contractile cycle of myosin (8−12 states), each

formation of a salt bridge (see examples 1 and 24). The positively charged replacements can position themselves in the environment of a cationic metals, slightly shifted from the phosphate groups (example 1, Figure 4A−C) or as scaffolding units (example 29, Figure 8D). In addition, they are favorably accommodated in the solvent exposed region of a binding site. While this study describes ligand replacements, the replacements of phosphate groups by the binding site itself akin to example 2 are also found: The example 28 provides a case where the protonated amino group of Lys33 is actually binding to the site of the Pi2 of ADP. In sheep 6-phosphogluconate dehydrogenase (example 35, Figure 9A), an oxidized nicotinamide (NAD+, carries a positive

Figure 9. Selected examples of phosphate structural replacements. (A) Ovis aries, 6-phosphogluconate dehydrogenase, ligand NDP (example 35, 1PGO and 1PGN). For clarity, the entire ligand is not shown. (B) Homo sapiens heat shock cognate protein 70, ligand BET (example 36, 4H5W and 1BA0); (C) Homo sapiens farnesyl diphosphate synthase, ligand 1XH (example 37, 4P0W and 4H5D); (D) Homo sapiens, PFKFB3, ligand FDP (example 38, 3QPV and 3QPU); (E) Staphylococcus aureus, sensor domain of BlaR1, ligand MER (example 39, 3Q82 and 1XA1); (F) Dictyostelium discoideum, myosin motor domain, ligand DAE (example 40, 1D1A ad 1FMW).

charge) is located at the binding site formed by the POP in the reduced NADH-bound form of the enzyme (reference protein, PDB code 1PGN). Comparing the nicotinamide conformations in the NAD+ and NADH forms shows that in the POP-bound form the NADH ring binds ∼3.5 Å “lower” and some hydrogen bond interactions of the reduced coenzyme are not made. In human chaperone protein HSC70 (example 36, Figure 9B), trimethylglycine (betaine) bears a positive charge at its quaternary amino group, yet it is matching exactly the binding mode of an ADP’s Pi1 phosphate group. The binding of betaine is accompanied by a conformational change of Asp366 side chains (closest approach C···O to carboxyl oxygen from a betaine 511

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling

Figure 10. Sample of matrix of ligand similarity for reference protein 1LO5 (cluster KS-3) complexed to ADP. x-axis and y-axis, “type of replacement_ligand ID_PDB ID”. The reference ligand is indicated by “REF”. Color coding according to Tanimoto similarity coefficients (darker, closer to 1). Numbers indicate the size of the MCS and the number of different atoms. (top) IC50 (μM) extracted from BindingDB when available. (left) Local structural replacements in SMILES notation.

kinetic state influenced by the binding of nucleotides. The interest of beryllium for medicinal chemists is not clear. Combining Structural Replacements with Biological Activities. Important questions that arise during the review of this manuscript are the automated classification of replacements to avoid the tedious manual analysis conducted here (presented above), and whether we could use activity data to perform a rigorous analysis of the structural replacements. To apprehend the complexity of this second problem, it is crucial to remember that biological activity is a property of a whole compound (a compound−target pair actually) and that a compound may be substituted at several sites. Additionally, local structural replacements are defined in spatially defined regions, and bound compounds may shift relative to the reference bound phosphate

group, leading to extraction of different structural replacements for the same chemical moiety. Furthermore, SMILES codes extracted for each replacement are short strings, which make challenging the efforts to classify them. In an attempt to tackle the problem of combining structure replacements and biological activities, we gathered and combined the data at the level of each reference protein (see Figure 10 for a sample case; all the data is available as a zip archive as Supporting Information S10). The data analyzed are the similarity matrices between compounds, color-coded as a maximum common substructure-based (MCS) Tanimoto similarity coefficient (this later computed using the noncontiguous atom matching structural similarity function (NAMS)63 method), the number of atoms in the maximum 512

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling

First, the workflow retrieves from PDB70 the three-dimensional structures of proteins that are homologous to proteins with a bound POP, AMP, ADP, and ATP (named “reference ligands”; reference ligands are each bound to several “reference proteins”). Protein chains are often problematic for automated extraction protocols and a single reference ligand is taken for any reference PDB files. Proximity (at least one atom within 4.5 Å) is used to select the protein chain(s) of interest, and a global sequence alignment used to eliminate redundancy in the protein chains of a given PDB file. The protein chain(s) considered is(are) appended to the PDB file name in the output of the workflow. Second, for each reference protein found in the previous step, the workflow retrieves and superimposes homologues and keeps those with a non-nucleotide ligand bound at equivalent position to the reference ligand (these homologues are referred to as LSRcontaining ligand and LSR-containing protein). Chains of the LSR-containing proteins are identified by proximity (4.5 Å) as above and appended to the file names. Metrics (RMSD, length of superimposed region and percent identity) are output from the whole body protein superimposition. At that stage, data on the binding site flexibility (Cα and all atoms RMSD; longest deviation) is extracted for the subset of proteins with exactly identical amino acid type and numbers in the reference and in the LSR-containing sets. A set of overrepresented ligands are automatically removed to speed up the analysis, based on a manually constructed list; this list is a parameter in the workflow. Third, structural isosteres are extracted at the level of individual groups; for example, ADP is composed of two “replaceable” groups that we study here, i.e., Pi1 and Pi2. Extraction of atoms belonging to the tructural isostere is based on spheres centered on the P, O1, O2, and O3 atoms. Target proteins with no ligand atom at the structural isostere site are removed. Redundant groups are then merged and some complexes filtered out when they do not conform to set criteria, for example a too small overlap of the LSR on top of the considered group of the reference protein (α-, β-, and γphosphate). Fourth, the fragments are classified according to their SMILES codes. The SMILES are first broken and only SMILES containing six or more atoms kept for analysis under the name “Pi”; the others are also placed in the output under the name “Pi_small”. Cyclic fragments are then identified based on the presence of “1” in SMILES. The extracted local structural replacements are then used to sequentially assign to the “replacing” groups exclusively one of 16 nonoverlapping categories containing: phosphorus; boron; fluorine; chlorine; bromine; beryllium; NO2 (generally nitro); SO2; S; CON (generally carbamoyl or amide); COO (ester or carboxylic acid); exclusively C; only C and O; only C and N; only C, O, and N; and Other. The same categories are also used to classify the cycles. For the very rare cases when the search retrieved several groups of unconnected atoms inside an LSR (identified by a “.” in the SMILES), a cycle is erroneously created (known bug). The empirically optimized filters used for the run which is presented are as follows: (1) PDB October 2015 release (110 466); NMR entries are not considered; (2) resolutions are better than 2.7 Å; (3) homologues were found by a BLAST evalue filter of 10−100; (4) manually built list of prefiltered compounds: ANP, TTP, DCP, DGT, DTP, DUP, ACP, AD9, NAD, AGS, APC, AOV, UDP, GDP, GTP and CTP, in addition to AMP, ADP, ATP, and POP; (5) definition of the binding site as any amino acid with at least one atom within 4.5 Å of the

common substructure shared by any two compounds, and the number of atoms not in the MCS (i.e., the size in atoms of the full ligands can be found on the diagonal). Additionally for each compound we extracted the biological activity from BindingDB.64 Overall, this analysis allows a solid grasp of the data collected, and would have been very useful to ease the manual analysis. Nonetheless, the data also make clear that there is no rationale to perform further automated analysissuch as a matched molecular pair analysisdue to lack of data. We found only 74 pairs of complexes sharing their complete MCS (0 difference, i.e stereoisomers), 68 pairs of complexes with one difference, and 18 pairs of complexes sharing two differences. Most were lacking associated biological activity data, and a large fraction of the replacements were phosphate to phosphate structural replacements (as anticipated from Figure 2). Furthermore, only a fraction of these pairs of complexes have replacements that occur within a region of interest (i.e., at the binding site of a phosphate group). Concluding Remarks. In this study, we present a fully automated workflow useful to extract structural isosteres by superimposing homologous proteins. We discuss 40 examples of structural isosteric replacements of phosphate groups that have been visually selected among a large pool, over 16 300, replacements. The analysis was made possible by the elimination of the smallest replacements, by dissecting the structural isosteres into different chemotypes based on their SMILES codes, as well as by categorizing the proteins into clusters enabling to scout rapidly over phylogenetically or distant proteins. The studied complexes share the apparent common features of phosphate sites, i.e., a high hydration content and large size. This study opens unexpected perspectives about the bioiosteric replacement of phosphate groups, in particular through the U-shape stabilized conformations, aromatic rings, aliphatic, and positively charged groups. In this study, biological activities are not directly accounted for but are rather discussed for some of the selected examples. Apparently, changes are more tolerated in SAR series in the solvent-exposed region of compounds, but this aspect would deserve a more specific study protocol. The data mining workflow presented herein could be generalized to exploit structural isosteres of other chemical fragments. Ideally, in further developments, the query group could be defined by the user and the workflow could automatically select the proper reference compounds for data extraction. This will raise a different set of issues, namely the need to have many relevant complexes in the PDB.



EXPERIMENTAL SECTION A computational automated workflow written in Python language was built. The workflow uses and connects different external tools: (1) Blastp for sequence-based retrieval of protein homologues, (2) TM-align for protein structure superimposition,65 (3) ShaEP for assessing a score for the ligand-based superimposition of molecular fragments,66 (4) Babel to convert extracted 3D structure files to SMILES code,67 and (5) Needle from the Emboss package to conduct global sequence alignment.68 The workflow can be parametrized in many aspects, and the individual parameters used for the run presented are given at the end of this section. The software R28 was used for all statistical analysis and presentation of data; PyMOL69 was used for 3D structure presentation. 513

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Journal of Chemical Information and Modeling



bound reference ligand; sphere extraction of atoms belonging to the LSR using a radius of 2.5 Å; a shape component of the score computed by ShaEP larger than 0.2 required.



ACKNOWLEDGMENTS

The Drug Discovery and Chemical Biology − Biocenter Finland network and the Center for Scientific Computing (CSC-IT) are thanked for organizing computational resources. The Integrative Life Science−Informational and Structural Biology Doctoral Program is thanked for organizing graduate studies.

ASSOCIATED CONTENT

S Supporting Information *



The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.6b00519. Supporting Information (SI) 1 (Figure S1): Conservation of molecular interaction in structure replacement. SI 4 (Tables S1−S3): protein−ligand interaction diagrams. SI 5 (Table S4): crystallographic quality metrics for the selected examples. SI 6 (Figure S2, Figure S3): Data set construction. SI 7 (Figure S4): Statistics about the flexibility of the LSR-containing proteins binding sites. SI 8 (Figure S5, Table S5): Statistics about metals included in the reference complexes. SI 9 (Figure S6, Figure S7): Conformation of the reference nucleotide ligands in complexes (PDF) SI 3: Hierarchy of the files included in the Pi data set (TXT) SI 3: Hierarchy of the files included in the Pi_small data set (TXT) SI 2: Data set Pi hierarchically organized (ZIP) SI 2: Data set Pi_small hierarchically organized (ZIP) SI 10: Each reference protein ligand similarity matrix and associated data (ZIP)



Article

ABBREVIATIONS 3D, three dimensions; ADP, adenosine diphosphate; AMP, adenosine monophosphate; ATP, adenosine triphosphate; CDK2, cyclin dependent kinase 2; ERK2, extracellular signalregulated kinase 2; HSP90, heat shock protein 90; HSC70, heat shock cognate protein 70; JNK3, c-Jun N-terminal kinase 3; LSR, local structural replacement; PAAK2, phenylacetate-CoA ligase PaaK 2; PDB, Protein Data Bank; PDE, phosphodiesterase; PDE4D, phosphodiesterase 4D; PDE4B, phosphodiesterase 4B; PDK1, pyruvate dehydrogenase kinase 1; PFKFB3, 6-phosphofructo-2-kinase/fructose-2,6-bisphosphatase 3; Pi1, α-phosphate; Pi2, β-phosphate; Pi3, γ-phosphate; PKA, protein kinase A; PKAB3, protein kinase A B3; POP, pyrophosphate; RMSD, root-mean-square deviation; SAR, structure-activity relationships; SMILES, simplified molecular input line entry specification



REFERENCES

(1) Papadatos, G.; Brown, N. In Silico Applications of Bioisosterism in Contemporary Medicinal Chemistry Practice. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2013, 3 (4), 339−354. (2) Lima, L. M. L.; Barreiro, E. J. Bioisosterism: A Useful Strategy for Molecular Modification and Drug Design. Curr. Med. Chem. 2005, 12 (1), 23−49. (3) Chen, D.; Vollmar, M.; Rossi, M. N.; Phillips, C.; Kraehenbuehl, R.; Slade, D.; Mehrotra, P. V.; von Delft, F.; Crosthwaite, S. K.; Gileadi, O.; Denu, J. M.; Ahel, I. Identification of Macrodomain Proteins as Novel OAcetyl-ADP-Ribose Deacetylases. J. Biol. Chem. 2011, 286 (15), 13261− 13271. (4) Anand, S. B.; Kodumudi, K. N.; Reddy, M. V.; Kaliraj, P. A Combination of Two Brugia Malayi Filarial Vaccine Candidate Antigens (BmALT-2 and BmVAH) Enhances Immune Responses and Protection in Jirds. J. Helminthol. 2011, 85 (4), 442−452. (5) Meanwell, N. A. N. N. a. Synopsis of Some Recent Tactical Application of Bioisosteres in Drug Design. J. Med. Chem. 2011, 54 (8), 2529−2591. (6) Mills, J. E. Protein Structure. In Bioisosteres in Medicinal Chemistry, Brown, N., Ed.; Wiley-VCH Verlag GmbH & Co. KGaA: Weinheim, Germany, 2012; Chapter 10; pp 167−181. (7) Wirth, M.; Zoete, V.; Michielin, O.; Sauer, W. H. B. SwissBioisostere: A Database of Molecular Replacements for Ligand Design. Nucleic Acids Res. 2013, 41 (D1), 1137−1143. (8) Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: A Large-Scale Bioactivity Database for Drug Discovery. Nucleic Acids Res. 2012, 40 (D1), D1100−D1107. (9) Hayward, J. BIOSTER: A Database of Bioisosteres and Bioanalogues, Part 4. In Bioisosteres in Medicinal Chemistry; Brown, N., Ed.; Wiley-VCH Verlag GmbH & Co. KGaA, 2012; pp 53−74. (10) Schuffenhauer, A.; Floersheim, P.; Acklin, P.; Jacoby, E. Similarity Metrics for Ligands Reflecting the Similarity of the Target Proteins. J. Chem. Inf. Comput. Sci. 2003, 43 (2), 391−405. (11) Nicholls, A.; McGaughey, G. B.; Sheridan, R. P.; Good, A. C.; Warren, G.; Mathieu, M.; Muchmore, S. W.; Brown, S. P.; Grant, J. A.; Haigh, J. A.; Nevins, N.; Jain, A. N.; Kelley, B. Molecular Shape and Medicinal Chemistry: A Perspective. J. Med. Chem. 2010, 53 (10), 3862−3886.

AUTHOR INFORMATION

Corresponding Author

*E-mail: henri.xhaard@helsinki.fi. ORCID

Alexandre Borrel: 0000-0001-6499-4540 Jari Yli-Kauhaluoma: 0000-0003-0370-7653 Present Addresses ▽

Y.Z.: Pharmaceutical Sciences Laboratory, Faculty of Sciences and Engineering, Åbo Akademi University, FI-20520 Turku, Finland. ○ A.B.: Department of Chemistry, Bioinformatics Research Center, North Carolina State University, 322 Ricks Hall, Raleigh, NC 2769, USA. Author Contributions

Y.Z. and L.R. conducted preliminary studies, and A.B. developed the final version of the computational workflow. Y.Z, A.B., and H.X. conducted the data analysis. A.B. made the final version of the figures. Author Contributions

⊥ Y.Z. and A.B. contributed equally and are considered as cofirst authors.

Funding

This study was funded by a grant from the French research ministry (A.B.), by the China Scholarship council (Y.Z., Grant no 2009629110), by the Erkko and Jane Foundation, and by the Drug Discovery and Chemical Biology−Biocenter Finland network (L.G.). The Magnus Ehrnrooth foundation and the KAKSIN program of the French Embassy in Finland are thanked for additional resources. Notes

The authors declare no competing financial interest. 514

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling (12) Allen, F. H. The Cambridge Structural Database: A Quarter of a Million Crystal Structures and Rising. Acta Crystallogr., Sect. B: Struct. Sci. 2002, 58 (3), 380−388. (13) Neudert, G.; Klebe, G. DSX: A Knowledge-Based Sscoring Function for the Assessment of Protein-Ligand Complexes. J. Chem. Inf. Model. 2011, 51 (10), 2731−2745. (14) Rantanen, V.-V.; Gyllenberg, M.; Koski, T.; Johnson, M. S. A Priori Contact Preferences in Molecular Recognition. J. Bioinf. Comput. Biol. 2005, 3 (4), 861−890. (15) Desaphy, J.; Rognan, D. Sc-PDB-Frag: A Database of ProteinLigand Interaction Patterns for Bioisosteric Replacements. J. Chem. Inf. Model. 2014, 54 (7), 1908−1918. (16) Wood, D. J.; Vlieg, J. De; Wagener, M.; Ritschel, T. Pharmacophore Fingerprint-Based Approach to Binding Site Subpocket Similarity and Its Application to Bioisostere Replacement. J. Chem. Inf. Model. 2012, 52 (8), 2031−2043. (17) Chemical Computing Group Inc. Molecular Operating Environment (MOE); Montreal, QC, Canada, 2017. (18) Taylor, N. Proasis2−A Web-Based Protein Structure Database and Visualization System Linking Crystallography and Medicinal Chemistry Research. In CHI Cystallography Conference, Boston; 2004. (19) Hendlich, M.; Bergner, A.; Günther, J.; Klebe, G. Relibase: Design and Development of a Database for Comprehensive Analysis of ProteinLigand Interactions. J. Mol. Biol. 2003, 326 (2), 607−620. (20) Kennewell, E. A.; Willett, P.; Ducrot, P.; Luttmann, C. Identification of Target-Specific Bioisosteric Fragments from LigandProtein Crystallographic Data. J. Comput.-Aided Mol. Des. 2006, 20 (6), 385−394. (21) Chalk, A. J. A. A. J.; Worth, C. L. C.; Overington, J. J. P.; Chan, A. W. E. PDBLIG: Classification of Small Molecular Protein Binding in the Protein Data Bank. J. Med. Chem. 2004, 47 (15), 3807−3816. (22) Manning, G.; Whyte, D. B.; Martinez, R.; Hunter, T.; Sudarsanam, S. The Protein Kinase Complement of the Human Genome. Science (Washington, DC, U. S.) 2002, 298 (5600), 1912− 1934. (23) Elliott, T. S.; Slowey, A.; Ye, Y.; Conway, S. J. The Use of Phosphate Bioisosteres in Medicinal Chemistry and Chemical Biology. MedChemComm 2012, 3 (7), 735−751. (24) Rye, C. S.; Baell, J. B. Phosphate Isosteres in Medicinal Chemistry. Curr. Med. Chem. 2005, 12 (26), 3127−3141. (25) Smith, F. W.; Mudge, S. R.; Rae, A. L.; Glassop, D. Phosphate Transport in Plants. Plant Soil 2003, 248, 71−83. (26) Zhao, R. Y.; Erickson, H. K.; Leece, B. A.; Reid, E. E.; Goldmacher, V. S.; Lambert, J. M.; Chari, R. V. J. Synthesis and Biological Evaluation of Antibody Conjugates of Phosphate Prodrugs of Cytotoxic DNA Alkylators for the Targeted Treatment of Cancer. J. Med. Chem. 2012, 55 (2), 766−782. (27) Ballatore, C.; Huryn, D. M.; Smith, A. B. Carboxylic Acid (Bio)Isosteres in Drug Design. ChemMedChem 2013, 8 (3), 385−395. (28) Team R Core (R Foundation for Statistical Computing). R: A Language and Environment for Satistical Computing; Vienna, Austria, 2015. (29) Schrödinger, L. The PyMOL Molecular Graphics System, version 1.8; November 2015. (30) Laskowski, R. a.; Swindells, M. B. LigPlot+: Multiple Ligand− Protein Interaction Diagrams for Drug Discovery. J. Chem. Inf. Model. 2011, 51 (10), 2778−2786. (31) Gouron, A.; Milet, A.; Jamet, H. Conformational Flexibility of Human Casein Kinase Catalytic Subunit Explored by Metadynamics. Biophys. J. 2014, 106 (5), 1134−1141. (32) Kuhn, B.; Guba, W.; Hert, J.; Banner, D.; Bissantz, C.; Ceccarelli, S.; Haap, W.; Körner, M.; Kuglstatter, A.; Lerner, C.; Mattei, P.; Neidhart, W.; Pinard, E.; Rudolph, M. G.; Schulz-Gasch, T.; Woltering, T.; Stahl, M. A Real-World Perspective on Molecular Design. J. Med. Chem. 2016, 59 (9), 4087−4102. (33) Staben, S. T.; Heffron, T. P.; Sutherlin, D. P.; Bhat, S. R.; Castanedo, G. M.; Chuckowree, I. S.; Dotson, J.; Folkes, A. J.; Friedman, L. S.; Lee, L.; Lesnick, J.; Lewis, C.; Murray, J. M.; Nonomiya, J.; Olivero, A. G.; Plise, E.; Pang, J.; Prior, W. W.; Salphati, L.; Rouge, L.; Sampath,

D.; Tsui, V.; Wan, N. C.; Wang, S.; Weismann, C.; Wu, P.; Zhu, B. Y. Structure-Based Optimization of Pyrazolo-Pyrimidine and -Pyridine Inhibitors of PI3-Kinase. Bioorg. Med. Chem. Lett. 2010, 20 (20), 6048− 6051. (34) Yu, T.; Tagat, J. R.; Kerekes, A. D.; Doll, R. J.; Zhang, Y.; Xiao, Y.; Esposite, S.; Belanger, D. B.; Curran, P. J.; Mandal, A. K.; Siddiqui, M. A.; Shih, N. Y.; Basso, A. D.; Liu, M.; Gray, K.; Tevar, S.; Jones, J.; Lee, S.; Liang, L.; Ponery, S.; Smith, E. B.; Hruza, A.; Voigt, J.; Ramanathan, L.; Prosise, W.; Hu, M. Discovery of a Potent, Injectable Inhibitor of Aurora Kinases Based on the Imidazo-[1,2-a]-Pyrazine Core. ACS Med. Chem. Lett. 2010, 1 (5), 214−218. (35) Beuming, T.; Che, Y.; Abel, R.; Kim, B.; Shanmugasundaram, V.; Sherman, W. Thermodynamic Analysis of Water Molecules at the Surface of Proteins and Applications to Binding Site Prediction and Characterization. Proteins: Struct., Funct., Genet. 2012, 80 (3), 871−883. (36) Wang, T.; Block, M. a.; Cowen, S.; Davies, A. M.; Devereaux, E.; Gingipalli, L.; Johannes, J.; Larsen, N. a.; Su, Q.; Tucker, J. A.; Whitston, D.; Wu, J.; Zhang, H. J.; Zinda, M.; Chuaqui, C. Discovery of Azabenzimidazole Derivatives as Potent, Selective Inhibitors of TBK1/ IKK Kinases. Bioorg. Med. Chem. Lett. 2012, 22 (5), 2063−2069. (37) Yin, J.; Mobarec, J. C.; Kolb, P.; Rosenbaum, D. M. Crystal Structure of the Human OX2 Orexin Receptor Bound to the Insomnia Drug Suvorexant. Nature 2014, 519 (7542), 247−250. (38) Shin, Y.; Chen, W.; Habel, J.; Duckett, D.; Ling, Y. Y.; Koenig, M.; He, Y.; Vojkovsky, T.; LoGrasso, P.; Kamenecka, T. M. Synthesis and SAR of Piperazine Amides as Novel c-Jun N-Terminal Kinase (JNK) Inhibitors. Bioorg. Med. Chem. Lett. 2009, 19 (12), 3344−3347. (39) Li, Q.; Li, T.; Zhu, G. D.; Gong, J.; Claibone, A.; Dalton, C.; Luo, Y.; Johnson, E. F.; Shi, Y.; Liu, X.; Klinghofer, V.; Bauch, J. L.; Marsh, K. C.; Bouska, J. J.; Arries, S.; De Jong, R.; Oltersdorf, T.; Stoll, V. S.; Jakob, C. G.; Rosenberg, S. H.; Giranda, V. L. Discovery of Trans-3,4′bispyridinylethylenes as Potent and Novel Inhibitors of Protein Kinase B (PKB/Akt) for the Treatment of Cancer: Synthesis and Biological Evaluation. Bioorg. Med. Chem. Lett. 2006, 16 (6), 1679−1685. (40) Li, Q.; Woods, K. W.; Thomas, S.; Zhu, G.-D.; Packard, G.; Fisher, J.; Li, T.; Gong, J.; Dinges, J.; Song, X.; Abrams, J.; Luo, Y.; Johnson, E. F.; Shi, Y.; Liu, X.; Klinghofer, V.; Des Jong, R.; Oltersdorf, T.; Stoll, V. S.; Jakob, C. G.; Rosenberg, S. H.; Giranda, V. L. Synthesis and Structure−activity Relationship of 3,4′-Bispyridinylethylenes: Discovery of a Potent 3-Isoquinolinylpyridine Inhibitor of Protein Kinase B (PKB/Akt) for the Treatment of Cancer. Bioorg. Med. Chem. Lett. 2006, 16 (7), 2000−2007. (41) Zeng, Q.; Allen, J. G.; Bourbeau, M. P.; Wang, X.; Yao, G.; Tadesse, S.; Rider, J. T.; Yuan, C. C.; Hong, F. T.; Lee, M. R.; Zhang, S.; Lofgren, J. A.; Freeman, D. J.; Yang, S.; Li, C.; Tominey, E.; Huang, X.; Hoffman, D.; Yamane, H. K.; Fotsch, C.; Dominguez, C.; Hungate, R.; Zhang, X. Azole-Based Inhibitors of AKT/PKB for the Treatment of Cancer. Bioorg. Med. Chem. Lett. 2010, 20 (5), 1559−1564. (42) Beno, B. R.; Yeung, K.-S.; Bartberger, M. D.; Pennington, L. D.; Meanwell, N. A. A Survey of the Role of Noncovalent Sulfur Interactions in Drug Design. J. Med. Chem. 2015, 58 (11), 4383−4438. (43) Desvergnes, S.; Courtiol-Legourd, S.; Daher, R.; Dabrowski, M.; Salmon, L.; Therisod, M. Synthesis and Evaluation of Malonate-Based Inhibitors of Phosphosugar-Metabolizing Enzymes: Class II Fructose1,6-Bis-Phosphate Aldolases, Type I Phosphomannose Isomerase, and Phosphoglucose Isomerase. Bioorg. Med. Chem. 2012, 20 (4), 1511− 1520. (44) Moreau, C.; Kirchberger, T.; Swarbrick, J. M.; Bartlett, S. J.; Fliegert, R.; Yorgan, T.; Bauche, A.; Harneit, A.; Guse, A. H.; Potter, B. V. L. Structure−Activity Relationship of Adenosine 5′-Diphosphoribose at the Transient Receptor Potential Melastatin 2 (TRPM2) Channel: Rational Design of Antagonists. J. Med. Chem. 2013, 56 (24), 10079− 10102. (45) Niewiadomski, S.; Beebeejaun, Z.; Denton, H.; Smith, T. K.; Morris, R. J.; Wagner, G. K. Rationally Designed Squaryldiamides - a Novel Class of Sugar-Nucleotide Mimics? Org. Biomol. Chem. 2010, 8 (15), 3488−3499. (46) Albers, H. M. H. G.; Van Meeteren, L. A.; Egan, D. a.; Van Tilburg, E. W.; Moolenaar, W. H.; Ovaa, H. Discovery and Optimization of 515

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516

Article

Journal of Chemical Information and Modeling Boronic Acid Based Inhibitors of Autotaxin. J. Med. Chem. 2010, 53 (13), 4958−4967. (47) Liu, X.; Moody, E. C.; Hecht, S. S.; Sturla, S. J. Deoxygenated Phosphorothioate Inositol Phosphate Analogs: Synthesis, Phosphatase Stability, and Binding Affinity. Bioorg. Med. Chem. 2008, 16 (6), 3419− 3427. (48) Merkul, E.; Klukas, F.; Dorsch, D.; Grädler, U.; Greiner, H. E.; Müller, T. J. J. Rapid Preparation of Triazolyl Substituted NHHeterocyclic Kinase Inhibitors via One-Pot Sonogashira CouplingTMS-Deprotection-CuAAC Sequence. Org. Biomol. Chem. 2011, 9 (14), 5129−5136. (49) Schonbrunn, E.; Betzi, S.; Alam, R.; Martin, M. P.; Becker, A.; Han, H.; Francis, R.; Chakrasali, R.; Jakkaraj, S.; Kazi, A.; Sebti, S. M.; Cubitt, C. L.; Gebhard, A. W.; Hazlehurst, L. A.; Tash, J. S.; Georg, G. I. Development of Highly Potent and Selective Diaminothiazole Inhibitors of Cyclin-Dependent Kinases. J. Med. Chem. 2013, 56 (10), 3768−3782. (50) Barril, X.; Beswick, M. C.; Collier, A.; Drysdale, M. J.; Dymock, B. W.; Fink, A.; Grant, K.; Howes, R.; Jordan, A. M.; Massey, A.; Surgenor, A.; Wayne, J.; Workman, P.; Wright, L. 4-Amino Derivatives of the Hsp90 Inhibitor CCT018159. Bioorg. Med. Chem. Lett. 2006, 16 (9), 2543−2548. (51) Richardson, C. M.; Nunns, C. L.; Williamson, D. S.; Parratt, M. J.; Dokurno, P.; Howes, R.; Borgognoni, J.; Drysdale, M. J.; Finch, H.; Hubbard, R. E.; Jackson, P. S.; Kierstan, P.; Lentzen, G.; Moore, J. D.; Murray, J. B.; Simmonite, H.; Surgenor, A. E.; Torrance, C. J. Discovery of a Potent CDK2 Inhibitor with a Novel Binding Mode, Using Virtual Screening and Initial, Structure-Guided Lead Scoping. Bioorg. Med. Chem. Lett. 2007, 17 (14), 3880−3885. (52) Xhaard, H.; Backström, V.; Denessiouk, K.; Johnson, M. S. Coordination of Na + by Monoamine Ligands in Dopamine, Norepinephrine, and Serotonin Transporters. J. Chem. Inf. Model. 2008, 48 (7), 1423−1437. (53) Bauman, J. D.; Patel, D.; Baker, S. F.; Vijayan, R. S. K.; Xiang, A.; Parhi, A. K.; Martínez-Sobrido, L.; LaVoie, E. J.; Das, K.; Arnold, E. Crystallographic Fragment Screening and Structure-Based Optimization Yields a New Class of Influenza Endonuclease Inhibitors. ACS Chem. Biol. 2013, 8 (11), 2501−2508. (54) Anderson, D. R.; Meyers, M. J.; Kurumbail, R. G.; Caspers, N.; Poda, G. I.; Long, S. A.; Pierce, B. S.; Mahoney, M. W.; Mourey, R. J. Benzothiophene Inhibitors of MK2. Part 1: Structure-Activity Relationships, Assessments of Selectivity and Cellular Potency. Bioorg. Med. Chem. Lett. 2009, 19 (16), 4878−4881. (55) Brasca, M. G.; Mantegani, S.; Amboldi, N.; Bindi, S.; Caronni, D.; Casale, E.; Ceccarelli, W.; Colombo, N.; De Ponti, A.; Donati, D.; Ermoli, A.; Fachin, G.; Felder, E. R.; Ferguson, R. D.; Fiorelli, C.; Guanci, M.; Isacchi, A.; Pesenti, E.; Polucci, P.; Riceputi, L.; Sola, F.; Visco, C.; Zuccotto, F.; Fogliatto, G. Discovery of NMS-E973 as Novel, Selective and Potent Inhibitor of Heat Shock Protein 90 (Hsp90). Bioorg. Med. Chem. 2013, 21 (22), 7047−7063. (56) Zhang, Z.; Cellitti, J.; Teriete, P.; Pellecchia, M.; Stec, B. New Crystal Structures of HSC-70 ATP Binding Domain Confirm the Role of Individual Binding Pockets and Suggest a New Method of Inhibition. Biochimie 2015, 108, 186−192. (57) Ressl, S.; Terwisscha van Scheltinga, A. C.; Vonrhein, C.; Ott, V.; Ziegler, C. Molecular Basis of Transport and Regulation in the Na(+)/ betaine Symporter BetP. Nature 2009, 458 (7234), 47−52. (58) Liu, Y.-L.; Lindert, S.; Zhu, W.; Wang, K.; McCammon, J. A.; Oldfield, E. Taxodione and Arenarone Inhibit Farnesyl Diphosphate Synthase by Binding to the Isopentenyl Diphosphate Site. Proc. Natl. Acad. Sci. U. S. A. 2014, 111 (25), E2530−E2539. (59) Cavalier, M. C.; Kim, S. G.; Neau, D.; Lee, Y. H. Molecular Basis of the Fructose-2,6-Bisphosphatase Reaction of PFKFB3: Transition State and the C-Terminal Function. Proteins: Struct., Funct., Genet. 2012, 80 (4), 1143−1153. (60) Wilke, M. S.; Hills, T. L.; Zhang, H. Z.; Chambers, H. F.; Strynadka, N. C. J. Crystal Structures of the Apo and Penicillin-Acylated Forms of the BlaR1 β-Lactam Sensor of Staphylococcus Aureus. J. Biol. Chem. 2004, 279 (45), 47278−47287.

(61) Jimenez-Morales, D.; Adamian, L.; Shi, D.; Liang, J. Lysine Carboxylation: Unveiling a Spontaneous Post-Translational Modification. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2014, 70 (1), 48−57. (62) Borbulevych, O.; Kumarasiri, M.; Wilson, B.; Llarrull, L. I.; Lee, M.; Hesek, D.; Shi, Q.; Peng, J.; Baker, B. M.; Mobashery, S. Lysine N -Decarboxylation Switch and Activation of the β-Lactam Sensor Domain of BlaR1 Protein of Methicillin-Resistant Staphylococcus Aureus. J. Biol. Chem. 2011, 286 (36), 31466−31472. (63) Teixeira, A. L.; Falcao, A. O. Noncontiguous Atom Matching Structural Similarity Function. J. Chem. Inf. Model. 2013, 53, 2511− 2524. (64) Gilson, M. K.; Liu, T.; Baitaluk, M.; Nicola, G.; Hwang, L.; Chong, J. BindingDB in 2015: A Public Database for Medicinal Chemistry, Computational Chemistry and Systems Pharmacology. Nucleic Acids Res. 2016, 44 (D1), D1045−D1053. (65) Zhang, Y. TM-Align: A Protein Structure Alignment Algorithm Based on the TM-Score. Nucleic Acids Res. 2005, 33 (7), 2302−2309. (66) Vainio, M. J.; Puranen, J. S.; Johnson, M. S. ShaEP: Molecular Overlay Based on Shape and Electrostatic Potential. J. Chem. Inf. Model. 2009, 49 (2), 492−502. (67) O’Boyle, N. M.; Banck, M.; James, C. a.; Morley, C.; Vandermeersch, T.; Hutchison, G. R. Open Babel: An Open Chemical Toolbox. J. Cheminf. 2011, 3 (1), 33. (68) Rice, P.; Longden, I.; Bleasby, A. EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet. 2000, 16 (1), 276−277. (69) DeLano, W. The PyMOL Molecular Graphics System. http://www. pymol.org/. Schrödinger LLC 2002. (70) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28 (1), 235−242.

516

DOI: 10.1021/acs.jcim.6b00519 J. Chem. Inf. Model. 2017, 57, 499−516