Structural Insights on Fragment Binding Mode ... - ACS Publications

Jun 15, 2018 - Structural Insights on Fragment Binding Mode Conservation. Malgorzata N. Drwal,. †. Guillaume Bret,. †. Carlos Perez,. ‡. Célien Jacque...
0 downloads 0 Views 4MB Size
Article Cite This: J. Med. Chem. 2018, 61, 5963−5973

pubs.acs.org/jmc

Structural Insights on Fragment Binding Mode Conservation Malgorzata N. Drwal,† Guillaume Bret,† Carlos Perez,‡ Célien Jacquemard,† Jérémy Desaphy,§ and Esther Kellenberger*,† †

Laboratoire d’Innovation Thérapeutique, UMR7200, Université de Strasbourg, 74 Route du Rhin, 67401 Illkirch, France Eli Lilly Research Laboratories, Avenida de la Industria, 30, 28108 Alcobendas, Madrid, Spain § Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana 46285, United States ‡

Downloaded via UNIV OF SUSSEX on July 27, 2018 at 16:45:12 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

S Supporting Information *

ABSTRACT: Aiming at a deep understanding of fragment binding to ligandable targets, we performed a large scale analysis of the Protein Data Bank. Binding modes of 1832 drug-like ligands and 1079 fragments to 235 proteins were compared. We observed that the binding modes of fragments and their drug-like superstructures binding to the same protein are mostly conserved, thereby providing experimental evidence for the preservation of fragment binding modes during molecular growing. Furthermore, small chemical changes in the fragment are tolerated without alteration of the fragment binding mode. The exceptions to this observation generally involve conformational variability of the molecules. Our data analysis also suggests that, provided enough fragments have been crystallized within a protein, good interaction coverage of the binding pocket is achieved. Last, we extended our study to 126 crystallization additives and discuss in which cases they provide information relevant to structurebased drug design.



INTRODUCTION Fragment-based drug design (FBDD) is a well-established method to find new drug candidates by optimizing small chemical fragments into larger molecules.1,2 As compared to larger ligands, fragments have in principle a higher proportion of functional groups involved in protein binding and many of them precisely fit the target subpockets. Moreover, due to their reduced size and complexity, fragments allow an efficient exploration of protein binding sites.3 Thus, FBDD usually results in higher hit rates than high-throughput screening (HTS) with large molecules, providing good starting points for drug discovery programs.4,5 Examples have shown that FBDD can be successful where other drug discovery programs have failed, e.g., for difficult protein targets or protein−protein interfaces.6 FBDD generally begins with the experimental screening of fragment libraries to determine possible hits. Once their 3D structure with the protein has been determined, they are optimized into larger molecules by growing them into molecules occupying the entire binding pocket or linking fragments binding to different subpockets. At this stage, computational chemists can support the FBDD project by predicting how to grow or link fragments to develop a highaffinity drug-like ligand. Usually, the predictions rely on the assumption that the binding mode of a fragment is unique and that the binding mode of a fragment and its drug-like counterpart will be conserved. However, several studies on ligand deconstruction have shown that this is not always the © 2018 American Chemical Society

case. On the basis of eight examples, Kozakov and colleagues showed that fragments coinciding with low-energy hot spots tend to have conserved binding modes.7 The detection of protein hot spot has been proposed to assist fragment selection and elaboration by prioritizing protein subpockets.8 In the current study, we have performed a large-scale analysis of fragment binding modes in crystallographic complexes obtained from the RCSB Protein Data Bank (PDB).9 The first question we asked is how often and why fragments crystallized multiple times in the same protein cavity have variable binding modes. We have evaluated the degree of binding mode conservation between drug-like ligands and their root fragments bound to the same protein target and the influencing parameters. In particular, we have systematically compared interactions made by fragments and their drug-like superstructures bound to the same proteins. Considering all the fragments and drug-like ligands bound to the same protein binding site, we have investigated if fragments cover all the interactions made by the drug-like ligands or if, by contrast, they have specific recognition subsites. We have extended our analyses to small crystallization additives and have characterized which information is relevant for the design of drug-like ligands. Overall, our results provide guidelines for the computational support of an FBDD project where one or more fragment−protein complex structures have been solved. Received: February 15, 2018 Published: June 15, 2018 5963

DOI: 10.1021/acs.jmedchem.8b00256 J. Med. Chem. 2018, 61, 5963−5973

Journal of Medicinal Chemistry



Article

398.0 ± 79.2 Da. Binding modes are compared by investigating the noncovalent interactions between the small molecule and the protein pocket residues. These include nondirectional (hydrophobic) as well as directional (polar) interactions like hydrogen bonds (H-bonds), aromatic and ionic interactions, and also metal interactions with divalent cation. The numerical representation of interactions involves the definition of a binding site which is common to all the ligands bound to a pocket (Figure 1B). We present here the study of two different aspects of binding modes: binding mode conservation and interaction pattern similarity. Focusing on binding mode conservation, we investigate the binding mode similarity between a single fragment and a single drug-like ligand (oneto-one comparisons scored using IFP similarity, Figure 1C). Focusing on interaction pattern similarity, we study the interactions of all fragments and compare them to the interactions of all drug-like ligands to answer the question of whether the two types of ligands display the same coverage of the pocket interactions (comparisons of consensus binding mode score using cIFP similarity, Figure 1D). In the last part of the manuscript, we extend the analysis to small crystallization additives and ask the question of whether they contain useful information for FBDD studies. The parsing of the PDB selected 3287 files allowing the comparison of the binding mode of additives with the binding mode of drug-like ligands bound to the same protein pocket (Figure 1). We investigate here 126 additives found in 319 binding sites of 309 proteins. We especially focus on additives bound to a free protein (or apo additives). On average, apo additives have more accurate structures than other additives. About threequarters of them show good local fit to the electron density (Supporting Information Table S1). Part I: Fragment Binding Mode Conservation. The general assumption in FBDD when a fragment is extended into a drug-like ligand is that the binding mode of the shared substructure will be conserved. In this section, we focus on the binding mode conservation between a single fragment and a single drug-like ligand on the PDB scale. In particular, we aim at finding answers to these four different questions: (1) Is the binding mode of a single fragment conserved in multiple complexes with the same protein? (2) Is the binding mode conserved when extending the fragment into a drug-like ligand superstructure? (3) Is the binding mode conserved when extending the fragment into a structurally similar drug-like ligand? (4) Is there a correlation between fragment size and binding mode conservation? Binding Mode Conservation of Fragments within the Same Protein Pocket. We investigated the binding mode of 453 fragments that have been crystallized multiple times with the same protein and within the same, ligandable, binding cavity. These fragments are found in 501 complexes, involving 152 binding sites in 149 proteins. In total, 1502 3D-structures are considered. Of note, a single PDB file can contain several biounits, generally distinguished by different chain names, and therefore can provide more than one 3D-structure for the same complex. More than two-thirds of the 501 complexes have only two 3D-structures, while a few of them have more than 10 copies in our data set (Supporting Information Figure S1). For each complex, we evaluate the degree of binding mode conservation by considering the minimum IFP similarity value obtained for the comparisons of all pairs of their 3D-structures (Figure 1C). IFP similarity values range from zero (no common interactions) to 1 (exactly the same interactions). IFP

RESULTS AND DISCUSSION In the current study, we explore and compare the binding modes of fragments and drug-like ligands that have been crystallized with the same protein target and within the same ligandable binding site (Figure 1A). For this purpose, we have

Figure 1. Overview of the study. (A) Selection of 3D-structures in the PDB. (B) Numerical representation of binding mode used in this study. For the sake of illustration, depicted is the fragment X76 bound to human CDK2 ATP-binding pocket in the PDB structure 3R1Y. (C) Similarity between the binding modes of two small molecules bound to the same pocket. For the sake of illustration, displayed are the fragment X76 and drug-like ligand Z04 bound to human CDK2 ATP-binding pocket in the PDB structures 3R1Y and 3R7Y (IFP sim = 1). (D) Similarity between the interactions coverage by all fragments and by all drug-like ligands bound to the same pocket. For the sake of illustration, displayed are all the ligands of human CDK2 ATP-binding pocket (cIFP sim = 0.895).

processed protein complexes from the PDB as described in detail in the Experimental Section. Special care was taken to ensure relevant description of the binding mode by removing complexes with protein pockets containing mutated or missing residues as well as removing small molecules with missing atoms. The quality of 3D-structures was assessed using the EDIA approach recently proposed by Meyder et al.10 On average, the 3D-structure is well covered by the electron density in more than 80% of the studied complexes. Furthermore, in none of them, the structure of the ligand or protein residues poorly fits to the corresponding electron density (more statistics are given in the Supporting Information Table S1). The studied data set contains 1079 fragments and 1832 drug-like ligands found in 1404 and 2268 PDB files, respectively. In total, 240 binding sites in 235 different protein targets are considered (Figure 1A). The average molecular weight of fragments and drug-like ligands is 204.5 ± 45.6 and 5964

DOI: 10.1021/acs.jmedchem.8b00256 J. Med. Chem. 2018, 61, 5963−5973

Journal of Medicinal Chemistry

Article

defines the position and orientation of the fragment in its binding pocket (Supporting Information Table S2). Considering all the studied complexes, many nonconserved binding modes correspond to cases where the boundconformation of a fragment is different in the multiple crystallographic structures of the same complex or when structural change occurs in the protein site. The removal of fragments with variable conformations in the same protein site from the data set results in an increased binding mode conservation (Supporting Information Figure S2B). The filtering of fragment structures where atoms are poorly resolved also discards many cases of low binding mode conservation. We found only 12 cases where very well-defined fragments have low binding mode conservation while no structural changes occur in the binding site and the fragments exhibit the same conformation and orientation in site. In these cases binding mode similarity is underestimated due to missed interactions caused by differential protonation or a threshold effect in the detection of interactions. Overall, PDB fragments tend to exhibit the same position and orientation when crystallized multiple times within the same pocket (rmsd max of 300 Da, because the larger molecule does not comply the rule-of-five, or because of mutations in binding site).

Figure 3. Analysis of 359 fragment−ligand substructure pairs. (A) Overall binding mode similarity considering all interactions (left) and polar interactions (right). The median for each substructure pair is shown. (B) Effect of fragment binding mode variability on the binding mode similarity of fragment−ligand substructure pairs. Fragment CK2 (PDB structures 1PXJ and 2C5O) and its drug-like superstructure CK8 (PDB structure 2C5N) bound to human cyclin dependent kinase 2. The protein in its active form is displayed as ribbon, the fragment (orange) and ligand (green) are displayed as sticks, and hydrogen bonds in the hinge region of the protein are in yellow. For the sake of comparison, the inactive form of the protein is shown as a light orange ribbon on the right panel. The proportion of interactions conserved is high (0.89) when considering fragment and ligand binding to the active form of the protein (left) and decreases (0.57) when considering fragment bound to the inactive form and ligand bound to the active one (right). (C) Overall pose similarity considering shape overlap (ROCS shape, left) and chemistry match (ROCS color, right). The median for each substructure pair is shown.

adenine has versatile binding modes and is indeed able to use different functional groups to recognize different proteins, including close homologs.14 A variable binding mode of a fragment in its target site, although not frequent in our data set, is likely to decrease the median similarity computed from all 3D-structures of a pair. As shown in the previous part, variations in the protein structure often explain inconsistent binding modes. Classifying pairs in the data set according to the rmsd of the site Cα atomic coordinates indeed reveals that the ability of fragments to maintain their binding modes is inversely correlated to protein site flexibility (Supporting Information Figure S3). As an example, the fragment CK2 was crystallized in the active and inactive forms of human cyclin-dependent kinase 2 where it adopts two different poses to fit to changes in the ATP-binding pocket.15 CK2 is a substructure of the drug-like ligand CK8, which was crystallized in the active form of the protein. Comparing the binding modes of CK2 and CK8 thus yields inconsistent similarity values (Figure 3B). This result stresses again the importance of considering protein flexibility during fragment elaboration. It also suggests that interactions are better preserved than binding poses. Therefore, we also determined for all the 359 fragment−ligand substructure pairs whether the fragment pose is conserved (Figure 3C). Pose similarity is here computed as shape overlap and 5966

DOI: 10.1021/acs.jmedchem.8b00256 J. Med. Chem. 2018, 61, 5963−5973

Journal of Medicinal Chemistry

Article

set, there are only 11 proteins which fulfill this rule (“ligandand fragment-rich targets”; Table 1). They however belong to diverse functional classes. Conditions for Similar Coverage of the Pocket Interactions by Fragment and Drug-like Ligands. When investigating the pocket properties of the 11 ligand- and fragment-rich targets, we observe large variability in size, polarity, and flexibility (Supporting Information Table S3). The size of the pocket, for instance, varies between 25 and 60 residues and the average volume between 327 and 739 Å3. It is interesting to note that the two largest pockets when considering the residue count (BACE1 and LTA4H) have the lowest interaction coverage between ligands and fragments, while the smallest pocket (BRD4) has the highest interaction pattern similarity. For the other ligand- and fragment-rich target properties, however, no correlation with the interaction pattern similarity is apparent. When extending the analysis to all targets with good interaction coverage (≥0.6; 81 targets), the pocket properties are even more variable. The count of pocket residues in those targets lies between 20 and 78 residues, the average volume between 150 and 1037 Å3, the volume variation between 155 and 1121 Å3, the average Cα rmsd to the reference structure between 0.1 and 4.3 Å, and the average percentage of polar points between 27% and 94%. Taken together, the results suggest that high coverage of fragment and drug-like ligand interactions can be observed independent of the pocket properties. We then verified whether the similarity of the interaction patterns is not overestimated due to low chemical diversity of the ligands. The structural diversity within the fragment and drug-like ligand sets of the ligand- and fragment-rich targets ranges between medium and high (Table 1, Supporting Information Table S3). In most of the cases, the diversity within the fragment set is higher than the diversity of the druglike ligands. By comparison of the fragments and the drug-like ligands, interset similarity is relatively low, although there are, for 10 of the 11 targets, a few pairs where the fragment is an exact or a close substructure of the drug-like ligand (Supporting Information Figure S6). The exception is LTA4H, where the maximal similarity of fragments and ligands is 0.63 and the interaction pattern similarity is relatively low (0.425). The low similarity is however not due to fragment and drug-like ligands interaction with mainly specific residues, yet to an unbalanced distribution of fragments and ligands within the binding site (Figure 5). Whereas drug-like ligands (green) are evenly distributed over the entire LTA4H pocket, all but one fragment (magenta) are found on the left site of the pocket. Correspondingly, the distribution of fragments and drug-like ligands in the BACE1 pocket (Supporting Information Figure S7) also explains the rather low interaction pattern similarity observed for this target, stressing that our computing approach can underestimate interaction coverage in the cases of poor spatial coverage. Lastly, we asked ourselves whether some polar interactions are specific to fragment or drug-like ligands. We observe that on average 20% of polar interactions found in fragment complexes are unique and thus not found in drug-like ligand complexes. For example, fragments form two specific H-bonds with LTA4H (Figure 5A). Considering the 11 ligand- and fragment-rich proteins, there are target-dependent differences in the percentage of unique polar interactions (Supporting Information Table S4). The values can vary between 0% and

In our previous study on four targets which are overrepresented in the PDB,13 we also observed that there is a relationship between the fragment size and the binding mode conservation. In particular, in 90% of the studied complexes with conserved binding modes, the number of additive/ fragment heavy atoms was higher than 8 and the MW was higher than 110 Da. We here extend our analysis to the 359 substructure pairs and the 1533 chemically similar pairs in the PDB (Supporting Information Figure S5). We confirm that binding mode is overall conserved if the fragment MW is high enough, with a threshold around 150 Da. The same trend is observed for the fragment binding pose, although the level of conservation is generally lower, as previously mentioned in the analysis of substructure pairs (Figure 3). Are there other reasons why a fragment binding pose varies? Case studies are more suitable to unravel the cause of change in binding mode than a global analysis. For example, Schauperl et al. used molecular dynamics simulations to provide a thermodynamical understanding of the variable binding mode of fragments in complex with the TGFBR1 kinase.20 Part II: Similarity between Fragment and Drug-like Interaction Patterns. In this section, we examine the coverage of protein binding site by fragments and drug-like ligands. Therefore, all available fragment- and drug-like ligand−protein complexes of the same target are considered and global interaction patterns of the fragment and drug-like ligand sets are compared. Only a Few Ligandable Proteins in the PDB Have High Interaction Coverage Data Sets. In the entire PDB we found 240 ligandable sites, in 235 proteins, in complex with both fragment and drug-like ligands. When only one fragment and one drug-like ligand complex are available, different levels of interaction pattern similarity are observed. Therefore, we hypothesized that the more fragment and drug-like ligands are available for a given target, the more chances of mapping the entire pocket and thus the more chances of high interaction coverage. Indeed, our hypothesis was confirmed, as indicated in Figure 4. As a rule of thumb, data suggest that at least nine different fragments and nine different drug-like ligands in complex with the same target are necessary to observe good interaction coverage between the two sets. In our PDB data

Figure 4. Similarity between fragment and drug-like interaction patterns. Dependence of fragment−ligand interaction pattern similarity on the number of different molecules considered. In each step, all targets with at least X (1, 2, 3, etc.) HET codes of both fragments and drug-like ligands are considered. Number of distinct binding sites is shown in red. 5967

DOI: 10.1021/acs.jmedchem.8b00256 J. Med. Chem. 2018, 61, 5963−5973

Journal of Medicinal Chemistry

Article

Table 1. Ligand- and Fragment-Rich Targetsa

The colors indicate the diversity within the fragment or ligand sets. Green: high diversity (average(1 − similarity) > 0.75). Light green: medium diversity (average(1 − similarity) > 0.6). Tanimoto similarity between molecules is calculated using ECFP2 fingerprints. a

Focusing on small additives with a MW below 300 Da, we repeated the analyses made for fragments (Figure 1): we first checked whether additives have a consistent binding mode in a protein site, next we compared additives and drug-like ligands binding modes, pair-by-pair for each protein, and last, we considered the coverage of binding site interactions by additives and by drug-like ligands. We distinguished between additives bound to a free protein (apo additives) and those present in a protein in complex between another molecule, e.g., a fragment or drug-like ligand. This distinction is important, since additives can hardly compete with a larger and stronger ligand for targeting protein hot spots. When considering the additives as a whole, we found that additives and fragments behave differently, while when focusing on apo additives, we observed common trends (see Supporting Information Figures S8, S9, and S10). An example of a target with many fragment− and additive− drug-like ligand substructure pairs is human macrophage metalloelastase (MMP12). As shown in Figure 6, both structures of the additive HAE, one being an apo additive structure, show a perfect conservation of the binding mode to the drug-like ligand superstructure CGS. Apart from hydrophobic interactions, the metal interactions with the zinc ion as well as a H-bond to Ala182 are conserved. Another example of well conserved binding modes is displayed in Figure 6. Bacillus thermoproteolyticus thermolysin is one of the targets with multiple apo additive complexes. In a previous study, the benzene ring of the fragment N-(phenylcarbonyl)-β-alanine (BYA) has been identified as a hot spot because several ligands place this ring at the given position.8 Interestingly, several apo additives map the directional interactions of BYA (Figure 6) and show good binding mode conservation. In this case, all of the additives are very small (four or five heavy atoms), indicating that a certain molecular complexity is not necessary to observe binding mode conservation.

39% for unique H-bonds and between 0% and 100% for unique aromatic bonds. It could be assumed that a factor that influences the results is the nature of the fragments and druglike ligands in the data set. However, this is not confirmed for the ligand- and fragment-rich targets, where neither the overall fragment−ligand set similarity (Supporting Information Table S3 and Figure S6) nor the count of H-bond donors or acceptors (data not shown) correlates with the observed unique interaction ratio. The differences cannot be simply explained by the diverse polarities of the corresponding binding sites (Supporting Information Table S4). Part III: Can Crystallization Additives Be Regarded as Fragments? In contrast to fragments that are crystallized with the target protein on purpose, other small molecules identified in PDB structures had been unintentionally incorporated into the protein crystals.21 These compounds are buffers, reducing agents, ions, detergents, or cryoprotectants added to the experimental sample for solubilizing and stabilizing the protein. They can also be precipitants and additives of various chemical nature and size added to the experimental sample for promoting crystal formation. A study of additives interaction in crystals grown in different experimental conditions has emphasized the important role of additives in the crystal formation, showing that additives directly mediate intermolecular crystal contacts or induce conformational changes at the protein surface.22,23 An interesting case is the additive benzamidine, which enhances the crystallization of trypsin when it is bound to the active site.23 Benzamidine is actually also a substructure of drug-like competitive inhibitors of trypsin, and its binding mode to the enzyme active site is indeed preserved in the drug-like superstructures;13 thereby benzamidine can also be regarded as fragment. We were here wondering whether we can find in the PDB other examples of protein−additive complexes containing useful information for computational drug design. 5968

DOI: 10.1021/acs.jmedchem.8b00256 J. Med. Chem. 2018, 61, 5963−5973

Journal of Medicinal Chemistry

Article

Figure 6. Examples of additive binding mode conservation. (Left panel) Example of the binding mode conservation of an additive/ drug-like ligand substructure pair. Example of the drug-like ligand CGS (green, PDB code 1JIZ) and its additive substructure HAE bound alone (white, PDB code 1OS2) and in the presence of the ligand EEG (light pink, PDB code 3LIK) in the human macrophage metalloelastase (MMP12) pocket. The protein is shown as green cartoon, and important protein residues are shown as sticks. Polar interactions are indicated as dotted lines. The zinc ion is shown as small sphere. (Right panel) Example of apo additives matching the polar interactions of a fragment. Bacillus thermoproteolyticus thermolysin bound to the fragment BYA (green, PDB code 3FGD) and three examples of apo additives: 2-bromoacetate (cyan; PDB code 3NN7), S-1,2-propanediol (blue, PDB code 3N21), and acetic acid (white, PDB code 2A7G). The small molecules are shown are sticks, the protein is shown as cartoon, the interacting protein residues are shown as lines, and the zinc ion is shown as a small sphere. Polar interactions are indicated as dotted lines: hydrogen bonds in green and metal interaction in yellow. All electron densities have been deposited and correspond with the shown structures.

(e.g., phosphate and Tris buffer), long straight-chain compounds (e.g., fatty acid and polyethylene glycol), small polyols (e.g., 1,2-ethanediol), small aromatic compounds (e.g., benzoic acid), and other polar cyclic or linear compounds of various sizes (e.g., DMSO, panthothenic acid, or MES buffer). We thus questioned whether binding mode conservation depends on the chemical nature of additives. When we investigated the variability of additive binding modes within the same protein, we observed that although there are high discrepancies in the classes containing the smallest compounds (small anions, small aromatic compounds, small inorganic compounds, and small polyols), more than half of the additives cluster in three or less spots of a protein site (Supporting Information Figure S11). For example, the 10 poses of malonate bound to E. coli aminopeptidase N pocket cluster into three groups in which the orientation and interactions are well preserved (Figure 7, left). By contrast, the 18 poses of glycerol bound to the same protein pocket define a continuous shape overlapping the hydrophobic parts of the drug-like ligand actinonin (Figure 7, right). Exploring the interactions made by additives revealed that the small anions can act as molecular probes to reveal charged regions in the protein pocket and that the long straight-chain compounds can act as molecular probes to reveal hydrophobic regions. More surprisingly, we found that polar additives such as small polyols or DMSO are largely engaged in hydrophobic contacts (Supporting Information Figure S12). The abovedescribed example of glycerol bound to E. coli aminopeptidase N pocket well illustrates that small polyols tend to bind efficiently to hydrophobic regions of the protein. In the 18 structures available for glycerol bound to E. coli aminopeptidase N, we identified 86 interactions between the additive and protein, including 56 hydrophobic contacts with 10 protein residues and 30 H-bonds with 14 protein residues. Two additional examples of polar additives engaged in an H-

Figure 5. Coverage of human leukotriene A-4 hydrolase pocket. (A) Drug-like ligand (left) and fragment (right) interaction heatmaps. Different types of interactions (hydrophobic, HYD; aromatic, AROM; hydrogen bonding, HB; ionic, IONIC; metal, METAL) are displayed on the X-axis. The binding site residues (one-letter code, residue number, chain) are displayed on the y-axis. The color intensity describes the frequency of the observed interaction in all complexes for this set (e.g., fragments). Residue number 1 is zinc. (B) Overlay of all fragments and drug-like ligands in the reference 3D-structure. Drug-like ligands (green) and fragments (magenta) are shown as sticks, the binding site is shown as surface, and the residues of the binding site are shown as lines.

When considering chemical structure, the set of additives is very heterogeneous. There are small inorganic and organic ions 5969

DOI: 10.1021/acs.jmedchem.8b00256 J. Med. Chem. 2018, 61, 5963−5973

Journal of Medicinal Chemistry

Article

protein 2. Of note, the additive was crystallized alone in the protein in the two examples (apo additive).



CONCLUSIONS The PDB-wide analysis of fragment binding modes has answered several questions related to FBDD projects (see Table 2). The study revealed that a fragment crystallized multiple times in the same protein cavity tends to occupy the same position and form the same interactions in the binding pocket. In addition, interactions made by fragments are well conserved in the related drug-like ligands, provided fragment MW is larger than 150 Da. Directional interactions, especially H-bonds, are highly conserved. Examples of variable binding modes drew our attention to protein flexibility and to the ability of some ligands to adopt multiple bound conformations. Growing fragments into structurally similar ligands results in good conservation of polar interactions provided the chemical modification does not induce a change in the protein structure. When comparing interaction coverage of fragments and drug-like ligands, we observed that the more fragments and drug-like ligands have been crystallized with a protein, the better is the sampling of interactions in its binding site. We evaluated that nine different fragments are enough to achieve good interaction coverage for diverse pockets. Besides, fragments tend to have unique polar interactions (not seen in complexes with drug-like ligands) in polar protein pockets. The study of small additives binding to ligandable PDB proteins showed that clusters of apo additives usually reveal interactions made by drug-like ligands. Interestingly, binding of small polyols and small polar compounds (e.g., glycerol or DMSO) involves both H-bonds and hydrophobic contacts. In conclusion, from a purely structural perspective, small additives cannot be regarded as fragments because their binding modes are too variable; nevertheless they provide key information on the protein binding properties, especially revealing pharmacophoric anchor points in the protein and supporting target ligandability. The main conclusions of the study are summarized in Table 2.

Figure 7. Examples of multiple additive poses in a protein site. (Left panel) Malonate (MLI, pink) bound to E. coli aminopeptidase N pocket (UniProt code P04825). The protein is shown as green cartoon and zinc ion as sphere. (Right panel) Glycerol (GOL, pink) and actinonin (BB2, green, PDB code 4Q4E) bound to E. coli aminopeptidase N pocket. The protein is shown as green cartoon, and hydrophobic protein residues in site are shown as lines.

bond on one side and in hydrophobic contact on the other side are shown in Figure 8. Glycerol makes H-bonds with the hinge

Figure 8. Examples of additive interactions. (Left panel) Glycerol (GOL, pink) and drug-like inhibitor (OFG, green, PDB code 4CCB) bound to the hinge region of the ALK tyrosin kinase receptor ATP binding site. (Right panel) 1,2-Ethanediol (EDO, pink) and the inhibitor GSK525762 (EAM, green, PDB code 2YEK) bound to bromodomain-containing protein 2. The protein is shown as green cartoon, and hydrophobic protein residues in site are shown as lines. Hydrogen bonds are indicated as yellow dotted lines.

region of the ALK tyrosine kinase receptor, like most inhibitors of kinase ATP binding sites. The carbon atoms of glycerol face hydrophobic side chains in the protein pocket. Another example is 1,2-ethanediol which is engaged in two H-bonds with a hydrophobic site of the bromodomain-containing



EXPERIMENTAL SECTION

Data Set and General Procedure. The analysis was performed on publicly available data from the RCSB PDB Web site.9 Crystal

Table 2. Main Conclusions for FBDD question

answer

Does a fragment always bind a protein pocket in a similar way?

Yes

74% (IFP sim ≥ 0.6)

Is the binding mode between a fragment and its drug-like superstructure conserved?

Yes

75% (IFP sim ≥ 0.8)

Exceptions: protein flexibility, multiple conformations of fragment, multiple tautomeric states, multiple molecules within the binding pocket. Exceptions: protein flexibility.

Is the binding mode between a fragment and a similar drug-like ligand conserved?

Yes

62.5% (ECFP2 sim ≥ 0.7, IFP sim ≥ 0.8)

Interactions are better conserved than pose. Exceptions: protein flexibility.

Is there an influence of fragment size on binding mode conservation? Is there a minimal data set size to observe interaction pattern similarity between fragment and ligand complexes? Can crystallization additives be regarded as fragments in drug design?

Yes

Inferred from 623 similar fragment−ligand pairs Inferred from 235 proteins binding 1079 fragments and 1832 drug-like ligands

Polar interactions are better conserved. Binding mode is generally conserved if MW > 150 Da. Same trend is observed for binding pose. >9 different fragments and ligands should be crystallized in the protein pocket (fulfilled by 11 proteins only).

Yes No

cases

Inferred from 309 proteins binding 1079 fragments and 1832 drug-like ligands

5970

comments

Additive binding modes are more variable than those of fragments and rarely bound alone in the pocket. apo additives however reveal interactions made by druglike ligands. DOI: 10.1021/acs.jmedchem.8b00256 J. Med. Chem. 2018, 61, 5963−5973

Journal of Medicinal Chemistry

Article

structures of additive, fragment, or ligand−protein complexes were filtered as described previously.13 Briefly, structures with a resolution of ≤3 Å that were deposited between January 2000 and August 2016 and containing at least one protein chain were analyzed. Fragments were defined as small molecules with a molecular weight (MW) below 300 Da and between 2 and 18 heavy atoms, whereas ligands were defined as nonfragments following the scPDB24 ligand rules as well as the rule of 525 with up to one exception. Crystallization additives were removed from the list of fragments and ligands and treated separately if they were found to agree with the fragment rules (small additives). Preparation of PDB Files. All PDB files were downloaded from the RCSB PDB Web site and prepared in a number of steps, similar to the previously described procedure.13 First, protein complexes were protonated using Protoss26 and their binding pockets were identified as all residues having at least one atom within 6.5 Å around the ligand. Protein residues were renumbered to match the Universal Protein Resource (UniProt)27 residue numbering. This was achieved using a protein chain sequence alignment with the EMBOSS needle package28 or, in cases with many gaps and mismatches, a structurebased sequence alignment with MOE (Molecular Operating Environment, 2016.08; Chemical Computing Group ULC, Montreal, Canada). For each target with drug-like ligands, one or multiple “common” binding sites were defined as follows. A first distinction was made between pockets containing one (“monomeric” pockets) or multiple protein chains (“dimeric” pockets, etc.). Because the study focuses on the comparison of binding modes within one protein target, heteromultimeric pockets were not considered in the analysis. Second, drug-like ligand binding sites were clustered based on their residue overlap to distinguish between spatially separated sites of a target. This was achieved using a hierarchical clustering with average linkage. When two site clusters show a residue overlap of at least 5%, the clusters are merged. For each binding site cluster, residues occurring in more than 10% of sites were defined as the “common” site. All PDB files containing mutations or missing residues in the common site were removed from the analysis. The remaining PDB files for each target and common site were superimposed to a reference structure using CE.29 The reference was chosen as the centroid according to binding site similarity calculated with the Shaper software.30 Finally, binding modes of all molecules overlapping with the drug-like ligand cavity were investigated in terms of interaction fingerprints (IFPs)31 calculated using the in-house IChem software.32 Interaction fingerprints represent the presence and absence of specific interaction types with the binding site residues. Default IChem settings were used to detect H-bonding (distance between H-bond donor atom and H-bond acceptor atom is lower the 3.5 Å, and H-bond angle is 180° ± 60°), ionic interaction (distance between positively and negatively charged atoms is lower than 4.0 Å), and interaction between a H-bond acceptor and a metallic cation (distance is lower than 2.8 Å). Hydrophobic contacts were detected between carbon atoms if the interatomic distance is lower than 5 Å. Aromatic interactions were detected as follows: in π−π interactions, the distance between π ring centers is lower than 5.0 Å and angle between π ring planes is 180° ± 30 or 90° ± 60; in π−cation interaction, the distance between π ring center and aromatic cation is lower than 5.0 Å and the angle between planes is 180° ± 30. Water molecules, which are not built in all the studied PDB structures, were not considered in the analysis. Comparison of Binding Modes and Interaction Patterns. To compare the binding mode conservation of the same fragment within a protein pocket, the IFP similarity was calculated using the Tanimoto metric (Tc) and the minimal observed similarity for a given molecule and target was investigated. It was also determined whether the fragment in different PDB structures binds to the protein pocket with a similar pose and with a similar conformation. To that purpose, the rmsd of non-hydrogen atom coordinates of the fragment was computed using a script in python 2.7.14, respectively, before and after optimal superposition of the compared fragment poses using the maximum common substructure alignment from oechem library (OpenEye Scientific, USA).

To compare the binding mode conservation of a fragment (or an additive) with the binding mode of drug-like ligand within the same protein pocket, IFP similarity was calculated as a proportion (Pr) of the common interactions divided by the smaller molecule’s interactions, as described by the following formula: oAB PrAB = uA + oAB where molecule A is smaller than molecule B (i.e., A is a fragment or an additive and B is a drug-like ligand), o is the count of overlapping interactions, and u is the unique molecular interactions. For the one-to-one comparisons between a fragment and its druglike superstructure within the same protein pocket (i.e., the substructure pairs), it was also determined whether the fragment alone and included in a larger ligand binds to the protein pocket with a similar pose. To that purpose, shape overlap and chemistry match were computed using ROCS v3.2.0.4 (OpenEye Scientific, USA). Importantly, scoring calculation only was performed since all the structures were 3D-aligned onto the reference structure of the protein pocket beforehand. The shape overlap (ROCS shape) and chemistry match (ROCS color) were evaluated using the Tversky and ColorTversky scores, respectively. To compare the interaction patterns of all the fragments and all the ligands bound to the same protein pocket, consensus interaction fingerprints (cIFPs) were generated. These are numerical fingerprints describing the frequency of each interaction within the given set (fragments or drug-like ligands). To describe the similarity between two cIFPs, the Tanimoto coefficient, defined for continuous variables,33 was used. In addition, numerical cIFPs were converted into binary fingerprints to study the ratios of unique and shared interactions between two molecule sets. Different thresholds to convert the fingerprints were tested, but the clearest results were generally obtained when a threshold of >0 was set. Using this threshold, every interaction observed at least once is converted into an on-bit in the binary fingerprint. Importantly, all interaction similarities (as expressed by IFP Tc, IFP Pr, or cIFP similarity) were calculated separately for hydrophobic and polar interactions and both weighted with 0.5. Molecular Properties. To investigate the similarity/diversity within a molecule set as well as between sets, extended-connectivity fingerprints with the radius 2 (ECFP2) were calculated using Pipeline Pilot 2016 (BIOVIA, Dassault Systèmes, France). Furthermore, several molecular properties were tested for their correlation with binding mode similarity: molecular weight and the count of heavy atoms. Both of them were also calculated with Pipeline Pilot 2016. Pocket Properties. Properties of binding sites were calculated using the in-house IChem software (Volsite30). For each common binding site, the average descriptors of all pockets of this target were determined. Apart from calculating the pocket volume, the software determines all possible interaction points describing different molecular interactions (e.g., hydrophobic points, hydrogen bonding points, etc.). For the pocket volume, the pocket volume variation was also determined by subtracting the minimum from the maximum volume. Box Plot. Box plots were generated by matplotlib 2.1.0 in python 2.7.14. Only Figure S12 use the display obtained using the default settings. In the other figures, the box gives the first and third quartiles, the median is shown in red, the whiskers indicate the first and ninth deciles, and outliers are not represented. A Web interface has been designed to query the data presented in this article. It is freely available at bioinfo-pharma.u-strasbg.fr/ PDBmob.



ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jmedchem.8b00256. 5971

DOI: 10.1021/acs.jmedchem.8b00256 J. Med. Chem. 2018, 61, 5963−5973

Journal of Medicinal Chemistry



Article

deviation; TGFBR1, transforming growth factor β receptor type 1; Tris, tris(hydroxymethyl)aminomethane

Molecular formula strings and PDB and HET codes (XLSX) Number of structures available for fragment−protein complexes (Figure S1), effect of conformational change and structure quality on the binding mode conservation of the same fragment within the same binding pocket (Figure S2), effect on protein flexibility on binding modes conservation (Figure S3), dependence of fragment−ligand binding mode conservation on chemical similarity (Figure S4), fragment size and conservation of binding mode and binding pose (Figure S5), chemical similarity between fragment and ligand sets for the fragment and ligand-rich targets (Figure S6), coverage of human β secretase pocket (Figure S7), binding mode conservation of the same additive bound to multiple structures of the same protein site (Figure S8), binding mode conservation of additive−ligand substructure pairs (Figure S9), dependence of additive−drug-like ligand interaction pattern similarity on the number of different molecules (Figure S10), variability of additive binding mode in multiple structures of the same protein site (Figure S11), nature of interaction made by additives (Figure S12), 3D-structure quality evaluated using the electron density support for individual atoms (Table S1), EDIA score of the complexes described in the figures (Table S2), properties of ligand and fragmentrich targets (Table S3), and unique polar interactions of targets with many fragments and ligands (Table S4) (PDF)



REFERENCES

(1) Doak, B. C.; Norton, R. S.; Scanlon, M. J. The Ways and Means of Fragment-Based Drug Design. Pharmacol. Ther. 2016, 167, 28−37. (2) Rees, D. C.; Congreve, M.; Murray, C. W.; Carr, R. FragmentBased Lead Discovery. Nat. Rev. Drug Discovery 2004, 3 (8), 660− 672. (3) Kalliokoski, T.; Olsson, T. S. G.; Vulpetti, A. Subpocket Analysis Method for Fragment-Based Drug Discovery. J. Chem. Inf. Model. 2013, 53 (1), 131−141. (4) Carr, R. A. E.; Congreve, M.; Murray, C. W.; Rees, D. C. Fragment-Based Lead Discovery: Leads by Design. Drug Discovery Today 2005, 10 (14), 987−992. (5) Hajduk, P. J.; Greer, J. A Decade of Fragment-Based Drug Design: Strategic Advances and Lessons Learned. Nat. Rev. Drug Discovery 2007, 6 (3), 211−219. (6) Price, A. J.; Howard, S.; Cons, B. D. Fragment-Based Drug Discovery and Its Application to Challenging Drug Targets. Essays Biochem. 2017, 61 (5), 475−484. (7) Kozakov, D.; Hall, D. R.; Jehle, S.; Luo, L.; Ochiana, S. O.; Jones, E. V.; Pollastri, M.; Allen, K. N.; Whitty, A.; Vajda, S. Ligand Deconstruction: Why Some Fragment Binding Positions Are Conserved and Others Are Not. Proc. Natl. Acad. Sci. U. S. A. 2015, 112 (20), E2585−E2594. (8) Rathi, P. C.; Ludlow, R. F.; Hall, R. J.; Murray, C. W.; Mortenson, P. N.; Verdonk, M. L. Predicting “Hot” and “Warm” Spots for Fragment Binding. J. Med. Chem. 2017, 60 (9), 4036−4046. (9) Berman, H. M. The Protein Data Bank. Nucleic Acids Res. 2000, 28 (1), 235−242. (10) Meyder, A.; Nittinger, E.; Lange, G.; Klein, R.; Rarey, M. Estimating Electron Density Support for Individual Atoms and Molecular Fragments in X-ray Structures. J. Chem. Inf. Model. 2017, 57 (10), 2437−2447. (11) Lingel, A.; Sendzik, M.; Huang, Y.; Shultz, M. D.; Cantwell, J.; Dillon, M. P.; Fu, X.; Fuller, J.; Gabriel, T.; Gu, J.; Jiang, X.; Li, L.; Liang, F.; McKenna, M.; Qi, W.; Rao, W.; Sheng, X.; Shu, W.; Sutton, J.; Taft, B.; Wang, L.; Zeng, J.; Zhang, H.; Zhang, M.; Zhao, K.; Lindvall, M.; Bussiere, D. E. Structure-Guided Design of EED Binders Allosterically Inhibiting the Epigenetic Polycomb Repressive Complex 2 (PRC2) Methyltransferase. J. Med. Chem. 2017, 60 (1), 415−427. (12) Czodrowski, P.; Hölzemann, G.; Barnickel, G.; Greiner, H.; Musil, D. Selection of Fragments for Kinase Inhibitor Design: Decoration Is Key. J. Med. Chem. 2015, 58 (1), 457−465. (13) Drwal, M. N.; Jacquemard, C.; Perez, C.; Desaphy, J.; Kellenberger, E. Do Fragments and Crystallization Additives Bind Similarly to Drug-like Ligands? J. Chem. Inf. Model. 2017, 57 (5), 1197−1209. (14) Drwal, M. N.; Bret, G.; Kellenberger, E. Multi-Target Fragments Display Versatile Binding Modes. Mol. Inf. 2017, 36 (10), 1700042−1700042. (15) Kontopidis, G.; McInnes, C.; Pandalaneni, S. R.; McNae, I.; Gibson, D.; Mezna, M.; Thomas, M.; Wood, G.; Wang, S.; Walkinshaw, M. D.; Fischer, P. M. Differential Binding of Inhibitors to Active and Inactive CDK2 Provides Insights for Drug Design. Chem. Biol. 2006, 13 (2), 201−211. (16) Tarcsay, Á .; Nyíri, K.; Keserű , G. M. Impact of Lipophilic Efficiency on Compound Quality. J. Med. Chem. 2012, 55 (3), 1252− 1260. (17) Smith, C. R.; Dougan, D. R.; Komandla, M.; Kanouni, T.; Knight, B.; Lawson, J. D.; Sabat, M.; Taylor, E. R.; Vu, P.; Wyrick, C. Fragment-Based Discovery of a Small Molecule Inhibitor of Bruton’s Tyrosine Kinase. J. Med. Chem. 2015, 58 (14), 5437−5444. (18) Malhotra, S.; Karanicolas, J. When Does Chemical Elaboration Induce a Ligand To Change Its Binding Mode? J. Med. Chem. 2017, 60 (1), 128−145.

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Phone: +33 368854221. ORCID

Esther Kellenberger: 0000-0002-9320-4840 Author Contributions

Project coordination: E.K. and J.D. Design of protocol: E.K. and M.N.D. Molecule classification: M.N.D., C.P., and E.K. Implementation of protocol: M.N.D. Data preparation: M.N.D. and G.B. Data analysis: M.N.D., E.K., and C.J. Preparation of manuscript: M.N.D., E.K., C.J., C.P., and J.D. All authors have given approval to the final version of the manuscript. Notes

The authors declare no competing financial interest. Other protein name abbreviations not in the section Abbreviations Used, are given in Table 1.



ACKNOWLEDGMENTS The authors thank the LRAP funding program. This work was supported by Eli Lilly and Company through the Lilly Research Award Program (LRAP).



ABBREVIATIONS USED CAM, camphor; DMSO, dimethyl sulfoxide; FBDD, fragmentbased drug design; IBM, 3-isobutyl-1-methylxanthine; IFP, interaction fingerprint; MES, 2-(N-morpholino)ethanesulfonic acid; MW, molecular weight; MYI, 5-methoxyindole acetate; PDB, Protein Data Bank; PPARγ, human peroxisome proliferator-activated protein γ; rmsd, root-mean-square 5972

DOI: 10.1021/acs.jmedchem.8b00256 J. Med. Chem. 2018, 61, 5963−5973

Journal of Medicinal Chemistry

Article

(19) Congreve, M.; Carr, R.; Murray, C.; Jhoti, H. A “Rule of Three” for Fragment-Based Lead Discovery? Drug Discovery Today 2003, 8 (19), 876−877. (20) Schauperl, M.; Czodrowski, P.; Fuchs, J. E.; Huber, R. G.; Waldner, B. J.; Podewitz, M.; Kramer, C.; Liedl, K. R. Binding Pose Flip Explained via Enthalpic and Entropic Contributions. J. Chem. Inf. Model. 2017, 57 (2), 345−354. (21) Kirkwood, J.; Hargreaves, D.; O’Keefe, S.; Wilson, J. Analysis of Crystallization Data in the Protein Data Bank. Acta Crystallogr., Sect. F: Struct. Biol. Commun. 2015, 71 (10), 1228−1234. (22) McPherson, A.; Cudney, B. Searching for Silver Bullets: An Alternative Strategy for Crystallizing Macromolecules. J. Struct. Biol. 2006, 156 (3), 387−406. (23) Larson, S. B.; Day, J. S.; Cudney, R.; McPherson, A. A Novel Strategy for the Crystallization of Proteins: X-Ray Diffraction Validation. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2007, 63 (3), 310−318. (24) Desaphy, J.; Bret, G.; Rognan, D.; Kellenberger, E. Sc-PDB: A 3D-Database of Ligandable Binding Sites–10 Years On. Nucleic Acids Res. 2015, 43 (D1), D399−D404. (25) Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Experimental and Computational Approaches to Estimate Solubility and Permeability in Drug Discovery and Development Settings. Adv. Drug Delivery Rev. 2001, 46 (1−3), 3−26. (26) Bietz, S.; Urbaczek, S.; Schulz, B.; Rarey, M. Protoss: A Holistic Approach to Predict Tautomers and Protonation States in ProteinLigand Complexes. J. Cheminf. 2014, 6, 12. (27) The UniProt Consortium.. UniProt: The Universal Protein Knowledgebase. Nucleic Acids Res. 2017, 45 (D1), D158−D169. (28) Rice, P.; Longden, I.; Bleasby, A. EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet. 2000, 16 (6), 276−277. (29) Shindyalov, I. N.; Bourne, P. E. Protein Structure Alignment by Incremental Combinatorial Extension (CE) of the Optimal Path. Protein Eng., Des. Sel. 1998, 11 (9), 739−747. (30) Desaphy, J.; Azdimousa, K.; Kellenberger, E.; Rognan, D. Comparison and Druggability Prediction of Protein−Ligand Binding Sites from Pharmacophore-Annotated Cavity Shapes. J. Chem. Inf. Model. 2012, 52 (8), 2287−2299. (31) Marcou, G.; Rognan, D. Optimizing Fragment and Scaffold Docking by Use of Molecular Interaction Fingerprints. J. Chem. Inf. Model. 2007, 47 (1), 195−207. (32) Desaphy, J.; Raimbaud, E.; Ducrot, P.; Rognan, D. Encoding Protein-Ligand Interaction Patterns in Fingerprints and Graphs. J. Chem. Inf. Model. 2013, 53 (3), 623−637. (33) Bajusz, D.; Rácz, A.; Héberger, K. Why Is Tanimoto Index an Appropriate Choice for Fingerprint-Based Similarity Calculations? J. Cheminf. 2015, 7 (1), 20.

5973

DOI: 10.1021/acs.jmedchem.8b00256 J. Med. Chem. 2018, 61, 5963−5973