Interacting with GPCRs: Using Interaction Fingerprints for Virtual

Sep 14, 2016 - Although structure-based virtual screening has been quite successful on GPCRs, scores obtained by docking are typically not indicative ...
0 downloads 10 Views 917KB Size
Subscriber access provided by Stony Brook University | University Libraries

Article

Interacting with GPCRs; on the use of Interaction Fingerprints for Virtual Screening Eelke B. Lenselink, Willem Jespers, Herman W. T. Van Vlijmen, Adriaan P. IJzerman, and Gerard J.P. van Westen J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.6b00314 • Publication Date (Web): 14 Sep 2016 Downloaded from http://pubs.acs.org on September 15, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Interacting with GPCRs; on the use of Interaction Fingerprints for Virtual Screening. Authors: Eelke B. Lenselink Willem Jespers, Herman W. T van Vlijmen, Adriaan P. IJzerman, Gerard J.P van Westen.* Division of Medicinal Chemistry, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands

1 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract

The expanding number of crystal structures of G protein-coupled receptors (GPCRs) has increased the knowledge on receptor function and their ability to recognize ligands. Although structure-based virtual screening has been quite successful on GPCRs, scores obtained by docking are typically not indicative for ligand affinity. Methods capturing interactions between protein and ligand in a more explicit manner, such as Interaction Fingerprints (IFPs), have been applied as an addition or alternative to docking. Originally IFPs captured the interactions of amino acid residues with ligands with specific definitions for the various interaction types. More complex IFPs now capture atom-atom interactions, such as in SYBYL, or fragment-fragment cooccurrences such as in SPLIF. Overall, most of the IFPs have been studied in comparison with docking in retrospective studies. For GPCRs it remains unclear which IFP should be used, if at all, and in what manner. Thus the performance between five different IFPs was compared on five different representative GPCRs, including several extensions of the original implementations,. Results show that the more detailed IFPs, SYBYL and SPLIF, perform better than the other IFPs (Deng, Credo, and Elements). SPLIF was further tuned based on the number of poses, fingerprint similarity coefficient, and using an ensemble of structures. Enrichments were obtained that were significantly higher than initial enrichments and those obtained by 2D-similarity. With the increase in available crystal structures for GPCRs, and given that IFPs such as SPLIF enhance enrichment in Virtual Screens, it is anticipated that IFPs will be used in conjunction with docking, especially for GPCRs with a large binding pocket.

2 ACS Paragon Plus Environment

Page 2 of 32

Page 3 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Introduction GPCRs have entered the age of structure based drug discovery, with the major driving force being the explosion in the number of crystal structures.1, 2 Starting from the first solved GPCR crystal structure about fifteen years ago,3 the Protein Data Bank (PDB)4 now comprises a total of over 140 GPCR crystal structures (March 2016). Most importantly, these crystal structures have been solved following a rational schema, resulting in the availability of a diverse palette of GPCR crystal structures. Examples include GPCRs from the three main classes (A/B/C), with ligands as diverse as nucleotides, peptides, proteins, and ions.1 Application of Virtual Screening (VS) on these structures has led to a number of success stories with hit rates typically exceeding 30%.2, 5 Although docking algorithms perform quite well in distinguishing between ligand and decoys, these algorithms typically fail at predicting binding

affinity.6,

7

To

prevent

the

occurrence

of

unfavorable

docking

poses/interactions, most prospective VS methods additionally include a step of visual inspection. 8, 9 To circumvent these shortcomings methods can be used that leverage the information present in crystal structures, namely the interactions between ligand and proteins. Interaction fingerprints (IFPs) can be used to capture the information present in these interactions in the form of a fingerprint. First introduced by Deng et al. an IFP consists of 1) the identification of residues interacting with a ligand and 2) classification of this interaction into predetermined interaction types (such as ‘hydrogen-bonds’ or ‘hydrophobic interactions’).10 A drawback of the explicit description of interaction types is the possibility of missing interaction types by not defining them.11 Trying to bypass this problem, subsequent methods were introduced 3 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

to cover more interaction types, e.g. to distinguish between weak and strong hydrogen bond donors and acceptors.12

Building on these results, subsequent methods further divided interactions from ligand – residue into atom – atom interactions (such as in Elements and SYBYL) and were applied to the PDBbind dataset using machine learning.13-15 Recently IFPs were introduced that use 2D fingerprints (such as Pipeline Pilot’s ECFP)16 to capture the local neighborhood of interactions; in this way different types of interactions (such as pi-pi) are encoded implicitly.11 In the context of GPCRs IFPs have been applied quite successfully, in both retrospective work and prospective VS17, 18 wherein most of the studies on GPCRs have used residue-based IFPs. However, the head-to-head performance of different IFPs has not been tested making it unclear what distinguishes the different IFPs. Moreover, no clear idea exists which fingerprint should be used in which situation. Therefore, the aim here was to characterize a variety of published IFPs and benchmark their performance on GPCRs. This target class was chosen as a representative benchmark given that IFPs have previously been applied successfully to this target class. Furthermore, the recent growth in GPCR crystal structures makes this class an ideal test case. To identify the best performing IFP in terms of retrieving ligands, five diverse GPCRs, were selected, with five sets of approximately 100 ligands, and matched the ligands with 50-fold the amount of decoys using the DUD-e web service.19 The results indicate that more complex IFPs like SPLIF perform best and further optimization of the SPLIF IFP led to enrichments that in most cases were significantly higher than either docking or 2D fingerprint based models.

4 ACS Paragon Plus Environment

Page 4 of 32

Page 5 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Methods Receptors Structures were prepared using Schrödinger’s Maestro Suite (v9.9).20 In order to obtain a diverse selection of class A GPCRs, crystal structures of the following receptors were retrieved from the protein databank (PDB);4 adenosine A2A (4EIY),21 histamine H1 (3RZE),22 chemokine CCR5 (4MBS),23 β2-adrenergic (2RH1),24 and muscarinic M2 receptor (3UON).25 These structures were further processed using the Protein Preparation Wizard and protonation states were assigned using PROPKA.26 In the case of the adenosine A2A and β2-adrenergic receptor, multiple crystal structures were available. Ensembles were created of receptors co-crystalized with an antagonist or inverse agonist. For the adenosine A2A receptor the following structures were used: 4EIY, 3EML, 3UZA, 3UZC, 3RFM, 3PWH, and 3REY. Similarly, for the β2-adrenergic receptor: 2RH1, 3D4S, 3NY8, 3NY9, 3NYA, and 4GBR. In addition, the following crystal structures were selected for the Database of Useful Decoysenhanced (DUD-e), β1-adrenergic receptor ensemble: 3ZPQ, 3ZPR, 2YCZ, 2YCW, 2VT4. Enrichment studies were also performed on the original DUD-e dataset. For this benchmark the same ligands, decoys, and crystal structures were used as used in the original DUD-e paper.19 A generalized workflow used in this study can be found in the Supporting information (SI Figure S1).

5 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Ligand set For each receptor, a set of compounds was retrieved from ChEMBL (release 19).27 Antagonists were selected antagonists with high affinity (pChEMBL > 7) and a molecular weight < 500 Da. Retrieved data was further curated by 1) checking the original papers to make sure no errors had been made during data deposition and 2) manual removal of presumed agonists. To ensure chemical diversity of the sets, 100 compounds were selected based on a clustering analysis using FCFP_4 fingerprints (Pipeline Pilot 9.0).16 For the β2 receptor less than 100 compounds were found after manual curation. To ensure chemical diversity, the set was then recreated using an affinity cut-off of pChEMBL > 6 (rather than 7) and a Molecular Weight < 600 Da (rather than 500). Decoys were generated for every compound using the web service of DUD-e.19 In short, this Web service first generates ligand protonation states (pH range 6–8 predicted with Epik28, 29). Around 50 decoys per ligand are matched with the different ligand protonation states using physicochemical properties (miLogP, rotatable bonds, hydrogen bond acceptors, hydrogen bond donors, and net charge) and are selected based on dissimilarity (25%, using ECFP416). This yielded 110 (A2A), 133 (H1), 139 (M2), 123 β2, and 120 (CCR5) states for the ligands. These ligand states were matched with 5500 (A2A), 6650 (H1), 6905 (M2), 6150 β2, and 6000 (CCR5) decoys. Subsequently 3D-coordinates and tautomeric states were generated using LigPrep52 using default settings.

6 ACS Paragon Plus Environment

Page 6 of 32

Page 7 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Fingerprint generation Grid generation and docking were done using Glide in Standard Precision (SP) mode. The 50 best poses were kept for further evaluation and five different types of fingerprints were generated for each pose. Initial exploration was done with a single pose (see: ‘More complex IFPs outperform simpler IFPs’) and subsequently the effect of sampling was investigated (see: ‘Influence of number of poses’)

Basic Interaction Fingerprints The first method used in this study is an extension of the original fingerprints by Deng et al.,10 which contain seven interaction types: any, main chain contacts, side chain contacts, polar/non-polar contacts, hydrogen bond donors, and hydrogen bond acceptors. The script is available in the Schrodinger software suite, which adds interactions with aromatic residues and charged residues on top of the original interaction types (Table 1: Deng).30 The other basic methods used were modified versions of the three approaches that were implemented by Ballester et al.14 using the open source chemistry toolkit Open Babel v2.3.031 and SciPy.32 This included Credo which further extends the amount of interaction types by adding weak hydrogen bonds (Table 1: Credo).33 Additionally, elements was used, which uses all commonly observed heavy elements in the protein and ligand (i.e. C, N, O, F, P, S, Cl, Br, I), subsequently paired according to a distance cut-off of 4.5Å (Table 1: Elements). A further extension (fourth method) is the SYBYL IFP which includes Sybyl atom types (Table 1: Sybyl).34 Considering for example two carbon atoms in a protein ligand complex, the interaction is covered by only one entry (C-C) in the Elements method, but can be up to 36 different entries using Sybyl element types (C1, C2, C3, C+, Cac, and Car on both ligand and receptor side). In this study, counts per residue were 7 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

generated instead of the sum of all counts for a given ligand-protein complex for these four fingerprint methods.

Structural Protein-Ligand Interaction Fingerprints (SPLIF) The final method used here, Structural Protein-Ligand Interaction Fingerprints (SPLIF), was obtained from Kireev et al., and was used as in the original article.11 Originally, Extended Connectivity Fingerprints up to the first close neighbor (ECFC2) and 3D coordinates were generated for atoms that are considered ‘in contact’ (within 4.5 Å of each other) using Pipeline Pilot version 9.0. Building on this, fingerprints were generated using combinations of: 1) different atom abstractions (Functional class, Atom Type, AlogP types, and SYBYL atom types); 2) fingerprint type (Extended-connectivity and Path-based); 3) usage of Fingerprints or counts; 4) the maximum distance diameter (2, 4, and 6); and 5) using either the Number of overlapping Atoms or the Number of overlapping Fingerprints (the original SPLIF score is based on overlapping number of atoms).

8 ACS Paragon Plus Environment

Page 8 of 32

Page 9 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

SPLIF similarity modifications Finally the original formula of SPLIF was modified to calculate a Tanimoto-based similarity:  =

    ×   +  ×   +  ×     +  ×  +  × 

where SA are shared fingerprint bits (either between docked ligand and reference ligand or between docked protein structure and reference structure) and SB the unique bits of the docked pose (both ligand and protein), excluding the matched bits (SA). SC represents the number of unique bits of the reference (crystal structure) ligand and protein. For Tversky α was modified between 0 and 1 and β was set to 1- α. For Tanimoto both α and β were set to 1. For reasons of speed and throughput SBprot was set to 0 as evaluating the unique protein atoms for all the docked poses slowed down the calculation about a 100 fold (1 pose every 10 seconds instead of 10 every second). For 2D similarity scoring (e.g. table 1, 2D) with respect to the co-crystalized ligand ECFP_4 was used, as implemented in Pipeline Pilot 9.0.16

9 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 32

Enrichment calculation Enrichment was calculated using the Boltzmann Enhanced Discriminator of the area under the Receiver Operating Characteristic (BEDROC) metric by Truchon et al.35 wherein actives were ranked based on either the docking or similarity score. Unless mentioned differently BEDROC scores (α = 160.9) were used as the default metric for enrichment. BEDROC scores focus on the early enrichment. Higher values of α increase the weight of the early enrichment curve. For instance, an α of 160.9 corresponds with 80% of the BEDROC score coming from the top 1% of the scored compounds. Bootstrap resampling was performed using the R package ‘boot’36,

37

with 1000 iterations; in this way 95% confidence intervals were calculated.

Model Ensembling If multiple crystal structures were available for a receptor, multiple structures were used in a model ensembling approach (see receptors). These docking results were aggregated and the Z-score was calculated by subtracting the mean calculated similarity score over the whole dataset of individual proteins and dividing by the standard deviation of that mean. Based on these Z-scores we took the maximum, average, and weighted average of all docking results for these receptors. Weighted average was calculated by weighing the individual Z-scores by the enrichment and subsequently taking the average. For the Z2, Z3 and Z4 scores the average of the two or three highest Z-scores as has been published before was calculated.38

10 ACS Paragon Plus Environment

Page 11 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Results More complex IFPs outperform simpler IFPs After docking of the ligand/decoy sets on the five different targets, five IFPs were generated based on the highest scoring pose. An overview of the results is shown in Table 1. Because the interest of this study was in the early enrichment scores of the Virtual Screen, BEDROC scores (α = 160.9) were used, for reference AUC values are listed in table S1. The simpler IFPs (Deng, Credo, and Elements) were unsuccessful in enriching ligands over decoys, demonstrated by the overall low BEDROC scores and confidence intervals. The two more complex IFPs (Sybyl and SPLIF) outperformed the other IFPs, with the latter showing the best enrichment. Interestingly most IFPs still were better than docking, with the histamine H1 receptor being the exception, where docking performed better than all other methods. However, enrichments obtained by SPLIF (and other methods) were inferior to 2D fingerprint similarity. 2D similarity was expected to be among the best performing methods since decoys were selected based on dissimilarity (see methods). Although the same type of fingerprint (ECFC) was used to generate the fingerprints, SPLIF and 2D similarities are rather different approaches, demonstrated by the fact that these two methods did not show any significant correlation (R2 = 0.31, SI Figure S2). Remarkably there is a correlation between the similarities calculated by the first three methods (Deng/Credo/Elements), yet this correlation is not observed in the eventual enrichment. The enrichment of Credo (BEDROC of 0.40) was higher than in the Deng method (BEDROC of 0.14). This corroborates that inclusion of more interaction types helps in distinguishing between actives and decoys.

11 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Fingerprints that were atom- or fingerprint-based (Elements, SYBYL, and SPLIF) yielded higher enrichments than the residue-based Deng and Credo fingerprints. This observation supports the case that IFPs with a higher resolution lead to better results. To study if enrichment could be further improved, the best performing IFP, i.e. SPLIF was fine-tuned. Different settings were tested and validated one by one, which are discussed in the following sections.

Influence of number of poses To determine the influence of docking sampling on the scoring of IFPs the number of poses was incrementally increased from 1 (used in the previous section) to 50. SPLIF fingerprints were used with the same settings as previously described. The most pronounced effect was observed going from 1 to 5 poses. In all cases enrichment increased using more poses with an optimum at 8 or 9 poses, after which enrichment either remained constant or slowly decreased for some targets (Figure 1). The decrease can be explained by decoy abundance compared to ligands, making it more probable that maximum similarities are increased for poorly scoring poses of decoys, a form of chance correlations. The average enrichment demonstrates that sampling 710 poses is generally optimal in terms of enriching ligands over decoys, with an average BEDROC improving from 0.57 to 0.66. The most pronounced effect was observed for the chemokine CCR5 receptor where at least 5 poses per ligand should be used. Generally the CCR5 ligands tend to be large and have many rotatable bonds; on average a MW of 474 Da and 7.6 rotatable bonds per ligand, with a MW of 369 Da and 5.6 rotatable bonds being the average of the other datasets. Additionally, the relatively large binding pocket could be the cause 12 ACS Paragon Plus Environment

Page 12 of 32

Page 13 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

of the pronounced sampling effect. This was already reflected by the very low enrichment for the docking score in the CCR5 receptor (average BEDROC 0.04, Table 1).

Influence of SPLIF Fingerprint type To test the influence of different settings of SPLIF on the enrichment settings were changed one by one starting from default (Table 2). In all cases 8 poses were used as it was shown to be the best tradeoff between speed and accuracy (see previous section). First the similarity calculation was modified, changing the number of overlapping atoms to the number of overlapping fingerprints. Using fingerprints increased the enrichment except for the histamine H1 receptor, for which enrichment slightly decreased. Next the influence of the atom abstraction in the fingerprint was determined. By default Atom Type was selected which uses the type of atom, charge, and hybridization (ECFC) and this is coupled to a radius of 2 bonds (ECFC2). 12 different fingerprints were tested and validated (SI Table S2), and here it was observed that the optimal type of fingerprint appears to be target dependent. For instance, on average over all targets the best fingerprint is ECFC6 (showing the most consistent performance) yet this particular fingerprint was surpassed by FCFC6 for three individual targets (M2, β2, and CCR5). Conversely, ECFC6 was better than FCFC6 on the A2A (0.568/ 0.511) and H1 (0.554/0.458) receptors. Finally, for most fingerprints, increasing the radii increased enrichment. In general, although individual changes led to small increases, overall the sum of the changes significantly improved SPLIF performance. In Table 2 the enrichments are shown for the optimal fingerprint per target, with an average enrichment of 0.70. 13 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 32

Influence of SPLIF Similarity coefficient Further exploring the limits of SPLIF, the similarity calculation was modified to a Tanimoto like similarity (see methods), which slightly increased enrichment for most targets (average enrichment rising from 0.65 to 0.68). Next a range of different settings of the Tversky similarity coefficient was evaluated. Depending on the value of α (see methods) more weight is put on either the sub- or superstructure. With a low value of α, the similarity is biased towards the reference ligand (subsimilarity) while with a high value of α it is biased towards the docked ligand (supersimilarity). Based on the different settings that were tested, it was observed that these settings are also target dependent. For instance, for the CCR5 receptor the optimal value of α was found to be 0, which indicates that especially the substructures in the docked and crystal structure ligand are shared. On average the optimal enrichment was found at α = 0.4 (close to 0.5, equal weights), further indicating that these settings are target dependent.

Optimal settings Because the differences in results between the different settings

were not

significant in most cases, and individual changes led to very small increases, it was worth exploring how a combination of different settings would work. To find the optimal settings experiments were performed to determine if a combination of all the individual settings could be found such that one of two conditions was met. The first condition encompassed settings that lead to a better performance than prior to optimization for all targets (combination 1, “default”), whereas the second held settings wherein the best performance was obtained that could be obtained for an 14 ACS Paragon Plus Environment

Page 15 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

individual target (combination 2, “optimal”). For this all the different settings from Table 2 were explored and a grid search was performed on 72 combinations of variables (matching atoms or fingerprint, atom-abstraction, similarity metric; SI Table S2-S7). In this search the previously found optimal settings for each target were used for the Tversky similarity. The best default setting proved to be atom similarity, using ECFC6, and Tanimoto similarity. Although the enrichments of the Tversky similarity were higher, proper fine-tuning of the parameter α is needed, rendering this approach not applicable as a “default” (combination 1, “default”) setting for future usage on other datasets. On average combination 1 (“default”) was found to work slightly better than 2D similarity with an enrichment of 0.71 vs. 0.70 (Tables 1 and 2, respectively). By using the optimal settings per target (combination 2, “optimal”) average enrichment further increased from 0.71 to 0.76, outperforming the use of both default settings and 2D similarity. Indeed by optimizing the SPLIF parameters the average enrichment has increased from 0.58 (Table 1) to 0.76 (Table 2). Still, also here increase of performance using different settings was target specific. Therefore it is recommended to optimize the settings of the IFP for every target individually.

Including multiple crystal structures As more crystal structures of a single GPCR are being published, with different ligands and in different conformational states, here the use of multiple crystal structures to enhance enrichment was explored. For two of the explored targets, the adenosine A2A receptor and the β2-adrenergic receptor, multiple crystal structures were available. Therefore ligand/decoy sets were docked into crystal structures cocrystalized with an antagonist (see methods). For the similarity calculation the 15 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

settings were used from Table 2; combination 1 “default”. Interestingly the results were not the same for both GPCRs (Figure 2). For the β2-adrenergic receptor enrichments were on average higher among all crystal structures (lowest enrichment: 0.51 and highest: 0.79, SI Table S8) possibly because the co-crystalized ligands are more similar for the β2-adrenergic receptor. For the adenosine A2A receptor enrichment was poor for the structures co-crystalized with xanthine based scaffolds (average enrichment 0.12, SI Table S9) and highest for the crystal structures cocrystalized with ZM 241385 (average enrichment 0.56, SI Table S9), with the remainder in between (average enrichment 0.32, SI Table S9). The enrichment was stable over different methods of data fusion that were sampled (see methods, ‘Model Ensembling’). Overall, the weighted average worked best suggesting that certain crystal structures do not contribute to the enrichment at all. Recently the average of the maximum of two methods combined (Z2 score) has been shown superior to other data-fusion methods.38 Building on this the average of the maximum of two (Z2), three (Z3) or four (Z4) targets was tested. Indeed the Z2 scores yielded high enrichments, with an enrichment of 0.73 for the adenosine A2A receptor and 0.83 for the β2-adrenergic receptor, both higher than for the other ensemble methods tested here, likewise the Z3 scores were performing similar. However performance dropped slightly going from Z3 to Z4 indicating that either the Z2 or Z3 score is preferred.

16 ACS Paragon Plus Environment

Page 16 of 32

Page 17 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Validation on original DUD-e ligand and decoy set To further demonstrate the applicability of this method to GPCRs, SPLIF was applied to the original dataset of the directory of useful decoys enhanced (DUD-e) paper.19 The GPCR subset was used as a starting point and compared docking score, usage of the original SPLIF, usage of the best default method (combination 1, “default”), and usage of an ensemble of structures if available. Original enrichment factors are also reported; in most cases SPLIF achieves better enrichments than both docking methods (Table 3). On average the largest enrichment was found when the default combination SPLIF (combination 1) was used. This was observed for all targets except for the chemokine CXCR4 receptor, indicating that this setting was superior to the default settings in most cases. However if the goal is to achieve maximal possible enrichment, proper fine-tuning of the settings is required (e.g. combination 2, “optimal”; Table 2). Moreover, similar to the CCR5 receptor, the CXCR4 receptor contains a large binding pocket for which docking (Glide DockScore) performed poorly, especially when compared with the enrichment obtained with IFPs. Interestingly none of the methods performed well on the dopamine D3 receptor, although enrichment at this receptor has been relatively low in other (successful) prospective screens as well.39 The reason for this may be the high chemical diversity of other D3 receptor ligands compared to the co-crystalized ligand. Model ensembling using either the Z2 or Z3 score performed better than single models for both the adenosine A2A receptor and β2-adrenergic receptor, where for the latter enrichment was increased by almost twofold (33.5% versus 19.5%). This is likely due to the fact that the enrichment of the used crystal structure (3NY8: EF1%= 14.3%, also in Figure 2) was the lowest hence there is room for improvement. 17 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Conversely, when the enrichment of the ensemble with the maximum enrichment on a single target (3D4S: EF1% = 31%) is compared enrichment was also increased, albeit by 8% only. This effect was not observed for the β1-adrenergic receptor where enrichment of the single structure was slightly higher than the ensemble (2VT4: EF=32.0% versus 26.3%), most likely due to the fact that enrichment of the used single structure was already relatively high.

18 ACS Paragon Plus Environment

Page 18 of 32

Page 19 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Discussion In this study different IFPs on a combined total of eight diverse GPCRs were systematically studied. The IFPs can be classified into different types ranging from simple (in which only interactions between the ligand and protein residues are captured) to complex (e.g. atom-atom based IFPs such as SYBYL). A correlation between the different methods was observed (SI Figure S2), where a relatively high correlation between the first three methods was found. Yet differences in ligand over decoy enrichment were observed between those IFPs. Methods that are residue-based such as the IFP of Deng et al.10 were expected to be inferior to methods that include more residue interaction types, such as the method introduced by Rognan et al.12 Indeed this was also confirmed by the results where the Credo IFP33 outperformed the IFP of Deng et al. Credo is almost identical to the IFP implemented by Rognan et al,12 except that it includes explicit halogen bonds and carbonyl interactions.33,40 Interestingly even the relatively simple Credo IFP was able to achieve better enrichment than docking. More complex atom-atom interaction type IFPs proved to accomplish better performance than the residue-based IFPs. Similarly, SYBYL IFP was better than the Elements IFP, as it contains less atom-atom interactions types than the SYBYL IFP. Recently Credo, Elements, and SYBYL have been applied to a subset of the PDBbind database by Ballester et al. in order to create a knowledge-based scoring function based on Random Forests.14 Contrary to the present study, they found that Elements IFPs performed better than SYBYL IFP, although a few aspects should be considered when comparing the current work with the paper of Ballester et al. First of all here residues numbers were explicitly included in the IFP, in contrast to the total 19 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

sum of interactions used in their study. The inclusion was done to enhance the interpretability and to further dissect between different residues and the corresponding protein-ligand interactions.

Future use applied to GPCRs could benefit from

alignment/structure based residue numbers41 such as has been applied by Kooistra et al. to distinguish between agonist and antagonists.18 Secondly the PDBbind42 set used by Ballester et al. can be considered as a sparse set of diverse protein-ligand interactions, while here only calculated similarities with respect to the co-crystalized ligand in the GPCRs were considered. It is anticipated that future studies on a larger and more diverse set of crystal structures combined with an IFP could provide a useful machine learning-based docking score. Still, care should be taken that such a method is validated thoroughly.

43

In fact, recently a combination of multiple IFPs

such as SPLIF has been used to train a Convolutional Neural Network to classify actives over decoys.44 The proposed method, AtomNet, achieved better enrichment than SMINA45 (another empirical scoring function) on a number of different datasets. Although the majority of IFPs was better than docking in most targets, a clear target dependency was observed. Contrary to our hypothesis, for some targets docking was better than all IFPs (e.g. the histamine H1 receptor). In these cases it is beneficial to include both the docking scores and IFP similarities. This has been applied successfully, for instance previously by Kooistra et al. in a prospective VS in which a hit rate of 73% was achieved, indicating the power of combining both methods.17 Conversely, there are targets for which docking performs poorly (e.g. the chemokine CCR5/CXCR4 receptors). For these targets using IFP enhanced the enrichment, especially when multiple poses were included. In the original SPLIF paper the top 30 poses were used,11 according to the results no more than 10 are needed in the case of GPCRs, although this might also be docking algorithm dependent. In a subsequent VS 20 ACS Paragon Plus Environment

Page 20 of 32

Page 21 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

paper on the Mer receptor tyrosine kinase by Kireev et al.46, a total of three (highest scoring) compounds were included. They additionally used FCFP4 fingerprinting in contrast to the ECFP2 fingerprints in the original protocol. This corroborates the observations that optimization of SPLIF scoring can be achieved by tailoring the scoring parameters for individual targets. Finally, decreasing the number of poses also increases the throughput of the method. Indeed, also in the current work further optimization of the SPLIF method demonstrated that optimal results are target dependent, meaning that there is not one optimal setting for all targets. Moreover the results on the ensemble of crystal structures demonstrated that enrichments could be enhanced even further, again depending on the target. With the exponential growth in available crystal structures for GPCRs, docking against an ensemble of these structures could become a routine task. Since generating different IFPs and calculating their similarities is computationally not very demanding, 3Dbased similarities such as the IFPs will likely be incorporated on a more frequent basis. For two well sampled protein classes, kinases 47 and phosphodiesterases,48 such IFP/ structure-based databases already exist.

21 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Conclusions With more GPCR structures becoming available and the good performance of IFPs on membrane proteins, the current work serves as a guideline to IFP implementation in structure-based GPCR Virtual Screens. Application of SPLIF IFP was optimized, which was originally selected for its speed, flexibility, and performance. Enrichments were achieved that exceeded 2D similarity, especially when the optimized version of SPLIF was used. Especially for receptors with larger binding pockets, such as the chemokine CCR5 and CXCR4 receptor, SPLIF excelled over docking. Further testing of SPLIF on ensembles of crystal structures co-crystalized with antagonists demonstrated that combining the models using Z2 or Z3 scores enhanced the enrichment. Additional experiments on the Database of useful Decoys enhanced (DUD-e) further validated these findings.

22 ACS Paragon Plus Environment

Page 22 of 32

Page 23 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 1. Overview of the influence of the number of poses on enrichment (BEDROC, α = 160.9) for the ligand/decoys sets of the five different GPCR targets. Enrichment was calculated for 1 pose, 5-15 poses, 20, 25, and 50 poses.

23 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. Performance of the different methods of model ensembling. The minimum, maximum, and average enrichment for different crystal structures is shown along with the performance of the different methods of model ensembling. Both the adenosine A2A and β2adrenergic receptor were considered for which multiple crystal structures were available (see text for further explanation).

24 ACS Paragon Plus Environment

Page 24 of 32

Page 25 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 1. Overview of the enrichment (BEDROC, α = 160.9) for the different IFPs on the five different GPCRs

Interaction Fingerprintsa Receptor

Otherb

Deng

Credo

Elements

SYBYL

SPLIF

Docking

2D

0.15

0.25

0.20

0.33

0.49

0.21

0.60

(4EIY)

(0.140.33)

(0.140.36)

(0.080.30)

(0.200.45)

(0.370.62)

(0.110.31)

(0.490.74)

Histamine H1

0.06

0.55

0.44

0.59

0.44

0.66

0.32

(3RZE)

(0.000.11)

(0.420.68)

(0.310.58)

(0.470.71)

(0.320.57)

(0.560.78)

(0.180.45)

Muscarinic M2

0.16

0.49

0.68

0.68

0.70

0.40

0.84

(3UON)

(0.070.23)

(0.360.62)

(0.590.81)

(0.580.80)

(0.600.81)

(0.270.53)

(0.780.92)

β2-adrenergic

0.15

0.31

0.33

0.51

0.60

0.29

0.90

(2RH1)

(0.060.24)

(0.180.44)

(0.200.47)

(0.400.65)

(0.490.72)

(0.190.39)

(0.860.96)

Chemokine CCR5

0.20

0.40

0.53

0.74

0.65

0.04

0.85

(4MBS)

(0.090.31)

(0.280.52)

(0.420.66)

(0.650.84)

(0.550.79)

(0.010.07)

(0.800.93)

Average

0.14

0.40

0.44

0.57

0.58

0.32

0.70

(PDB ID) Adenosine A2A

Overview of the enrichment (BEDROC, α = 160.9) for the ligand/decoys sets of the five different GPCR targets. a) Enrichment for the five different IFPs is given including enrichments of the basic IFPs (methods – ‘Basic Interaction Fingerprints’) and enrichment of the SPLIF IFP (methods – ‘Structural Protein-Ligand Interaction Fingerprints’). b) In addition, we have added enrichment for the docking score and 2D similarity. The enrichments highlighted in bold show the best performing IFP of the methods tested here. Only the top ranked pose was considered in IFP calculations.

25 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 32

Table 2. Enrichments (BEDROC, α = 160.9) obtained by further optimization of SPLIF

Receptor

SPLIF

SPLIF

SPLIF

SPLIF Fingerprint

(PDB ID)

(default)

(FP)

Adenosine A2A

0.55 (0.430.67)

0.62 (0.520.75)

0.53 (0.410.67)

0.57 (0.460.71) 0.7

0.57 (0.45-0.70) LCFC2

0.56 (0.450.69)

0.63 (0.530.76)

0.51 (0.400.64)

0.48 (0.360.60)

0.61 (0.510.75)

0.64 (0.530.75) 0.1

0.54 (0.43-0.67) ECFC6

0.61 (0.500.74)

0.71 (0.610.82)

0.76 (0.670.87)

0.78 (0.680.89)

0.78 (0.700.88)

0.78 (0.690.88) 0.5

0.78 (0.68-0.89) SCFC6

0.78 (0.690.89)

0.80 (0.720.90)

0.64 (0.550.76)

0.73 (0.640.84)

0.66 (0.560.78)

0.67 (0.560.78) 0.6

0.75 (0.68-0.85) FCFC6

0.72 (0.640.83)

0.77 (0.690.88)

0.81 (0.740.90)

0.81 (0.740.90)

0.83 (0.770.92)

0.84 (0.780.92) 0.0

0.86 (0.81-0.93) SCFC6

0.86 (0.800.93)

0.89 (0.840.95)

0.65

0.68

0.68

0.70

0.70

0.71

0.76

(4EIY) Histamine H1 (3RZE) Muscarinic M2 (3UON) β2adrenergic (2RH1) Chemokine CCR5 (4MBS) Average

(Tanimoto) (Tversky)

Combination Combination 1 2

The default performance based on 8 poses (as shown in Figure 1) is given. First we tested the influence of using atom based (default) or fingerprint (FP) based IFPs. Next we implemented and tested both a Tanimoto and Tversky similarity version of SPLIF. Tversky similarities were tested using values for α between 0 and 1 (see methods), the optimal value for α is given in italics. Finally we tested different atom abstraction types, here the best fingerprint is shown in italics. After fine tuning of individual settings we tested different combinations of the settings; combination 1 which can be considered as a “default” setting irrespective of fine tuning and combination 2 which is a combination of the best settings per protein in this table “optimal”. Best performing method is indicated in bold.

26 ACS Paragon Plus Environment

Page 27 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 3. Validation of the different methods on the directory of useful decoys enhanced (DUDe) database for the GPCR subset.

Receptor

Combination 1 Published DockScore SPLIF

(PDB ID)

Ensemble ‘default’

(EF1%)

(EF1%)

(EF1%)

Z2/Z3 (EF1%) SPLIF (EF1%)

β1-adrenergic 10.5

22.9

30.0

32.0

26.7/26.3

3.9

17.9

14.3

19.5

33.0/33.5

4.4

2.5

3.8

-

17.5

0.0

30.0

25.0

-

21.8

13.1

42.0

46.2

50.7/50.7

11.6

19.3

23.7

25.3

-

(2VT4) β2-adrenergic (3NY8) Dopamine D3 (3PBL)

4.4

CXCR4 (3ODU) Adenosine A2A (3EML) Average

Enrichment Factors (EF1%) as have been published are listed together with the enrichment of the docking score (DockScore) Enrichment of the original SPLIF method (SPLIF), default optimized SPLIF (combination 1, table 2) and the enrichment of the ensemble (Z2 and Z3 scores) for receptors for which multiple crystal structures were available. The best performing method is shown in bold.

ACS Paragon Plus Environment

27

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 32

Supporting Information Correlation between the different methods, enrichment of individual settings and crystal structures. This material is available free of charge via the Internet at http://pubs.acs.org. Corresponding Author Corresponding Author:

Dr. Gerard J. P. van Westen, Leiden Academic Center for Drug

Research, Leiden University, Einsteinweg 55, 2333 CC Leiden, the Netherlands Phone: +31 (0)71 527 4651, Fax; +31 (0)71 527 4277 E-mail: [email protected]

Acknowledgements We would like to acknowledge D. Kireev, from the Center for Integrative Chemical Biology and Drug Discovery, University of North Carolina at Chapel Hill for providing the Pipeline Pilot script to calculate the SPLIF fingerprints. Adriaan P. IJzerman and Eelke B. Lenselink thank the Dutch Research Council (NWO) for financial support (NWO-TOP #714.011.001). Gerard J.P. van Westen thanks NWO and Stichting Technologie Wetenschappen (STW) for financial support (STW/NWO-Veni #14410).

Author Contributions Eelke B. Lenselink conceived the study. Eelke B. Lenselink and Willem Jespers performed the experiments. Eelke B. Lenselink, Willem Jespers and Gerard J.P. van Westen wrote the manuscript. All authors discussed, contributed to, and revised the manuscript.

ACS Paragon Plus Environment

28

Page 29 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

References 1. Katritch, V.; Cherezov, V.; Stevens, R. C., Structure-function of the G protein-coupled Receptor Superfamily. Annual review of pharmacology and toxicology 2013, 53, 531-556. 2. Andrews, S. P.; Brown, G. A.; Christopher, J. A., Structure‐Based and Fragment‐Based GPCR Drug Discovery. ChemMedChem 2014, 9, 256-275. 3. Palczewski, K.; Kumasaka, T.; Hori, T.; Behnke, C. A.; Motoshima, H.; Fox, B. A.; Le Trong, I.; Teller, D. C.; Okada, T.; Stenkamp, R. E., Crystal Structure of Rhodopsin: A G Protein-coupled Receptor. Science 2000, 289, 739-745. 4. Helen M. Berman, J. W., Zukang Feng, Gary Gilliland, T. N. Bhat, Helge Weissig, Ilya N. Shindyalov, and Philip E. Bourne, The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235242. 5. Beuming, T.; Lenselink, B.; Pala, D.; McRobb, F.; Repasky, M.; Sherman, W., Docking and Virtual Screening Strategies for GPCR Drug Discovery. G Protein-Coupled Receptors in Drug Discovery: Methods and Protocols 2015, 251-276. 6. Warren, G. L.; Andrews, C. W.; Capelli, A.-M.; Clarke, B.; LaLonde, J.; Lambert, M. H.; Lindvall, M.; Nevins, N.; Semus, S. F.; Senger, S., A Critical Assessment of Docking Programs and Scoring Functions. J. Med. Chem. 2006, 49, 5912-5931. 7. Li, Y.; Han, L.; Liu, Z.; Wang, R., Comparative Assessment of Scoring Functions on an Updated Benchmark: 2. Evaluation methods and general results. J. Chem. Inf. Model. 2014, 54, 1717-1736. 8. Alvarez, J.; Shoichet, B., Virtual screening in drug discovery. CRC press: 2005. 9. Irwin, J. J.; Shoichet, B. K., Docking Screens for Novel Ligands Conferring New Biology. J. Med. Chem. 2016. 10. Deng, Z.; Chuaqui, C.; Singh, J., Structural Interaction Fingerprint (SIFt): a Novel Method for Analyzing Three-dimensional Protein-ligand Binding Interactions. J. Med. Chem. 2004, 47, 337-344. 11. Da, C.; Kireev, D., Structural Protein–Ligand Interaction Fingerprints (SPLIF) for Structure-Based Virtual Screening: Method and Benchmark Study. J. Chem. Inf. Model. 2014, 54, 2555-2561. 12. Marcou, G.; Rognan, D., Optimizing Fragment and Scaffold Docking by Use of Molecular Interaction Fingerprints. J. Chem. Inf. Model. 2007, 47, 195-207. 13. Mpamhanga, C. P.; Chen, B.; McLay, I. M.; Willett, P., Knowledge-based Interaction Fingerprint Scoring: a Simple Method for Improving the Effectiveness of Fast Scoring Functions. J. Chem. Inf. Model. 2006, 46, 686-698. 14. Ballester, P. J.; Schreyer, A.; Blundell, T. L., Does a More Precise Chemical Description of Protein–ligand Complexes Lead to More Accurate Prediction of Binding Affinity? J. Chem. Inf. Model. 2014, 54, 944-955. 15. Pérez-Nueno, V. I.; Rabal, O.; Borrell, J. I.; Teixidó, J., APIF: a New Interaction Fingerprint Based on Atom Pairs and Its Application to Virtual Screening. J. Chem. Inf. Model. 2009, 49, 1245-1260. 16. Rogers, D.; Hahn, M., Extended-connectivity Fingerprints. J Chem Inf Model 2010, 50, 742-754. 17. de Graaf, C.; Kooistra, A. J.; Vischer, H. F.; Katritch, V.; Kuijer, M.; Shiroishi, M.; Iwata, S.; Shimamura, T.; Stevens, R. C.; de Esch, I. J.; Leurs, R., Crystal Structure-based

ACS Paragon Plus Environment

29

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 32

Virtual Screening for Fragment-like Ligands of the Human Histamine H(1) Receptor. Journal of medicinal chemistry 2011, 54, 8195-8206. 18. Kooistra, A. J.; Leurs, R.; de Esch, I. J.; de Graaf, C., Structure-Based Prediction of GProtein-Coupled Receptor Ligand Function: A β-Adrenoceptor Case Study. J. Chem. Inf. Model. 2015. 19. Mysinger, M. M.; Carchia, M.; Irwin, J. J.; Shoichet, B. K., Directory of Useful Decoys, enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking. Journal of medicinal chemistry 2012, 55, 6582-6594. 20. Small-Molecule Drug Discovery Suite 2014-3: Glide, version 6.4, Schrödinger, LLC, New York, NY, 2014. 21. Liu, W.; Chun, E.; Thompson, A. A.; Chubukov, P.; Xu, F.; Katritch, V.; Han, G. W.; Roth, C. B.; Heitman, L. H.; IJzerman, A. P.; Cherezov, V.; Stevens, R. C., Structural Basis for Allosteric Regulation of GPCRs by Sodium Ions. Science 2012, 337, 232-236. 22. Shimamura, T.; Shiroishi, M.; Weyand, S.; Tsujimoto, H.; Winter, G.; Katritch, V.; Abagyan, R.; Cherezov, V.; Liu, W.; Han, G. W., Structure of the Human Histamine H1 Receptor Complex with Doxepin. Nature 2011, 475, 65-70. 23. Tan, Q.; Zhu, Y.; Li, J.; Chen, Z.; Han, G. W.; Kufareva, I.; Li, T.; Ma, L.; Fenalti, G.; Li, J., Structure of the CCR5 Chemokine Receptor–HIV Entry Inhibitor Maraviroc Complex. Science 2013, 341, 1387-1390. 24. Cherezov, V.; Rosenbaum, D. M.; Hanson, M. A.; Rasmussen, S. G.; Thian, F. S.; Kobilka, T. S.; Choi, H.-J.; Kuhn, P.; Weis, W. I.; Kobilka, B. K., High-resolution Crystal Structure of an Engineered Human β2-adrenergic G Protein–coupled Receptor. Science 2007, 318, 1258-1265. 25. Haga, K.; Kruse, A. C.; Asada, H.; Yurugi-Kobayashi, T.; Shiroishi, M.; Zhang, C.; Weis, W. I.; Okada, T.; Kobilka, B. K.; Haga, T.; Kobayashi, T., Structure of the Human M2 Muscarinic Acetylcholine Receptor Bound to an Antagonist. Nature 2012, 482, 547-51. 26. Li, H.; Robertson, A. D.; Jensen, J. H., Very Fast Empirical Prediction and Rationalization of Protein pKa Values. Proteins: Struct., Funct., Bioinf. 2005, 61, 704-721. 27. Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P., ChEMBL: a Large-scale Bioactivity Database for Drug Discovery. Nucleic Acids Res. 2012, 40, D1100-7. 28. Shelley, J. C.; Cholleti, A.; Frye, L. L.; Greenwood, J. R.; Timlin, M. R.; Uchimaya, M., Epik: a Software Program for pK( a ) Prediction and Protonation State Generation for Drug-like Molecules. Journal of computer-aided molecular design 2007, 21, 681-691. 29. Greenwood, J. R.; Calkins, D.; Sullivan, A. P.; Shelley, J. C., Towards the Comprehensive, Rapid, and Accurate Prediction of the Favorable Tautomeric States of Drug-like Molecules in Aqueous Solution. Journal of computer-aided molecular design 2010, 24, 591-604. 30. Schrödinger Release 2014-3: Maestro, v., Schrödinger, LLC, New York, NY, 2014. 31. O’Boyle, N. M.; Morley, C.; Hutchison, G. R., Pybel: a Python Wrapper for the OpenBabel Cheminformatics Toolkit. Chem. Cent. J. 2008, 2. 32. Jones, E.; Oliphant, T.; Peterson, P., SciPy: Open Source Scientific Tools for Python. 2015. 33. Schreyer, A.; Blundell, T., CREDO: a Protein–ligand Interaction Database for Drug Discovery. Chem. Biol. Drug Des. 2009, 73, 157-167. 34. Clark, M.; Cramer, R. D.; Van Opdenbosch, N., Validation of the General Purpose Tripos 5.2 force field. J. Comput. Chem. 1989, 10, 982-1012.

ACS Paragon Plus Environment

30

Page 31 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

35. Truchon, J. F.; Bayly, C. I., Evaluating Virtual Screening Methods: Good and Bad Metrics for the "Early Recognition" Problem. J Chem Inf Model 2007, 47, 488-508. 36. Davison, A. C.; Hinkley, D. V., Bootstrap methods and their application. Cambridge university press: 1997; Vol. 1. 37. Canty, A.; Ripley, B., boot: Bootstrap R (S-Plus) functions. 2012. 38. Sastry, G. M.; Inakollu, V. S.; Sherman, W., Boosting Virtual Screening Enrichments with Data Fusion: Coalescing Hits from Two-dimensional Fingerprints, Shape, and Docking. J. Chem. Inf. Model. 2013, 53, 1531-42. 39. Carlsson, J.; Coleman, R. G.; Setola, V.; Irwin, J. J.; Fan, H.; Schlessinger, A.; Sali, A.; Roth, B. L.; Shoichet, B. K., Ligand Discovery from a Dopamine D3 Receptor Homology Model and Crystal Structure. Nature chemical biology 2011, 7, 769-778. 40. Schreyer, A. M.; Blundell, T. L., CREDO: a Structural Interactomics Database for Drug Discovery. Database 2013, 2013, bat049. 41. Isberg, V.; Vroling, B.; van der Kant, R.; Li, K.; Vriend, G.; Gloriam, D., GPCRDB: an Information System for G Protein-coupled Receptors. Nucleic Acids Res. 2013, gkt1255. 42. Liu, Z.; Li, Y.; Han, L.; Li, J.; Liu, J.; Zhao, Z.; Nie, W.; Liu, Y.; Wang, R., PDB-wide Collection of Binding Data: Current Status of the PDBbind Database. Bioinformatics 2014, btu626. 43. Gabel, J.; Desaphy, J.; Rognan, D., Beware of Machine Learning-Based Scoring Functions: On the Danger of Developing Black Boxes. J. Chem. Inf. Model. 2014, 54, 28072815. 44. Wallach, I.; Dzamba, M.; Heifets, A., AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery. arXiv preprint arXiv:1510.02855 2015. 45. Koes, D. R.; Baumgartner, M. P.; Camacho, C. J., Lessons Learned in Empirical Scoring with Smina from the CSAR 2011 Benchmarking Exercise. J. Chem. Inf. Model. 2013, 53, 18931904. 46. Da, C.; Stashko, M.; Jayakody, C.; Wang, X.; Janzen, W.; Frye, S.; Kireev, D., Discovery of Mer kinase Inhibitors by Virtual Screening Using Structural Protein–Ligand Interaction Fingerprints. Bioorg. Med. Chem. 2015, 23, 1096-1101. 47. Kooistra, A. J.; Kanev, G. K.; van Linden, O. P.; Leurs, R.; de Esch, I. J.; de Graaf, C., KLIFS: a Structural Kinase-ligand Interaction Database. Nucleic acids research 2015, 44, D365D371. 48. Jansen, C.; Kooistra, A. J.; Kanev, G. K.; Leurs, R.; De Esch, I. J.; de Graaf, C., PDEStrIAn: A phosphodiesterase Structure and Ligand Interaction Annotated Database as a Tool for Structure-based Drug Design. J. Med. Chem. 2016, DOI: 10.1021/acs.jmedchem.5b01813.

ACS Paragon Plus Environment

31

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 32

For Table of Contents Use Only

ACS Paragon Plus Environment

32