Inexpensive Method for Selecting Receptor Structures for Virtual

Dec 14, 2015 - ABSTRACT: This article introduces a screening performance index (SPI) to help select from a number of experimental structures one or a ...
1 downloads 6 Views 1MB Size
Subscriber access provided by UNIV OF NEBRASKA - LINCOLN

Article

An Inexpensive Method for Selecting Receptor Structures for Virtual Screening Zunnan Huang, and Chung F. Wong J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.5b00299 • Publication Date (Web): 14 Dec 2015 Downloaded from http://pubs.acs.org on December 19, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

An Inexpensive Method for Selecting Receptor Structures for Virtual Screening Zunnan Huang1* and Chung F. Wong2 1 China-America Cancer Research Institute, Guangdong Provincial Key Laboratory of Medical Molecular Diagnostics, Dongguan Scientific Research Center, Guangdong Medical University, Dongguan, Guangdong Province, P. R. China. 523808. 2 Department of Chemistry and Biochemistry and Center for Nanoscience, University of Missouri-Saint Louis, One University Boulevard, St. Louis, Missouri 63121

Abstract: This article introduces a screening performance index (SPI) to help select from a number of experimental structures one or a few that are more likely to identify more actives among its top hits from virtual screening of a compound library. It achieved this by docking only known actives to the experimental structures without considering a large number of decoys to reduce computational costs. The SPI is calculated by using the docking energies of the actives to all the receptor structures. We evaluated the performance of the SPI by applying it to study eight protein systems: fatty acid binding protein adipocyte FABP4, serine/threonine-protein kinase BRAF, beta-1 adrenergic receptor ADRB1, TGF-beta receptor type I TGFR1, adenosylhomocysteinase SAHH, thyroid hormone receptor beta-1 THB, phospholipase A2 group IIA PA2GA and cytochrome P450 3a4 CP3A4. We found that the SPI agreed with the results from other popular performance metrics such as Boltzmann-Enhanced Discrimination Receiver Operator Characteristics (BEDROC), Robust Initial Enhancement (RIE), Area Under Accumulation Curve (AUAC), and Enrichment Factor (EF) but less expensive to be calculated. SPI also performed better than the best docking energy, the molecular volume of the bound ligand, and the resolution of crystal structure in selecting good receptor structures for virtual screening. The implications of these findings were further discussed in the context of ensemble docking, in situations when no experimental structure for the targeted protein was available, or under circumstances when quick choices of receptor structures need to be made before quantitative indexes such as the SPI and BEDROC can be calculated. Key words: Molecular docking-based virtual screening, Screening Performance Index, SPDocking, lowest docking energy, resolution of crystal structure, molecular volume of bound ligand, Boltzmann-enhanced discrimination receiver operator characteristics

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1. Introduction Target-based virtual screening (TBVS) has been used extensively in large-scale discovery of compound hits for biological targets1-4. This technique requires the expensive process of docking a large number of compounds from real or virtual libraries to a biological receptor. An effective use of this technique calls for an efficient method to select good structures for docking, whether from the many experimental structures that might be available or from structures obtained from homology modeling or molecular simulation. An effective choice of structure(s) could identify more actives among the top hits in a virtual screening, thus reducing the number of compounds that need to be evaluated experimentally to find good drug candidates. One technique for selecting a receptor structure to be used involves docking a library containing known actives and a large number of decoys to the receptor5-9. The larger the number of actives found among the top hits and the higher the rank of the actives, the better is the receptor structure for virtual screening. However, this method is tedious to be used and could be computationally expensive when more sophisticated models, such as the Molecular Mechanics/Poisson-Boltzmann Surface Area model10 are used to re-score the results, or when this analysis is repeated with multiple scoring functions. This work examined whether one could use the docking results of a number of known actives only, without using a large number of decoys, to choose a good experimental structure to use for virtual screening to find more useful drug leads. We expected that a good experimental structure for virtual screening to be one that gave favorable docking energies for the largest number of actives, among the experimental structures to be selected. By being able to accommodate a diverse set of active molecules, the structure is also more likely to reject false positives. To help identify these structures, we have introduced a Screening Performance Index (SPI) derived from the docking energies of the actives to the experimental structures. We found that this SPI matched well the results from one of the most useful metrics introduced recently, the Boltzmann-Enhanced Discrimination Receiver Operator Characteristics (BEDROC) on eight test systems: fatty acid binding protein adipocyte FABP4, serine/threonine-protein kinase BRAF, beta-1 adrenergic receptor ADRB1, TGF-beta receptor type I TGFR1, adenosylhomocysteinase SAHH, thyroid hormone receptor beta-1 THB, phospholipase A2 group IIA PA2GA and cytochrome P450 3a4 CP3A4, but is much cheaper to be computed. It also performed better than the best docking energy, the molecular volume of the bound ligand, or the resolution of crystal structure did. Our analysis has also suggested useful quick rules for choosing experimental structures for virtual screening to increase success if quantitative analysis such as the ones described in this article is not carried out.

2. Methods 2.1. Screening Performance Index for selecting receptor structures for virtual screening We have evaluated the five terms listed in Table 1 to examine whether they could be good screening performing indices. Term 1 in Table 1 measured how favorable the best docking energy within a receptor structure was in comparable to the best docking energies to all structures. Term 2 and Term 3 used the average docking energies and the average deviation of the docking energies from the most favorable one to evaluate whether compounds other than the one with the most favorable docking energy also possessed favorable docking energies – the

ACS Paragon Plus Environment

Page 2 of 33

Page 3 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

more compounds with docking energies close to the most favorable one the better. Term 4 penalized structures to which fewer compounds could be docked successfully by GLIDE. Term 5 favored structures to which many actives bound with docking energies more favorable than the overall averaged docking energies across all structures. 2.2. Statistical measures for evaluating screening efficiency We compared the performance of the SPI with several commonly used statistical measures shown in Table 2. These measures included BEDROC, Robust Initial Enhancement (RIE), Area Under Accumulation Curve (AUAC), and Enrichment Factor (EF)11-13. Among these measures, BEDROC and RIE favor screening models that pile up more actives near the top of the rank-ordered list from a virtual screening11, 14. Currently, BEDROC is regarded as one of the most useful metrics for gauging the usefulness of a screening model in suggesting compounds for experimental synthesis or/and screening. On the other hand, AUAC and EF were criticized for their lack of power of “early recognition” in that it could not distinguish a small rank-ordered list that contained most of the actives at the top from one that contained most of the actives near its end11-13. Therefore, in this work, we took BEDROC as the “gold” standard to identify the best receptor structures for virtual screening and examined how well the SPI could mimic the results with only a fraction of the costs. 2.3. System Setup We chose eight protein test systems in this study: (1) Fatty acid binding protein adipocyte FABP4, (2) Serine/threonine-protein kinase BRAF, (3) Beta-1 adrenergic receptor ADRB1, and (4) TGF-beta receptor type I TGFR1, (5) Adenosylhomocysteinase SAHH, (6) Thyroid hormone receptor beta-1 THB, (7) Phospholipase A2 group IIA PA2GA (8) Cytochrome P450 3a4 CP3A4 (Table 3 contains more details.). We downloaded 34, 36, 16, 13, 28, 18, 20, and 24 crystal structures for FABP4, BRAF, ADRB1, TGFR1, SAHH, THB, PA2GA, and CP3A4 respectively from the Protein Data Bank15. Actives and decoys for the protein receptors were obtained from the DUD-E benchmarking set16 (http://decoys.docking.org). We converted each structure from the Protein Data Bank into an all-atom model ready for docking by using the Maestro's Protein Preparation Wizard17. Default settings were used, except that the missing side chains were introduced by Prime. All crystal waters were deleted from the preprocess step. For structures containing covalently bonded ligands - such as 1A2D, 1A18, 1ADL, and 3JS1 for the FABP4 receptor – we separated the ligands from their protein receptors by breaking their covalent bonds. After refining the structures with restrained minimization, the receptor structures were aligned by using Protein Structure Alignment in Maestro18. For the FABP4 system, the root-mean-square deviations (RMSDs) of the protein atoms in 33 different structures from the reference structure 1A2D ranged from 0.283 Å (1A18) to 0.658 Å (1AB0). For the BRAF system, the 35 protein structures deviated from the reference structure 1UMH by 0.442 Å (4FC0) to 1.717 Å (4PP7), with an exception of 2.867 Å for 4MNE. For the ADRB1 system, the 15 protein structures deviated from the reference structure 2VT4 by 0.338 Å (2YCX) to 1.684 Å (4GPO). For the TGFR1 system, the 12 protein structures deviated from the reference structure 1VJY by 0.232 Å (3HMM) to 1.365 Å (3FAA). For the SAHH system, the 27 protein structures deviated from the reference structure 1A7A by 0.273 Å (3NJ4) to 2.743 Å (3D64). For the THB system, the 17 protein structures deviated from the reference structure 1BSX by 0.446 Å (1Y0X) to 1.028 Å (1Q4X). For the PA2GA system, the 19 protein structures deviated from the reference structure 1AYP by 0.337 Å (1KVO) to 1.229 Å (2ARM). And for

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the CP3A4 system, the 23 protein structures deviated from the reference structure 1TQN by 0.448 Å (3UA1) to 1.572 Å (2V0M). Receptor grid for docking was created by Glide’s Receptor Grid Generation19 for every structure. For structures containing bound ligands, we used Glide’s default setting to construct the grid box and choose its center based on the ligand in its binding pocket. For apo structures of the FABP4, ADRB1, and TGFR1 systems, the grid box was set to have each side equaled 30 Å, and the center of the box was determined by the residues surrounding the binding site. For the seven FABP4 apo structures, we selected the 18 residues: PHE16, TYR19, MET20, THR29, PRO38, MET40, ILE51, ALA57, THR60, ILE62, GLU72, ALA75, ASP76, GLN95, ARG106, CYS117, ARG126, and TYR128. For one ADRB1 apo structure 4GPO, we used the 17 residues LEU101, VAL102, TRP117, THR118, ASP121, VAL122, THR126, PHE201, TYR207, SER211, SER212, SER215, PHE306, PHE307, ASN310, ASN329, and TYR333. For two TGFR1 apo structures, we chose the 12 residues ILE211, VAL219, LYS232, GLU245, TYR249, LEU260, SER280, TYR282, ASP290, ARG294, LEU340, and ASP351. For five, five, and six apo structures of the SAHH, PA2GA, and CP3A4 systems, the grid box and its center was constructed based on the transferred co-crystallized ligand respectively from the crystal structures of PDB id 1A7A, 1AYP, and 1WOG in its binding pocket after protein structure alignment. Afterwards, we used the Virtual Screening Workflow (VSW) in Schrödinger20 to perform virtual screening with each active-decoy set for the eight protein systems. In a normal VSW procedure, each docking was performed hierarchically using the Glide program19, 21, 22 through three stages with different accuracy levels: HTVS, SP (Standard Precision) and XP (Extra Precision). In this work, we performed SP-docking for all eight systems but XP-docking only for the first four systems. We used default parameters for docking (e.g., scaling factor = 0.8 and partial charge cutoff = 0.15). The VSW for each receptor structure generated a screening-result file with ligands ranked by their docking energies. We then post-processed the results by using Maestro’s Enrichment Calculator to calculate different enrichment metrics and by using a home-grown program to calculate the screening performance index. In addition, we used Maestro’s python script volume_calc.py in Schrödinger Command Prompt to calculate the molecular volume of the co-crystallized ligand for each crystal structure and a pop up window titled Clustering based on Volume Overlap from Maestro’s Scripts menu to group similar ligands together to form clusters for each test system.

3. Results To examine which of the five terms in Table 1 best described the behavior of BEDROC, which we considered to be the best screening performance metric to match, we calculated the correlation coefficient between each term with BEDROC using the crystal structures for each of the eight protein systems studied (as shown in Table S1 – S8). Table 4 summarizes the results. Term 5 consistently gave the best correlation for all the systems studied. Therefore, we chose Term 5 to be the SPI. Fig. 1 shows the correlation plots for the eight protein systems studied using Term 5 as the SPI with BEDROC. And Fig. 2 gives the histograms of the correlation coefficients between five enrichment metrics with SPI. This SPI is cheaper to use than BEDROC and other metrics because it only requires the docking of active compounds to a receptor without using a much larger number of decoys. Yet, it paralleled the results from BEDROC and other enrichment metrics well for the eight protein test systems studied here.

ACS Paragon Plus Environment

Page 4 of 33

Page 5 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 5 compares the different screening performance metrics in docking 47 actives and 2749 decoys to the 34 crystal structures of FABP4. These metrics included BEDROC, RIE, AUAC, 1% EF and 10% EF and the screening performance index (SPI) calculated for each experimental structure. The lowest docking energy (LDE) among the docking of all the actives to all the crystal structures, the resolution of the crystal structures (RES), and the molecular volume (MV) of the co-crystallized ligands are also shown. The crystal structure from PDB id 3FR2 offered the best docking results with the highest BEDROC, RIE, and AUAC value of 0.828, 14.07 and 0.91, respectively. In addition, the enrichment factors calculated at 1% and 10% of the entire ranked database were 55 and 8.7 respectively. The best SPI did not pick out this structure but the second best did. The best SPI corresponded to the 6th “best” receptor structure, PDB id 3HK1, identified by BEDROC. This structure is still acceptable for virtual screening, not much inferior to the “best” structure. This structure also gave good numbers for several other screening performance metrics: e.g., AUAC = 0.89, enrichment factor at 1% ranked database = 34. For the worst structure for virtual screening, the SPI picked out the crystal structure PDB id = 1ALB, which was the same as that identified by three other major screening performance metrics examined in this study: BEDROC, RIE, and enrichment factor at 1% ranked database. AUAC also picked out this structure as its 4th worst one. The enrichment factor at 1% ranked database was 0, corresponding to no actives found in the top 1% of the database for this worst structure. It is worthy to note that the best apo structure identified by BEDROC, PDB id = 1LIB, ranked 19th out of 34 FABP4 crystal structures, better than 9 other ligand-bound structures and 6 other apo structures. Therefore, using ligand-bound structures do not necessarily produce better results than using apo structures, if inferior ligandbound structures are used. Table 6 summarizes the results for docking 152 actives and 9942 decoys to 36 crystal structures for BRAF. For this system, the SPI identified the best structure (PDB id = 3IDP) for virtual screening as the other five screening performance metrics did: BEDROC, RIE, AUAC, 1% EF, 10% EF. In addition, the SPI identified as its worst structure the bottom-sixth found by BEDROC. Thus, the SPI is also useful for excluding inferior structures to use for virtual screening. Table 7 shows similar results for ADRB1 in docking 247 actives and 15842 decoys to 16 crystal structures. In this case, the SPI identified the best (PDB id 4AMI) and worst (PDB id 4GPO) structures as the other five major screening performance metrics did. This further reinforces that the much-easier-to-calculate SPI is adequate for selecting good crystal structures to use for virtual screening. Table 8 shows similar results for TGFR1 for which 133 actives and 8498 decoys were docked to 13 TGFR1 crystal structures. This time, the best SPI identified the fourth best structure suggested by BEDROC while the second best SPI identified the best structure (PDB id 3TZM) suggested by BEDROC. If one were to choose several structures for virtual screening, the SPI would have identified the best and several very good structures. For identifying the worst structures, the SPI gave the same results as the other five screening performance metrics did. Table 9 summarizes the results for docking 63 actives and 3450 decoys to 28 crystal structures for SAHH. For this system, the SPI identified the best (PDB id 3DHY) and worst (PDB id 3X2E) structures for virtual screening as the other five screening performance metrics did: BEDROC, RIE, AUAC, 1% EF, 10% EF.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 10 shows similar results for THB in docking 103 actives and 7441 decoys to 18 crystal structures. In this case, the best SPI identified the best structure (PDB id 1Q4X) suggested by all the other five major screening performance metrics while the worst and 2nd worst SPI identified the worst structure (PDB id 1NQ2) by BEDROC, RIE, and 1% EF, and another worst one (PDB id 1NUO) by AUAC and 10% EF respectively. Table 11 shows similar results for PA2GA for which 99 actives and 5146 decoys were docked to 20 PA2GA crystal structures. This time, the best SPI identified the best structure (PDB id 1DB4) suggested by BEDROC, RIE, AUAC and 10% EF while the second best SPI identified the best structure (3U8H) suggested by 1% EF. In addition, the SPI identified as its worst structure the bottom-seventh found by BEDROC. Though this structure ranked just higher than the other worst structures identified respectively by the other five major screening performance metrics. Therefore, again, the SPI is useful for excluding inferior structures to use for virtual screening. Table 12 shows similar results for CP3A4 in docking 170 actives and 11796 decoys to 24 crystal structures. For this system, the SPI identified the best (PDB id 4I4H) and worst (PDB id 5A1R) structures for virtual screening as the other five screening performance metrics did. Picking the best over the worst structure for virtual screening could make a big difference in screening performance for some systems. For the 28 crystal structures of SAHH studied here, the BEDROC value nearly covered an entire range from 0.031 to 0.981 (Table 9 and Fig. 1e). The 34 crystal structures of FABP4 provided a wide range of the BEDROC value from 0.121 to 0.828 (Table 5 and Fig. 1a). The 13 structures of TGFR1 also varied substantially in quality, their BEDROC scores ranged from a poor value of 0.095 to a reasonably good value of 0.700 (Table 8 and Fig. 1d). The range of BEDROC scores produced by the 36 structures of BRAF covered a smaller range from 0.234 to 0.633 (Table 6 and Fig. 1b) because even the best structure did not give too good performance, so as those BEDROC values given by 18 THB structures (0.294 — 0.523, see Table 10 and Fig. 1f) or by 20 PA2GA structures (0.041 — 0.531, see Table 11 and Fig. 1g). The 16 structures of ADRB1 varied less substantially in quality as their BEDROV values ranged from 0.188 to 0.454 (see Table 7 and Fig. 1c), suggesting poorer performance on virtual screening even using the best structure available for this protein. Finally, none of the 24 structures of CP3A4 gave a BEDROC value better than 0.223 (see Table 12 and Fig. 1h), indicating the poorest and nearly random performance (the AUAC values less than 0.62) for this system on receptor structure-based virtual screening. The results in Table 13 give some insights into whether the most favorable binding energy obtained for each structure among the actives, the resolution of the structure, or the molecular volume of co-crystallized ligands correlated with BEDROC. The correlation decreased in the order: the most favorable binding energy, molecular volume, and structural resolution. The correlation for resolution was so low that using structural resolution to aid the selection for good structures for virtual screening is not recommended. The correlation between the most favorable energy and the BEDROC value was higher but generally not significant in comparison to the SPI. In addition, as Rueda, Bottegoni and Abagyan reported that protein structures co-crystallized with the largest ligands performed better in identifying actives5, we did find that the molecular volume of co-crystallized ligands for some systems such as SAHH correlated well with the BEDROC studied here. However, the molecular volume alone did not provide as good an index as the SPI did in selecting good structures for virtual screening. Figures S1 to S3 of Supporting Information illustrate the correlation between the BEDROC value and the lowest docking energy, the structural resolution or the molecular volume in

ACS Paragon Plus Environment

Page 6 of 33

Page 7 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

greater details. For the lowest docking energy (Figure S1 of Supporting Information and the LDE column in tables 5 through 12), the crystal structure with the lowest docking energy ranked no better than the 7th in five systems while the one with the highest SPI ranked no worse than the 6th in all the eight systems studies here. In addition, the correlation coefficients of SPI with BEDROC were better than those of the lowest docking energy for seven systems (Term 1 vs. Term 5 in Table 4 and Column 2 vs. Column 5 in Table 13). The only exception was found for the SAHH system. In that case, the bottom-ranked five apo structures gave SPI = 0, which deteriorated the correlation between SPI and BEDROC (see Fig. 1e). Nevertheless, the correlation coefficient between SPI and BEDROC for SAHH was stronger than that between the lowest docking energy and BEDROC when those five apo structures were excluded (data not shown here). Thus, although there appeared some correlation between screening performance and the lowest docking energy, the lowest docking energy did not perform as well as the SPI in suggesting a good structure to use. For structural resolution, Figure S2 of Supporting Information shows that some good crystal structures for virtual screening carried low resolutions. For example, the best crystal structure identified by BEDROC for ADRB1, THB and CP3A4 had a relatively low resolution of 3.2 Å, 2.8 Å and 2.9 Å respectively (See also the RES column in tables 7, 10, 12). For BRAF and TGFR1, the 2nd best crystal structures had low resolutions of 3.5 Å and 3.7 Å respectively (see also the RES column in tables 6 and 8). For PA2GA, the 3rd best crystal structure had a low resolution of 2.8 Å (see also the RES column in table 11). This observation contradicted previous suggestion that crystal structures with resolution worse than 2.8 Å were not useful for screening. Among all the metrics studied here, structural resolution represented the least useful parameter to help select good receptor structures for virtual screening. In fact, for the PA2GA system, the resolution of the crystal structure even showed negative correlation with BEDROC (see also Table 13). Although not as poor as structural resolution, molecular volume did not always perform well either. Although Rueda et al. suggested that crystal structures containing large ligands performed better, we found exceptions. The most noticeable examples for BRAF and PA2GA (Figure S3 of Supporting Information) show that several crystal structures containing large ligands did not display good screening performance. Table 14 shows the ranks of the crystal structures obtained by BEDROC for the best structures identified by four other measures for the eight proteins studied and vice versa. The best structures identified by the SPI for the eight systems ranked 6th, 1st, 1st, 4th, 1st, 1st, 3rd, and 1st. This is quite good considering that the SPI only uses the docking energies of the actives without any decoy. In addition, the best structures identified by BEDROC ranked highly at 2nd, 1st, 1st, 2nd, 1st, 1st, and 1st according to the SPI for seven protein systems. The only exception was found for the PA2GA system. Though the best structure identified by BEDROC only ranked at 8th/9th, the 2nd and 3rd best structures by BEDROC still ranked highly at 3rd and 2nd by SPI (see Table 11). Thus, if one were to choose two or more structures instead of just one based on the SPI, the best structures identified by BEDROC would have been included. On the other hand, the performance for the other three measures -- LED, RES and MV -- was far inferior. Although they might identify a good structure for one protein, they could not consistently pick out good structures for all the eight proteins studied here. Because the actives chosen in the above analysis might over-represent ligands with certain scaffolds16, 23, we also performed cluster analysis to group similar ligands together and repeated the above analysis by using only one representative ligand from each group. In addition, as the binding energies of close analogues are harder to be distinguished24, removing structurally

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

similar compounds also makes it easier to rank order the remaining more structurally distinct compounds as demonstrated by the results below. We used the Kelley method25, 26 for clustering based on volume overlap. Figure S4 of Supporting Information shows that 19, 31, 28, and 24 clusters were found respectively for the FABP4, BRAF, ADRB1 and TGFR1 systems studied here, and Table S9 of Supporting Information lists the representative actives for these clusters. Table 15 shows that the SPI still correlated well with BEDROC and AUAC for this more careful choice of non-reductant ligands. Tables S10 to S13 of Supporting Information echoed the analysis above that the SPI were able to find the best structures identified by BEDROC among its top hits and identified the worst structures among its bottom hits for both SP-docking and XP-docking using the subset of “diverse” actives only.

4) Discussion Our new method could help to select structures for ensemble docking as well. Ensemble docking uses multiple receptor structures rather than a single structure. Since the pioneering work of Lorber et al.27 in 1998, ensemble docking has become a popular method to account for protein flexibility in molecular docking3, 5, 28-34. High successful rates have been reported35-37. However, performance could depend on how the multiple structures were selected5, 38. Rueda et al.5 found that including too many structures could lead to poor screening performance while increasing computational costs. They found that high screening performance could be achieved by including only a few structures, those that could already produce reasonably good results by themselves, rather than adding many “poor” structures5. On a similar note, Xu and Lill7 found that using known active ligands to help select as few as three structures could already improve the enrichment factor substantially over a single structure. By using fewer structures, the costs of virtual screening can also be kept low. The calculation of SPI helped to choose a small number of good structures to use for ensemble docking in virtual screening. The SPI should also be useful for selecting theoretical structures for virtual screening. Tang et al. demonstrated that adding theoretical models could give better hit rates than using Xray structures alone for the beta-2 adrenoreceptor8. Mordalski et al. also found the addition of homology models to outperform the use of crystal structures alone in their virtual screening9. More recently, Kalenkiewicz and others reported that the conformations obtained from combined cosolvent-accelerated molecular dynamics simulations starting with the apo-Bcl-xL structure gave better docking results for known inhibitors than using available experimental structures39. Adding theoretical structures obtained from homology modeling or molecular simulation is particularly useful when few experimental structures are available for some protein targets. And as our results above showed that crystal structures with lower resolutions could give better screening performance than structures with higher resolutions, small errors in well-built theoretical structures should likewise be tolerated, especially when these structures are selected with the help of the SPI when a large number of known actives are available. Our analysis also suggested some rules for selecting good structures for docking before quantitative indexes such as the SPI can be computed. We found that apo structures did not perform as well as the best ligand-bound structures did. This agrees with previous findings that apo structures could lack the needed ligand-induced conformational changes29, 40. On the other hand, we found that some ligand-bound structures performed less well than the apo structures. These tend to be structures bound with small ligands. Rueda and Bottegoni found that structures co-crystallized with the largest ligands performed well5. Our data supported this but

ACS Paragon Plus Environment

Page 8 of 33

Page 9 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

further suggested that the best structures for virtual screening were not the ones bound with the largest ligands, but those a bit smaller. Our analysis also found that the resolution of the crystal structure was not a good criterion to use to select structures for virtual screening – structures with lower resolution could give better performance than some structures with higher resolutions.

5. Conclusion. We have introduced a novel measure, the screening performance index (SPI), to help select good structures to use in virtual screening, particularly in identifying actives early in a ranked list. It is cheaper to calculate the SPI than other performance index such as BEDROC, as it requires only the docking of actives to the structures without using a large number of decoys. This is particularly helpful when one wants to examine docking results by using more than one scoring models, including expensive ones such as the Molecular Mechanics/Poisson-Boltzmann Surface Area model. For investigators who do not calculate SPI or other metrics to help select receptor structures, our analysis in this work offered some useful hints: pick structures with large ligands bound, even though they might not be the crystal structures determined with the highest resolution by X-ray crystallography. Using structures containing small ligands could give poorer performance than using apo structures and should be avoided unless no apo structure or structure with a large ligand bound is available.

ASSOCIATED CONTENT Supporting Information contains 13 additional tables: (1) The values of BEDROC and the five terms in Table 1 of main text for the 34 FABP4 crystal structures. (2) The values of BEDROC and the five terms in Table 1 of main text for the 36 BRAF crystal structures. (3) The values of BEDROC and the five terms in Table 1 of main text for the 16 ADRB1 crystal structures. (4) The values of BEDROC and the five terms in Table 1 of main text for the 13 TGFR1 crystal structures. (5) The values of BEDROC and the five terms in Table 1 of main text for the 28 SAHH crystal structures. (6) The values of BEDROC and the five terms in Table 1 of main text for the 18 THB crystal structures. (7) The values of BEDROC and the five terms in Table 1 of main text for the 20 PA2GA crystal structures. (8) The values of BEDROC and the five terms in Table 1 of main text for the 24 CP3A4 crystal structures. (9) The representative actives found for the four protein systems by the Kelley method. (10) Screening performance evaluated by docking 19 diverse ligands to 34 FABP4 crystal structures. (11) Screening performance evaluated by docking 31 diverse ligands to 36 BRAF crystal structures. (12) Screening performance evaluated by docking 28 diverse ligands to 16 ADRB1 crystal structures. (13) Screening performance evaluated by docking 24 diverse ligands to 13 TGFR1 crystal structures. Supporting Information also contains four additional figures: (1) Correlation plots between the lowest docking energy among all actives to each crystal structure and the BEDROC value obtained from docking a set of actives and decoys to each structure for the eight protein systems included in this study. (2) Correlation plots between structural resolution and the BEDROC value for the eight protein systems in this study. (3) Correlation plots between the molecular volume calculated from the co-crystallized ligand and the BEDROC value for each crystal structure for the eight protein systems in this study. (4) Plots of Kelley Penalty versus

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

number of clusters identified for the actives in the DUD-E benchmarking set for the first four protein systems in this study. This material is available free of charge via the Internet at http://pubs.acs.org. AUTHOR INFORMATION Corresponding Author: *Email: [email protected] Notes The authors declare no competing financial interest.

ACKNOWLEDGEMENTS This work was supported by National Natural Science Foundation of China (31170676), Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry (2014-1685), and Scientific Research Foundation for Returned Overseas Scholars of Guangdong Medical University, China (B2012082). This work was also supported by the funds from 2013 Sail Plan "the Introduction of the Shortage of Top-Notch Talent” Project (YueRenCaiBan [2014] 1), Science and Technology Planning Project (2013B021800072), and Education Discipline Construction Project (2013KJCX0090) of Guangdong Province, China. Finally, thanks to Dr Tingjun Hou from Zhejiang University for putting forward useful suggestions and thanks to Schrodinger for providing an evaluation license.

REFERENCES 1. Breda, A.; Basso, L. A.; Santos, D. S.; de Azevedo, W. F., Virtual Screening of Drugs: Score Functions, Docking, and Drug Design. Curr Comput Aided Drug Des 2008, 4, 265-272. 2. Andricopulo, A. D.; Salum, L. B.; Abraham, D. J., Structure-Based Drug Design Strategies in Medicinal Chemistry. Curr Top Med Chem 2009, 9, 771-790. 3. Lionta, E.; Spyrou, G.; Vassilatis, D. K.; Cournia, Z., Structure-Based Virtual Screening for Drug Discovery: Principles, Applications and Recent Advances. Curr Top Med Chem 2014, 14, 1923-1938. 4. Verma, S.; Prabhakar, Y. S., Target Based Drug Design - a Reality in Virtual Sphere. Curr Med Chem 2015, 22, 1603-1630. 5. Rueda, M.; Bottegoni, G.; Abagyan, R., Recipes for the Selection of Experimental Protein Conformations for Virtual Screening. J Chem Inf Model 2010, 50, 186-193. 6. Li, Y.; Kim, D. J.; Ma, W.; Lubet, R. A.; Bode, A. M.; Dong, Z., Discovery of Novel Checkpoint Kinase 1 Inhibitors by Virtual Screening Based on Multiple Crystal Structures. J Chem Inf Model 2011, 51, 2904-2914. 7. Xu, M.; Lill, M. A., Utilizing Experimental Data for Reducing Ensemble Size in Flexible-Protein Docking. J Chem Inf Model 2012, 52, 187-198. 8. Tang, H.; Wang, X. S.; Hsieh, J. H.; Tropsha, A., Do Crystal Structures Obviate the Need for Theoretical Models of Gpcrs for Structure-Based Virtual Screening? Prot Struct Funct Bioinform 2012, 80, 1503-1521.

ACS Paragon Plus Environment

Page 10 of 33

Page 11 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

9. Mordalski, S.; Witek, J.; Smusz, S.; Rataj, K.; Bojarski, A. J., Multiple Conformational States in Retrospective Virtual Screening - Homology Models Vs. Crystal Structures: Beta-2 Adrenergic Receptor Case Study. J Cheminform 2015, 7, 13. 10. Kuhn, B.; Kollman, P. A., Binding of a Diverse Set of Ligands to Avidin and Streptavidin: An Accurate Quantitative Prediction of Their Relative Affinities by a Combination of Molecular Mechanics and Continuum Solvent Models. J Med Chem 2000, 43, 3786-3791. 11. Truchon, J. F.; Bayly, C. I., Evaluating Virtual Screening Methods: Good and Bad Metrics for the "Early Recognition" Problem. J Chem Inf Model 2007, 47, 488-508. 12. Triballeau, N.; Acher, F.; Brabet, I.; Pin, J.-P.; Bertrand, H.-O., Virtual Screening Workflow Development Guided by the "Receiver Operating Characteristic" Curve Approach. Application to High-Throughput Docking on Metabotropic Glutamate Receptor Subtype 4. J Med Chem 2005, 48, 2537-2547. 13. Jacobsson, M.; Liden, P.; Stjernschantz, E.; Bostrom, H.; Norinder, U., Improving Structure-Based Virtual Screening by Multivariate Analysis of Scoring Data. J Med Chem 2003, 46, 5781-5789. 14. Sheridan, R. P.; Singh, S. B.; Fluder, E. M.; Kearsley, S. K., Protocols for Bridging the Peptide to Nonpeptide Gap in Topological Similarity Searches. J Chem Inf Comput Sci 2001, 41, 1395-1406. 15. Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E., The Protein Data Bank. Nuc Acids Res 2000, 28, 235-242. 16. Mysinger, M. M.; Carchia, M.; Irwin, J. J.; Shoichet, B. K., Directory of Useful Decoys, Enhanced (Dud-E): Better Ligands and Decoys for Better Benchmarking. J Med Chem 2012, 55, 6582-6594. 17. Schrödinger Suite 2014-3: Protein Preparation Wizard; Epik Version 2.9, Schrödinger, Llc, New York, Ny, 2014; Impact Version 6.4, Schrödinger, Llc, New York, Ny, 2014; Prime Version 3.7, Schrödinger, LLC: New York, NY, 2014. 18. Schrödinger Release 2014-3: Maestro, Version 9.9, Schrödinger; L.L.C.: New York, NY, 2014. 19. Schrödinger Release 2014-3: Glide, Version 6.4, Schrödinger, LLC: New York, NY, 2014. 20. Schrödinger Release 2014-3, Schrödinger; L.L.C.: New York, NY, 2014. 21. Friesner, R. A.; Banks, J. L.; Murphy, R. B.; Halgren, T. A.; Klicic, J. J.; Mainz, D. T.; Repasky, M. P.; Knoll, E. H.; Shelley, M.; Perry, J. K.; Shaw, D. E.; Francis, P.; Shenkin, P. S., Glide: A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking Accuracy. J Med Chem 2004, 47, 1739-1749. 22. Friesner, R. A.; Murphy, R. B.; Repasky, M. P.; Frye, L. L.; Greenwood, J. R.; Halgren, T. A.; Sanschagrin, P. C.; Mainz, D. T., Extra Precision Glide: Docking and Scoring Incorporating a Model of Hydrophobic Enclosure for Protein-Ligand Complexes. J Med Chem 2006, 49, 6177-6196. 23. Kirchmair, J.; Markt, P.; Distinto, S.; Wolber, G.; Langer, T., Evaluation of the Performance of 3d Virtual Screening Protocols: Rmsd Comparisons, Enrichment Assessments, and Decoy Selection-What Can We Learn from Earlier Mistakes. J. Comput. Aided Mol. Des. 2008, 22, 213-228.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

24. Gkeka, P.; Eleftheratos, S.; Kolocouris, A.; Cournia, Z., Free Energy Calculations Reveal the Origin of Binding Preference for Aminoadamantane Blockers of Influenza a/M2tm Pore. J Chem Theory and Comput 2013, 9, 1272-1281. 25. Kelley, L. A.; Gardner, S. P.; Sutcliffe, M. J., An Automated Approach for Clustering an Ensemble of Nmr-Derived Protein Structures into Conformationally Related Subfamilies. Protein Eng 1996, 9, 1063-1065. 26. Kelley, L. A.; Gardner, S. P.; Sutcliffe, M. J., An Automated Approach for Defining Core Atoms and Domains in an Ensemble of Nmr-Derived Protein Structures. Protein Eng 1997, 10, 737-741. 27. Lorber, D. M.; Shoichet, B. K., Flexible Ligand Docking Using Conformational Ensembles. Prot Sci 1998, 7, 938-950. 28. Claussen, H.; Buning, C.; Rarey, M.; Lengauer, T., Flexe: Efficient Molecular Docking Considering Protein Structure Variations. J Mol Biol 2001, 308, 377-395. 29. Huang, S. Y.; Zou, X., Ensemble Docking of Multiple Protein Structures: Considering Protein Structural Variations in Molecular Docking. Prot Struct Funct Bioinform 2007, 66, 399421. 30. Totrov, M.; Abagyan, R., Flexible Ligand Docking to Multiple Receptor Conformations: A Practical Alternative. Curr Opin Struct Biol 2008, 18, 178-184. 31. Fukunishi, Y., Structural Ensemble in Computational Drug Screening. Expert Opin Drug Metab Toxicol 2010, 6, 835-849. 32. Korb, O.; Olsson, T. S.; Bowden, S. J.; Hall, R. J.; Verdonk, M. L.; Liebeschuetz, J. W.; Cole, J. C., Potential and Limitations of Ensemble Docking. J Chem Inf Model 2012, 52, 12621274. 33. Tian, S.; Sun, H.; Pan, P.; Li, D.; Zhen, X.; Li, Y.; Hou, T., Assessing an Ensemble Docking-Based Virtual Screening Strategy for Kinase Targets by Considering Protein Flexibility. J Chem Inf Model 2014, 54, 2664-2679. 34. Ellingson, S. R.; Miao, Y.; Baudry, J.; Smith, J. C., Multi-Conformer Ensemble Docking to Difficult Protein Targets. J Phys Chem B 2015, 119, 1026-1034. 35. Ferrari, A. M.; Wei, B. Q.; Costantino, L.; Shoichet, B. K., Soft Docking and Multiple Receptor Conformations in Virtual Screening. J Med Chem 2004, 47, 5076-5084. 36. Cavasotto, C. N.; Abagyan, R. A., Protein Flexibility in Ligand Docking and Virtual Screening to Protein Kinases. J Mol Biol 2004, 337, 209-225. 37. Bolstad, E. S.; Anderson, A. C., In Pursuit of Virtual Lead Optimization: The Role of the Receptor Structure and Ensembles in Accurate Docking. Prot Struct Funct Bioinform 2008, 73, 566-580. 38. Barril, X.; Morley, S. D., Unveiling the Full Potential of Flexible Receptor Docking Using Multiple Crystallographic Structures. J Med Chem 2005, 48, 4432-4443. 39. Kalenkiewicz, A.; Grant, B. J.; Yang, C. Y., Enrichment of Druggable Conformations from Apo Protein Structures Using Cosolvent-Accelerated Molecular Dynamics. Biology (Basel) 2015, 4, 344-366. 40. McGovern, S. L.; Shoichet, B. K., Information Decay in Molecular Docking Screeens Again Holo, Apo, and Modeled Conformations of Enzymes. J Med Chem 2003, 46, 2895-2907.

ACS Paragon Plus Environment

Page 12 of 33

Page 13 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 1: Several potential measures to use as the Screening Performance Index Measure Rationale Formula Min({Ei }1,...,n ) Term 1 Structures that bind T1 = Min({Ei }1,...,N ) ligands with the most where n and N are respectively the total number of the favorable docking energies might be more actives successfully docked to a specific receptor structure likely to select other true by GLIDE, and N = ∑ n . Ei represents the docking all structures actives and to reject energy of an active bound to a receptor structure. false positives. 1 n Term 2 If the average docking ∑ Ei energy of all the actives T2 = n i=1 Min({Ei }1,...,n ) to a crystal structure is n, N and Ei same as T1 above. closer to the docking energy of the best binder, more actives are bound with favorable energies to this structure, suggesting that this structure could pick out actives more readily. n Term 3 Qualitatively similar to [(Ei − Min({Ei }1,...,n )]2 ∑ = 1 i Term 2 by quantitatively T3 = 1 n −1 different. If more n, N and E same as above. i actives give favorable docking energies as the best binder, this structure is more likely to pick out actives. (l − n) Term 4 Structures for which T4 = − l GLIDE could dock more where l and n are respectively the total number of all actives successfully to it active ligands, and the actives successfully docked to a might represent a specific receptor structure. structure that could pick out actives more readily. Term 5 If many actives can k T5 = (SPI) dock to a structure with l docking energies more n 1 N favorable than the where k = ∑ xi , xi = 1 if Ei ≤ ∑ Ei , xi = 0 otherwise N i =1 i =1 overall average docking l, n, N and Ei same as above. energies to all structures, this structure might be more likely to pick out many actives in virtual screening.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 2: Statistical measures for screening performance Statistical Description Measure n BEDROC e −αxi / N ∑ (BoltzmannRa e aRa (e a − 1) 1 i =1 BEDROC = + enhanced aR aR a (1− Ra ) a a a n  1 − e −α  (e − e )(e − 1) 1 − e Discrimination N  eα N − 1  Receiver Operator where Ra is the ratio of the total number of actives n to the Characteristic) total number of compounds N screened, and x is the i relative rank of the ith active in the ordered list. BEDROC gives the probability that an active is ranked ahead of a compound randomly selected from a hypothetical exponential probability distribution function ae − ax 1 − e − ax . It lies between 1 and 0, with 1 reflecting the best possible screening performance. In this work, a=20.0, with which the first ~8% of compounds contributed ~80% to BEDROC. n RIE (Robust e −αxi ∑ Initial i =1 Enhancement) RIE = n  1 − e −α  N  e α N − 1  Actives with higher ranks are weighed more heavily. Screening models giving large positive RIE values are more capable of identifying actives early in the ranked list. BEDROC has a more rigorous statistical interpretation than RIE. AUAC (Area 1 N −1 AUAC = ∑ [ Fa (k ) + Fa (k + 1)] Under the 2nN k = 0 Accumulation where n and N are the total number of actives and total Curve) number of compounds screened. Fa(k) represents the cumulative count of actives at rank position k. It measures the probability that an active in a rank-ordered list is placed before a compound selected randomly from a uniform distribution. It lies between 1 and 0, with 1 signifying the best possible performance and 0.5 random behavior. x% EF a/n x % EF = where a is the number of actives found in (Enrichment A/ N Factor) the first x% (n compounds) of a rank-ordered list containing A actives among a total number of N actives and decoys screened. It estimates how many times a screening model can pick out actives relative to random selection.

(

)

ACS Paragon Plus Environment

Page 14 of 33

Page 15 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 3: Structures used for the eight protein test systems in this work Protein

Protein PDB id

Docking Compounds actives decoys # # # FABP4 1A2D , 1A18 , 1AB0*, 1ACD*, 1ADL , 1ALB*, 1G7N*, 47 2749 1G74, 1LIB*, 1LIC, 1LID, 1LIE, 1LIF, 1TOU, 1TOW, 2ANS, 2HNX, 2NNQ, 2Q9S, 2QM9, 3FR2, 3FR4, 3FR5, 3HK1, 3JS1#, 3JSQ, 3P6C, 3P6D, 3P6E, 3P6F, 3P6G, 3P6H, 3Q6L*, 3RZY* BRAF 1UWH, 1UWJ, 2FB8, 3C4C, 3D4Q, 3IDP, 3II5, 3OG7, 152 9942 3PPJ, 3PPK, 3PRF, 3PRI, 3PSB, 3PSD, 3Q4C, 3Q96, 3SKC 3TV4, 3TV6, 4DBN, 4E4X, 4E26, 4EHE, 4EHG, 4FC0, 4FK3, 4G9C, 4G9R, 4H58, 4JVG, 4KSP, 4KSQ, 4MBJ, 4MNE, 4MNF, 4PP7 ADRB1 2VT4, 2Y00, 2Y01, 2Y02, 2Y03, 2Y04, 2YCW, 2YCX, 247 15842 2YCY, 2YCZ, 3ZPQ, 3ZPR, 4AMI, 4AMJ, 4BVN, 4GPO* TGFR1 1B6C*, 1IAS*, 1PY5, 1RW8, 1VJY, 2WOT, 2WOU, 2X7O, 133 8498 3FAA, 3GXL, 3HMM, 3KCF, 3TZM SAHH 1A7A, 1B3R*, 1D4F, 1K0U, 1KY4*, 1KY5, 1LI4, 1V8B, 63 3450 1XWF, 2H5L, 2ZIZ, 2ZJ0, 2ZJ1, 3CE6, 3D64*, 3DHY, 3G1U, 3GLQ, 3H9U, 3N58, 3NJ4, 3OND, 3ONE, 3ONF, 3X2E*, 3X2F*, 4PFJ, 4PGF THB 1BSX, 1N46, 1NAX,1NQ0,1NQ1,1NQ2,1NUO, 1Q4X, 103 7441 1R6G, 1XZX, 1Y0X, 2J4A, 2PIN, 3D57, 3GWS, 3IMY, 3JZC, 4ZO1 99 5146 PA2GA 1AYP, 1BBC*, 1DB4, 1DB5, 1DCY, 1JIA, 1KQU, 1KVO, 1N28*, 1N29*, 1POD*, 1POE, 1SV3, 1ZYX, 2ARM, 2OLI, 3U8B*, 3U8D, 3U8H, 3U8I# CP3A4 1TQN*, 1W0E*, 1W0F*, 1W0G, 2J0D, 2V0M, 3NXU, 170 11796 3TJS, 3UA1, 4D6Z, 4D7D, 4D75, 4D78, 4I3Q*, 4I4G, 4I4H, 4K9T, 4K9U, 4K9V, 4K9W, 4K9X, 4NY4, 5A1P*, 5A1R* # Protein structures co-crystallized with covalent ligands. *Protein structures without ligand bound (apo form).

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 4: Correlation coefficients between BEDROC and Term 1, Term 2, Term 3, Term 4, or Term 5 in Table 1 for eight protein test systems. Correlation Term 1 Term 2 Term 3 Term 4 Term 5 (SPI) Coefficient FABP4 0.7754 0.5105 -0.0760 0.4330 0.9286 BRAF 0.5537 0.2062 -0.2100 0.2168 0.8256 ADRB1 0.4072 0.3350 -0.0403 0.2140 0.8271 TGFR1 0.6775 0.6270 0.2377 0.1042 0.9050 SAHH 0.9225 0.9680 0.6775 -0.6537 -0.3465 THB 0.8709 0.7562 -0.0908 -0.5561 0.4285 PA2GA 0.8956 0.3208 -0.4886 0.3952 0.9432 CP3A4 0.7381 0.0383 -0.1843 0.3974 0.7751

ACS Paragon Plus Environment

Page 16 of 33

Page 17 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 5: Screening performance to 34 FABP4 crystal structures measured by different metrics. PDB id

BED RIE AUAC 1% 10% SPI LDE RES MV ROC EF EF (kcal/mol) (Å) (Å2) 3FR2 0.872 -10.274 2.20 250 0.828 14.07 0.91 55 8.7 3FR5 0.778 13.22 0.91 47 8.5 0.872 -9.710 2.20 289 3FR4 0.740 12.57 0.88 49 7.9 0.809 -10.308 2.16 283 2HNX 0.691 11.74 0.89 47 7.4 0.723 -9.083 1.50 234 2NNQ 0.674 11.46 0.88 36 7.4 0.830 -10.672 1.80 390 3HK1 0.658 11.17 0.89 34 7.6 1.70 256 0.894 -11.016 1LID 0.628 10.66 0.87 34 7.9 0.872 -11.255 1.60 252 2Q9S 0.620 10.54 0.88 34 7.4 0.574 -8.651 2.30 249 1LIF 0.576 9.78 0.86 25 7.9 0.851 -11.164 1.60 256 3P6H 0.575 9.77 0.88 40 6.6 0.681 -9.195 1.15 175 3P6G 0.550 9.34 0.86 28 6.8 0.617 -9.392 1.20 176 1TOW 0.549 9.33 0.89 34 7.0 0.617 -9.192 2.00 211 1ADL# 0.531 9.02 0.86 32 6.8 0.787 -11.087 1.60 271 1TOU 0.522 8.87 0.86 23 7.2 0.553 -8.469 2.00 232 1LIC 0.517 8.79 0.84 25 7.0 0.809 -10.555 1.60 258 1LIE 0.516 8.76 0.81 32 6.4 0.702 -11.005 1.60 223 2QM9 0.499 8.48 0.86 15 7.0 0.511 -8.124 2.31 340 3P6F 0.492 8.36 0.81 34 5.9 0.404 -8.703 1.20 145 * 1LIB 0.470 7.98 0.83 25 6.4 0.362 -8.008 1.70 3JS1# 0.409 6.94 0.84 13 6.6 0.362 -7.948 1.81 137 3JSQ 0.395 6.70 0.80 15 5.7 0.234 -7.923 2.30 140 1AB0* 0.392 6.66 0.77 23 4.0 0.362 -9.160 1.90 3Q6L* 0.351 5.97 0.82 2.1 5.9 0.128 -7.433 1.40 3RZY* 0.346 5.88 0.81 6.4 5.1 0.085 -7.448 1.08 3P6D 0.345 5.86 0.77 13 5.1 0.362 -9.064 163 1.06 2ANS 0.318 5.39 0.78 2.1 5.5 0.170 -7.686 2.50 221 3P6C 0.315 5.35 0.72 8.5 4.5 0.106 -8.002 1.25 131 # 1A2D 0.251 4.26 0.73 11 3.8 0.021 -7.328 2.40 141 1G7N* 0.241 4.09 0.52 6.4 3.2 1.50 0.000 -6.559 3P6E 0.236 4.01 0.74 11 4.0 0.149 -8.312 1.08 149 1A18# 0.164 2.78 0.74 3.6 0.064 -7.234 2.40 191 0 1G74 0.152 2.58 2.1 3.0 0.043 -7.345 1.70 257 0.51 * 1ACD 0.140 2.38 0.67 2.1 -6.999 2.1 0.000 2.70 1ALB* 0.121 2.06 0.69 2.5 -7.072 2.50 0 0.000 # Protein structures co-crystallized with covalent ligands. *Protein structures without ligand bound (apo form). Bold: The best number for each column (lowest for LDE and RES; highest for all others) Bold: the worst number for each column (highest for LDE and RES; lowest for all others) SPI, Screening performance index; LDE, lowest docking energy; RES, resolution of crystal structure; MV, molecular volume.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 6: Screening performance to 36 BRAF crystal structures measured by different metrics. PDB id BED RIE AUAC 1% 10% SPI LDE Res. MV ROC EF EF (kcal/mol) (Å) (Å2) 3IDP 2.70 373 0.633 10.93 0.91 35 7.5 0.816 -14.500 1UWJ 0.568 9.81 0.90 33 7.0 0.796 -14.659 3.50 338 1UWH 0.554 9.39 0.89 32 6.1 0.724 -14.099 2.95 340 4KSQ 0.533 9.20 0.87 27 6.3 0.737 -14.416 3.30 412 4MNF 0.518 8.95 0.84 33 5.9 0.520 -12.361 2.80 261 3PRI 0.503 8.69 0.82 28 6.3 0.500 -11.984 3.50 304 2FB8 0.500 8.63 0.84 30 5.5 0.507 -12.193 2.90 368 4G9R 0.496 8.57 0.86 27 6.6 0.789 -14.713 3.20 374 4H58 0.496 8.57 0.87 28 6.0 0.553 -11.156 3.10 311 3OG7 0.494 8.52 0.87 22 6.2 0.428 -10.612 366 2.45 4DBN 0.494 8.52 0.86 32 5.7 0.599 -12.939 3.15 408 3D4Q 0.489 8.44 0.84 30 5.7 0.618 -12.750 2.80 298 3PSD 0.470 8.12 0.85 24 5.7 0.678 -12.595 3.60 302 3PPJ 0.465 8.04 0.83 24 5.4 0.566 -11.728 260 3.70 4KSP 0.464 8.01 0.85 23 5.8 0.618 -13.631 2.93 413 3PSB 0.461 7.97 0.85 24 5.7 0.572 -11.686 3.40 270 4E26 0.461 7.96 0.81 27 5.4 0.322 -10.706 2.55 258 3PPK 0.458 7.91 0.82 28 5.5 0.507 -12.137 3.00 303 4JVG 0.454 7.83 0.85 24 5.8 0.651 -14.052 3.09 427 4FC0 0.451 7.79 0.85 28 5.4 0.625 -13.619 2.95 426 3PRF 0.447 7.73 0.78 27 5.1 0.401 -10.885 2.90 255 3II5 0.405 6.99 0.81 20 5.1 0.421 -12.369 2.79 407 3TV6 0.377 6.52 0.80 14 5.0 0.368 -11.302 3.30 294 4PP7 0.376 6.49 0.81 14 5.5 0.151 -9.574 3.40 307 3Q4C 0.374 6.47 0.77 24 4.7 0.230 -10.359 3.20 238 3Q96 0.374 6.45 0.82 21 4.6 0.414 -13.563 3.10 412 4FK3 0.362 6.25 0.83 9.9 5.3 0.151 -11.096 2.65 322 3C4C 0.351 6.05 0.80 9.9 5.0 0.349 -11.446 2.57 288 4E4X 0.347 5.99 0.83 9.2 4.8 0.303 -11.531 3.60 331 4EHG 0.335 5.78 0.82 12 4.5 0.355 -10.940 3.50 312 3TV4 0.327 5.65 0.82 5.3 -10.367 3.40 279 7.2 0.059 4MBJ 0.288 4.97 0.76 11 3.9 0.362 -10.613 3.60 283 4EHE 0.268 4.63 0.76 7.9 3.9 0.105 -9.977 3.30 299 3SKC 0.244 4.21 0.78 7.9 3.8 0.197 -11.391 3.20 325 4G9C 0.241 4.15 0.75 14 3.1 0.342 -13.520 3.50 370 4MNE 0.234 13 0.211 -11.213 2.85 293 4.03 0.67 3.0 See footnote for Table 5.

ACS Paragon Plus Environment

Page 18 of 33

Page 19 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 7: Screening performance to 16 ADRB1 crystal structures measured by different metrics. PDB id

BED RIE AUAC 1% ROC EF 4AMI 0.454 7.81 0.84 25 4AMJ 0.366 6.30 0.83 14 2Y02 0.351 6.05 0.82 15 2VT4 0.349 6.00 0.82 12 2Y01 0.337 5.80 0.80 16 2Y03 0.322 5.54 0.79 13 2YCY 0.322 5.55 0.77 11 2YCX 0.309 5.32 0.79 12 2Y00 0.300 5.16 0.78 13 3ZPQ 0.297 5.12 0.74 13 4BVN 0.294 5.06 0.75 12 2YCZ 0.289 4.98 0.77 10 3ZPR 0.280 4.82 0.76 10 2YCW 0.277 4.77 0.79 9.7 2Y04 0.266 4.58 0.77 9.3 4GPO* 0.188 3.24 0.72 2.8 See footnote for Table 5.

10% EF 6.0 5.3 5.1 5.2 4.5 4.7 4.6 4.5 4.3 4.3 4.5 4.0 4.3 4.0 4.0 3.2

SPI

0.749 0.704 0.611 0.700 0.534 0.352 0.462 0.619 0.559 0.547 0.583 0.466 0.413 0.502 0.377 0.028

LDE (kcal/mol) -12.423 -12.635 -12.173 -13.193 -11.738 -11.406 -11.948 -12.247 -12.172 -12.265 -14.922 -12.868 -10.700 -11.690 -10.526 -10.286

ACS Paragon Plus Environment

Res. (Å) 3.20 2.30 2.60 2.70 2.60 2.85 3.15 3.25 2.50 2.80 2.10 3.65 2.70 3.00 3.05 3.50

MV (Å2) 290 331 294 229 247 170 231 232 243 159 232 254 186 242 205

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 33

Table 8: Screening performance to 13 TGFR1 crystal structures using different metrics. BED RIE AUAC 1% EF ROC 3TZM 0.700 12.05 0.93 38 2X7O 0.632 10.87 0.93 29 3KCF 0.584 10.06 0.91 32 2WOT 0.576 9.92 0.91 26 3HMM 0.546 9.41 0.88 29 1RW8 0.518 8.92 0.87 26 1VJY 0.460 7.91 0.89 17 3FAA 0.448 7.72 0.86 27 3GXL 0.426 7.34 0.85 15 1PY5 0.407 7.01 0.84 22 1IAS* 0.335 5.77 0.74 17 2WOU 0.288 4.96 0.78 11 1B6C* 0.095 1.64 0.72 1.5 See footnote for Table 5. PDB id

10% EF 8.3 7.9 7.6 7.7 6.5 6.8 6.3 5.4 5.9 5.4 4.7 4.1 1.5

SPI 0.759 0.707 0.684 0.797 0.759 0.504 0.609 0.632 0.609 0.481 0.150 0.331 0.000

LDE (kcal/mol) -11.569 -10.827 -11.356 -11.410 -11.262 -11.168 -11.832 -10.958 -11.405 -10.803 -10.678 -11.592 -8.227

ACS Paragon Plus Environment

Res. (Å) 1.70 3.70 2.80 1.85 1.70 2.40 2.00 3.35 1.80 2.30 2.90 2.30 2.60

MV (Å2) 297 406 294 369 246 229 225 276 274 215 280

Page 21 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 9: Screening performance to 28 SAHH crystal structures measured by different metrics. PDB id BED RIE AUAC 1% 10% SPI LDE Res. MV ROC EF EF (kcal/mol) (Å) (Å2) 3DHY 2.00 0.981 16.49 0.99 56 10 1.000 -12.608 228 1V8B 0.948 15.93 0.98 53 9.8 0.810 -11.383 2.40 190 3G1U 0.931 15.65 0.96 56 9.5 0.778 -11.177 2.20 189 3OND 0.909 15.28 0.97 49 9.7 0.810 -11.698 192 1.17 1D4F 0.897 15.08 0.88 56 8.9 0.810 -12.400 2.80 190 3GLQ 0.894 15.02 0.89 56 8.9 0.762 -12.386 2.30 194 4PGF 0.894 15.03 0.90 56 8.9 0.635 -11.794 2.59 187 3N58 0.889 14.94 0.89 56 8.9 0.778 -11.579 2.39 188 1LI4 0.888 14.93 0.89 54 9.1 0.778 -12.121 2.01 190 4PFJ 0.884 14.86 0.90 54 9.1 0.667 -12.474 2.30 187 1KY5 0.878 14.76 0.88 56 8.7 0.762 -11.880 2.80 190 1XWF 0.877 14.74 0.98 46 9.7 0.635 -11.951 2.80 187 3CE6 0.874 14.70 0.88 56 8.7 0.746 -12.476 1.60 186 2ZIZ 0.869 14.60 0.86 56 8.6 0.762 -11.988 2.20 194 3ONF 0.868 14.59 0.98 48 9.8 0.873 -12.081 2.00 185 2ZJ0 0.865 15.54 0.85 56 8.6 0.730 -11.446 2.42 196 3NJ4 0.865 14.54 0.86 54 8.6 0.794 -12.104 2.50 198 2H5L 0.854 14.36 0.86 54 8.7 0.730 -11.390 2.80 188 2ZJ1 0.844 14.19 0.89 53 9.1 0.778 -12.444 2.01 188 1K0U 0.842 14.15 0.89 49 8.9 0.714 -11.366 178 3.00 1A7A 0.821 13.80 0.86 51 8.4 0.524 -11.179 2.80 168 3ONE 0.764 12.85 0.80 49 7.8 0.365 -10.818 1.35 101 3H9U 0.718 12.06 0.69 43 7.0 0.190 -10.884 1.90 98 1B3R* 0.357 6.00 0.83 11 5.1 2.80 0.000 -7.517 1KY4* 0.237 3.98 0.79 6.4 3.5 2.80 0.000 -7.856 3D64* 0.234 3.94 0.82 4.8 4.1 2.30 0.000 -7.709 3X2F* 0.054 0.90 0.62 1.3 2.04 0 0.000 -6.887 3X2E* 0.031 0.53 2.85 0.52 0 0.95 0.000 -6.478 See footnote for Table 5.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 10: Screening performance to 18 THB crystal structures measured by different metrics. PDB id BED RIE AUAC 1% 10% SPI LDE Res. MV ROC EF EF (kcal/mol) (Å) (Å2) 1Q4X 2.80 308 0.523 9.16 0.84 35 6.5 0.631 -14.311 3GWS 0.478 8.37 0.62 31 5.2 0.388 -13.458 292 2.20 2J4A 0.452 7.91 0.56 33 4.9 0.417 -14.212 273 2.20 4ZO1 0.448 7.85 0.67 26 5.4 0.388 -13.391 3.22 299 3IMY 0.437 7.65 0.70 25 5.2 0.320 -12.157 2.55 271 1R6G 0.428 7.48 0.81 20 5.5 0.398 -12.781 3.00 362 1NAX 0.425 7.43 0.55 30 4.8 0.340 -13.123 2.70 262 1N46 0.419 7.33 0.62 30 4.8 0.311 -13.649 295 2.20 2PIN 0.411 7.20 0.54 26 4.3 0.388 -13.730 2.30 274 1NQ1 0.405 7.09 0.59 26 4.7 0.301 -12.757 2.90 268 3D57 0.400 7.01 0.59 25 4.6 0.388 -13.439 270 2.20 1Y0X 0.400 6.99 0.60 25 4.6 0.330 -12.836 3.10 328 1XZX 0.388 6.78 0.54 27 4.5 0.320 -13.471 2.50 299 1NQ0 0.373 6.54 0.59 21 4.4 0.359 -12.854 2.40 267 1BSX 0.371 6.48 0.68 21 4.8 0.330 -12.737 296 3.70 3JZC 0.362 6.34 0.56 24 4.5 0.243 -12.198 2.50 269 1NUO 0.342 5.98 21 0.194 -11.967 3.10 269 0.43 3.9 1NQ2 0.64 4.0 2.40 281 0.294 5.14 13 0.165 -11.669 See footnote for Table 5.

ACS Paragon Plus Environment

Page 22 of 33

Page 23 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 11: Screening performance to 20 PA2GA crystal structures measured by different metrics. PDB id BED RIE AUAC 1% 10% SPI LDE Res. MV ROC EF EF (kcal/mol) (Å) (Å2) 1DB4 33 0.677 -12.457 2.20 317 0.531 8.85 0.79 5.9 3U8H 0.528 8.79 0.78 5.3 0.737 -12.181 2.30 396 40 1DB5 0.515 8.59 31 5.7 305 0.79 0.778 -12.151 2.80 1KQU 0.503 8.38 0.77 34 5.1 0.727 -12.001 2.10 330 1KVO 0.500 8.32 0.77 36 4.8 0.717 -11.841 2.00 410 1J1A 0.495 8.24 0.75 35 4.7 0.687 -11.763 2.20 409 1AYP 0.458 7.63 0.76 32 4.6 0.697 -12.608 2.57 443 3U8D 0.440 7.33 0.76 26 5.2 0.677 -12.336 1.80 333 1POE 0.396 6.60 0.74 17 4.7 0.737 -11.276 2.10 385 2OLI 0.240 3.99 0.68 2 3.8 0.273 -8.042 2.21 151 1ZYX 0.237 3.94 0.70 6.1 3.6 0.182 -7.657 1.95 298 1DCY 0.216 3.60 0.71 8.2 3.5 0.657 -10.854 2.70 256 1SV3 0.121 2.01 0.65 1 2.3 0.071 -7.565 1.35 126 1BBC* 0.105 1.76 0.49 5.1 1.3 2.20 0.000 -6.861 1N29* 0.074 1.24 0.60 1.1 0.091 -7.757 2.60 0 2ARM 0.059 0.98 0.58 1.3 0.010 -7.016 1.23 239 0 1POD* 0.049 0.81 0.49 1 0.81 0.071 -8.974 2.10 # 3U8I 0.047 0.78 1 0.81 0.061 -9.592 0.28 1.10 125 1N28* 0.042 0.69 0.46 1.50 0 0.71 0.030 -7.526 3U8B* 0.041 0.68 0.50 0.81 0.040 -7.459 2.30 0 See footnote for Table 5.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 12: Screening performance to 24 CP3A4 crystal structures measured by different metrics. PDB id BED RIE AUAC 1% 10% SPI LDE Res. MV ROC EF EF (kcal/mol) (Å) (Å2) 4I4H 0.60 2.90 573 0.223 3.89 11 3.2 0.712 -11.413 3NXU 0.212 3.68 0.59 9.4 2.6 0.653 -11.120 2.00 569 4K9W 0.202 3.52 8.8 2.6 0.571 -10.010 2.40 519 0.62 4D6Z 0.198 3.45 0.59 9.4 2.5 0.559 -11.415 272 1.93 3UA1 0.197 3.43 0.60 9.4 2.7 0.694 -10.387 2.15 471 4K9V 0.193 3.37 0.59 10 2.6 0.682 -10.793 2.60 439 4NY4 0.186 3.25 0.59 8.8 2.4 0.482 -9.789 348 2.95 3TJS 0.181 3.15 0.59 6.5 2.5 0.288 -9.234 2.25 463 4I4G 0.175 3.04 0.57 10 2.4 0.459 -11.079 2.72 551 4K9X 0.174 3.02 0.58 7.6 2.5 0.400 -9.558 2.76 400 4K9U 0.165 2.88 0.59 6.5 2.2 0.194 -9.610 2.85 439 4D78 0.157 2.74 0.58 7.6 2.2 0.488 -9.680 2.80 349 1W0G 0.156 2.71 0.58 4.7 2.2 0.224 -9.356 2.74 191 1W0F* 0.146 2.55 0.52 7.6 1.8 0.353 -9.391 2.65 4D75 0.143 2.50 0.59 5.3 2.2 0.582 -10.468 2.25 258 1W0E* 0.139 2.42 0.54 4.1 1.9 0.229 -8.855 2.80 4D7D 0.136 2.36 0.56 4.7 1.9 0.529 -11.253 2.76 368 4K9T 0.135 2.35 0.58 2.9 2.2 0.635 -10.283 2.50 414 2V0M 0.125 2.18 0.54 4.1 1.9 0.494 -9.603 2.80 397 2J0D 0.119 2.07 0.56 3.5 1.8 0.141 -8.550 2.75 580 1TQN* 0.074 1.29 0.51 2.3 1.3 2.05 0.059 -8.422 4I3Q* 0.064 1.12 0.45 1.2 1.2 0.082 -8.247 2.60 5A1P* 0.064 1.12 0.45 2.3 1.0 0.100 -8.354 2.50 5A1R* 0.034 0.59 2.45 0.43 0.59 0.47 0.059 -8.800 See footnote for Table 5.

ACS Paragon Plus Environment

Page 24 of 33

Page 25 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 13: Correlation Coefficient between BEDROC with the lowest docking energy, the resolution of crystal structure, or molecular volume for eight protein test systems. Correlation Coefficient FABP4 BRAF ADRB1 TGFR1 SAHH THB PA2GA CP3A4

Lowest Docking Energy (LDE) -0.7754 -0.5537 -0.4072 -0.6775 -0.9680 -0.7562 -0.8956 -0.7381

Crystal Resolution (RES) -0.0825 -0.2354 -0.2534 -0.1006 -0.1905 -0.0910 0.4196 -0.0203

Molecular Volume (MV) 0.5408 0.2288 0.5874 0.4942 0.8751 0.2636 0.8138 0.3088

ACS Paragon Plus Environment

Screening Performance Index (SPI) 0.9286 0.8256 0.8271 0.9050 0.9225 0.8709 0.9432 0.7751

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table 14: BEDROC ranks for the best crystal structures identified by four other screening performance metrics and vice versa for eight protein test systems. Rank Rank by BEDROC

Rank by each measure for the best structure identified by BEDROC System SPI LDE RES MV SPI LDE RES MV th th th th nd th th th FBAP4 6 7 25 5 2 9 25 / 26 11th BRAF 1st 8th 10th 19th 1st 3rd 5th 9th st th th nd st th th ADRB1 1 11 11 2 1 4 13 3rd TGFR1 4th 7th 1st / 5th 2nd 2nd 3rd 1st / 2nd 3rd SAHH 1st 1st 4th 1st 1st 1st 5th / 6th 1st st st nd rd th th th st st th THB 1 1 2 /3 /8 /11 6 1 1 12 3rd PA2GA 3rd 7th 18th 7th 8th/9th 2nd 11th/12th/13th 8th st th th th st nd rd CP3A4 1 4 4 18 1 2 23 2nd *Two numbers appears (e.g., 1st / 5th) when two structures tied with best performance. SPI, screening performance index; LDE, lowest docking energy; RES, resolution; MV, molecular volume.

ACS Paragon Plus Environment

Page 26 of 33

Page 27 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table 15: Correlation coefficients between SPI and BEDROC or AUAC obtained from SPdocking or XP-docking for the first four protein systems, using “diverse” actives only. Correlation Coefficient FABP4 BRAF ADRB1 TGFR1

SP-docking BEDROC AUAC 0.8579 0.8690 0.8922 0.7625 0.7795 0.8867 0.8992 0.9558

XP-docking BEDROC AUAC 0.9094 0.9346 0.9238 0.8269 0.7799 0.8466 0.8584 0.8793

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1.0

1.0 r = 0.9286

0.8

SPI

SPI

r = 0.8256

0.8

0.6 0.4 0.2

0.6 0.4 0.2

0.0

0.0 0.0

0.2

0.4 0.6 BEDROC

0.8

1.0

0.2

0.3

(a)

0.6

0.7

0.8

1.0

0.6

0.7

1.0 r = 0.8271

0.8

r = 0.9050

0.8

0.6

SPI

SPI

0.4 0.5 BEDROC

(b)

1.0

0.4 0.2

0.6 0.4 0.2

0.0

0.0 0.1

0.2

0.3 0.4 BEDROC

0.5

0.6

0.0

0.2

(c)

0.4 0.6 BEDROC

(d)

1.0

0.7 r = 0.9225

0.8

r = 0.8709

0.6 0.5 SPI

0.6 SPI

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 33

0.4

0.4 0.3

0.2

0.2

0.0

0.1 0.0

0.2

0.4 0.6 BEDROC

0.8

1.0

0.2

0.3

(e)

0.4 0.5 BEDROC

(f)

ACS Paragon Plus Environment

Page 29 of 33

1.0

1.0 r = 0.9432

0.8

0.8

r = 0.7751

0.6 SPI

0.6 SPI

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

0.4

0.4

0.2

0.2

0.0

0.0 0.00

0.0

0.1

0.2 0.3 0.4 BEDROC

0.5

0.6

0.06

0.12 0.18 0.24 BEDROC

0.30

(g) (h) Figure 1. Correlation plots of the screening performance index against BEDROC: (a) for the FABP4 system, (b) for the BRAF system, (c) for the ADRB1 system, and (d) for the TGFR1 system, (e) for the SAHH system, (f) for the THB system, (g) for the PA2GA system, (h) for the CP3A4 system.

1.0

0.8

0.6

BEDROC RIE AUAC

0.4

1% EF 10% EF

0.2

0.0

Figure 2. Correlation coefficients between five screening performance metrics with SPI: (a) for the FABP4 system, (b) for the BRAF system, (c) for the ADRB1 system, and (d) for the TGFR1 system, (e) for the SAHH system, (f) for the THB system, (g) for the PA2GA system, (h) for the CP3A4 system.

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

For Table of Contents Only

ACS Paragon Plus Environment

Page 30 of 33

Page 31 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 1. Correlation plots of the screening performance index against BEDROC: (a) for the FABP4 system, (b) for the BRAF system, (c) for the ADRB1 system, and (d) for the TGFR1 system, (e) for the SAHH system, (f) for the THB system, (g) for the PA2GA system, (h) for the CP3A4 system. 232x365mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. Correlation coefficients between five screening performance metrics with SPI: (a) for the FABP4 system, (b) for the BRAF system, (c) for the ADRB1 system, and (d) for the TGFR1 system, (e) for the SAHH system, (f) for the THB system, (g) for the PA2GA system, (h) for the CP3A4 system. 158x145mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 32 of 33

Page 33 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 3. Overview of the method used to calculate the SPIs from the energies of XP-docking of 19 diverse actives to 34 FABP4 crystal structures (only the data from PDB id 3FR2, 3P6H, 2QM9 and 1ACD shown here) to illustrate that the SPIs correlated well with BEDROC values obtained by screening a compound library containing 19 actives and 2749 decoys to each of the four crystal structures. 58x44mm (300 x 300 DPI)

ACS Paragon Plus Environment