WebMolCS: A Web-Based Interface for Visualizing ... - ACS Publications

Mar 19, 2017 - The concept of chemical space provides a convenient framework to analyze large collections of molecules by placing them in property spa...
0 downloads 9 Views 3MB Size
Application Note pubs.acs.org/jcim

WebMolCS: A Web-Based Interface for Visualizing Molecules in Three-Dimensional Chemical Spaces Mahendra Awale, Daniel Probst, and Jean-Louis Reymond* Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland ABSTRACT: The concept of chemical space provides a convenient framework to analyze large collections of molecules by placing them in property spaces where distances represent similarities. Here we report webMolCS, a new type of web-based interface visualizing up to 5000 user-defined molecules in six different three-dimensional (3D) chemical spaces obtained by principal component analysis or similarity mapping of multidimensional property spaces describing composition (MQN: 42D molecular quantum numbers, SMIfp: 34D SMILES fingerprint), shapes and pharmacophores (APfp: 20D atom pair fingerprint, Xfp: 55D category extended atom pair fingerprint), and substructures (Sfp: 1024D binary substructure fingerprint, ECfp4:1024D extended connectivity fingerprint). Each molecule is shown as a sphere, and its structure appears on mouse over. The sphere is color-coded by similarity to the first compound in the list, by the list rank, or by a userdefined value, which reveals the relationship between any property encoded by these values and structural similarities. WebMolCS is freely available at www.gdb.unibe.ch.



INTRODUCTION One of the defining features of modern drug discovery is the necessity to constantly interact with hundreds to thousands of molecules such as hit lists from virtual or experimental screening and perform appropriate choices.1−6 The concept of chemical space can assist in understanding the structural diversity of such series of molecules by placing them in multidimensional property spaces where distances represent similarities.7−10 These property spaces can then be subjected to various dimensionality reduction methods to obtain two- or three-dimensional (2D or 3D) representations suitable for visualization.11−15 Herein we report webMolCS, a new type of web-based interface for visualizing sets of up to 5000 userdefined molecules in 3D chemical spaces and selecting subsets. WebMolCS first places the molecules in six different multidimensional property spaces representing a selection of simple molecular features computed from the 2D structures of molecules and relevant for molecular diversity and biological properties, including composition (MQN and SMIfp), shape and pharmacophores (APfp and Xfp), as well as detailed substructures (Sfp and ECfp4, Table 1). WebMolCS then projects each of these multidimensional spaces into a 3D space suitable for visualization using either principal component analysis (PCA) or similarity mapping (SIM),16−21 and offers interactive molecule view and selection features. The interface is freely available at www.gdb.unibe.ch and completes the mapping for any user-defined list of compounds in only a few minutes. WebMolCS extends on our previously reported mapplets22,23 and similarity mapplets,21 which were downloadable Java © 2017 American Chemical Society

applets displaying interactive color-coded 2D maps, and our recently reported WebDrugCS, a web-based application to explore DrugBank in 3D chemical spaces.24 Specifically, WebMolCS handles user-defined compound series rather than predefined databases and displays these compounds in more diverse 3D chemical spaces than previously possible. Furthermore, WebMolCS offers enhanced visualization options in 3D by making nearest neighbor relationships visible in the form of a minimum spanning tree and by enabling color-coding by list rank or by user-defined values. The application also allows one to create compound subsets either by manual or automatic compound selection from the maps. The main intended use of WebMolCS is to upload a list of compounds, typically top scoring compounds from a virtual screening campaign or sets of bioactive compounds, and provide interactive visualization to assist in understanding the structural diversity at hand and selecting a smaller subset, for example toward experimental evaluation. To the best of our knowledge the web-based capabilities of WebMolCS in terms of uploading compound sets of up to 5000 molecules, color-coded visualization of the compound set in six different chemical spaces, and subset selection features, is unprecedented and nicely complements other online library visualization tools (Table 2)25−29 as well as various desktop applications.30−36 Received: November 11, 2016 Published: March 19, 2017 643

DOI: 10.1021/acs.jcim.6b00690 J. Chem. Inf. Model. 2017, 57, 643−649

Application Note

Journal of Chemical Information and Modeling



the form used to compute the various fingerprints. The 3D point containing the first molecule in the list, which is used for similarity sorting, is easily identifiable in these maps as a white/red blinking point. Four sets of demonstration examples with different compound set sizes are provided on the webpage, each with the complete set of the 12 possible precomputed chemical space maps which are accessible directly. Selected views are displayed in Figure 2 and discussed below. The control panel at upper right, displayed in Figure 2A, offers the following options for the interactive view: (a) Map Color. The default color code of the map represents the rank of the molecules when the list has been sorted by similarity to the first molecule is the list according to the selected fingerprint for map generation. For ligand-based virtual screening results this molecule should preferentially be the seed molecule used for searching. The second option “List-rank” uses molecule rank in the originally uploaded list by the user for color coding. As a third option the map can be color-coded according to a user defined value inserted as a third column in the input list. (b) PC Variance Percent. This displays the variance covered by the first three principal components used to generate the 3D view. For the first three fingerprints APfp, SMIfp, and MQN variance coverage by the three PCs is usually high (>70%), such that the 3D space offers similar relationships as the original multidimensional space. For the more complex fingerprints Xfp, Sfp, and ECfp4, the variance covered is usually relatively low (5000c 2000

no no no

no no no

25 26 27

1D/2D

0

1000

yes

no

28

1D/2D/3D

0

>5000c

no

no

NA

3D

6

5000

yes

yes

this work

ref

a

Color coding the map using any parameter of interest. bMultidimensional scaling. cExact limit on number of compounds is not available, tested here with 5000 cpds. Fps: fingerprints. NA: not available. 644

DOI: 10.1021/acs.jcim.6b00690 J. Chem. Inf. Model. 2017, 57, 643−649

Application Note

Journal of Chemical Information and Modeling

Figure 1. (a) Starting page of webMolCS. (b) Intermediate page showing status of the job submitted by a user. For details refer to the main text.

(c) Axes and MST. These display the reference (PC1, PC2, PC3) axes, and the minimum spanning tree (MST) calculated using Kruskal’s algorithm (Figure 2C).44,45 The MST provides

as can be judged by the fact that the fp-rank color-coding follows the spatial distribution and by inspecting structures in the interactive 3D maps. 645

DOI: 10.1021/acs.jcim.6b00690 J. Chem. Inf. Model. 2017, 57, 643−649

Application Note

Journal of Chemical Information and Modeling

Figure 2. Selected views of WebMolCS. (a) 5000 MQN-nearest neighbors of nicotine from GDB-13, displayed as an ECfp4 similarity map. The control panel is shown. The map is color-coded by ECfp4-similarity to nicotine. Three distincts clouds of compounds are visible. See the main text for discussion of control panel functions and details of the nicotine maps. (b) View of 1541 Aurora A inhibitors from ChEMBL in APfp direct PCA space. The color-code is by APfp-similarity to most active compound shown at lower left. The color changes from blue→ cyan→ green→ yellow→ red with increasing similarity value. (c) Automated selection of 20 nicotine analogs taken from the ECfp4-similarity map in part a. (d) View of 48 compounds studied for Aurora A inhibition in a SMIfp direct PCA map, showing three groups of compounds. The map is color-coded by increasing inhibitor potency. The color changes from blue → red with increasing pIC50. The most potent inhibitor is shown as the white pixel at lower right. (e) View of 197 compounds active against three different targets in a Sfp direct PCA map. 100 compounds for Focal Adhesion Kinase 1 (fak1, blue), 56 compounds for Thymidine Kinase (kith, green), and 41 compounds for Catechol O-methyltransferase (comt, red). Compounds were extracted from eDUD database.

(g) No. of Cpds and Automatic Cpd List. This function automatically picks the selected number of compounds along the list ordered by fingerprint similarity to the first compound in the list. The automatically selected points appear white. Individual molecules can be added (by clicking on molecule ID) to this list manually. This function allows one to perform compound selection during virtual screening.46 (h) Show List. opens a molecule view panel showing the selected molecules. At the top of this panel three buttons are

direct information on spatial relationship between different points (or molecules) in the 3D map. (d) Point Size. allows adjusting the size of each 3D-point on the display. (e) Set as Pivot Point. This function centers the rotation of the map on the selected 3D point. The 3D point can be selected (or unselected) by mouse double click. (f) Reset View. This feature resets the view to the initial view and reference point. 646

DOI: 10.1021/acs.jcim.6b00690 J. Chem. Inf. Model. 2017, 57, 643−649

Application Note

Journal of Chemical Information and Modeling

counterions, checking for valence error, and adjusting the protonation state at pH 7.4. Fingerprints. All fingerprints are calculated using JChem java library from ChemAxon Pvt. Ltd. Calculation of APfp, SMIfp, MQN, and Xfp fingerprints are discussed in detail in the respective publications from our group. An Sfp fingerprint is calculated using the ChemicalFingerprint class using path length (bond count) of 7 and a fingerprint length of 1024 bits. The ECfp4 fingerprint is calculated using the ECFP class with bond diameter of 4 and fingerprint length of 1024 bits. Principal Component Analysis. The PCA is performed using an in house written Java program utilizing the JSci (a science API for Java: http://jsci.sourceforge.net/) java library to find eigenvalues and eigenvectors from the covariance matrix. The Java source code is based on the tutorial of Lindsay I. Smith (http://www.cs.otago.ac.nz/cosc453/student_ tutorials/principal_components.pdf). 3D Space. For each molecule in the list PC-1, PC-2, and PC-3 values are calculated. The largest (PCmax) and smallest (PCmin) PC values appearing in the PC-1, PC-2, or PC-3 values are used to define the value range ΔPC = PCmax − PCmin and set the binning scale as ΔPC/300. The PC-1, PC-2, and PC-3 values are binned onto 300 × 300 × 300 3D grids using the same absolute bin size on the PC-1, PC-2, and PC-3 axis. Each molecule is assigned to a point on this 3D grid. Color Coding. Each grid point in the 3D map is color coded according to the average rank of all the compounds at that point and standard deviation of rank computed considering all the molecules within ±5 grid points in each direction. The hue−saturation−lightness (HSL) color space is used for color coding. The hue and saturation values are set according to the average rank and standard deviation of each grid point, respectively. The color changes from blue−cyan−green− yellow−red with an increasing average rank of compounds in a grid point. Minimum Spanning Tree. The minimum spanning tree (MST) is created using our in-house Java-based implementation of Kruskal’s algorithm.44 The initial input for Kruskal’s algorithm is the positions of points in a 3D map. Given that Kruskal’s algorithm requires a weighted directed graph as an input, a k-nearest neighbor graph (k-NNG)45 is constructed with the weight of the edges set to the manhattan (city block) distance between the connected nodes (points in the 3D map). The resulting k-NNG is not guaranteed to be connected, depending on the size and sparsity of potential clusters compared to the number of nearest neighbors to be coupled during the construction of the graph. In case of a disconnected graph, the output of our Java-program is a minimum spanning tree forest. webMolCS. The 3D rendering and visualization in webMolCS is supported by Three.js (http://threejs.org/), an open-source JavaScript library to create and display animated 3D computer graphics in a web browser. webMolCS has been successfully tested on the IE, Chrome, and Opera browsers. The only requirement for webMolCS is to have JavaScript enabled in a web browser.

displayed allowing to sort the list, save the list as SMILES list, and clear the list. (i) ExportCoord. exports a text file containing the 3D-map coordinates of the compounds displayed on the map together with their SMILES strings and ID code. (j) Help. opens a panel explaining the control functions in webMolCS. The use of WebMolCS is exemplified by the demonstration examples available on the web-page (Figure 2). The first example shows the main intended use of WebMolCS, which is to facilitate the visual inspection of virtual screening hit lists and the selection of subsets for experimental evaluation. In this example we have selected the 5000 MQN-nearest neighbors of the natural product nicotine found in the chemical universe database GDB-13.47−49 Visualizing these 5000 nicotine analogs in WebMolCS helps appreciating their structural diversity. For example the similarity map in ECfp4 similarity space reveals three main compound classes, first N-methyl arylpyrrolidine and aryl-piperidines that are highly similar to the parent nicotine, second aryl-pyrrolidine and aryl-piperidines that resemble nor-nicotine, and third a large group of cyclic amidines (Figure 2A). The automated compound list function set with 20 compounds provides a selection of these structural types which could be envisioned for testing (Figure 2C). The second example features 1503 Aurora A kinase inhibitors (IC50 ≤ 10 μM) extracted from ChEMBL and ordered by decreasing activity.50 Here WebMolCS allows one to browse through the structural types and appreciate the diversity of molecules reported to be active against Aurora A, as illustrated here for the APfp direct PCA map color-coded by similarity to the most active inhibitor of the series, which distributes compounds according to their size and molecular shape (Figure 2B). The third case reports a related but much smaller set of 48 compounds recently investigated for Aurora A kinase inhibition in our group.51 Here the SMIfp direct PCA map nicely illustrates the three compound series explored in this study, two of which provided potent inhibitors (Figure 2D). Finally the fourth case shows the combination of active inhibitors for three different targets taken from the e-DUD52 shown in direct PCA of Sfp space with color-coding according to compound series, which shows partial intermixing of two of the three series (Figure 2E).



CONCLUSION The webMolCS web interface represents a new opportunity to apply the concept of chemical space by placing molecules in various 3D chemical spaces and visualizing their structural similarities by proximity and color-coding. Structure−activity relationships can be revealed by using the list-rank color code in an activity-sorted list of molecules or any values provided by the user such as activity. The possibility to automatically or manually select a subset of molecules furthermore allows one to select compounds from virtual screening results for experimental evaluation. The interface is freely available at www.gdb.unibe.ch.





METHODS Processing of Molecules. Molecules are processed in SMILES format using an in-house written java program utilizing JChem java chemistry library from ChemAxon Pvt. Ltd. as a starting point. Molecules are processed by removing

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Fax: +41 31 631 80 57. ORCID

Jean-Louis Reymond: 0000-0003-2724-2942 647

DOI: 10.1021/acs.jcim.6b00690 J. Chem. Inf. Model. 2017, 57, 643−649

Application Note

Journal of Chemical Information and Modeling Author Contributions

(18) Klein, C.; Kaiser, D.; Kopp, S.; Chiba, P.; Ecker, G. F. Similarity Based Sar (Sibar) as Tool for Early Adme Profiling. J. Comput.-Aided Mol. Des. 2002, 16, 785−793. (19) Raghavendra, A. S.; Maggiora, G. M. Molecular Basis Sets - a General Similarity-Based Approach for Representing Chemical Spaces. J. Chem. Inf. Model. 2007, 47, 1328−1240. (20) Maggiora, G. M.; Shanmugasundaram, V. Molecular Similarity Measures. Methods Mol. Biol. 2011, 672, 39−100. (21) Awale, M.; Reymond, J. L. Similarity Mapplet: Interactive Visualization of the Directory of Useful Decoys and Chembl in High Dimensional Chemical Spaces. J. Chem. Inf. Model. 2015, 55, 1509− 1516. (22) Awale, M.; van Deursen, R.; Reymond, J. L. Mqn-Mapplet: Visualization of Chemical Space with Interactive Maps of Drugbank, Chembl, Pubchem, Gdb-11, and Gdb-13. J. Chem. Inf. Model. 2013, 53, 509−518. (23) Ruddigkeit, L.; Awale, M.; Reymond, J. L. Expanding the Fragrance Chemical Space for Virtual Screening. J. Cheminf. 2014, 6, 27−39. (24) Awale, M.; Reymond, J. L. Web-Based 3d-Visualization of the Drugbank Chemical Space. J. Cheminf. 2016, 8, 25. (25) Rosén, J.; Lövgren, A.; Kogej, T.; Muresan, S.; Gottfries, J.; Backlund, A. Chemgps-Npweb: Chemical Space Navigation Online. J. Comput.-Aided Mol. Des. 2009, 23, 253−259. (26) Backman, T. W. H.; Cao, Y.; Girke, T. Chemmine Tools: An Online Service for Analyzing and Clustering Small Molecules. Nucleic Acids Res. 2011, 39, W486−W491. (27) Athanasiadis, E.; Cournia, Z.; Spyrou, G. Chembioserver: A Web-Based Pipeline for Filtering, Clustering and Visualization of Chemical Compounds Used in Drug Discovery. Bioinformatics 2012, 28, 3002−3003. (28) Williams, A. J.; Harland, L.; Groth, P.; Pettifer, S.; Chichester, C.; Willighagen, E. L.; Evelo, C. T.; Blomberg, N.; Ecker, G.; Goble, C.; et al. Open Phacts: Semantic Interoperability for Drug Discovery. Drug Discovery Today 2012, 17, 1188−1198. (29) Lu, J.; Carlson, H. A. Chemtreemap: An Interactive Map of Biochemical Similarity in Molecular Datasets. Bioinformatics 2016, btw523. (30) Givehchi, A.; Dietrich, A.; Wrede, P.; Schneider, G. Chemspaceshuttle: A Tool for Data Mining in Drug Discovery by Classification, Projection, and 3d Visualization. QSAR Comb. Sci. 2003, 22, 549−559. (31) Lounkine, E.; Wawer, M.; Wassermann, A. M.; Bajorath, J. Saranea: A Freely Available Program to Mine Structure−Activity and Structure−Selectivity Relationship Information in Compound Data Sets. J. Chem. Inf. Model. 2010, 50, 68−78. (32) Gütlein, M.; Karwath, A.; Kramer, S. Ches-Mapper - Chemical Space Mapping and Visualization in 3d. J. Cheminf. 2012, 4, 7. (33) Hoksza, D.; Škoda, P.; Voršilák, M.; Svozil, D. Molpher: A Software Framework for Systematic Chemical Space Exploration. J. Cheminf. 2014, 6, 7. (34) Zhang, B.; Hu, Y.; Bajorath, J. Analogexplorer: A New Method for Graphical Analysis of Analog Series and Associated Structure− Activity Relationship Information. J. Med. Chem. 2014, 57, 9184− 9194. (35) Sander, T.; Freyss, J.; von Korff, M.; Rufener, C. Datawarrior: An Open-Source Program for Chemistry Aware Data Visualization and Analysis. J. Chem. Inf. Model. 2015, 55, 460−473. (36) Lewis, R.; Guha, R.; Korcsmaros, T.; Bender, A. Synergy Maps: Exploring Compound Combinations Using Network-Based Visualization. J. Cheminf. 2015, 7, 36. (37) Schwartz, J.; Awale, M.; Reymond, J. L. Smifp (Smiles Fingerprint) Chemical Space for Virtual Screening and Visualization of Large Databases of Organic Molecules. J. Chem. Inf. Model. 2013, 53, 1979−1989. (38) Nguyen, K. T.; Blum, L. C.; van Deursen, R.; Reymond, J.-L. Classification of Organic Molecules by Molecular Quantum Numbers. ChemMedChem 2009, 4, 1803−1805.

M.A. designed and realized webMolCS and wrote the paper. D.P. designed and realized the MST algorithm. J.-L.R. codesigned and supervised the project and wrote the paper. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was supported financially by the University of Berne, the Swiss National Science Foundation, and the NCCR TransCure.



REFERENCES

(1) Forino, M.; Jung, D.; Easton, J. B.; Houghton, P. J.; Pellecchia, M. Virtual Docking Approaches to Protein Kinase B Inhibition. J. Med. Chem. 2005, 48, 2278−2281. (2) Klebe, G. Virtual Ligand Screening: Strategies, Perspectives and Limitations. Drug Discovery Today 2006, 11, 580−594. (3) Kolb, P.; Ferreira, R. S.; Irwin, J. J.; Shoichet, B. K. Docking and Chemoinformatic Screens for New Ligands and Targets. Curr. Opin. Biotechnol. 2009, 20, 429−36. (4) Geppert, H.; Vogt, M.; Bajorath, J. Current Trends in LigandBased Virtual Screening: Molecular Representations, Data Mining Methods, New Application Areas, and Performance Evaluation. J. Chem. Inf. Model. 2010, 50, 205−216. (5) Scior, T.; Bender, A.; Tresadern, G.; Medina-Franco, J. L.; Martinez-Mayorga, K.; Langer, T.; Cuanalo-Contreras, K.; Agrafiotis, D. K. Recognizing Pitfalls in Virtual Screening: A Critical Review. J. Chem. Inf. Model. 2012, 52, 867−881. (6) Heikamp, K.; Bajorath, J. The Future of Virtual Compound Screening. Chem. Biol. Drug Des. 2013, 81, 33−40. (7) Wetzel, S.; Klein, K.; Renner, S.; Rauh, D.; Oprea, T. I.; Mutzel, P.; Waldmann, H. Interactive Exploration of Chemical Space with Scaffold Hunter. Nat. Chem. Biol. 2009, 5, 581−583. (8) Ertl, P.; Rohde, B. The Molecule Cloud - Compact Visualization of Large Collections of Molecules. J. Cheminf. 2012, 4, 12. (9) Hilbig, M.; Rarey, M. Mona 2: A Light Cheminformatics Platform for Interactive Compound Library Processing. J. Chem. Inf. Model. 2015, 55, 2071−2078. (10) Korb, O.; Kuhn, B.; Hert, J.; Taylor, N.; Cole, J.; Groom, C.; Stahl, M. Interactive and Versatile Navigation of Structural Databases. J. Med. Chem. 2016, 59, 4257−4266. (11) Oprea, T. I.; Gottfries, J. Chemography: The Art of Navigating in Chemical Space. J. Comb. Chem. 2001, 3, 157−166. (12) Medina-Franco, J. L.; Maggiora, G. M.; Giulianotti, M. A.; Pinilla, C.; Houghten, R. A. A Similarity-Based Data-Fusion Approach to the Visual Characterization and Comparison of Compound Databases. Chem. Biol. Drug Des. 2007, 70, 393−412. (13) Rosen, J.; Gottfries, J.; Muresan, S.; Backlund, A.; Oprea, T. I. Novel Chemical Space Exploration Via Natural Products. J. Med. Chem. 2009, 52, 1953−1962. (14) Gaspar, H. A.; Baskin, I. I.; Marcou, G.; Horvath, D.; Varnek, A. Chemical Data Visualization and Analysis with Incremental Generative Topographic Mapping: Big Data Challenge. J. Chem. Inf. Model. 2015, 55, 84−94. (15) Osolodkin, D. I.; Radchenko, E. V.; Orlov, A. A.; Voronkov, A. E.; Palyulin, V. A.; Zefirov, N. S. Progress in Visual Representations of Chemical Space. Expert Opin. Drug Discovery 2015, 10, 959−973. (16) Benigni, R.; Cotta-Ramusino, M.; Giorgi, F.; Gallo, G. Molecular Similarity Matrices and Quantitative Structure-Activity Relationships: A Case Study with Methodological Implications. J. Med. Chem. 1995, 38, 629−635. (17) Kubinyi, H.; Hamprecht, F. A.; Mietzner, T. Three-Dimensional Quantitative Similarity-Activity Relationships (3d Qsiar) from Seal Similarity Matrices. J. Med. Chem. 1998, 41, 2553−2564. 648

DOI: 10.1021/acs.jcim.6b00690 J. Chem. Inf. Model. 2017, 57, 643−649

Application Note

Journal of Chemical Information and Modeling (39) van Deursen, R.; Blum, L. C.; Reymond, J. L. A Searchable Map of Pubchem. J. Chem. Inf. Model. 2010, 50, 1924−1934. (40) Hagadone, T. R. Molecular Substructure Similarity Searching: Efficient Retrieval in Two-Dimensional Structure Databases. J. Chem. Inf. Model. 1992, 32, 515−521. (41) Rogers, D.; Hahn, M. Extended-Connectivity Fingerprints. J. Chem. Inf. Model. 2010, 50, 742−754. (42) Awale, M.; Reymond, J. L. Cluster Analysis of the Drugbank Chemical Space Using Molecular Quantum Numbers. Bioorg. Med. Chem. 2012, 20, 5372−5378. (43) Jin, X.; Awale, M.; Zasso, M.; Kostro, D.; Patiny, L.; Reymond, J. L. Pdb-Explorer: A Web-Based Interactive Map of the Protein Data Bank in Shape Space. BMC Bioinf. 2015, 16, 339. (44) Kruskal, J. B. On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem. Proc. Am. Math. Soc. 1956, 7, 48−50. (45) Eppstein, D.; Paterson, M. S.; Yao, F. F. On Nearest-Neighbor Graphs. Discrete Comput. Geom. 1997, 17, 263−282. (46) Awale, M.; Reymond, J. L. A Multi-Fingerprint Browser for the Zinc Database. Nucleic Acids Res. 2014, 42, W234−W239. (47) Blum, L. C.; Reymond, J. L. 970 Million Druglike Small Molecules for Virtual Screening in the Chemical Universe Database Gdb-13. J. Am. Chem. Soc. 2009, 131, 8732−8733. (48) Blum, L. C.; van Deursen, R.; Reymond, J. L. Visualisation and Subsets of the Chemical Universe Database Gdb-13 for Virtual Screening. J. Comput.-Aided Mol. Des. 2011, 25, 637−647. (49) Blum, L. C.; van Deursen, R.; Bertrand, S.; Mayer, M.; Burgi, J. J.; Bertrand, D.; Reymond, J. L. Discovery of Alpha7-Nicotinic Receptor Ligands by Virtual Screening of the Chemical Universe Database Gdb-13. J. Chem. Inf. Model. 2011, 51, 3105−3112. (50) Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; et al. Chembl: A Large-Scale Bioactivity Database for Drug Discovery. Nucleic Acids Res. 2012, 40, D1100−D1107. (51) Kilchmann, F.; Marcaida, M. J.; Kotak, S.; Schick, T.; Boss, S. D.; Awale, M.; Gönczy, P.; Reymond, J.-L. Discovery of a Selective Aurora a Kinase Inhibitor by Virtual Screening. J. Med. Chem. 2016, 59, 7188−7211. (52) Mysinger, M. M.; Carchia, M.; Irwin, J. J.; Shoichet, B. K. Directory of Useful Decoys, Enhanced (Dud-E): Better Ligands and Decoys for Better Benchmarking. J. Med. Chem. 2012, 55, 6582−6594.

649

DOI: 10.1021/acs.jcim.6b00690 J. Chem. Inf. Model. 2017, 57, 643−649