SmilesDrawer: Parsing and Drawing SMILES-Encoded Molecular

SmilesDrawer can be used by developers of web applications as well as JavaScript-based mobile and desktop applications to render molecular structures ...
49 downloads 47 Views 2MB Size
Application Note Cite This: J. Chem. Inf. Model. 2018, 58, 1−7

pubs.acs.org/jcim

SmilesDrawer: Parsing and Drawing SMILES-Encoded Molecular Structures Using Client-Side JavaScript Daniel Probst* and Jean-Louis Reymond* Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure, University of Berne, Freiestrasse 3, 3012 Berne, Switzerland S Supporting Information *

ABSTRACT: Here we present SmilesDrawer, a dependencyfree JavaScript component capable of both parsing and drawing SMILES-encoded molecular structures client-side, developed to be easily integrated into web projects and to display organic molecules in large numbers and fast succession. SmilesDrawer can draw structurally and stereochemically complex structures such as maitotoxin and C60 without using templates, yet has an exceptionally small computational footprint and low memory usage without the requirement for loading images or any other form of client−server communication, making it easy to integrate even in secure (intranet, firewalled) or offline applications. These features allow the rendering of thousands of molecular structure drawings on a single web page within seconds on a wide range of hardware supporting modern browsers. The source code as well as the most recent build of SmilesDrawer is available on Github (http://doc.gdb.tools/smilesDrawer/). Both yarn and npm packages are also available.



INTRODUCTION The past decade has seen the releases of a myriad of web-based applications in the fields of bio- and cheminformatics. A major advantage of these browser-rendered frontends is the availability of a large variety of JavaScript libraries and components available through package managers like yarn or npm. Among these libraries, many components deal with the display and transmission of molecular structure information, in particular creating SMILES (simplified molecular-input lineentry system) from molecular structures drawn by the user,1,2 such that the molecular information can be transmitted and processed rapidly. Indeed together with InChI,3 SMILES is the de facto standard for encoding chemical species as short, singleline ASCII strings.4 A SMILES string is created from a molecular structure by computing a spanning tree of the undirected graph representing the molecule (atoms as vertices, bonds as edges), retaining the broken cycles by indexing the removed edges (bonds) on both participating vertices (atoms) and identifying the longest path in the resulting spanning tree. Next, the SMILES is generated by following the longest path, writing out the current chemical elements symbol followed by a bond and the index of a broken cycle if available. In case of branching vertices (atoms), each branch is written enclosed by parentheses.5 Here we address the lack of the corresponding easy-to-use, small-footprint JavaScript components to perform the inverse task, that is, draw drug-like molecular structures from SMILES, which can help in applications dealing with the display of molecular structures from very large databases such as the GDB and GDB-derived databases published by our group.6,7 © 2017 American Chemical Society

Currently, most web applications and database frontends (PubChem Sketcher, Marvin JS by ChemAxon Inc.) rely on a server-side backend for providing pregenerated images, dynamically generated images or atom coordinates.8−14 The drawbacks of loading information from a server are the following: (I) Pregenerating images requires the creation, possible update as well as the storage of many images in persistent memory. Serving these images negatively influences loading times depending on the server hardware and the imagesize, especially on high-latency mobile networks.15 Pregenerated images are also only available in certain resolutions, colors and drawing styles, possibly requiring the creation of a new set of images depending on the front-end. (II) Dynamically generating images using GET requests requires either the provisioning of a web service capable of doing so or the reliance on a service provided by a third party, thus sending potentially confidential information to an external server. (III) Calculating atom coordinates server-side, as implemented in Marvin JS by ChemAxon, with subsequent drawing of the structure client-side resolves performance issues when implemented correctly, but would still require the provisioning of a web service resulting in infrastructure overhead and potential security issues. Rendering a molecular structure from SMILES directly is challenging since the SMILES notation only records topology but no spatial information, in contrast to other formats such as PDB or CML,16,17 which explains the use of server-side Received: July 18, 2017 Published: December 19, 2017 1

DOI: 10.1021/acs.jcim.7b00425 J. Chem. Inf. Model. 2018, 58, 1−7

Application Note

Journal of Chemical Information and Modeling

Figure 1. Molecular structures drawn by SmilesDrawer. SmilesDrawer applies a dynamic system simulation based on Kamada and Kawais algorithm to determine the position of atoms when it encounters a bridged ring in a molecule. This enables SmilesDrawer to depict a wide range of molecules, such as cubane (drawing time: 3.5 ms) (1), trinorbornane (4.3 ms) (2), heptacyclo[6.4.0.02,7.03,6.04,11.05,10.09,12]dodecane (10.5 ms) (3), dodecahedrane (11.7 ms) (4), buckminsterfullerene (85.7 ms) (5), vitamin B12 (cyanocobalamin) (55.5 ms) (6), hydromethylglutaryl-coenzyme A (23 ms) (7), cholesterol (7.2 ms) (8), arachidonic acid (9.2 ms) (9), prostaglandin E1 (2.6 ms) (10), strychnine (11.9 ms) (11), quinine (5.1 ms) (12), vancomycin (115 ms) (13), calicheamycin γ1 (31.4 ms) (14), cyclosporin A (9.8 ms) (15), maitotoxin (140 ms) (16). SMILES of all molecules shown are available in Table S1.

computation to circumvent this limitation. The only currently available JavaScript component to convert SMILES to molecular structure drawings without any code server side is OpenChemLib-JS, a feature applied by JSME.2 OpenChemLibJS is maintained by the Cheminformatics Department of the Swiss Federal Institute of Technology and is cross-compiled from Java to JavaScript with OpenChemLib as the codebase, which is part of Actelions DataWarrior.18 This implementation, however, has two major disadvantages: (I) The conversionless structure drawing from SMILES is implemented using SVG (scalable vector graphics) implying retained mode rendering resulting in each element of the drawing (letters, lines, ...) being added to the DOM and thus generating object management overhead for the web browser. This approach leads to unpredictable and generally lower performance across different devices and browsers while complicating development given the lack of an API. A canvas depicter option for OpenChemLibJS is available but not well documented or customizable. (II) The codebase of OpenChemLib-JS being written in Java, and thus, its reliance on being cross-compiled by GWT makes it

virtually impossible to customize, optimize, or adapt for integration into a web application without considerable development overhead. Here, we introduce SmilesDrawer, a small (97 KB minified), dependency-free JavaScript component capable of drawing molecular structures from SMILES strings client-side, which is much smaller and faster and overcomes many of the limitations of OpenChemLib-JS. SmilesDrawer can be used by developers of web applications as well as JavaScript-based mobile and desktop applications to render molecular structures from SMILES strings without the need of additional libraries or communication with a server, which often presents a major drawback when processing sensitive information. The component is fully customizable, well-documented and its source code is available under the MIT license. SmilesDrawer is written in JavaScript ES6, transpiled adhering to the current ES6 implementation status using Babel and then packaged for both yarn and npm. SmilesDrawer is useful for web-based tools that need to display thousands of molecules because it generates the drawing from SMILES, which reduces the 2

DOI: 10.1021/acs.jcim.7b00425 J. Chem. Inf. Model. 2018, 58, 1−7

Application Note

Journal of Chemical Information and Modeling ∂ 2E (t ) (t ) ∂ 2E (xm , ym )δx + (xm(t ) , ym(t ))δy 2 ∂xm∂ym ∂xm

amount of data transmitted. SmilesDrawer has been utilized for the 3D visualization of a multitude of chemical spaces populated by SureChEMBL data.19



=−

RESULTS AND DISCUSSION SmilesDrawer Components. The SmilesDrawer JavaScript component achieves the conversion of a SMILES to a two-dimensional structure drawing by combining two modules, a SMILES parser to convert the SMILES back to its parent spanning tree, and a SMILES drawer to convert this spanning tree to a two-dimensional structure drawing. Both are written in JavaScript and while the drawer relies on the output of the parser, the parser is usable as a standalone function and can be applied in other projects. The parser accounts for approximately 1/10 of the source code and is not directly customizable, as it was generated by a parser generator. The parser module generates a parse tree from the input SMILES, in which each atom is encoded by a node object in a linked tree data structure. The topology of the parse tree is identical to the spanning tree used to generate the SMILES string. The parser was generated using PEG.js, a parser generator for JavaScript, and by translating the grammar provided by the OpenSMILES specification into an unambiguous parsing expression grammar (PEG).20,21 PEG was chosen over a context free grammar (CFG) to avoid ambiguities (reduce-shift conflicts).22 Additionally, the generated parsers are an implementation of the packrat parsing algorithm and thus express a linear time complexity through memoization, resulting in increased space complexity.23,24 In practice, parsing expression grammars simplify syntax definitions, conform closely to syntax practices (prioritized choices, unlimited lookahead), and allow linear time parsing for any TDPL grammar. The higher average space complexity of packrat, which is directly proportional to input length, is well compensated for by the generally short length of SMILES strings. In addition to generating the parse tree, the parser can identify the location of an erroneous symbol. The SMILES drawer module converts the parse tree obtained from the SMILES to a 2D-structure drawing. The module positions acyclic atoms, atoms in fused rings and atoms in spiros based on Euclidean and molecular geometry according to the VSEPR model.25 The placement of bridged ring-systems with nrings ≥ 2 is treated as a two-dimensional graph embedding problem and solved based on graph theoretic distances as described by Kamada and Kawai.26 The algorithm sets up a virtual dynamic system, where weighted topological distances between all vertices are modeled as springs, whereas other spring embedders such as the Eades and Fruchterman− Reingold algorithms, which have been adapted to depict molecular structures, model edges as springs and introduce repulsive electrical forces between non connected vertices to keep them apart.27−29 The formula for the energy functional of the dynamic system according to Kamada and Kawai is shown in eq 1, where ki,j = K/di,j2 is the spring strength between vertices i and j based on the topological distance di,j and the constant K; and li,j = L × di,j is the relaxed spring length based on the topological distance and the desired edge length L. n−1

E=

n

∑ ∑ i=1 j=i+1

(2)

∂ 2E ∂ 2E ∂E (t ) (t ) (xm(t ) , ym(t ))δx + 2 (xm(t ) , ym(t ))δy = − (xm , ym ) ∂xm∂ym ∂ym ∂ym (3)

The functional E of the system is then minimized by iterative local minimizations one vertex m at a time, where vertex m is the vertex with the largest value of Δm = (∂E /∂xm)2 + (∂E /∂ym )2 . Vertex m is then moved by δx and δy until Δm reaches a lower threshold. δx and δy are computed by applying a two-dimensional Newton−Raphson method to solve eqs 1 and 2. Our implementation of the algorithm by Kamada and Kawai enables the SMILES drawer module to depict highly complex ring systems such as a buckminsterfullerene without the need for templates (Figure 1). A drawback in implementing Kamada and Kawais algorithm for structure drawing arises when depicting macrocycles and bridged ring-systems including macrocycles where rings might be distorted, ignore stereochemistry around double bonds or produce overlaps (Figure 1 compounds 6, 13, 15). Smallest set of smallest ring discovery is implemented applying a robust algorithm based on path-included distance matrices.30 Once atoms have been positioned, overlaps are resolved by rotating rotatable bonds where the resulting positions yield a lower overlap score. If overlaps are still present after the first step of overlap resolution, a second step rotates overlapping terminal vertices away from the overlapcausing position. The overlap score for each vector (atom) νi is l − || vi − vj ||

for l − || vi − vj || > 0, d e fi n e d a s overlapi = ∑j l where l is the optimal bond length. Chirality determination was implemented based on the algorithm developed by Teixeira et al. and is based on the parity of permutation of ligands after ordering according to CIP rules compared to their index of occurrence in the SMILES string as defined in the OpenSMILES specification.31 For the depiction of wedges, we developed an algorithm which chooses the bond to be flipped up, respectively down, based on the following simple set of rules (ordered by priority from highest to lowest): (1) Wedges are not to be drawn between two stereocenters. (2) If possible, wedges are not to be drawn inside a ring. (3) Drawing wedges toward heteroatoms takes priority, and (4) Wedges are drawn in the direction of the shortest subtree. The resulting structures are drawn using the Canvas API supported by all modern browsers. Unlike the commonly used SVG (scalable vector graphics) format, Canvas implements immediate mode rendering, thus abolishing the need for the performance impacting object model kept in memory. After the HTML 5 standard specifying the Canvas API became the stable W3C recommendation in October 2014, the novel technology has been applied by several web-based cheminformatics and bioinformatics applications, asserting its increased performance and reduced code complexity compared to scalable vector graphics.32−34 The SMILES drawer module implements the complete OpenSMILES specification except for square planar, trigonal bipyramidal and octahedral chirality. These types of chirality

1 ki , j((xi − xj)2 + (yi − yj )2 + li2, j 2

− 2li , j (xi − xj)2 + (yi − yj )2 )

∂E (t ) (t ) (xm , ym ) ∂xm

(1) 3

DOI: 10.1021/acs.jcim.7b00425 J. Chem. Inf. Model. 2018, 58, 1−7

Application Note

Journal of Chemical Information and Modeling

Figure 2. Analysis of the pooled test sets. Test sets include Drugbank (n = 7238) and samples from ChEMBL, FDB-17, GDB-17, and SureChEMBL (each n = 7238). The sets were pooled into a super set containing all data (ntotal = 36 190). Subplots a and b show the distribution of ring count and length of SMILES respectively. The range from 0 to 15 rings covers 99.981% (36 082), and the range from 0 to 150 characterizes 98.933% (35 704) of all molecules in the pooled set. SMILES length was chosen as a measurement variable as it correlates best with both parse and render time (c, d). The ρ values yielded by Pearson’s and Spearman’s methods suggest a strong nonlinear, monotonic relationship between SMILES length and performance.

are, according to the specification, only implemented by very few SMILES systems and we did not encounter them in any of the organic molecule databases known to us. In addition, the proposed extensions, including external R-groups, polymers and crystals, atom-based double bond configuration, radical centers and twisted SMILES, provided by the OpenSMILES specification are not supported. Assessing the Performance of SmilesDrawer. The aesthetic performance of SmilesDrawer was assessed by visual inspection. SmilesDrawer performs extremely well for rendering structures of a wide range of molecules (Figure 1). Excellent drawings are produced for polycyclic hydrocarbons such as cubane (1), trinorbornane (2),35 heptacyclo[6.4.0.02,7.03,6.04,11.05,10.09,12]dodecane (3), dodecahedrane, (4) or buckminsterfullerene (5) without using any template. Depictions of complex biomolecules are also very good, for example the essential coenzyme vitamin B12 (6), the steroid precursor hydromethylglutaryl-coenzyme A (7), its biosynthetic endproduct cholesterol (8), the polyunsaturated omega-6 fatty acid arachidonic acid (9), and the immune modulator prostaglandin E1 (10). Complex natural products are also satisfactorily rendered such as the alkaloids strychnine (11) and quinine (12), the antibiotic vancomycin (13) and the complex cytotoxic natural product calicheamycin γ1 (14), the immunosuppresor cyclic peptide cyclosporin A (15), and the complex polycyclic toxin maitotoxin (16), which possesses 98 chiral centers. In addition to drawing small to medium-sized molecules, SmilesDrawer also excels at drawing large and topologically complex peptides such as peptide dendrimers36,37

with minimal overlap in contrast to OpenChemLib-JS and ChemAxon Marvin JS (Figure S1). To evaluate the runtime performance of SmilesDrawer, a test set including Drugbank (n = 7238) and samples from ChEMBL, FDB-17, GDB-17, and SureChEMBL (each n = 7238) was assembled. This pooled set containing all compounds (ntotal = 36 190) was analyzed for SMILES length and number of rings, as these two features intuitively have the highest impact on drawing speed. By running preliminary benchmarks, this assessment was confirmed through correlation analysis as shown in Figure 2c, d. While parsing time primarily correlates with the SMILES length of a molecule (ρPearson = 0.28, ρSpearman = 0.68), rendering time correlates well with both the length of the SMILES (ρPearson = 0.26, ρSpearman = 0.73) and the number of rings (ρPearson = 0.15, ρSpearman = 0.67). SMILES length was chosen as the measurement variable for ensuing performance benchmarks, as it correlates well with both parsing and rendering time. Surprisingly, whereas monotonic relationships were expected between rendering time versus the SMILES length and the number of rings respectively, the nonlinear relationship between SMILES length and parsing time is unexpected due to the theoretically linear runtime of the parser. We suspect this behavior to be caused by current JavaScript implementations not supporting tail call optimization and the generated parser heavily relying on recursion. The theoretical time complexities are O(n) and O(n3) for the parser and drawer, respectively. Benchmarks were conducted using the Drugbank data set, containing 7238 compounds.8 In addition, random subsets of equal size were extracted from the 4

DOI: 10.1021/acs.jcim.7b00425 J. Chem. Inf. Model. 2018, 58, 1−7

Application Note

Journal of Chemical Information and Modeling ChEMBL,9 FDB-17,38 GDB-17,6 and SureChEMBL13 databases. The performance was assessed using desktop as well as mobile hardware and software (Intel Core i7-7700 3.60 GHz, 16.0 GB DDR4 RAM, Windows 10.0.16299, Chrome Version 63.0.3239.84 64-bit; Samsung exynos8895 0.455-2.314 GHz, 4 GB LPDDR4X, Android 7.0, Linux version 4.4.13-12401979, Chrome Version 63.0.3239.71). SmilesDrawer shows excellent performance for both parsing (tp̅ arsing = 0.04 ± 0.085 ms) and drawing (td̅ rawing = 2.445 ± 14.144 ms). The rendering time for molecules containing, in terms of depiction using our proposed approach, complex bridged ring systems (n = 5473 with an average of 4.033 ± 1.453 rings) is still low (td̅ rawing,bridged = 4.079 ± 17.731 ms). Performance of drawing speed is excellent even on mobile hardware with tp̅ arsing = 0.169 ± 0.557 ms, td̅ rawing = 7.481 ± 23.471 ms, and td̅ rawing,bridged = 13.692 ± 42.451 ms. Per set performance is shown in Figure 3. Comparison of SmilesDrawer on Different Devices and with OpenChemLib-JS. To compare the total depiction time (parsing + rendering time) of SmilesDrawer with the JavaScript port of OpenChemLib, we ran the benchmarks on the latest version of OpenChemLib-JS using its undocumented canvas depicter. The performance of OpenChemLib-JS was assessed using the same desktop test setting as for the SmilesDrawer test case. The results are shown in Figure 3. The total depiction time values show that the performance of SmilesDrawer on a mobile phone is comparable to that of OpenChemLib-JS on a desktop computer (Figure 3a). While the render time for both SmilesDrawer and OpenChemLib-JS are close, SmilesDrawer shows generally lower variance on the desktop system (Figure 3b). SmilesDrawer’s parse time exhibits a runtime performance which is orders of magnitude faster than that of OpenChemLib-JS on both the desktop and the mobile system (Figure 3c). To further assess the comparative runtime performance, we analyzed the data using two-dimensional KDE plots, which show a detailed comparison of the drawing (parsing + rendering) performance of SmilesDrawer with that of OpenChemLib-JS for the test sets (Figure 4a) GDB-17, (Figure 4b) FDB-17, (Figure 4c) SureChEMBL, (Figure 4d) ChEMBL, and (Figure 4e) Drugbank. GDB-17 and FDB-17 are constrained to a relatively low maximum atom count of 17, causing the parser to take up a significant part of the drawing time. This fact reflects in Figure 4a, b, where bimodal distributions are caused by the significantly slower parser of OpenChemLib-JS. The analysis of these benchmarks has shown that our JavaScript module generally performs better throughout the test sets compared to the transcompiled version of OpenChemLib-JS and that Kamada and Kawais algorithm is indeed suited for placing the atoms of bridged ring systems without any negative impact on overall rendering performance (Figure S2). For mobile applications, the overall performance of SmilesDrawer measured during benchmarking matches the latency (depending on carrier and network generation: 5−100 ms) of mobile networks, facilitating application-scale performance improvements over loading structures as image files from a web server on such networks.15

Figure 3. Performance comparison between SmilesDrawer and OpenChemLib-JS. Performance was established for three different test cases. SmilesDrawer on a desktop computer (blue), OpenChemLib-JS on a desktop computer (green), and SmilesDrawer on a mobile phone (red). The total depiction time (parsing + rendering) values show that the performance of SmilesDrawer on a mobile phone is comparable to that of OpenChemLib-JS on a desktop computer (a). While the render time for both SmilesDrawer and OpenChemLib-JS are close, SmilesDrawer shows generally lower variance on the desktop system (b). SmilesDrawer’s parse time exhibits a runtime performance which is orders of magnitude faster than that of OpenChemLib-JS on both the desktop and the mobile system (c).

used in modern web applications in need of a method to display molecular structures. SmilesDrawer differentiates itself from other previously reported JavaScript components for SMILES drawing in that it does not require any third-party libraries, has a codebase written entirely in JavaScript, does not require the deployment of web services, and applies the algorithm proposed by Kamada and Kawai for positioning atoms in bridged rings while applying simple Euclidean geometry for the placement of other atoms. Given that SmilesDrawer was implemented and optimized for the limited use on SureChEMBL data sets, its performance carries over well to the Drugbank, ChEMBL, FDB-17, and GDB-17 data sets and even depicts complex molecules. SmilesDrawer should



CONCLUSION SmilesDrawer is a highly customizable, easy-to-use and performant JavaScript component consisting of both a SMILES parser and a Canvas API drawing module. It is tailored to be 5

DOI: 10.1021/acs.jcim.7b00425 J. Chem. Inf. Model. 2018, 58, 1−7

Application Note

Journal of Chemical Information and Modeling

Figure 4. Comparison to OpenChemLib-JS. The two-dimensional KDE plots show a detailed comparison for the drawing (parsing + rendering) performance of SmilesDrawer to that of OpenChemLib-JS for the test sets (a) GDB-17, (b) FDB-17, (c) SureChEMBL, (d) ChEMBL, and (e) Drugbank. GDB-17 and FDB-17 are constrained to a relatively low maximum atom count of 17, causing the parser to take up a significant part of the drawing time. This fact reflects in subplots a and b, where bimodal distributions are caused by the significantly slower parser of OpenChemLib-JS. 456 (1.262%) and 509 (1.409%) compounds were removed from the SmilesDrawer and OpenChemLib-JS set respectively, as they interfered with the readability of these plots.



ACKNOWLEDGMENTS This work was supported financially by the University of Berne, the Swiss National Science Foundation, and the NCCR TransCure. We thank ChemAxon for providing access to Marvin JS.

be generally useful to display molecules from SMILES in web applications.



ASSOCIATED CONTENT

* Supporting Information S



The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.7b00425. Additional benchmarking of SmilesDrawer (Figures S1 and S2) and SMILES of all molecules shown in Figure 1 and S1 (Table S1) (PDF)



REFERENCES

(1) Ihlenfeldt, W. D.; Bolton, E. E.; Bryant, S. H. The Pubchem Chemical Structure Sketcher. J. Cheminf. 2009, 1, 20. (2) Bienfait, B.; Ertl, P. Jsme: A Free Molecule Editor in Javascript. J. Cheminf. 2013, 5, 24. (3) Heller, S. R.; McNaught, A.; Pletnev, I.; Stein, S.; Tchekhovskoi, D. Inchi, the Iupac International Chemical Identifier. J. Cheminf. 2015, 7, 23. (4) Weininger, D. Smiles, a Chemical Language and InformationSystem 0.1. Introduction to Methodology and Encoding Rules. J. Chem. Inf. Model. 1988, 28, 31−36. (5) Weininger, D.; Weininger, A.; Weininger, J. L. Smiles. 2. Algorithm for Generation of Unique Smiles Notation. J. Chem. Inf. Model. 1989, 29, 97−101. (6) Ruddigkeit, L.; van Deursen, R.; Blum, L. C.; Reymond, J. L. Enumeration of 166 Billion Organic Small Molecules in the Chemical Universe Database Gdb-17. J. Chem. Inf. Model. 2012, 52, 2864−2875. (7) Awale, M.; Visini, R.; Probst, D.; Arus-Pous, J.; Reymond, J. L. Chemical Space: Big Data Challenge for Molecular Diversity. Chimia 2017, 71, 661−666.

AUTHOR INFORMATION

Corresponding Authors

*E-mail: [email protected] (D.P.). *E-mail: [email protected] (J.-L.R.). ORCID

Jean-Louis Reymond: 0000-0003-2724-2942 Author Contributions

D.P. designed and developed both modules of SmilesDrawer and wrote the paper. J.-L.R. codesigned, supervised the project, and wrote the paper. Notes

The authors declare no competing financial interest. 6

DOI: 10.1021/acs.jcim.7b00425 J. Chem. Inf. Model. 2018, 58, 1−7

Application Note

Journal of Chemical Information and Modeling

Distance Matrix. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 17355− 17358. (31) Teixeira, A. L.; Leal, J. P.; Falcao, A. O. Automated Identification and Classification of Stereochemistry: Chirality and Double Bond Stereoisomerism. arXiv.org 2013, No. arXiv:1303.1724. (32) Miller, C. A.; Anthony, J.; Meyer, M. M.; Marth, G. Scribl: An Html5 Canvas-Based Graphics Library for Visualizing Genomic Data over the Web. Bioinformatics 2013, 29, 381−383. (33) Taylor, S.; Noble, R. Html5 Pivotviewer: High-Throughput Visualization and Querying of Image Data on the Web. Bioinformatics 2014, 30, 2691−2692. (34) Vanderkam, D.; Aksoy, B. A.; Hodes, I.; Perrone, J.; Hammerbacher, J. Pileup.Js: A Javascript Library for Interactive and in-Browser Visualization of Genomic Data. Bioinformatics 2016, 32, 2378−2379. (35) Delarue Bizzini, L.; Muntener, T.; Haussinger, D.; Neuburger, M.; Mayor, M. Synthesis of Trinorbornane. Chem. Commun. 2017, 53, 11399−11402. (36) Stach, M.; Siriwardena, T. N.; Kohler, T.; van Delden, C.; Darbre, T.; Reymond, J. L. Combining Topology and Sequence Design for the Discovery of Potent Antimicrobial Peptide Dendrimers against Multidrug-Resistant Pseudomonas Aeruginosa. Angew. Chem., Int. Ed. 2014, 53, 12827−12831. (37) Bergmann, M.; Michaud, G.; Visini, R.; Jin, X.; Gillon, E.; Stocker, A.; Imberty, A.; Darbre, T.; Reymond, J. L. Multivalency Effects on Pseudomonas Aeruginosa Biofilm Inhibition and Dispersal by Glycopeptide Dendrimers Targeting Lectin Leca. Org. Biomol. Chem. 2016, 14, 138−148. (38) Visini, R.; Awale, M.; Reymond, J.-L. Fragment Database Fdb17. J. Chem. Inf. Model. 2017, 57, 700−709.

(8) Wishart, D. S.; Knox, C.; Guo, A. C.; Shrivastava, S.; Hassanali, M.; Stothard, P.; Chang, Z.; Woolsey, J. Drugbank: A Comprehensive Resource for in Silico Drug Discovery and Exploration. Nucleic Acids Res. 2006, 34, D668−D672. (9) Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. Chembl: A Large-Scale Bioactivity Database for Drug Discovery. Nucleic Acids Res. 2012, 40, D1100−D1107. (10) Awale, M.; van Deursen, R.; Reymond, J. L. Mqn-Mapplet: Visualization of Chemical Space with Interactive Maps of Drugbank, Chembl, Pubchem, Gdb-11, and Gdb-13. J. Chem. Inf. Model. 2013, 53, 509−518. (11) Awale, M.; Reymond, J. L. Similarity Mapplet: Interactive Visualization of the Directory of Useful Decoys and Chembl in High Dimensional Chemical Spaces. J. Chem. Inf. Model. 2015, 55, 1509− 1516. (12) Awale, M.; Reymond, J. L. Web-Based 3d-Visualization of the Drugbank Chemical Space. J. Cheminf. 2016, 8, 25. (13) Papadatos, G.; Davies, M.; Dedman, N.; Chambers, J.; Gaulton, A.; Siddle, J.; Koks, R.; Irvine, S. A.; Pettersson, J.; Goncharoff, N.; Hersey, A.; Overington, J. P. Surechembl: A Large-Scale, Chemically Annotated Patent Document Database. Nucleic Acids Res. 2016, 44, D1220−D1228. (14) Awale, M.; Probst, D.; Reymond, J. L. Webmolcs: A Web-Based Interface for Visualizing Molecules in Three-Dimensional Chemical Spaces. J. Chem. Inf. Model. 2017, 57, 643−649. (15) Nikravesh, A.; Choffnes, D. R.; Katz-Bassett, E.; Mao, Z. M.; Welsh, M., Mobile Network Performance from User Devices: A Longitudinal, Multidimensional Analysis. In Passive and Active Measurement. Pam 2014. Lecture Notes in Computer Science, Vol 8362; Faloutsos, M. K. A., Ed.; Springer: Cham, 2014. (16) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235−242. (17) Murray-Rust, P.; Rzepa, H. S.; Wright, M. Development of Chemical Markup Language (Cml) as a System for Handling Complex Chemical Content. New J. Chem. 2001, 25, 618−634. (18) Sander, T.; Freyss, J.; von Korff, M.; Rufener, C. Datawarrior: An Open-Source Program for Chemistry Aware Data Visualization and Analysis. J. Chem. Inf. Model. 2015, 55, 460−473. (19) Probst, D.; Reymond, J. L. Fun: A Framework for Interactive Visualizations of Large, High Dimensional Datasets on the Web. Bioinformatics 2017, DOI: 10.1093/bioinformatics/btx760. (20) www.opensmiles.org (accessed December 12, 2017). (21) Ford, B. Parsing Expression Grammars. In Proceedings of the 31st Acm Sigplan-Sigact Symposium on Principles of Programming Languages−Popl ’04; ACM Press: New York, USA, 2004; pp 111−112. (22) Parikh, R. J. On Context-Free Languages. J. Assoc. Comput. Mach. 1966, 13, 570−581. (23) Ford, B. Packrat Parsing. ACM SIGPLAN Not. 2002, 37, 36−47. (24) Mizushima, K.; Maeda, A.; Yamaguchi, Y. Packrat Parsers Can. Handle Practical Grammars in Mostly Constant Space. In Proceedings of the 9th Acm Sigplan-Sigsoft Workshop on Program Analysis for Software Tools and Engineering, Paste; ACM: New York, 2010; pp 29− 36. (25) Hargittai, I.; Chamberland, B. The Vsepr Model of Molecular Geometry. Comput. Math. Appl. 1986, 12, 1021−1038. (26) Kamada, T.; Kawai, S. An Algorithm for Drawing General Undirected Graphs. Inf. Process. Lett. 1989, 31, 7−15. (27) Eades, P. A Heuristic for Graph Drawing. Congr. Numer. 1984, 42, 149−160. (28) Fruchterman, T. M. J.; Reingold, E. M. Graph Drawing by Force-Directed Placement. Softw. Pract. Exp. 1991, 21, 1129−1164. (29) Fraczek, T. Simulation-Based Algorithm for Two-Dimensional Chemical Structure Diagram Generation of Complex Molecules and Ligand-Protein Interactions. J. Chem. Inf. Model. 2016, 56, 2320−2335. (30) Lee, C. J.; Kang, Y. M.; Cho, K. H.; No, K. T. A Robust Method for Searching the Smallest Set of Smallest Rings with a Path-Included 7

DOI: 10.1021/acs.jcim.7b00425 J. Chem. Inf. Model. 2018, 58, 1−7