Optimized Virtual Screening Workflow for the Identification of Novel

For a more comprehensive list of citations to this article, users are ... Nathalie Lagarde , Solenne Delahaye , Jean-François Zagury , Matthieu Monte...
0 downloads 0 Views 5MB Size
Subscriber access provided by UNIV OF CALIFORNIA SAN DIEGO LIBRARIES

Article

An Optimized Virtual Screening Workflow for the Identification of Novel G-Quadruplex Ligands Teresa Kaserer, Riccardo Rigo, Philipp Schuster, Stefano Alcaro, Claudia Sissi, and Daniela Schuster J. Chem. Inf. Model., Just Accepted Manuscript • DOI: 10.1021/acs.jcim.5b00658 • Publication Date (Web): 03 Feb 2016 Downloaded from http://pubs.acs.org on February 11, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Information and Modeling is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

An Optimized Virtual Screening Workflow for the Identification of Novel G-Quadruplex Ligands Teresa Kaserer,† Riccardo Rigo,‡ Philipp Schuster,† Stefano Alcaro,$ Claudia Sissi,‡ Daniela Schuster†* †

Computer-Aided Molecular Design Group, Institute of Pharmacy / Pharmaceutical Chemistry

and Center for Molecular Biosciences Innsbruck (CMBI), University of Innsbruck, Innrain 80-82, 6020 Innsbruck, Austria ‡

Department of Pharmaceutical and Pharmacological Sciences, via Marzolo 5, 35131 Padova,

Italy §

Dipartimento di Scienze della Salute, Università “Magna Graecia” di Catanzaro, Campus “S.

Venuta”, Viale Europa, 88100, Catanzaro, Italy

1 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 70

ABSTRACT

G-quadruplexes, alternative DNA secondary structures present in telomeres, emerge as promising targets for the treatment of cancer, because they prevent telomere elongation and accordingly cell proliferation. Within this study, theoretically validated pharmacophore- and shape-based models as well as a theoretically validated docking protocol were generated and applied in parallel for virtual screening and the identification of novel G-quadruplex ligands. Top-ranked hits retrieved with all methods independently and in addition in a consensus approach were selected for biological testing. Of the 32 tested virtual hits seven selectively stabilized G-quadruplexes over duplex DNA in the fluorescence melting assay. For the five most active compounds, chemically closely related analogues were collected and subjected to in vitro analysis. Thereby, seven further novel G-quadruplex ligands could be identified. These molecules do not only represent novel scaffolds, but some of them are in addition even more potent G-quadruplex stabilizers than the established reference compound berberine. This study proposes an optimized in silico workflow for the identification of novel G-quadruplex stabilizers, which can also be applied in future studies. In addition, structurally novel and promising lead candidates with strong and selective Gquadruplex stabilizing properties are reported.

2 ACS Paragon Plus Environment

Page 3 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

INTRODUCTION Most of the current pharmacological drug targets are proteins. However, nucleic acids are classical targets of most of the currently used anticancer drugs. Among these macromolecules, Gquadruplexes represent a structurally and functionally different subclass. Interestingly, these noncanonical nucleic acid arrangements, assumed by sequences rich in guanines, are involved in the regulation of the cell cycle progression, thus they represent good therapeutic targets for anticancer treatments. Moreover, G-quadruplexes contain peculiar structural features quite distinct from the more abundant double helix. This opens the possibility to design ligands endowed with a high selectivity for these peculiar nucleic acid regions. In addition to the exploration as promising anti-cancer target, G-quadruplexes in combination with specific probes have been employed for the development of diverse diagnostic assays, detecting for example nicking endonuclease activity,1 DNA, RNA, aptamers, metal ions,2 or even cocaine3 in analytical samples. Up-to-date, several ligands have been reported, which can roughly be divided into two chemical classes: fused planar heteroaromatic compounds and cyclic or acyclic linked heteroaromatic molecules. The majority of published ligands, however, belong to the fused planar heteroaromatic molecules, which, in most cases, contain one or more additional flexible side chains.4 The representative G-quadruplex ligands BRACO-19 1,5 BMSG-SH-3 2,6 MM41 3,7 and berberine 4,8 are depicted in Chart 1. These molecules were co-crystallized with G-quadruplex (PDB entries 3CE5,5 3SC8,6 3UYH,7 and 3R6R8) and have been employed for model generation in this study.

Chart 1. Structures of the G-quadruplex ligands 1 (BRACO-19),5 2 (BMSG-SH-3),6 3 (MM41),7 and 4 (berberine).8

3 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Despite the many available data concerning both the structures of G-quadruplexes and their ligands, only a limited number of studies has so far explored this data for virtual screening and the identification of novel bioactive molecules.9-12 Multiple computational methods are available for this purpose, for example pharmacophore modeling, shape-based modeling, or docking. A pharmacophore model is defined as “an ensemble of steric and electronic features that is necessary to ensure the optimal supramolecular interactions with a specific biological target and to trigger (or block) its biological response”.13 A pharmacophore model therefore represents a binding mode of a ligand and its target. Shape-based modeling relies upon the assumption that compounds have to be geometrically complementary to the binding pocket to be active. The 3D structure of a known active molecule serves as a template for the generation of a shape model. In the shape-based modeling program 4 ACS Paragon Plus Environment

Page 4 of 70

Page 5 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Rapid Overlay of Chemical Structures (ROCS),14,

15

the shape-based models can further be

refined by addition of chemical information.16 Pharmacophore as well as shape-based models can then be used to filter out compounds from large chemical databases, which satisfy the requirements of the model and are therefore potentially active. On the other hand, docking explicitly requires structural data of the target to be applied. During a docking run, molecules are positioned in the empty binding pocket of the target, and the quality of the docking pose is estimated via calculation of the free energy of binding.17 In recent studies, considerable differences in the performances of these methods were reported.18,

19

In addition, in one of these studies,19 the combination of multiple methods in a

consensus approach improved the enrichment of active molecules in the virtual hit list.19 To further improve the chances of identifying novel G-quadruplex stabilizing molecules, we therefore applied all methods in parallel and, additionally, in a consensus approach. The activities of selected virtual hits from all approaches were then assessed experimentally in solution assays. Besides the discovery of novel bioactive molecules, this study design also allows for the identification of the most successful virtual screening strategy for future projects on Gquadruplex ligands.

Study design This study was conducted analogous to our previous study.19 An overview is provided in Figure 1. Within this study, pharmacophore- and shape-based models were generated, optimized, and theoretically validated using datasets of known G-quadruplex ligands and decoys. In addition, a theoretically validated docking protocol was developed that could discriminate between active molecules and decoys. The best pharmacophore- and shape-based models during 5 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 70

the theoretical validation, as well as the docking workflow, were employed for virtual screening of the commercial Specs database (www.specs.net). The virtual hits of every method were independently ranked according to the respective fit value, and the top ten ranked molecules were selected for further investigations. In addition, five molecules that were predicted as active by all three methods were also included. All selected molecules were further analyzed independently and in parallel using external bioactivity profiling tools. The predictions for every molecule obtained with every applied method were summarized in a prediction matrix. After the experimental assessment of the in silico predictions with in solution biophysical assays, the performances of all applied tools were analyzed and compared. For the most active compounds, derivatives were purchased and investigated as well.

Figure 1. Study design. SAR structure-activity relationship.

RESULTS Pharmacophore modeling In the course of pharmacophore modeling, 109 different pharmacophore models were generated with LigandScout.20 For the prospective screening, however, only seven refined models were applied (Figure 2).

6 ACS Paragon Plus Environment

Page 7 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 2. G-quadruplex pharmacophore models. Models pm-3sc8-1 (A), pm-3sc8-2 (B), and pm3sc8-3 (C) were generated with the PDB entry 3SC8.6 Models pm-3uyh-1 (D) and pm-3uyh-2 (E) were created with the PDB entry 3UYH.7 The berberine-human G-quadruplex complex (PDB entry 3R6R8) served as the basis for the generation of model pm-3r6r-1 (F). Model pm-Gquadruplex-1 (G) was generated with the two known ligands acridone derivative 521 (grey) and bis-triazole derivative 622 (blue). Aro, aromatic feature; PI, positively ionizable feature; H, hydrophobic feature; Xvol, exclusion volume. These models performed best during the theoretical validation and, together, mapped the majority of the active compounds in the dataset. All models except model pm-G-quadruplex-1 were generated using a structure-based approach. The ligand-based model, pm-G-quadruplex-1, was created with the two known active molecules 521 and 622 (Chart 2).

Chart 2. Compounds 521 and 622 served as training compounds for the generation of the pharmacophore model pm-G-quadruplex-1. 7 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 70

A list of the seven selected models, their quality metrics from the theoretical validation, and the number of hits retrieved during the prospective screening of the Specs database are provided in Table 1. A detailed description of pharmacophore model generation is provided in Section S1 in the supporting information.

Table 1. Summary of G-quadruplex pharmacophore models. % maxEF

no. of prospective virtual hits from the Specs database

PDB entry 32.6 3SC86

77

6

pm-3sc8-2

PDB entry 31.3 3SC86

73

2

pm-3sc8-3

PDB entry 30.1 3SC86

71

0

pm-3uyh-1

PDB entry 31.7 3UYH7

74

4

pm-3uyh-2

PDB entry 28.7 3UYH7

67

1

pm-3r6r-1

PDB entry 35.4 3R6R8

83

235

pm-G-quadruplex-1

521 and 622

80

7

name

origin

pm-3sc8-1

EF

34.1

8 ACS Paragon Plus Environment

Page 9 of 70

Together, these models covered 303 out of the 360 active (84.2%) molecules, but retrieved only 128 out of the 14,974 decoys (0.9%). In combination, these models yielded an enrichment factor (EF) of 30.2, 71% of the maximum (max) EF, and an area under the curve of the ROC plot (AUC) of 0.92. In total, 252 unique molecules of the Specs database mapped at least one of these models, corresponding to 0.2% of the input database. All virtual hits were ranked according to the relative pharmacophore fit score, and the ten top-ranked and diverse molecules were selected for further investigation. A detailed list of the selected compounds and their fit scores is provided in Table 2. All structures are depicted in Chart 3.

Table 2. All predictions generated for every compound with all applied methods are summarized in a prediction matrix.

Pharmacophore modeling top-ranked hits

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Cpd-No.

LigandScouta

ROCSb

GOLDc

∆Tm (°C)l

16

0.952 d

-

-

n. o.

17

0.944 d

-

-

n. o.

18

0.941 d

1.093g

-

24.0

19

0.939 d

-

-

n. o.

20

0.939 d

1.085g

-

n. o.

21

0.939 e

-

-

n. o.

22

0.938 d

-

-

11.7

23

0.936 d

-

-

0.8

24

0.936 d

-

-

n. o.

25

0.934 d

-

-

n. o. 9

ACS Paragon Plus Environment

Docking top-ranked hits Consensus hits

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Shape-based modeling top-ranked hits

Journal of Chemical Information and Modeling

26

-

1.299g

-

n. o.

27

-

1.263h

81.3

n. o.

28

-

1.246g

79.7

n. o.

29

-

1.244i

90.8

n. o.

30

-

1.244g

81.0

21.1

31

-

1.220g

-

n. d.

32

-

1.218j

80.4

n. o.

33

-

1.218g

-

n. o.

34

-

1.215g

-

n. o.

35

-

1.214g

-

n. o.

36

-

-

115.2

n. o.

37

-

-

112.2

n. d.

38

-

-

111.9

n. o.

39

-

-

110.5

n. d.

40

-

-

109.4

n. o.

41

-

-

109.2

n. o.

42

-

-

108.9

n. o.

43

-

-

108.8

n. o.

44

-

1.074k

108.7

n. o.

45

-

-

108.7

n. o.

46

0.930d

1.064h

88.8

20.8

47

0.915d

1.034h

80.8

n. o.

48

0.937d

1.128g

79.8

4.4

49

0.870f

1.007h

81.9

n. o.

50

0.936d

1.056g

86.3

10.8

Pos. 1 cont rol

Page 10 of 70

31.7

10 ACS Paragon Plus Environment

Page 11 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Pos. 4 cont rol

6.5

a

only highest relative pharmacophore fit score is listed

b

only highest normalized ComboScore is listed

c

only highest GoldScore is listed

d

compound mapped model pm-3r6r-1

e

compound mapped model pm-3sc8-1

f

compound mapped model pm-3sc8-2

g

compound mapped model shape-3r6r-2

h

compound mapped model shape-3uyh-2

i

compound mapped model shape-G-quadruplex-6

j

compound mapped model shape-G-quadruplex-5

k

compound mapped model shape-G-quadruplex-2

l

at a compound concentration of 5µM

- not predicted as active n. d. not determined, because of solubility issues n. o. not observed

Chart 3. Structures of the selected test compounds.

11 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 70

12 ACS Paragon Plus Environment

Page 13 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Shape-based modeling In total, 148 shape-based models were generated with vROCS,14,

15

of which the nine best-

performing ones during the theoretical validation were selected for the prospective screening (Figure 3). Whenever possible, co-crystallized ligands from X-ray crystal structures were used as query molecules for model generation, because they represent an experimentally determined conformation of the compounds bound to G-quadruplex. However, to recover the majority of known ligands from the dataset, models based on one low-energy conformation of active compounds (Chart 4) were also created. A detailed description of the shape-based models is provided in Section S1 in the supporting information. Most models were generated and applied by using the Implicit Mills-Dean force field. For the last two models however, a modified force field was employed, in which the ring (R) features were exchanged to Aro features to facilitate the higher ranking of aromatic compounds. We could observe, however, that changing the feature type alone did not result in markedly better ranking of compounds with aromatic rings in comparison to aliphatic rings. Therefore, the feature weight of aromatic rings was additionally increased to 10 within the modified force field. A detailed list of all models, their performance in the theoretical validation, the applied force field, and the number of hits retrieved in the prospective screening of the Specs database is provided in Table 3.

13 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 70

Figure 3. G-quadruplex shape-based models. Models shape-3uyh-2 (A), shape-3r6r-2 (B), and shape-3ce5-1 (C) were generated with the co-crystallized ligands from the PDB entries 3UYH,7 3R6R,8 and 3CE5,5 respectively. The models shape-G-quadruplex-1 (D) and shape-Gquadruplex-2 (E) were generated with one low-energy conformation of the known active compounds quinazoline derivative 723 and triarylpyridine derivative 824, respectively. The prealigned poses of acridine derivative 925 and acridone derivative 1021 and quindoline derivative 1126, quindoline derivative 1227, and imidazo phenanthrolin derivative 1328 served as queries for 14 ACS Paragon Plus Environment

Page 15 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

the models shape-G-quadruplex-3 (F) and shape-G-quadruplex-4 (G). For the last models shapeG-quadruplex-5 (H) and shape-G-quadruplex-6 (I) a modified aromatic force field was applied for virtual screening. These models were generated with one low-energy conformation of 14 (telomestatin)29 and diarylurea derivative 1530, respectively. C, cation; HBA, hydrogen bond acceptor; HBD, hydrogen bond donor.

Chart 4. Training set compounds for the generation of shape-based models.

Table 3. Summary of G-quadruplex shape-based models. EF

% maxEF

no of prospective virtual hits from the Specs database

PDB entry implicit 7 3UYH MDa

30.2

71

622

PDB 3R6R8

20.8

49

1,664

model

origin

shape-3uyh-2 shape-3r6r-2

force field

entry implicit MD

15 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

shape-3ce5-1

PDB 3CE55

shape-Gquadruplex-1

Page 16 of 70

entry implicit MD

30.0

70

21

723

implicit MD

40.9

96

0

shape-Gquadruplex-2

824

implicit MD

42.6

100

3

shape-Gquadruplex-3

925 and 1021

implicit MD

34.6

81

3

shape-Gquadruplex-4

1126, 1227, and implicit 1328 MD

31.2

73

38

shape-Gquadruplex-5

1429

aromatic-10 11.0

26

335

shape-Gquadruplex-6

1530

aromatic-10 31.5

74

18

a

MD Mills Dean

In total, all models combined map 287 out of the 360 active compounds (79.7%) and 230 out of the 14,974 decoys (1.5%) above the individually defined activity cut-offs. The whole model collection therefore retrieved an EF of 23.6, representing 56% of the maxEF, and an AUC of 0.89. The prospective screening of the Specs database yielded 2,620 out of 127,452 unique virtual hits. This corresponds to 2.1% of the Specs database. The ComboScore of all virtual hits was normalized to the activity cut-off value defined during theoretical validation. These normalized ComboScores were then employed to rank the hits, and the ten highest ranked and diverse molecules were selected for further investigation. A detailed list of these compounds and their normalized ComboScores is provided in Table 2. The structures of the compounds are depicted in Chart 3.

16 ACS Paragon Plus Environment

Page 17 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Docking We generated multiple docking protocols with GOLD31,

32

using the four crystal structures

3SC86, 3UYH7, 3CE55, and 3R6R8, which have also been employed for the generation of pharmacophore- and shape-based models. For the prospective part, however, only the bestperforming protocol in the theoretical validation was applied to limit the time required for the investigation of about 127,000 molecules. This docking workflow employed the crystal structure of 1 (PDB entry 3CE55). Also the “actives” and “decoys” dataset were too large for the theoretical validation, therefore we used a reduced dataset containing 16 known active and 50 decoy molecules. For the generation of this dataset, the “Find diverse molecules” protocol implemented in Discovery Studio33 was employed. The resulting files should contain 35 diverse active and 300 diverse inactive molecules, respectively. However, inspection of the dataset revealed that still many similar compounds with the same core structure were included. Therefore, the automatically generated dataset was manually refined, and finally 16 diverse active and 50 diverse inactive molecules were kept. The selected docking protocol scored 13 out of 16 active (81.3%) and 9 out of 50 decoys (18.0%) above the activity cut-off of GoldScore ≥ 79.5. Our docking protocol therefore retrieved an EF of 2.4, which represents 59% of the maxEF. Docking yielded an AUC of 0.84. After application of the activity cut-off, 20,873 unique compounds from the Specs database were predicted to be active in the prospective screening. These compounds were ranked according to their GoldScore, and the ten top-ranked diverse molecules were selected for further investigation. A detailed list of all selected compounds and their GoldScore is provided in Table 2 and Chart 3.

Consensus hits 17 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 70

The hit lists derived from all three methods were further analyzed to identify consensus hits, i. e. compounds that were predicted as active by all three methods. Finally, only nine unique molecules were identified as consensus hits. Five of these nine molecules were very similar, so only one of this group was selected for further investigations. This molecule had the highest relative pharmacophore fit score and the highest ComboScore among all the nine compounds. In the end, five molecules were selected for biological investigation. These molecules and their relative pharmacophore fit score, normalized ComboScore, and GoldScore are provided in Table 2 and Chart 3.

External profiling tools All selected compounds were merged to the overall hit list and further investigated with the external bioactivity profiling tools SEA,34 PASS,35 and PharmMapper.36 However, none of the molecules under investigation was predicted as a G-quadruplex ligand by either SEA, PASS, or PharmMapper. In the case of PASS, a list of included activities was available, which showed that G-quadruplex is not included in the prediction spectrum of the program. For SEA and PharmMapper, this information was not available. To get further insights, also 1, 2, 3, and 4 were investigated with these tools. By using SEA, the DNA was predicted as target. This could include G-quadruplex, but also other DNA structures and did not explicitly refer to G-quadruplexes. Also none of these known active compounds was predicted as G-quadruplex ligand with PharmMapper. Therefore we did not consider any of the bioactivity profiling tools in our further analysis.

Generation of a prediction matrix

18 ACS Paragon Plus Environment

Page 19 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

All the 35 compounds selected for further investigations were merged into an overall hit list, and it was investigated whether these compounds were also predicted by any other applied methods, i.e. pharmacophore modeling, shape-based modeling, and docking, above the activity cut-off. All predictions generated with all methods for every compound were then summarized in a prediction matrix (Table 2). The structures of the selected compounds are depicted in Chart 3.

Fluorescence melting studies Since the G-quadruplex ligand dataset was built taking into account derivatives which were able to increase the thermal stability of human telomeric G-quadruplex, we experimentally screened all the selected hits through a fluorescence melting assay. As first target DNA we used the human telomeric sequence HTS in potassium containing solution, where it folds in predominant hybrid-like G-quadruplex structures. Out of the 35 hits, 3 were not soluble in the assay buffer so they were discarded (compounds 31, 37, and 39). The shift of the melting temperature induced by 5 µM of all the other compounds is reported in Table 2. Data for two reference compounds (1 and 4) were included too. This work allows for highlighting the presence of seven active compounds that can be clustered into two main classes according to their efficiency: compounds 18, 30, and 46, as the most active ones, and compounds 22, 23, 48 and 50 with low activity (Table 2). Although they are less efficient than 1, all of them stabilized the telomeric G-quadruplex to a larger extent when compared to 4, the only exceptions being 23 and 48. To preliminarily asses if G-quadruplex topology may alter the ligand ranking order, the same analysis was repeated in Na+ containing buffer, where HTS assumes a preferentially antiparallel G-quadruplex structure and with HT24 which in potassium preferentially folds into a hybrid 1 form. Although the hybrid forms were more stabilized by the binding, the ranking order of efficiency was actually preserved on all tested DNA substrates. Interestingly, when the assay was 19 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

performed on a double stranded DNA, remarkably lower Tm shifts were detected, thus suggesting a preferential interaction with the telomeric G-quadruplex (Figure 4).

35 30 ∆ Tm (°C)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 70

25 20 15 10 5

HTS in K+ HT24 in K+ HTS in Na+ Duplex

0 18

22

23

30

46

48

50

compound

Figure 4. Thermal stabilization of different DNA templates promoted by 10 µM ligand concentration. To further investigate the selectivity of the active compounds for G-quadruplex vs. double stranded DNA, we performed a fluorescence melting competition assay. The obtained data showed that for all the tested ligands, the thermal stability of the G-quadruplex was not affected by the addition of double stranded DNA up to a 40-fold excess of base pairs vs. G-quadruplex (Figure 5).

20 ACS Paragon Plus Environment

Page 21 of 70

7 duplex:G4=0:1 duplex:G4=2:1 duplex:G4=4:1 duplex:G4=10:1 duplex:G4=40:1

6

5

F/Fo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

4

3

2

1 30

40

50

60

70

80

90

T (°C)

Figure 5. Melting profile of HTS in the presence of 5 µM of compound 46, recorded in LiP buffer, 50 mM KCl, pH 7.5, with increasing concentrations of double stranded DNA. Duplex and G4 refer to base pairs and G-quadruplex concentration, respectively. Selection of derivatives In a next step, derivatives structurally related to the five most active compounds 18, 22, 30, 46, and 50 were collected. For compound 18, no structurally similar molecules were commercially available. For compound 22, five derivatives (compounds 51 to 55) were selected. The first two molecules 51 and 52 matched a G-quadruplex pharmacophore model pm-3r6r-1, but were not predicted by any other method. The other three compounds did not contain a charged nitrogen atom, and only matched the pharmacophore model when the nitrogen was protonated. Compound 54 was additionally predicted to be active by docking.

21 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 70

One derivative, compound 56, was selected as derivative of compound 30. This molecule only matched model pm-3r6r-1, but was not predicted by any other method or model. For compound 46, two similar compounds were chosen for further biological testing. One of them, compound 57, was predicted by the pharmacophore model pm-3r6r-1, and one, compound 58, by docking. Two further derivatives, compounds 59 and 60, were selected as derivatives of compound 50. Both of these molecules were predicted as active by docking, but did not match a pharmacophore- or shape-based model. In total, ten molecules structurally related to the most active compounds from the initial virtual screening were selected for further experimental investigations. Their molecular structures are shown in Chart 5.

Chart 5. The structures of the derivatives structurally related to the active hits.

According to the previously reported fluorescence melting assay, seven of these ten molecules stabilized HTS. The duplex structures were stabilized to a lesser extent (Figure 6).

22 ACS Paragon Plus Environment

Page 23 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

By analyzing this data set for each group we can derive that 30 and 56 don’t exhibit big differences in their behavior towards the tested G-quadruplex. The main difference seems to be a better solubility profile of 56. The same can be said for 22, 51 and 52, thus confirming that the position of the fluorine substituent on the aromatic ring does not influence the compounds activity. In this case, the related compounds 53, 54 and 55 lack any activity, likely because they are not charged in the applied experimental conditions. A modest modulation was observed among the 46-related compounds, with 57 performing only slightly better than the rest. Conversely, the substitution of one benzyl ring in 50 with two or one methyl group(s) leads to a progressive reduction of the G-quadruplex stabilization.

23 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 70

Figure 6. Variation of the melting temperature (∆ Tm) of G-quadruplex (solid line) induced by increasing concentrations of selected parent compounds 22 (A), 30 (B), 46 (C), and 50 (D), and their derivatives 51 – 60. Compounds that increased the melting temperature of the G-quadruplex were also investigated for their stabilizing properties concerning duplex DNA (dashed line). 24 ACS Paragon Plus Environment

Page 25 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Circular dichroidism To get further insights into the telomeric G-quadruplex recognition mode of our ligands, we selected the most active compound of each group (18, 50, 51, 56, and 57) and used it to titrate the target DNA. Circular dichroidism (CD) analyses confirmed that all compounds bind the telomeric sequence and cause a comparable change in its dichroic features corresponding mainly to an increment in the 290 nm dichroic band (Figure 7). Only in the case of 50, these changes are remarkably less intense. Nevertheless, in all instances, saturation curves were similar and suggest a complex stoichiometry of two ligands for each G-quadruplex. These data suggest that the binding mode of the selected compounds with telomeric Gquadruplex is quite similar. Interestingly, all active compounds mapped at least one model derived from berberine. Consistently, the modification of the CD features on telomeric Gquadruplex caused by the binding of the natural compound is comparable.37 Therefore, based on the increment of the DNA thermal stability and the binding stoichiometry, an external stacking mode can be envisaged.

25 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 70

Figure 7. (A) Variation of CD spectrum of telomeric G-4 upon addition of selected compounds determined in 10 mM Tris, 1 mM EDTA, 50 mM KCl, pH 7.5, 25 °C. The spectrum of 4 was

26 ACS Paragon Plus Environment

Page 27 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

taken from Ref37. In (B), the relative variation of the 290 nm dichroic signal upon the ligand]/[Gquadruplex] ratio is reported. Surface plasmon resonance To fully characterize the binding efficiency of our hits on telomeric G-quadruplex, we performed surface plasmon resonance (SPR) analysis. This was performed with derivatives 56 and 50, which appear to be the most and the least efficient ones in stabilizing the G-quadruplex conformation. As reference compound we included 4. The target DNA was the biotinylated human telomeric sequence Tel22, which was folded and immobilized on the surface chip through interaction with streptavidin moieties. Examples of the titrations were reported in Figure 8. From the sensorgrams, we evaluated the signals at the steady state which were analyzed as a function of ligand concentrations. Data were well fitted by a one binding site model and this allows for obtaining the dissociation constants reported in Table 4.

Table 4. Dissociation constants determined at the steady-state by SPR titrations of Tel22 with selected binders.

Compound

µM) Kd (µ

56

0.36 ± 0.10

50

5.81 ± 0.78

4

2.32 ± 0.29

Data of the natural compound 4 related well to previously reported results (Bhadra, BBA, 2011, 1810, 485-496).38 The novel compounds showed a more pronounced non-specific binding component on the chip surface. Nevertheless, the major binding event clearly underlined that 50 27 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 70

shows an affinity for the nucleic acid target comparable to 4. Remarkably, 56 is almost one order of magnitude more efficient. This result clearly confirms that the VS program succeeded in identify a novel group of binders for telomeric G-quadruplex.

Figure 8. Sensorgrams corresponding to the titration of Tel22 with compound 50 in 10 mM Tris, 50 mM KCl, pH 7.4 (Panel A). Plot of the response units (RU) recorded at the steady state as a function of metal complexes concentrations in flow solutions (Panel B). Lines represent the best fit using the appropriate binding model described in the text.

Analysis of in silico predictions Based on the results of the initial prospective screening, the early enrichment (EE) and overall enrichment (OE) rates, the accuracy (Acc), the true positive (TP), % of the maximum (max) TP, true negative (TN), maxTn, false positive (FP), maxFP, false negative (FN), and maxFN rates were calculated for all applied methods (Table 5). For a detailed description of these quality metrics please refer to the Experimental Section. The activities of the derivatives were not taken 28 ACS Paragon Plus Environment

Page 29 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

into account. We also calculated all quality metrics for the consensus approach to investigate how many active compounds were missed and how the FP rate was improved by applying multiple methods. The OE rate yielded the same value as the EE rate, because no further compounds of the overall hit list were predicted as active. The profiling tools were not included in the analysis, as no prediction could be generated with them. A detailed graphical summary of the performances of all applied methods and approaches is depicted in Figure 9. A graphical representation of the maxTP, maxTN, maxFP, and maxFN rates of all methods is provided in Figure S1 of the supporting information. Table 5. Analysis of the performances of all applied methods in the prospective screening. method

EE

OE

AC C

TP

max TP

TN

max TN

FP

max FP

FN

maxF N

LigandScout

30.0

40.0

68.8

18.8

85.7

50.0

64.0

28.1

36.0

3.1

14.3

ROCS

11.1

29.4

56.3

15.6

71.4

40.6

52.0

37.5

48.0

6.3

28.6

GOLD

0.0

22.2

46.9

12.5

57.1

34.4

44.0

43.8

56.0

9.4

42.9

Consensus

60.0

60.0

81.3

9.4

42.9

71.9

92.0

6.3

8.0

12.5

57.1

29 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 70

Figure 9. Detailed graphical representation of the performances of all methods. (A) Shows the EE, OE, and Acc values retrieved by all applied methods and the consensus approach. (B) Displays the composition of the hit lists obtained with every method and the consensus approach with respect to TP, TN, FP, and FN rates.

DISCUSSION Evaluation of the applied virtual screening tools So far, virtual screening methods have been rarely employed for the prediction and subsequent biological investigation of novel lead candidates as G-quadruplex ligands. In 2008, Ma et al.9 reported the discovery of a novel active compound by docking and Chen and colleagues employed a pharmacophore model for the identification of TSIZ01.10 In 2011, Alcaro et al.11 applied a combination of ligand- and structure-based virtual screening methods to filter novel active compounds from the ZINC database. Very recently, Castillo-González et al.12 applied a sequential in silico workflow comprising of both ligand-based virtual screening and docking for 30 ACS Paragon Plus Environment

Page 31 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

the identification of novel G-quadruplex binders. The results of these virtual screening studies are summarized in Table 6.

Table 6. Comparison of G-quadruplex virtual screening studies Study

applied method

No of compounds in screening database

No. of No. of active most compounds active compounds selected for compound testing

Ma et al.6

Docking

> 100,000 1 drug like compounds

1

∆Tm of 17.9°C @ 1µM

Chen et al.7

Pharmacophorebased screening

> 5,000 4 natural products and derivatives

1

∆Tm of 23.5°C @ 1µM, KA = 2.19 µM

Alcaro et al.8

Combination

~ 2,7 Mio

28 (+ 12 1 (+ 4 ∆Tm derivatives of derivatives) 14°C active 10µM compound)

~ @

CastilloCombination González et al.9

> 600,000

17

∆Tm 7.3°C 5µM

of @

our study

all together

~ 127,000

32 (+10 7 (+ 7 ∆Tm of derivatives of derivatives) 24.8°C @ active 5µM, Kd = compounds) 0.36 µM

Pharmacophorebased screeninga

~ 127,000

10

3

∆Tm of 24.0°C @ 5µM

Shape-based screeninga

~ 127,000

9

1

∆Tm of 21.1°C @ 5µM

Dockinga

~ 127,000

8

0

-

Consensusa

~ 127,000

5

3

∆Tm of 20.8°C @

4

31 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 70

5µM a

only initial virtual screening hits (but not derivatives) were considered

In the study presented here, multiple virtual screening tools have been applied in parallel and in a consensus approach for the identification of novel G-quadruplex ligands. The selection of test compounds from every approach and their subsequent experimental assessment allowed for the direct comparison of the performances of all methods and the consensus approach in a prospective manner. Pharmacophore modeling retrieved high EE and OE rates and also a very high Acc. It could correctly identify six out of the seven active compounds, while not mapping too many inactive molecules. One reason for this good performance can be found in the feature definition implemented in LigandScout. It is well accepted that aromatic interactions are crucial for ligand activity. In addition, positively charged groups can support binding via interaction with the negatively charged DNA backbone. LigandScout includes both Aro and PI features in its default settings, which allows for a good representation of the binding mode. Also shape-based modeling performed very well as it correctly identified five out of the seven active molecules. Compared to pharmacophore modeling, it missed one more of the active compounds and retrieved a few more inactive compounds. The performance of ROCS, however, might very much depend on the database selected for virtual screening. ROCS does not provide a color feature specific to aromatic interactions in its default settings. This might be of subordinate relevance for databases containing many aromatic compounds, and apparently SPECS is one of them. In addition, the prior filtering for compounds with at least two aromatic rings might have also supported the success of ROCS. The discrimination between aromatic and aliphatic rings can however be crucial for databases that include many aliphatic rings. We also applied our ROCS 32 ACS Paragon Plus Environment

Page 33 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

models for the screening of natural product databases (data not shown), which contained a lot of steroids. These compounds were ranked very high, although they clearly did not possess the structural requirements for G-quadruplex binding. However, ROCS seems to be very suitable to represent the co-planar, or almost co-planar, geometry most of the so far reported G-quadruplex ligands have in common. In combination with the suitably designed screening library, ROCS proved to be a powerful tool for the identification of novel G-quadruplex ligands. Although docking did not predict active compounds among the top-ten ranked molecules, it performed quite well. It has been reported before that docking suffers from scoring and the correct ranking of active molecules.17, 39 Furthermore, we18 and others40, 41 have observed that the performance of docking, and GOLD in particular, is dependent on the size of the binding site. The G-quadruplex binding “pocket” is rather a surface than a pocket, and with up to 390 Å2 this surface comprises a large area.4, 42 In addition, most docking programs may have difficulties to accurately calculate the interaction patterns of G-quadruplex ligands.4 Also we had to include a hydrophobic constraint in our docking workflow to enable the re-docking of the ligand with an RMSD lower than 2.0 Å. Considering these numerous drawbacks, docking performed very well. It correctly predicted four out of the seven compounds as active and could successfully identify 13 inactive compounds as such. However, other methods such as pharmacophore modeling might be better suited to identify novel active compounds, at least when applied alone. The performance of the consensus approach outperformed all other methods when applied alone in terms of enrichment of active compounds in the hit list (EE rate of 60% in comparison to 30%, 11.1%, and 0% for pharmacophore modeling, shape-based modeling, and docking, respectively). Although this finding is generally anticipated as a universal rule, it might not apply for all projects. In our recent study investigating cytochrome P450 metabolic enzymes18, we 33 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 70

could observe that it was difficult to describe the properties of active compounds with multiple methods, and the application of a consensus approach did not lead to improved virtual screening performances. The remarkable EE rate retrieved in this study was accompanied by a decreased number of TP and an increased number of FN hits. Also in previous studies, we observed that consensus scoring leads to exclusion of active compounds that have been solely identified by a lower number of methods or even only one.18, 19 Our results strongly suggest the application of a consensus approach for lead-identification projects However, when a project requires the successful identification of as many active compounds as possible, e.g. when predicting the risk of adverse effects, a different strategy should be employed. Nevertheless, the consensus approach applied in this study clearly appeared to be most suitable for the identification of novel Gquadruplex ligands whilst keeping the number of FP hits low. The evaluation of the external profiling tools SEA and PharmMapper was largely hampered by the fact that no information about the included targets was available. Since also well-known active molecules were not predicted as G-quadruplex ligands by these programs, we did not consider it in our analysis. However, a list of targets and activities that are represented by a program in addition to the profiling tool would be highly desirable for future studies. In the case of PASS, this information was provided and revealed that G-quadruplexes were not included as targets. Consequently, we could not predict the activity of our compounds with respect to G-quadruplex binding by using PASS.

Novel compounds In the initial virtual screening, seven novel G-quadruplex ligands were identified. To investigate the structural novelty of these compounds, we compared them to the known active 34 ACS Paragon Plus Environment

Page 35 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

compounds in the dataset we employed for model refinement and theoretical validation. For this purpose, we applied the “Compare Libraries” tool implemented in Discovery Studio 4.043 and selected the ECFP4 for the calculation of the global fingerprints. The similarity of the libraries was determined with the Tanimoto coefficient (Tc).44,

45

The Tc ranks from zero to one, with

similar compounds retrieving high and distinct molecules low values. All together both databases, i.e. the validation dataset and the dataset of newly identified compounds, display a global similarity of 0.048. In detail, the compounds 18, 22, 23, 30, 46, 48, and 50 retrieved similarity values of 0.016, 0.017, 0.015, 0.015, 0.017, 0.020, and 0.018 compared to the global fingerprints of the “actives” dataset. In addition, we employed the similarity search option in SciFinder to investigate whether similar compounds were already reported in the literature. For none of the novel compounds Gquadruplex binding data was available. One compound with a similarity of ≥ 80% to compound 23 was reported as a G-quadruplex ligand,46 suggesting that this scaffold has already been related to G-quadruplex binding before. For compounds 18,47 30,48, 49 46,50 and 4847 molecules with ≥ 80% similarity have been investigated for their DNA-binding properties. However, all of these studies focused on duplex DNA and none of them linked these compounds to G-quadruplex binding. The newly identified compounds, except compound 23, represent promising lead candidates for further optimization as they constitute novel scaffolds for G-quadruplex ligands. Moreover, they have more promising binding profiles. In addition, ten derivatives of the novel active compounds 22, 30, 46, and 50 were investigated. Seven of them were confirmed as active in the experimental testing. The three inactive derivatives were related to compound 22: compounds 53-55 differed from their active analogues in the cationic nitrogen. At physiological pH, these compounds might not be charged, which apparently leads to a diminished binding efficiency. The site of the fluoride substitution did not 35 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 70

influence the activity, as compounds 22, 51, and 52 displayed a similar effect. The two selected derivatives of compound 50 were less active than the parent compound. Both of these molecules have a smaller aromatic core. Compound 59 showed a reduced activity compared to compound 60. Intriguingly, these compounds differ only at one methyl group. For the remaining derivatives, similar activities to their parent compounds could be observed. In their excellent review, Ohnmacht and Neidle4 determined the structural features most of the polycyclic heteroaromatic G-quadruplex ligands have in common: They all contain large planar surfaces, a positive charge, at least one side chain, and the number of the side chains can lead to selective binding of specific G-quadruplex stuctures.4 The novel G-quadruplex ligands reported in this study somehow differ from these findings, as none of them contains any side chains. There exist some known ligands lacking these side chains as well, for example the natural product 4. Many of the novel compounds were identified with models based on the crystal structure of 4 bound to a G-quadruplex structure (PDB entry 3R6R8). Intriguingly, many of the newly identified compounds do not only differ in their structural features, but are even more effective than the established reference ligand 4. This is even more remarkable as none of these compounds underwent further chemical optimization steps.

CONCLUSION In this study, multiple virtual screening approaches were employed for the identification of novel G-quadruplex ligands. Our results strongly recommend the application of a consensus approach for this target, because it yielded the highest EE of active compounds in the hit list. However, also the application of pharmacophore modeling and shape-based modeling for virtual screening was very successful (which may partly be caused by the unintended selection of the very suitable prospective screening database for shape-based modeling). Docking may be less 36 ACS Paragon Plus Environment

Page 37 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

appropriate as an exclusively applied virtual screening approach, although the application of the consensus approach as well as the prediction of the derivatives 59 and 60 proved the value of the method as an auxiliary tool. In summary, 14 structurally novel G-quadruplex ligands were identified in this study. Most of these compounds were highly selective for G-quadruplex over duplex DNA, and were even more potent G-quadruplex stabilizers than established G-quadruplex ligands such as 4. The investigation of the biological effects of these promising molecules should therefore be continued in further studies.

EXPERIMENTAL SECTION Hardware specification All processes and predictions were performed on a multi-core workstation with 2.4+ GHz, 8 GB of RAM, a 1+TB fast mass storage, and a NVIDIA graphical processing unit. All programs run on the Windows 7 platform.

Dataset A dataset containing 360 confirmed G-quadruplex ligands was manually assembled from the literature. It comprises only compounds that were proved to bind G-quadruplex by FRET or CD melting assays, SPR, NMR, or electrospray ionization-MS. In addition, these compounds were required to stabilize telomeric G-quadruplex and to be selective over duplex DNA. Compounds with a high affinity towards G-quadruplex that failed to increase the melting temperature were discarded, because the ability to stabilize G-quadruplex (which is determined via melting assays) is crucial for their biological activity.

37 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 70

The 2D structures of the molecules were manually generated with ChemBioDraw Ultra version 1151 and converted to a sd-file with Pipeline Pilot version 8.5.52 One low-energy conformation was generated for every input molecule using OMEGA version 2.3.2.53-55 A list containing the smiles codes of all compounds is provided in Table S1 in the supporting information. In addition, an optimized decoy set for G-quadruplex was generated for the theoretical validation: In a first step, the ChEMBL database56 version 1557 was downloaded and all molecules related to the term “telomerase” were removed. The molecular properties of the known G-quadruplex ligands in the dataset were calculated using the “Calculate Molecular Properties“ tools implemented in Discovery Studio version 3.5.33 The ChEMBL-decoy set was then filtered according to the mean ± standard deviation of the molecular properties using a modified Lipinski filter in Pipeline Pilot version 8.5.52 In detail, these properties included a molecular weight from 402 to 713, a number of HBDs from one to five, a number of HBAs from three to nine, a number of rotatable bonds from five to 14, and a clogP between two and six. The remaining compounds (about 234,000) were clustered, and 14,974 cluster centers were kept for the final optimized and diverse G-quadruplex decoy set. Similar to the “actives” dataset described above, one low-energy conformation was generated for every input molecule using OMEGA version 2.3.2.53-55 The Specs database version May2013_10mg (www.specs.net) was selected for prospective virtual screening. To reduce the size of the 200,000 compound database, it was filtered with a modified Lipinski-filter in Pipeline Pilot version 8.552 and only molecules with a minimum number of two aromatic rings were kept. Finally, 127,452 compounds were subjected to the prospective virtual screening conducted with pharmacophore modeling, shape-based modeling, and docking.

Applied programs 38 ACS Paragon Plus Environment

Page 39 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

LigandScout version 3.1,20 vROCS version 3.0.0,14,

15

and GOLD version 5.231,

32

were

employed for all pharmacophore modeling, shape-based modeling, and docking procedures. Whenever possible, the models were based on X-ray crystal structures, because they provide experimentally derived information about potential binding modes. In detail, the structures of the ligands 1 (PDB entry 3CE55), 2 (PDB entry 3SC86), 3 (PDB entry 3UYH7), and 4 (PDB entry 3R6R8) in complex with human G-quadruplex were employed. All selected compounds from the prospective experiments were further investigated with the external bioactivity profiling tools SEA,34 PASS,35 and PharmMapper.36 For a detailed description of the methods and the parameters applied in this study, please refer to Section S2 in the supporting information.

Theoretical model validation During the theoretical validation, the EF of every model was calculated with Equation 1: 



 = (  )/( ) (Equation 1) In this equation, TP represents the number of active compounds in the virtual hit list, n the number of all compounds in the virtual hit list, A the number of active compounds in the overall validation dataset, and N the number of all compounds in the overall validation dataset. The EF therefore calculates the enrichment of active compounds in comparison to a random selection. Since the EF is highly dependent on the actual composition of the validation dataset, we also calculated the percentage of the maxEF retrieved by every model. To ensure the retrieval of the majority of active compounds from the dataset while retaining the number of decoy molecules low, multiple restrictive models were applied in parallel.58 To also determine the quality of these model collections, the AUC of the ROC plots were calculated. For a detailed description of the ROC plot and the AUC please refer to Triballeau et al.59 39 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 70

Selection of virtual hits for biological testing To be able to directly compare the performances of all applied methods, the virtual hits of all methods were independently ranked according to the respective fitness score. The ten top-ranked compounds of every method were selected for further investigations. In cases where very similar compounds were ranked among the top ten, only the highest scoring ones were included, discarding the others in favor of more diverse molecules further down the line. In order to compare the performance of each single method to a consensus approach, five consensus hits, predicted as active by all three methods, were also subjected to the biological testing.

Generation of the prediction matrix The top-10-compounds selected by LigandScout, ROCS, and GOLD, as well as the five hits from the consensus approach were merged into an overall hit list, thereby creating a prediction matrix. It shows whether the top-ten ranked compounds were also predicted by any other of the applied methods above the activity cut-off.

Compound characterization The identity and purity of the tested compounds was confirmed by Specs via 1H NMR and / or HPLC-MS. The melting points (m. p.) of the novel G-quadruplex ligands were measured with a light microscope System Kofler, which was calibrated with benzanilide (m. p. 163°C), dicyandiamide (m. p. 201°C), and phenolphthalein (m. p. 263°C) prior to its use. All determined m. p. are available in Table S2 in the supporting information. To investigate the risk of interference with the experimental assays, all compounds were analyzed with the PAINS filter 40 ACS Paragon Plus Environment

Page 41 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

implemented at the endocrine disruptome homepage.60 For compound 58, a PAINS alert was reported, because it contained the het_pyridinium_(B2) substructure. However, no interference of the optical properties of molecules during the fluorescence melting assay was observed. All other active compounds did not violate the PAINS rules and are therefore unlikely to interfere with biological assays.

Fluorescence melting assay Experiments were performed in a Roche LightCycler, using an excitation source at 488 nm and recording the fluorescence emission at 520 nm. Target DNA were the human telomeric sequence HTS d[AG3(T2AG3)3T], HT24 d[TTG3(T2AG3)3A] and a 22bp double stranded DNA (5’G2A(TG)2A(GT)2GA(GT)2GAG2). All sequences were designed to locate a fluorophore (6-FAM) and a quencher (Dadcyl) in close proximity when the nucleic acid was folded into the expected secondary structure and thus resulting in a quenching of FAM fluorescence signal. Conversely, upon DNA denaturation the two labelling groups fall apart and fluorescence increases. For competitions assay, non labelled 22bp double stranded DNA was used. Solutions of properly annealed nucleic acid (0.25 µM) were prepared in LiP buffer (40 mM Li3(PO)4, pH 7.5), containing 50 mM KCl or NaCl and increasing amounts of the tested compounds (0-20 µM)were added . Recordings were taken during melting (1 °C/min) and Tm values were determined from the first derivatives of the melting profiles using the Roche LightCycler software. Each curve was repeated at least three times and errors were ± 0.4 °C. ∆Tm were calculated by subtracting the Tm value recorded in the presence of the ligand from the corresponding value in the absence of ligand.61

Circular dichroidism 41 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 70

CD spectra were recorded at 25 °C in 10 mM Tris-HCl, 1 mM EDTA, 50 mM KCl pH 7.5 on a Jasco J-810 spectropolarimeter equipped with a Peltier-type temperature control system using a 10 mm path length cell. Before data acquisition, the human telomeric sequence Tel22 d[G3(T2AG3)3T (4 µM) was heated at 95°C for 5 min and left to cool at room temperature over night. Spectra were then acquired in the absence and in the presence of increasing concentration of tested ligands. The reported spectrum of each sample represents the average of 3 scans recorded with 1-nm step resolution. Observed ellipticities were converted to mean residue ellipticity [θ] = deg x cm2 x dmol-1 (Mol. Ellip.).

Surface plasmon resonance SPR measurements were performed on a Biocore X100. A streptavidine-coated sensor chip was prepared for use by conditioning with 1 min injections of 1M NaCl, 50mM NaOH in 50% isopropanol and finally extensively washed with 0.22µm filtered buffer (10 mM Tris, 50 mM KCl, 0.5 % DMSO, 0.025% P20). Previously annealed 5’-biotinylated Tel22 was then immobilized on one cell of the chip surface by flowing a 50 nM DNA solution at a 1 µl/min flow rate until a 400 RU response was obtained. A second cell was left blank as control. Sensorgrams were acquired while compound solutions (0-70 µM) were injected at a 25 µl/min flow rate for 100 sec. After each run, a 30 seconds regeneration step was performed with 10 mM glycine pH 2.5 followed by a 60 second stabilization period with running buffer. The experimental RU values recorded at the steady state were fitted according to a one or two binding site model.

Analysis of results

42 ACS Paragon Plus Environment

Page 43 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

After the biological testing, the performances of all applied tools were analyzed within three categories. In the first category, EE, the ability of the programs to correctly predict active compounds within the top-ten ranked molecules was investigated using Equation 2:  =

                  

∗ 100 (Equation 2)

The next category, OE (Equation 3), analyzed, whether a method correctly predicted the active molecules (TP) above the defined activity cut-off (independently of the top ten positions). 

" =            ∗ 100 (Equation 3) The last category, Acc62 (Equation 4), investigated how many of the predictions were correct. This included compounds correctly predicted as active, but also those molecules that were correctly classified as inactive (TN). For this purpose, also the number of FP (compounds that were predicted as active, but that were inactive in the experimental testing), and FN (compounds that were not predicted as active, but that were active in the biological assessment) hits were calculated. %

#$$ = %&% %& ∗ 100 (Equation 4) However, the numbers of TP, TN, FP, and FN hits are highly dependent on the actual composition of active and inactive molecules in the overall hit list. To better assess the performances of the applied tools independently of the composition of the overall hit list, the maximum possible values were determined (maxTP, maxTN, maxFP, and maxFN) and for every method the retrieved % of these maximum values were calculated. For example, the maxTP rate is 21.9% (seven out of 35 investigated molecules were active). A method that correctly classified all of the seven active compounds yields 100%, whereas a method predicting only one retrieves 14.3% of the maxTP rate. 43 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 70

The % of the maxTN, maxFP, and maxFN rates were calculated analogous.

SUPPORTING INFORMATION The supporting information provides: Section S1. Detailed description of pharmacophore- and shape-based model generation. Section S2. Detailed description of the applied virtual screening methods. Table S1. Smiles codes of the active compounds in the theoretical validation dataset. Figure S1. Detailed graphical representation of the maxTP, maxTN, maxFP, and maxFN rates retrieved by all methods. Table S2. Melting points of novel G-quadruplex ligands. This material is available free of charge via the Internet at http://pubs.acs.org.

AUTHOR INFORMATION Corresponding Author *[email protected], phone/fax:+43 512 507 58253 / +43 512 507 58299

ACKNOWLEDGMENT This study was supported by the foundation „Verein zur Förderung der Ausbildung und Tätigkeit von Südtirolern an der Landesuniversität Innsbruck” (T. K.), by the Erasmus program student mobility placement financed by the E. U. and supported by the “Standortagentur Tirol” 44 ACS Paragon Plus Environment

Page 45 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(T. K.), by a grant of the TWF (T. K.), and the Erika Cremer habilitation program of the University of Innsbruck (D.S.). The research was founded also by University of Padova (CPDA147272/14 and 60A04-5935/15). S. A. acknowledges the Italian Ministry of Education Funding for Investments of Base Research for the years 2009–2014 (code FIRB-IDEAS RBID082ATK). Many thanks to Veronika Temml and Sonja Herdlinger for technical support. We thank OpenEye and Inte:Ligand for providing software free of charge.

ABBREVIATIONS Acc, accuracy; Aro, aromatic feature; AUC, area under the curve; C, cation; CD, circular dichroidism; EE, early enrichment; EF, enrichment factor; FN, false negatives; FP, false positives; H, hydrophobic feature; HBA, hydrogen bond acceptor, HBD, hydrogen bond donor; maxEF, maximum enrichment factor; maxFN, maximum false negative rate; maxFP, maximum false positive rate; maxTN, maximum true negative rate; maxTP, maximum true positive rate; OE, overall enrichment; PI, positively ionizable feature; R, ring feature; SPR, surface plasmon resonance; Tc, Tanimoto coefficient; TN, true negatives; TP, true positives; Xvol, exclusion volume. REFERENCES (1)

Lu, L.; Shiu-Hin Chan, D.; Kwong, D. W. J.; He, H.-Z.; Leung, C.-H.; Ma, D.-L.

Detection of Nicking Endonuclease Activity Using a G-Quadruplex-Selective Luminescent Switch-On Probe. Chem. Sci. 2014, 5, 4561-4568. (2)

He, H.-Z.; Chan, D. S.-H.; Leung, C.-H.; Ma, D.-L. G-Quadruplexes for Luminescent

Sensing and Logic Gates. Nucleic Acids Res. 2013, 41, 4345–4359.

45 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(3)

Page 46 of 70

Ma, D.-L.; Wang, M.; He, B.; Yang, C.; Wang, W.; Leung, C.-H. A Luminescent Cocaine

Detection Platform Using a Split G-Quadruplex-Selective Iridium(III) Complex and a Three-Way DNA Junction Architecture. ACS Appl. Mater. Interfaces 2015, 7, 19060-19067. (4)

Ohnmacht, S. A.; Neidle, S. Small-Molecule Quadruplex-Targeted Drug Discovery.

Bioorg. Med. Chem. Lett. 2014, 24, 2602-2612. (5)

Campbell, N. H.; Parkinson, G. N.; Reszka, A. P.; Neidle, S. Structural Basis of DNA

Quadruplex Recognition by an Acridine Drug. J. Am. Chem. Soc. 2008, 130, 6722-6724. (6)

Collie, G. W.; Promontorio, R.; Hampel, S. M.; Micco, M.; Neidle, S.; Parkinson, G. N.

Structural Basis for Telomeric G-Quadruplex Targeting by Naphthalene Diimide Ligands. J. Am. Chem. Soc. 2012, 134, 2723-2731. (7)

Micco, M.; Collie, G. W.; Dale, A. G.; Ohnmacht, S. A.; Pazitna, I.; Gunaratnam, M.;

Reszka, A. P.; Neidle, S. Structure-Based Design and Evaluation of Naphthalene Diimide GQuadruplex Ligands as Telomere Targeting Agents in Pancreatic Cancer Cells. J. Med. Chem. 2013, 56, 2959-2974. (8)

Bazzicalupi, C.; Ferraroni, M.; Bilia, A. R.; Scheggi, F.; Gratteri, P. The Crystal Structure

of Human Telomeric DNA Complexed with Berberine: An Interesting Case of Stacked Ligand to G-Tetrad Ratio Higher than 1:1. Nucleic Acids Res. 2013, 41, 632-638. (9)

Ma, D.-L.; Lai, T.-S.; Chan, F.-Y.; Chung, W.-H.; Abagyan, R.; Leung, Y.-C.; Wong, K.-

Y. Discovery of a Drug-Like G-Quadruplex Binding Ligand by High-Throughput Docking. ChemMedChem 2008, 3, 881-884.

46 ACS Paragon Plus Environment

Page 47 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(10) Chen, S.-B.; Tan, J.-H.; Ou, T.-M.; Huang, S.-L.; An, L.-K.; Luo, H.-B.; Li, D.; Gu, L.Q.; Huang, Z.-S. Pharmacophore-Based Discovery of Triaryl-Substituted Imidazole as New Telomeric G-Quadruplex Ligand. Bioorg. Med. Chem. Lett. 2011, 21, 1004-1009. (11) Alcaro, S.; Musetti, C.; Distinto, S.; Casatti, M.; Zagotto, G.; Artese, A.; Parrotta, L.; Moraca, F.; Costa, G.; Ortuso, F.; Maccioni, E.; Sissi, C. Identification and Characterization of New DNA G-Quadruplex Binders Selected by a Combination of Ligand and Structure-Based Virtual Screening Approaches. J. Med. Chem. 2013, 56, 843-855. (12) Castillo-González, D.; Mergny, J.-L.; De Rache, A.; Pérez-Machado, G.; Cabrera-Pérez, M. A.; Nicolotti, O.; Introcaso, A.; Mangiatordi, G. F.; Guédin, A.; Bourdoncle, A.; Garrigues, T.; Pallardó, F.; Cordeiro, M. N. D. S.; Paz-y-Miño, C.; Tejera, E.; Borges, F.; Cruz-Monteagudo, M. Harmonization of QSAR Best Practices and Molecular Docking Provides an Efficient Virtual Screening Tool for Discovering New G-Quadruplex Ligands. J. Chem. Inf. Model. 2015, 55, 2094–2110. (13) Wermuth, G.; Ganellin, C. R.; Lindberg, P.; Mitscher, L. A. Glossary of Terms Used in Medicinal Chemistry (IUPAC Recommendations 1998). Pure Appl. Chem. 1998, 70, 1129-1143. (14) vROCS

version

3.0.0;

OpenEye

Scientific

Software:

Santa

FE,

NM,

http://www.eyesopen.com (access date December 30, 2015). (15) Hawkins, P. C.; Skillman, A. G.; Nicholls, A. Comparison of Shape-Matching and Docking as Virtual Screening Tools. J. Med. Chem. 2007, 50, 74-82. (16) Kirchmair, J.; Distinto, S.; Markt, P.; Schuster, D.; Spitzer, G. M.; Liedl, K. R.; Wolber, G. How to Optimize Shape-Based Virtual Screening: Choosing the Right Query and Including Chemical Information. J. Chem. Inf. Model. 2009, 49, 678-692. 47 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 48 of 70

(17) Kitchen, D. B.; Decornez, H.; Furr, J. R.; Bajorath, J. Docking and Scoring in Virtual Screening for Drug Discovery: Methods and Applications. Nat. Rev. Drug Discovery 2004, 3, 935-949. (18) Kaserer, T.; Höferl, M.; Müller, K.; Elmer, S.; Ganzera, M.; Jäger, W.; Schuster, D. In Silico Predictions of Drug-Drug Interactions Caused by CYP1A2, 2C9, and 3A4 Inhibition - A Comparative Study of Virtual Screening Performance. Mol. Inf. 2015, 34, 431–457. (19) Kaserer, T.; Temml, V.; Kutil, Z.; Vanek, T.; Landa, P.; Schuster, D. Prospective Performance Evaluation of Selected Common Virtual Screening Tools. Case Study: Cyclooxygenase (COX) 1 and 2. Eur. J. Med. Chem. 2015, 96, 445-457. (20) Wolber, G.; Langer, T. LigandScout: 3-D Pharmacophores Derived from Protein-Bound Ligands and Their Use as Virtual Screening Filters. J. Chem. Inf. Model. 2005, 45, 160-169. (21) Cuenca, F.; Moore, M. J. B.; Johnson, K.; Guyen, B.; De Cian, A.; Neidle, S. Design, Synthesis and Evaluation of 4,5-di-Substituted Acridone Ligands with High G-Quadruplex Affinity and Selectivity, Together with Low Toxicity to Normal Cells. Bioorg. Med. Chem. Lett. 2009, 19, 5109-5113. (22) Moorhouse, A. D.; Haider, S.; Gunaratnam, M.; Munnur, D.; Neidle, S.; Moses, J. E. Targeting Telomerase and Telomeres: A Click Chemistry Approach Towards Highly Selective G-Quadruplex Ligands. Mol. BioSyst. 2008, 4, 629-642. (23) Li, Z.; Tan, J.-H.; He, J.-H.; Long, Y.; Ou, T.-M.; Li, D.; Gu, L.-Q.; Huang, Z.-S. Disubstituted Quinazoline Derivatives as a New Type of Highly Selective Ligands for Telomeric G-Quadruplex DNA. Eur. J. Med. Chem. 2012, 47, 299-311.

48 ACS Paragon Plus Environment

Page 49 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(24) Smith, N. M.; Labrunie, G.; Corry, B.; Tran, P. L. T.; Norret, M.; Djavaheri-Mergny, M.; Raston, C. L.; Mergny, J.-L. Unraveling the Relationship Between Structure and Stabilization of Triarylpyridines as G-Quadruplex Binding Ligands. Org. Biomol. Chem. 2011, 9, 6154-6162. (25) Laronze-Cochard, M.; Kim, Y.-M.; Brassart, B.; Riou, J.-F.; Laronze, J.-Y.; Sapi, J. Synthesis and Biological Evaluation of Novel 4,5-Bis(Dialkylaminoalkyl)-Substituted Acridines as Potent Telomeric G-Quadruplex Ligands. Eur. J. Med. Chem. 2009, 44, 3880-3888. (26) Zhou, J.-L.; Lu, Y.-J.; Ou, T.-M.; Zhou, J.-M.; Huang, Z.-S.; Zhu, X.-F.; Du, C.-J.; Bu, X.-Z.; Ma, L.; Gu, L.-Q.; Li, Y.-M.; Chan, A. S.-C. Synthesis and Evaluation of Quindoline Derivatives as G-Quadruplex Inducing and Stabilizing Ligands and Potential Inhibitors of Telomerase. J. Med. Chem. 2005, 48, 7315-7321. (27) Lu, Y.-J.; Ou, T.-M.; Tan, J.-H.; Hou, J.-Q.; Shao, W.-Y.; Peng, D.; Sun, N.; Wang, X.D.; Wu, W.-B.; Bu, X.-Z.; Huang, Z.-S.; Ma, D.-L.; Wong, K.-Y.; Gu, L.-Q. 5-N-Methylated Quindoline Derivatives as Telomeric G-Quadruplex Stabilizing Ligands: Effects of 5-N Positive Charge on Quadruplex Binding Affinity and Cell Proliferation. J. Med. Chem. 2008, 51, 63816392. (28) Wei, C.-Y.; Wang, J.-H.; Wen, Y.; Liu, J.; Wang, L.-H. 4-(1H-Imidazo[4,5-f]-1,10phenanthrolin-2-yl)phenol-Based G-Quadruplex DNA Binding Agents: Telomerase Inhibition, Cytotoxicity and DNA-Binding Studies. Bioorg. Med. Chem. 2013, 21, 3379-3387. (29) De Cian, A.; Guittat, L.; Shin-ya, K.; Riou, J.-F.; Mergny, J.-L. Affinity and Selectivity of G4 Ligands Measured by FRET. Nucleic Acids Symp. Ser. 2005, 49, 235-236.

49 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 50 of 70

(30) Drewe, W. C.; Nanjunda, R.; Gunaratnam, M.; Beltran, M.; Parkinson, G. N.; Reszka, A. P.; Wilson, W. D.; Neidle, S. Rational Design of Substituted Diarylureas: A Scaffold for Binding to G-Quadruplex Motifs. J. Med. Chem. 2008, 51, 7751-7767. (31) GOLD version 5.2; CCDC: Cambridge, UK, www.ccdc.cam.ac.uk (access date December 30, 2015). (32) Jones, G.; Willett, P.; Glen, R. C.; Leach, A. R.; Taylor, R. Development and Validation of a Genetic Algorithm for Flexible Docking. J. Mol. Biol. 1997, 267, 727-748. (33) Accelry Software Inc., Discovery Studio Release 3.5; San Diego: Accelrys Inc., 2012. (34) Keiser, M. J.; Roth, B. L.; Armbruster, B. N.; Ernsberger, P.; Irwin, J. J.; Shoichet, B. K. Relating Protein Pharmacology by Ligand Chemistry. Nat. Biotechnol. 2007, 25, 197-206 (access dates January 31, February 1-4, 20, and 23-24, and March 2, 2015). (35) Filimonov, D. A.; Lagunin, A. A.; Gloriozova, T. A.; Rudik, A. V.; Druzhilovskii, D. S.; Pogodin, P. V.; Poroikov, V. V. Prediction of the Biological Activity Spectra of Organic Compounds Using the Pass Online Web Resource. Chem. Heterocycl. Compd. 2014, 50, 444-457 (access date February 20, 2015). (36) Liu, X.; Ouyang, S.; Yu, B.; Liu, Y.; Huang, K.; Gong, J.; Zheng, S.; Li, Z.; Li, H.; Jiang, H. PharmMapper Server: A Web Server for Potential Drug Target Identification Using Pharmacophore Mapping Approach. Nucleic Acids Res. 2010, 38, W609-614 (access dates February 1, 20, and 23, 2015). (37) Bessi, I.; Bazzicalupi, C.; Richter, C.; Jonker, H. R. A.; Saxena, K.; Sissi, C.; Chioccioli, M.; Bianco, S.; Bilia, A. R.; Schwalbe, H.; Gratteri, P. Spectroscopic, Molecular Modeling, and 50 ACS Paragon Plus Environment

Page 51 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

NMR-Spectroscopic Investigation of the Binding Mode of the Natural Alkaloids Berberine and Sanguinarine to Human Telomeric G-Quadruplex DNA. ACS Chem. Biol. 2012, 7, 1109-1119. (38) Bhadra, K.; Kumar, G. S. Interaction of Berberine, Palmatine, Coralyne, and Sanguinarine to Quadruplex DNA: A Comparative Spectroscopic and Calorimetric Study. Biochim. Biophys. Acta, Gen. Subj. 2011, 1810, 485-496. (39) Leach, A. R.; Shoichet, B. K.; Peishoff, C. E. Prediction of Protein−Ligand Interactions. Docking and Scoring:  Successes and Gaps. J. Med. Chem. 2006, 49, 5851-5855. (40) Meslamani, J.; Li, J.; Sutter, J.; Stevens, A.; Bertrand, H. O.; Rognan, D. Protein-LigandBased Pharmacophores: Generation and Utility Assessment in Computational Ligand Profiling. J. Chem. Inf. Model. 2012, 52, 943-955. (41) Kellenberger, E.; Rodrigo, J.; Muller, P.; Rognan, D. Comparative Evaluation of Eight Docking Tools for Docking and Virtual Screening Accuracy. Proteins: Struct., Funct., Bioinf. 2004, 57, 225-242. (42) Alcaro, S.; Artese, A.; Costa, G.; Distinto, S.; Ortuso, F.; Parrotta, L. Conformational Studies and Solvent-Accessible Surface Area Analysis of Known Selective DNA G-Quadruplex Binders. Biochimie 2011, 93, 1267-1274. (43) Dassault Systèmes BIOVIA, Discovery Studio Modeling Environment, version 4.0; San Diego: Dassault Systèmes, 2013. (44) Tanimoto, T. T. IBM Internal Report, 17th Nov. 1957. (45) Jaccard, P. Distribution de la Flore Alpine dans le Bassin des Dranses et dans Quelques Régions Voisines. Bull. Soc. Vaud. sci. nat. 1901, 37, 241-272. 51 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 52 of 70

(46) Sassano, M. F.; Schlesinger, A. P.; Jarstfer, M. B. Identification of G-Quadruplex Inducers Using a Simple, Inexpensive and Rapid High Throughput Assay, and Their Inhibition of Human Telomerase. Open Med. Chem. J. 2012, 6, 20-28. (47) Suarez, R. M.; Bosch, P.; Sucunza, D.; Cuadro, A. M.; Domingo, A.; Mendicuti, F.; Vaquero, J. J. Targeting DNA with Small Molecules: A Comparative Study of a Library of Azonia Aromatic Chromophores. Org. Biomol. Chem. 2015, 13, 527-538. (48) Pastor, J.; Siro, J. G.; García-Navío, J. L.; Vaquero, J. J.; Alvarez-Builla, J.; Gago, F.; de Pascual-Teresa, B.; Pastor, M.; Rodrigo, M. M. Azino-Fused Benzimidazolium Salts as DNA Intercalating Agents. 2. J. Org. Chem. 1997, 62, 5476-5483. (49) Pastor, J.; Siró, J.; García-Navío, J.; Vaquero, J. J.; Melia Rodrigo, M.; Ballesteros, M.; Alvarez-Builla, J. Synthesis of New Azino Fused Benzimidazolium Salts. A New Family of DNA Intercalating Agents. I. Bioorg. Med. Chem. Lett. 1995, 5, 3043-3048. (50) Martin, M. A.; del Castillo, B.; Lerner, D. A. Study of the Lumiescence Properties of a New Series of Quinolizinium Salts and Their Interaction with DNA. Anal. Chim. Acta 1988, 205, 105-115. (51) PerkinElmer. ChemBioDraw Ultra 11.0, 2008, Waltham, MA. (52) Accelry Software Inc., Pipeline Pilot, Release 8.5; San Diego: Accelrys Inc., 2011. (53) OMEGA

version

2.3.2;

OpenEye

Scientific

Software:

Sante

FE,

NM,

http://www.eyesopen.com (access date December 30, 2015).

52 ACS Paragon Plus Environment

Page 53 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

(54) Hawkins, P. C.; Skillman, A. G.; Warren, G. L.; Ellingson, B. A.; Stahl, M. T. Conformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural Database. J. Chem. Inf. Model. 2010, 50, 572-584. (55) Hawkins, P. C.; Nicholls, A. Conformer Generation with OMEGA: Learning from the Data Set and the Analysis of Failures. J. Chem. Inf. Model. 2012, 52, 2919-2936. (56) Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: A Large-Scale Bioactivity Database for Drug Discovery. Nucleic Acids Res. 2012, 40, D1100-D1107. (57) chembl_15_release_notes. ftp://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_15/ (access date May 16, 2013). (58) Schuster, D.; Waltenberger, B.; Kirchmair, J.; Distinto, S.; Markt, P.; Stuppner, H.; Rollinger, J. M.; Wolber, G. Predicting Cyclooxygenase Inhibition by Three-Dimensional Pharmacophoric Profiling. Part I: Model Generation, Validation and Applicability in Ethnopharmacology. Mol. Inf. 2010, 29, 75-86. (59) Triballeau, N.; Acher, F.; Brabet, I.; Pin, J.-P.; Bertrand, H.-O. Virtual Screening Workflow Development Guided by the “Receiver Operating Characteristic” Curve Approach. Application to High-Throughput Docking on Metabotropic Glutamate Receptor Subtype 4. J. Med. Chem. 2005, 48, 2534-2547. (60) Kolšek, K.; Mavri, J.; Sollner Dolenc, M.; Gobec, S.; Turk, S. Endocrine Disruptome— An Open Source Prediction Tool for Assessing Endocrine Disruption Potential Through Nuclear Receptor Binding. J. Chem. Inf. Model. 2014, 54, 1254-1267 (access date March 17, 2015). 53 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 54 of 70

(61) Darby, R. A. J.; Sollogoub, M.; McKeen, C.; Brown, L.; Risitano, A.; Brown, N.; Barton, C.; Brown, T.; Fox, K. R. High Throughput Measurement of Duplex, Triplex and Quadruplex Melting Curves Using Molecular Beacons and a LightCycler. Nucleic Acids Res. 2002, 30, e39e39. (62) Jacobsson, M.; Lidén, P.; Stjernschantz, E.; Boström, H.; Norinder, U. Improving Structure-Based Virtual Screening by Multivariate Analysis of Scoring Data. J. Med. Chem. 2003, 46, 5781-5789.

54 ACS Paragon Plus Environment

Page 55 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Table of Contents Graphic

55 ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chart 1. Structures of the G-quadruplex ligands 1 (BRACO-19),5 2 (BMSG-SH-3),6 3 (MM41),7 and 4 (berberine).8 105x94mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 56 of 70

Page 57 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Chart 2. Compounds 521 and 622 served as training compounds for the generation of the pharmacophore model pm-G-quadruplex-1. 56x40mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chart 3. Structures of the selected test compounds. 290x470mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 58 of 70

Page 59 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Chart 4. Training set compounds for the generation of shape-based models. 108x61mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Chart 5. The structures of the derivatives structurally related to the active hits. 50x19mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 60 of 70

Page 61 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 1. Study design. SAR structure-activity relationship. 25x3mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. G-quadruplex pharmacophore models. Models pm-3sc8-1 (A), pm-3sc8-2 (B), and pm-3sc8-3 (C) were generated with the PDB entry 3SC8.6 Models pm-3uyh-1 (D) and pm-3uyh-2 (E) were created with the PDB entry 3UYH.7 The berberine-human G-quadruplex complex (PDB entry 3R6R8) served as the basis for the generation of model pm-3r6r-1 (F). Model pm-G-quadruplex-1 (G) was generated with the two known ligands acridone derivative 521 (grey) and bis-triazole derivative 622 (blue). Aro, aromatic feature; PI, positively ionizable feature; H, hydrophobic feature; Xvol, exclusion volume. 95x47mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 62 of 70

Page 63 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 3. G-quadruplex shape-based models. Models shape-3uyh-2 (A), shape-3r6r-2 (B), and shape-3ce51 (C) were generated with the co-crystallized ligands from the PDB entries 3UYH,7 3R6R,8 and 3CE5,5 respectively. The models shape-G-quadruplex-1 (D) and shape-G-quadruplex-2 (E) were generated with one low-energy conformation of the known active compounds quinazoline derivative 723 and triarylpyridine derivative 824, respectively. The pre-aligned poses of acridine derivative 925 and acridone derivative 1021 and quindoline derivative 1126, quindoline derivative 1227, and imidazo phenanthrolin derivative 1328 served as queries for the models shape-G-quadruplex-3 (F) and shape-G-quadruplex-4 (G). For the last models shape-G-quadruplex-5 (H) and shape-G-quadruplex-6 (I) a modified aromatic force field was applied for virtual screening. These models were generated with one low-energy conformation of 14 (telomestatin)29 and diarylurea derivative 1530, respectively. C, cation; HBA, hydrogen bond acceptor; HBD, hydrogen bond donor. 187x184mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4. Thermal stabilization of different DNA templates promoted by 10 µM ligand concentration. 83x46mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 64 of 70

Page 65 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 5. Melting profile of HTS in the presence of 5 µM of compound 46, recorded in LiP buffer, 50 mM KCl, pH 7.5, with increasing concentrations of double stranded DNA. Duplex and G4 refer to base pairs and Gquadruplex concentration, respectively. 80x61mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 6. Variation of the melting temperature (∆ Tm) of G-quadruplex (solid line) induced by increasing concentrations of selected parent compounds 22 (A), 30 (B), 46 (C), and 50 (D), and their derivatives 51 – 60. Compounds that increased the melting temperature of the G-quadruplex were also investigated for their stabilizing properties concerning duplex DNA (dashed line). 186x217mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 66 of 70

Page 67 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 7. (A) Variation of CD spectrum of telomeric G-4 upon addition of selected compounds determined in 10 mM Tris, 1 mM EDTA, 50 mM KCl, pH 7.5, 25 °C. The spectrum of 4 was taken from Ref37. In (B), the relative variation of the 290 nm dichroic signal upon the ligand]/[G-quadruplex] ratio is reported. 119x124mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 8. Sensorgrams corresponding to the titration of Tel22 with compound 50 in 10 mM Tris, 50 mM KCl, pH 7.4 (Panel A). Plot of the response units (RU) recorded at the steady state as a function of metal complexes concentrations in flow solutions (Panel B). Lines represent the best fit using the appropriate binding model described in the text. 148x62mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 68 of 70

Page 69 of 70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Information and Modeling

Figure 9. Detailed graphical representation of the performances of all methods. (A) Shows the EE, OE, and Acc values retrieved by all applied methods and the consensus approach. (B) Displays the composition of the hit lists obtained with every method and the consensus approach with respect to TP, TN, FP, and FN rates. 89x66mm (300 x 300 DPI)

ACS Paragon Plus Environment

Journal of Chemical Information and Modeling

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

54x22mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 70 of 70