Two-Quartet G-Quadruplexes Formed by DNA Sequences Containing

Feb 17, 2015 - ABSTRACT: The DNA sequence containing four contiguous GG runs. (G2NxG2NyG2NzG2, G2 sequence) has the potential to form a ...
0 downloads 0 Views 1MB Size
Subscriber access provided by The Chinese University of Hong Kong

Article

Two-Quartet G-quadruplexes Formed by DNA Sequences Containing Four Contiguous GG Runs Mingyan Qin, Zhuxi Chen, Qichao Luo, Yi Wen, Naixia Zhang, Hualiang Jiang, and Huaiyu Yang J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/jp512914t • Publication Date (Web): 17 Feb 2015 Downloaded from http://pubs.acs.org on February 18, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Two-Quartet G-quadruplexes Formed by DNA Sequences Containing Four Contiguous GG Runs Mingyan Qin, Zhuxi Chen, Qichao Luo, Yi Wen, Naixia Zhang, Hualiang Jiang, Huaiyu Yang* Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China

* To whom correspondence should be addressed. Tel: +86-21-50800619; Fax: +86-21-50807088; E-mail address: [email protected]

1

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 30

ABSTRACT The DNA sequence containing four contiguous GG runs (G2NxG2NyG2NzG2, G2 sequence) has the potential to form a two-quartet G-quadruplex. However, the prevalence, structure and function of G2 sequences have not been well studied. Here, bioinformatics analysis reveals the abundance of G2 sequences in the human genome and their enrichment in promoter regions. The density of G2 sequences in the genome and promoters is much higher than that of the G3 sequence (G3NxG3NyG3NzG3). Experiments show that the conformations and thermal stabilities of the two-quartet Gquadruplexes of G2 sequences are highly sensitive to the length and composition of the loops. Among the two-quartet G-quadruplexes, the parallel G-quadruplex with a loop length of 1 and the anti-parallel G-quadruplex with a loop length of 3 show high thermal stabilities.

Additionally,

the

stable

parallel

G-quadruplexes

are

stacked

into

intermolecular higher-order structures. This work determines the prevalence of G2 sequences in the human genome and demonstrates that the G-quadruplex structures for certain loop lengths and compositions may be stable in vivo. Thus, more attentions should be paid to the structure and function of the two-quartet G-quadruplex.

2

ACS Paragon Plus Environment

Page 3 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

INTRODUCTION The identification of G-quadruplexes formed by G-rich DNA sequences dates back to the 1960s;1 increasing attention has been drawn to the structure and function of Gquadruplexes since the 1990s, when an abundance of G-rich sequences was found at the end of telomeres.2-4 The G-quadruplexes formed by the tandem repeats of TTAGGG in human telomere DNAs are involved in telomere maintenance.5-7 In addition to the Grich sequences at telomere ends, there are also various G-rich DNA sequences containing four contiguous runs of three or more guanine bases (G3NxG3NyG3NzG3, G3 sequence) in other DNA regions, particularly in gene promoters.8-13 The G-quadruplexes formed by G3 sequences in promoters modulate a variety of biological processes, including gene expression, epigenetic regulation and DNA replication.14-17 Structurally, a G-quadruplex unit is built by stacked quartets (G-tetrads) that are stabilized by the Hoogsteen hydrogen bonds among the guanines and the coordination interactions with the metal ions located between quartets.18-21 Overall, the four guanine runs in a G3 sequence can form three or more quartets, which significantly reduce the enthalpy of the G-quadruplex unit. This favorable contribution of enthalpy to the stability is strong enough to exceed the negative contribution of entropy that is related to the length of the loops.21-22 As a result, these G-quadruplexes with various loop lengths are stable under physiological conditions.21,23-27 Theoretically, the sequence containing four contiguous runs of two guanine bases (G2NxG2NyG2NzG2, G2 sequence) has the potential to fold into two quartets, forming a two-quartet G-quadruplex. Additionally, there are reports about such G-quadruplex structures formed by telomere DNAs and DNA aptamers.28-36 G2 sequences are present 3

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

in the human genome. However, the prevalence of G2 sequences in the human genome is unknown. Furthermore, the structure and function of G2 sequences has not been well investigated. Because fewer quartets provide a smaller enthalpy contribution, the structural features of the two-quartet G-quadruplexes of G2 sequences may be different from those of the G-quadruplexes that have three (or more) quartets. Additionally, different structures may be associated with distinct functions. In this work, we analyzed the distribution of G2 sequences with loop lengths smaller than 8 in the human genome. Then, using G2(TxG2)3 (x = 1-7) sequences as models, we studied the structural features of G2 sequences. Nuclear magnetic resonance (NMR) and circular dichroism (CD) were used to determine whether these sequences could fold into G-quadruplexes and if so, to determine their conformations and their stabilities. Finally, to evaluate the effect of loop composition on G-quadruplex structures, we studied the structures of randomly selected human G2 sequences, whose loops consisted of various nucleotides, and whose sequences differed from the model sequences in which the loops only contained thymine nucleotides.

EXPERIMENTAL METHODS Bioinformatics analysis. The DNA sequences of the human genome and human promoters were downloaded from the Ensembl gene database (ENSEMBL GENES 77, GRCh38) via the Biomart (version 0.7) module (http://www.ensembl.org/). The search for these sequences was accomplished using the quadparser program developed by Shankar Balasubramanian’s group.9,37 The program was developed to study DNA sequences of the form d(G≥3NxG≥3NyG≥3NzG≥3) (DNA sequences with 3 or more 4

ACS Paragon Plus Environment

Page 4 of 30

Page 5 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

guanines in 4 guanine cores and loop lengths of 1-7). The quadparser program found 188836 G≥3 sequences (overlapping sequences were not included) in the ENSEMBL database, using NCBI build 34 version of the human genome sequence.9 In agreement with this study, we found 192711 G≥3 sequences in the ENSEMBL Genes 77 database (overlapping sequences were not included, if included the number was 304826), using the human genome assembly GRCh38. The quadparser program could also be used to calculate the number of DNA sequences of d(G≥2NxG≥2NyG≥2NzG≥2) (DNA sequences with 2 or more guanines in 4 guanine cores and loop lengths of 1-7). To determine the total number of G2 sequences, the number of DNA sequence of d(G≥3NxG≥3NyG≥3NzG≥3) was subtracted from the number of DNA sequences of d(G≥2NxG≥2NyG≥2NzG≥2). Thus, G≥3 sequences are not included as potential G2 sequences. Because previous studies have revealed that G3 sequences with loop lengths from 1 to 7 could form Gquadruplexes, with stability decreasing as length increases, bioinformatics studies mainly focus on G3 sequences with loop length less than 8.8-13 In accordance with G3 sequences, we focused on G2 sequences with loop lengths from 1 to 7. In principle, the restriction on loop length doesn’t mean that G2 sequences with longer loop length could not fold into G-quadruplex structures. Preparation of DNA. The oligonucleotides were purchased from the Sangon Biotech Co., LTD (Shanghai,China). The oligonucleotides were dissolved in 50 mM Tris-HCl buffer (pH = 7). KCl was added to the solution at a final concentration of 100 mM. Samples were heated for 10 min at 90 ºC and then were slowly cooled to room temperature before the NMR and CD analysis. The DNA concentrations were 10µM.

5

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Nuclear magnetic resonance. NMR experiments were performed on a Bruker Advance III 600 MHz spectrometer with a TCI cryoprobe at 25 °C. DNA samples of 3 mM were prepared in 0.5 mL of 90% H2O/10% D2O with, 100 mM KCl (pH 7). Spectra were recorded using the p11 pulse sequence with jump-and-return water suppression. The water proton signal (4.7 ppm at 25 ºC) was used as an internal reference. Gel electrophoresis. Nondenaturing gel electrophoresis experiments were per-formed in 1× TAE buffer. Samples were loaded onto a nondenaturing 20% polyacrylamide gel and run for 2 h (4 V/cm) at room temperature. Gels were stained with SYBR Gold DNA gel stain purchased from Invitrogen China (Shanghai, China). The gel was imaged using a Tanon 2500 Gel Image System (Shanghai, China). Circular dichroism spectroscopy and melting temperature measurement. CD spectra data were collected on a Jasco J-810 spectropolarimeter (Japan) equipped with a thermoelectrically controlled cell holder. The wavelength varied from 220 to 320 nm for a bandwidth of 1 nm. The scan speed was 50 nm per minute. For the CD melting experiments, 10 µM samples were heated from 20 ºC to 80 ºC at 1 ºC min-1. For each melting curve, the apparent melting temperature (T1/2) was graphically determined as the intercept between the melting curve and the median line between the lowtemperature and high-temperature absorbance linear baselines.38 For each DNA sample, three independent CD melting experiments were performed.

RESULTS The presence of G2 sequences in the human genome. Numerous bioinformatics analysis studies have suggested that G-rich DNA sequences are prevalent in the 6

ACS Paragon Plus Environment

Page 6 of 30

Page 7 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

human genome.8-12,17,24-26,39 Because G3 sequences have been found to form stable and functional G-quadruplexes, the prevalence of G3 sequences in the human genome has been extensively studied.14-15,40,41 However, the prevalence of G2 sequences in the human genome is unknown. We conducted a genome-wide search to analyze the distribution of G2 sequences (G2NxG2NyG2NzG2) in the human genome using the quadparser program.9,37 We got the total number of G2 sequences (8482461) through subtracting the number of G≥3 sequences from the number of G≥2 sequences. The density of G2 sequences in the human genome was 2.9 per kb and was much higher than G≥3 sequences (0.1 sequences per kb).37 This result showed that G2 sequences were much more prevalent than G≥3 sequences. In addition, we also analysed the numbers of G2 sequences with various loop length constraints (Table 1).

Table 1. The density of G2 sequences with various loop lengths in the human genome and promoter regions. Lmax means the longest loop length. Totally, 8482461 and 815402 G2 sequences were found out in the human genome and promoter regions, respectively. Lmax Total

1

2

3

4

5

6

7

2.9

0.13

0.23

0.37

0.67

0.45

0.47

0.59

8.7

0.61

0.81

1.23

1.61

1.44

1.42

1.57

Density in genome (per kb) Density in promoter ( per kb)

7

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Previous studies also revealed that G≥3 sequences are enriched in the promoter regions of human genes, indicating that the G-quadruplexes formed by G≥3 sequences could be involved in gene regulation.14-15,31,40-42 In this study, we analyzed the density of G2 sequences within a specific distance upstream from transcription start sites (TSSs). In accordance with the G≥3 sequence results, we found that within promoters, probability of finding a G2 sequence is directly related to its proximity to the TSS (Figure 1). The density of G2 sequences in promoters (defined as 1 kb upstream region of the TSSs) is 8.7 per kb, much higher than the whole genome (2.9 G2 sequences per kb) and G3 sequences (0.8 per kb) (Table 1).37

Figure 1. Density (number of G2 sequences per kilo bases) of G2 sequences with distance upstream from the TSS. It clearly shows that the G2 sequences are enriched in the promoter region and a positive correlation between density and the proximity to the TSS.

8

ACS Paragon Plus Environment

Page 8 of 30

Page 9 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

NMR spectra of the model G2 sequences. The bioinformatics analysis suggests the abundance of G2 sequences in the human genome. Therefore, it may be of biologically importance to explore the structures of G2 sequences. In NMR spectra, the imino protons involved in the Watson-Crick duplex show chemical shifts of 13-14 ppm, whereas the characteristic guanine imino protons in the quartet have chemical shifts of 10-12.5 ppm, and the guanine imino protons in the quartet exchange more slowly with the solvent and remain detectable long after they are dissolved in a sample in D2O solution.41 Using the DNA sequences of G2(TxG2)3 (x = 1-7) containing four GG runs separated by 1-7 thymine nucleotides as models for G2 sequences, we first applied NMR to detect the structures of the G2 sequences. The TBA aptamer with the sequence GGTGGTGTGGTGG was used as a control, and the eight peaks between 10-12.5 ppm in the 1D 1H NMR spectra indicate the formation of two-quartet G-quadruplexes (Figure 2).28-29,32,38

9

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 30

Figure 2. 1D 1H NMR spectra of the G2(TxG2)3 (L1–L7, x = 1-7) sequences. The TBA aptamer with the sequence GGTGGTGTGGTGG was used as a control, and the NMR spectrum shows eight major peaks and a minor peak between 10-12.5 ppm. In accordance with the TBA spectrum, the the G2(TxG2)3 (x = 2-3) sequences show eight major well-resolved imino proton resonances and one or two minor peaks. The spectrums of other sequences are complicated.

In Figure 2, the presence of characteristic peaks between 10-12.5 ppm in the 1D 1H NMR spectra of the G2(TxG2)3 (x = 1-5) sequences shows the formation of G-quadruplex structures from these sequences. In particular, in accordance with that of TBA, NMR spectra of the G2(TxG2)3 (x = 2-3) sequences show eight major well-resolved imino 10

ACS Paragon Plus Environment

Page 11 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

proton resonances with sharp line widths (~6 Hz) (Figure 2), indicating that the twoquartet G-quadruplex structure is the dominant conformation. Minor conformations are also present as indicated by the presence of weak resonances (Figure 2). NMR spectra of the G2(T4G2)3 sequence shows twelve major well-resolved imino proton resonances and one minor peak, indicating the formation of more than one two-quartet Gquadruplex structures. The G2(TxG2)3 (x = 1 and 5) sequences most likely could form Gquadruplexes. However, there is no a dominant G-quadruplex conformation because the signals of the characteristic peaks in the NMR spectra are relatively indistinct. Instead, there may be various G-quadruplex conformations, or there may be intermolecular higher-order G-quadruplexes and other conformations.42,43 The very broad profiles of the NMR spectra of the G2(TxG2)3 (x = 6-7) sequences indicate that the quadruplex structure may be unfavorable for the sequences with loop lengths ≥ 6 (Figure 2).

CD spectra of the model G2 sequences. The NMR data suggest the presence of Gquadruplexes of the G2(TxG2)3 (x = 1-5) sequences. We next tried to distinguish the conformations

of

the

G-quadruplexes.

Although

assigning

the

G-quadruplex

conformations from CD spectrum is not always correct,44 the CD spectrum with a positive peak near 265 nm and a negative peak near 240 nm is usually interpreted as the formation of parallel conformations, and the CD spectrum with two positive peaks near 240 nm and 295 nm separately and a negative peak near 265 nm is usually interpreted as the formation of anti-parallel conformations.18,45-48 We used CD to study the structures of the G2 sequences of G2(TxG2)3 (x = 1-7) in K+ solutions at room 11

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 30

temperature. As a control, the CD spectra of these oligonucleotides were also recorded in Li+ solutions, where the G-quadruplex structure is not favorable.49-51

Figure 3. CD spectra of the G2(TxG2)3 (x = 1-7) sequences. The wavelength varied from 220 to 320 nm. (A-G) The spectra are recorded in both K+ solutions (red) and Li+ solutions (black). (H) All the spectra of the G2(TxG2)3 (x = 1-7) sequences recorded in K+ solutions are shown together for comparison.

The CD data are consistent with the NMR data. For sequences with loop lengths ≥ 6, the CD spectra recorded using K+ solutions are almost the same as those recorded 12

ACS Paragon Plus Environment

Page 13 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

using Li+ solutions (Figure 3F, G), indicating the absence of the G-quadruplex structure. In contrast, the CD spectra of the G2(TxG2)3 (x = 1-5) sequences show the signals of Gquadruplexes (Figure 3A-E). When the loop length is 1, the CD spectrum shows a quite strong positive peak near 265 nm and a negative peak near 240 nm (Figure 3A), indicating the presence of parallel G-quadruplexes. For the G2(TxG2)3 (x = 3-5) sequences, two positive peaks are found near 240 nm and 295 nm, and a negative peak is present near 265 nm, indicating the anti-parallel conformation. Interestingly, the CD spectrum of the G2(T2G2)3 sequence shows two positive peaks and two negative peaks (Figure 3B). The two positive peaks are near 295 nm and 260 nm, and the two negative peaks near 235 nm and 275 nm. The NMR signals are clear, suggesting that this does not represent mixed populations of parallel and antiparallel complexes, but it is not clear what structure this adopts. We induce the formation of a hybrid structure containing both parallel and anti-parallel conformational features, as the CD contains both positive peaks near 295 nm and 260 nm, which are the characterized peaks for anti-parallel and parallel conformations, respectively.

13

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 30

Figure 4. CD melting profiles and apparent melting temperatures of the G2(TxG2)3 (x = 1-5) sequences. CD melting experiment was monitored at 265 nm (L1) or 295 nm (L2L5). (A) the melting profile recorded at the concentration of 10 µM. (B) apparent melting temperatures of G2 sequences DNA with concentrations between 10-200µM. The apparent melting temperature of G2(T1G2)3 increases with the increasing of DNA concentrations, while the apparent melting temperatures of other sequences was independent on DNA concentration

Thermal stabilities of the G-quadruplexes of the model G2 sequences. The above results reveal that the G2(TxG2)3 (x = 1-5) sequences can form G-quadruplexes in the 14

ACS Paragon Plus Environment

Page 15 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

presence of potassium ions at room temperature. We next explored the thermal stabilities of the G-quadruplexes using the CD melting method. For the G2(TxG2)3 (x = 24) sequences, the NMR data have suggested that the two-quartet G-quadruplex structure is the dominant conformation at room temperature. Consistently, their melting temperatures were significantly higher than room temperature (Figure 4). In contrast, the apparent melting temperature of the G2(T5G2)3 sequence with a loop length of 5 is 34 ± 2 ºC, which is not much higher than room temperature. Compared with the G2(TxG2)3 (x = 2-4) sequences, the G2(T5G2)3 sequence shows less thermal stability, most likely resulting in various conformations at room temperature, which is in agreement with the NMR data. Interestingly, the G2(TG2)3 sequence with a loop length of 1 shows the strongest thermal stability with an apparent melting temperature of 61 ± 2 ºC, and its apparent melting temperature increases with the increasing of DNA concentrations, while the apparent melting temperatures of other sequences are independent on DNA concentrations (Figure 4). The CD data suggest that this sequence adopts the parallel G-quadruplex conformation. Previous studies of the threelayer G-quadruplexes suggested that the parallel G-quadruplex conformation can form an intermolecular higher-order structure and shows strong thermal stability.50 Our NMR data of the G2(TG2)3 sequence are similar to the NMR data of the intermolecular higherorder structure formed by three-layer G-quadruplexes.43 Therefore, we propose that the parallel conformation of the two-quartet G-quadruplex of the G2(TG2)3 sequence could form a intermolecular higher-order structure.

15

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 30

Figure 5. NMR and CD spectra of randomly selected human G2 sequences with loop lengths of 1 (A) and 3 (B), compared with the model systems.

The structures of several randomly selected human G2 sequences. The above studies on model G2 sequences with loops composed of thymine nucleotides suggest that the structure of the two-quartet G-quadruplexes depends on the loop length. In contrast to the model sequences, the loops of human G2 sequences are not solely composed of thymine nucleotides. To test whether the two-quartet G-quadruplexes are sensitive to the loop composition, we studied the structures of 15 randomly selected human G2 sequences with various loop compositions. We found that for the G2 sequences with loop lengths of 1 and 3, the CD spectra show the same pattern as that of the model sequences. Furthermore, we propose that the G2 sequences with a loop length of 1 form parallel G-quadruplexes and that the sequences with a loop length of 3 adopt the anti-parallel conformation (Figure 5). The similarity of the NMR signal between model sequences and human sequences with loop lengths of 1 and 3 further 16

ACS Paragon Plus Environment

Page 17 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

supports this conclusion (Figure 5). It should be noted that the complicated NMR signal of sequence S3 with a loop length of 3 (Figure 5B) indicates that other conformations may form in addition to the anti-parallel structure. Moreover, we measured the CD signal of another ten randomly selected human G2 sequences with loop lengths of 1 or 3. The CD data of these sequences further support the assertion that human G2 sequences with loop lengths of 1 and 3 could form parallel and anti-parallel Gquadruplexes, respectively (Figure S1). Additionally, the parallel and anti-parallel Gquadruplexes of the human sequences with loop lengths of 1 and 3 show high thermal stabilities (Table S1 of the Supporting Information). However, for the human sequences with loop lengths of 2, 4 and 5, the NMR, CD spectra (Figure 6) and melting temperature (Table S1 of the Supporting Information) of the sequences with the same loop length show significant diversity, indicating that other G-quadruplexes or hairpins may form. Among these sequences, we noticed that little G-quadruplexes are determined for all the three randomly selected DNA sequences with the loop length of 4, despite that the model sequence G2(T4G2)3 could form the anti-parallel G-quadruplexes. Thus, whether G2 sequences could fold into G-quadruplex structures and if so, the conformations are dependent on the loop composition.

17

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 30

Figure 6. NMR and CD spectra of randomly selected human G2 sequences with loop lengths of 2 (A), 4 (B) and 5 (C), compared with the model systems.

DISCUSSION Our genome-wide search suggests the abundance of G2 sequences in human genome, especially in the promoter region (8.7 per kb). The density of G2 sequences (2.9 per kb) is much higher than that of G3 sequences (0.1 per kb). Characterization of the structural features of these sequences can provide important information for the future study on their functions. However, previous studies of the structures and functions of Gquadruplexes usually focused on G3 sequences.22,31,53-56 In this study, we used NMR, CD and gel electrophoresis to study the structures of the G-quadruplexes formed by G2 18

ACS Paragon Plus Environment

Page 19 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

sequences. We find that whether a G2 sequence could form a G-quadruplex is highly dependent on the loop length and composition. The model G2 sequences with short loop lengths can form G-quadruplexes, whereas the G-quadruplex structure is unfavorable for the model sequences with loop lengths ≥ 6. Although the model sequences with loop lengths of 4 or 5 could fold into G-quadruplexes, several human sequences with the same loop lengths but loop compositions that differed from that of model sequences could not. Both model and human G2 sequences with loop lengths of 1 to 3 could form G-quadruplex structures. In agreement with previous studies,9,37 G2 sequences with loop lengths of 1 to 3 are among the most common G≥2 sequences and the most common set of loop lengths was {1, 1, 1} (G2 sequences with loop lengths of 1), which accounted for ~4.6% of all the G2 sequences (Table S2 of the Supporting Information). Previous studies have showed that three-quartet G-quadruplexes with a loop length of 1 adopt the parallel conformation, and with the increases in the loop length, the parallel conformation transforms to the anti-parallel conformation.52,54,55 This rule of conformational change may also be applicable to the two-quartet G-quadruplexes of certain G2 sequences. For example, by comprehensively analyzing the NMR, CD and melting temperature data of the model G2 sequences, we conclude that the two-quartet G-quadruplexes with loop lengths of 1 form the parallel conformation, and the twoquartet G-quadruplexes with loop lengths of 3-5 prefer the anti-parallel conformation. Previous studies on three-quartet G-quadruplexes reveal that parallel G-quadruplexes are stacked into intermolecular higher-order structures.52,54,55 Our study shows that this stacking is also applicable to two-quartet G-quadruplexes, because the dependence of 19

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 30

stability on DNA concentration (Figure 4B) and the multiple gel bands (Figure S2) suggest the formation of intermolecular higher-order structures for the G2(T1G2)3 sequence. The extremely pronounced hysteresis in the melting and annealing profiles of the G2(T1G2)3 sequence (Figure S3) also supports the formation of intermolecular higher-order structures. The NMR data indicates that the G2(T2G2)3 sequence adopts the two-quartet Gquadruplex. However, the CD spectrum cannot distinguish the conformation of the Gquadruplex. Noticeably, we find that the CD spectrum of the G2(T2G2)3 sequence shows two positive peaks and two negative peaks (Figure 3B). The positive peaks near 295 nm and 260 nm might be equal to the 295 nm peak of anti-parallel three-quartet Gquadruplex and the 265 nm peak of parallel three-quartet G-quadruplex, respectively. The shifting from 265 nm to 260 nm is likely due to the presence of the negative peak near 275 nm. We propose that this DNA might fold into a hybrid structure containing both conformational features of parallel and anti-parallel G-quadruplexes. Another similarity between the three-quartet and two-quartet G-quadruplexes is that the two-quartet G-quadruplex conformations of G2 sequences are also dependent on the loop composition. For the human G2 sequences with loop lengths of 2, 4 and 5, the G2 sequences with the same loop length but different loop compositions form different structures (Figure 6). However, it seems that the structures of the G2 sequences with loop lengths of 1 or 3 do not significantly depend on the loop composition. All the tested human G2 sequences with a loop length of 1 form the parallel conformation, and the sequences with a loop length of 3 adopt the anti-parallel conformation. These structures

20

ACS Paragon Plus Environment

Page 21 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

occur because the high thermal stabilities of the G-quadruplexes with a loop length of 1 or 3 could overcome the influence of the changed loop composition. Numerous studies focused on three-quartet G-quadruplexes formed by G3 sequences have shown that they are involved in biological processes like telomere maintenance, gene expression, epigenetic regulation and DNA replication.5-17 Recently, there are also a few reports about the structures and biological functions of two-quartet G-quadruplexes, like the promoters of oncogene HIF1ɑ and TK1.57,58 Our findings suggest that more attentions should be paid to the structure and function of G2 sequences. Because their structures depend on the length and composition of the loops, G2 sequences show great diversity in their structures, which may be associated with distinct functions. In particular, attention may be first directed to G-quadruplexes with high thermal stabilities, such as those formed by the sequences with a loop lengths of 1 or 3. Further study should also confirm whether a single G2 sequence with a loop length of 1 could fold into a parallel G-quadruplex in vivo, because this sequence lacks the multiple G2 sequences that can form a intermolecular higher-order structure. It should be noted that for each G2 sequence used in this study, the three loops have the same length. G2 sequences with three loops of inhomogeneous lengths have not been studied here. These sequences may show more diversity in their structures. Moreover, this study also has not investigated the folding and unfolding mechanisms of the G-quadruplexes. The folding kinetics of a two-layer G-quadruplex could tell us whether the folding time scale is biologically relevant; thus, further efforts should be made to measure the kinetics.

21

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 30

CONCLUSIONS In the current work, we studied the G-quadruplex structures formed by G-rich DNA nucleotides containing only two guanines in each of the four guanine cores. The structures of the sequences are highly dependent on the loop length and composition. Although the G-quadruplexes of G2 sequences with long loops are not as stable as those of G3 sequences, the melting temperatures of certain G2 sequences, such as G2 sequences with loop lengths of 1 or 3, could still be >50 ºC. In addition, bioinformatics analysis revealed the prevalence of these sequences in the human genome; these sequences have the potential to fold into G-quadruplex structures. We speculate that the two-layer G-quadruplex structures of these sequences in human promoters may modulate some biological processes such as gene transcription and DNA replication.

ASSOCIATED CONTENT Supporting Information Melting temperature of human G2 sequences, The 10 most common and 10 least common sets of G2 sequences loop lengths, CD spectrum of human G2 sequences, gel electrophoresis of G2(T1G2)3 sequence and CD melting and annealing profiles of G2(T1G2)3. This material is available free of charge via the Internet at http://pubs.acs.org.

AUTHOR INFORMATION Corresponding Author *Tel: +86-21-50800619; Fax: +86-21-50807088; E-mail address: [email protected].

22

ACS Paragon Plus Environment

Page 23 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Notes The authors declare no competing financial interest.

ACKNOWLEDGMENTS Thanks are due to Professor Shankar Balasubramanian’s group for providing us the quadparser program. This work was supported by the National Natural Science Foundation of China [grant numbers 21422208, 81173027, 81230076], the Instrument Developing Project of the Chinese Academy of Sciences [grant number YZ201245], the Hi-Tech Research and Development Program of China [grant number 2012AA020302], and the SA-SIBS Scholarship Program.

REFERENCES (1) Gellert, M.; Lipsett, M. N.; Davies, D. R. Helix formation by guanylic acid. Proc. Natl. Acad. Sci. U. S. A. 1962, 48, 2013-2018. (2) Harley, C. B.; Futcher, A. B.; Greider, C. W. Telomeres shorten during aging of human fibroblasts. Nature 1990, 345, 458-460. (3) Laughlan, G.; Murchie, A. I.; Norman, D. G.; Moore, M. H.; Moody, P. C.; Lilley, D. M.; Luisi, B. The high-resolution crystal structure of a parallel-stranded guanine tetraplex. Science 1994, 265, 520-524. (4) Aboul-ela, F.; Murchie, A. I.; Norman, D. G.; Lilley, D. M. Solution structure of a parallelstranded tetraplex formed by d(TG4T) in the presence of sodium ions by nuclear magnetic resonance spectroscopy. J. Mol. Biol. 1994, 243, 458-471.

23

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 30

(5) Ray, S.; Bandaria, J. N.; Qureshi, M. H.; Yildiz, A.; Balci, H. G-quadruplex formation in telomeres enhances POT1/TPP1 protection against RPA binding. Proc. Natl. Acad. Sci. U. S. A. 2014, 111, 2990-2995. (6) Wang, F.; Tang, M. L.; Zeng, Z. X.; Wu, R. Y.; Xue, Y.; Hao, Y. H.; Pang, D. W.; Zhao, Y.; Tan, Z. Telomere- and telomerase-interacting protein that unfolds telomere G-quadruplex and promotes telomere extension in mammalian cells. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, 20413-20418. (7) Neidle, S.; Parkinson, G. Telomere maintenance as a target for anticancer drug discovery. Nat. Rev. Drug Discovery 2002, 1, 383-393. (8) Todd, A. K. Bioinformatics approaches to quadruplex sequence location. Methods 2007, 43, 246-251. (9) Huppert, J. L.; Balasubramanian, S. Prevalence of quadruplexes in the human genome. Nucleic Acids Res. 2005, 33, 2908-2916. (10) Rawal, P.; Kummarasetti, V. B. R.; Ravindran, J.; Kumar, N.; Halder, K.; Sharma, R.; Mukerji, M.; Das, S. K.; Chowdhury, S. Genome-wide prediction of G4 DNA as regulatory motifs: Role in Escherichia coli global regulation. Genome Res. 2006, 16, 644-655. (11) Kostadinov, R.; Malhotra, N.; Viotti, M.; Shine, R.; D'Antonio, L.; Bagga, P. GRSDB: a database of quadruplex forming G-rich sequences in alternatively processed mammalian premRNA sequences. Nucleic Acids Res. 2006, 34, D119-D124. (12) Kikin, O.; D'Antonio, L.; Bagga, P. S. QGRS Mapper: a web-based server for predicting G-quadruplexes in nucleotide sequences. Nucleic Acids Res. 2006, 34, W676-W682. (13) Todd, A. K.; Johnston, M.; Neidle, S. Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic Acids Res. 2005, 33, 2901-2907.

24

ACS Paragon Plus Environment

Page 25 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(14) Siddiqui-Jain, A.; Grand, C. L.; Bearss, D. J.; Hurley, L. H. Direct evidence for a Gquadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 11593-11598. (15) Catasti, P.; Chen, X.; Moyzis, R. K.; Bradbury, E. M.; Gupta, G. Structure-function correlations of the insulin-linked polymorphic region. J. Mol. Biol. 1996, 264, 534-545. (16) Mirkin, S. M. DNA replication: driving past four-stranded snags. Nature 2013, 497, 449450. (17) Zhang, C.; Liu, H. H.; Zheng, K. W.; Hao, Y. H.; Tan, Z. DNA G-quadruplex formation in response to remote downstream transcription activity: long-range sensing and signal transducing in DNA double helix. Nucleic Acids Res. 2013, 41, 7144-7152. (18) Phan, A. T. Human telomeric G-quadruplex: structures of DNA and RNA sequences. FEBS J. 2010, 277, 1107-1117. (19) Burge, S.; Parkinson, G. N.; Hazel, P.; Todd, A. K.; Neidle, S. Quadruplex DNA: sequence, topology and structure. Nucleic Acids Res. 2006, 34, 5402-5415. (20) Sket, P.; Plavec, J. Tetramolecular DNA quadruplexes in solution: insights into structural diversity and cation movement. J. Am. Chem. Soc. 2010, 132, 12724-12732. (21) Rachwal, P. A.; Brown, T.; Fox, K. R. Effect of G-tract length on the topology and stability of intramolecular DNA quadruplexes. Biochemistry 2007, 46, 3036-3044. (22) Guedin, A.; Alberti, P.; Mergny, J. L. Stability of intramolecular quadruplexes: sequence effects in the central loop. Nucleic Acids Res. 2009, 37, 5559-5567. (23) Perrone, R.; Nadai, M.; Poe, J. A.; Frasson, I.; Palumbo, M.; Palu, G.; Smithgall, T. E.; Richter, S. N. Formation of a unique cluster of G-Quadruplex structures in the HIV-1 nef coding region: implications for antiviral activity. PLoS ONE 2013, 8, e73121 (24) Beaudoin, J. D.; Jodoin, R.; Perreault, J. P. New scoring system to identify RNA Gquadruplex folding. Nucleic Acids Res. 2014, 42, 1209-1223 25

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 30

(25) Tluckova, K.; Marusic, M.; Tothova, P.; Bauer, L.; Sket, P.; Plavec, J.; Viglasky, V. Human papillomavirus G-quadruplexes. Biochemistry 2013, 52, 7207-7216. (26) Lexa, M.; Kejnovsky, E.; Steflova, P.; Konvalinova, H.; Vorlickova, M.; Vyskot, B. Quadruplex-forming sequences occupy discrete regions inside plant LTR retrotransposons. Nucleic Acids Res. 2014, 42, 968-978 (27) Maizels, N.; Gray, L. T. The g4 genome. PLoS Genet. 2013, 9, e1003468. (28) Bunka, D. H.; Stockley, P. G. Aptamers come of age - at last. Nat. Rev. Microbiol. 2006, 4, 588-596. (29) Macaya, R. F.; Schultze, P.; Smith, F. W.; Roe, J. A.; Feigon, J. Thrombin-binding DNA aptamer forms a unimolecular quadruplex structure in solution. Proc. Natl. Acad. Sci. U. S. A. 1993, 90, 3745-3749. (30) Bates, P. J.; Laber, D. A.; Miller, D. M.; Thomas, S. D.; Trent, J. O. Discovery and development of the G-rich oligonucleotide AS1411 as a novel treatment for cancer. Exp. Mol. Pathol. 2009, 86, 151-164. (31) Tran, P. L. T.; Mergny, J. L.; Alberti, P. Stability of telomeric G-quadruplexes. Nucleic Acids Res. 2011, 39, 3282-3294. (32) Amrane, S.; Ang, R. W. L.; Tan, Z. M.; Li, C.; Lim, J. K. C.; Lim, J. M. W.; Lim, K. W.; Phan, A. T. A novel chair-type G-quadruplex formed by a Bombyx mori telomeric sequence. Nucleic Acids Res. 2009, 37, 931-938. (33) Lim, K. W.; Amrane, S.; Bouaziz, S.; Xu, W. X.; Mu, Y. G.; Patel, D. J.; Luu, K. N.; Phan, A. T. Structure of the human telomere in K+ solution: a stable basket-type G-quadruplex with only two G-tetrad layers. J. Am. Chem. Soc. 2009, 131, 4301-4309. (34) Smirnov, I.; Shafer, R. H. Effect of loop sequence and size on DNA aptamer stability. Biochemistry 2000, 39, 1462-1468.

26

ACS Paragon Plus Environment

Page 27 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(35) Dapic, V.; Abdomerovic, V.; Marrington, R.; Peberdy, J.; Rodger, A.; Trent, J. O.; Bates, P. J. Biophysical and biological properties of quadruplex oligodeoxyribonucleotides. Nucleic Acids Res. 2003, 31, 2097-2107. (36) Nagatoishi, S.; Isono, N.; Tsumoto, K.; Sugimoto, N. Loop residues of thrombin-binding DNA aptamer impact G-quadruplex stability and thrombin binding. Biochimie 2011, 93, 12311238. (37) Huppert, J. L.; Balasubramanian, S. G-quadruplexes in promoters throughout the human genome. Nucleic Acids Res. 2007, 35, 406-413. (38) Mergny, J. L.; De Cian, A.; Ghelab, A.; Sacca, B.; Lacroix, L. Kinetics of tetramolecular quadruplexes. Nucleic Acids Res. 2005, 33, 81-94. (39) Wanrooij, P. H.; Uhler, J. P.; Shi, Y. H.; Westerlund, F.; Falkenberg, M.; Gustafsson, C. M. A hybrid G-quadruplex structure formed between RNA and DNA explains the extraordinary stability of the mitochondrial R-loop. Nucleic Acids Res. 2012, 40, 10334-10344. (40) Dai, J.; Dexheimer, T. S.; Chen, D.; Carver, M.; Ambrus, A.; Jones, R. A.; Yang, D. An intramolecular G-quadruplex structure with mixed parallel/antiparallel G-strands formed in the human BCL-2 promoter region in solution. J. Am. Chem. Soc. 2006, 128, 1096-1098. (41) Adrian, M.; Heddi, B.; Phan, A. T. NMR spectroscopy of G-quadruplexes. Methods 2012, 57, 11-24. (42) Ambrus, A.; Chen, D.; Dai, J. X.; Bialis, T.; Jones, R. A.; Yang, D. Z. Human telomeric sequence

forms

a

hybrid-type

intramolecular

G-quadruplex

structure

with

mixed

parallel/antiparallel strands in potassium solution. Nucleic Acids Res. 2006, 34, 2723-2735. (43) Hansel, R.; Lohr, F.; Foldynova-Trantirkova, S.; Bamberg, E.; Trantirek, L.; Dotsch, V. The parallel G-quadruplex structure of vertebrate telomeric repeat sequences is not the preferred folding topology under physiological conditions. Nucleic Acids Res. 2011, 39, 57685775. 27

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 30

(44) Vorlickova, M.; Kejnovska, I.; Sagi, J.; Renciuk, D.; Bednarova, K.; Motlova, J.; Kypr, J. Circular dichroism and guanine quadruplexes. Methods 2012, 57, 64-75. (45) Bejugam, M.; Sewitz, S.; Shirude, P. S.; Rodriguez, R.; Shahid, R.; Balasubramanian, S. Trisubstituted isoalloxazines as a new class of G-quadruplex binding ligands: Small molecule regulation of c-kit oncogene expression. J. Am. Chem. Soc. 2007, 129, 12926-12927. (46) Heddi, B.; Phan, A. T. Structure of human telomeric DNA in crowded solution. J. Am. Chem. Soc. 2011, 133, 9824-9833. (47) Miller, M. C.; Buscaglia, R.; Chaires, J. B.; Lane, A. N.; Trent, J. O. Hydration Is a major determinant of the G-quadruplex stability and conformation of the human telomere 3 ' sequence of d(AG(3)(TTAG(3))(3)). J. Am. Chem. Soc. 2010, 132, 17105-17107. (48) Nagatoishi, S.; Tanaka, Y.; Tsumoto, K. Circular dichroism spectra demonstrate formation of the thrombin-binding DNA aptamer G-quadruplex under stabilizing-cation-deficient conditions. Biochem. Biophys. Res. Commun. 2007, 352, 812-817. (49) Lee, J. Y.; Yoon, J. M.; Kihm, H. W.; Kim, D. S. Structural diversity and extreme stability of unimolecular Oxytricha nova telomeric G-quadruplex. Biochemistry 2008, 47, 3389-3396. (50) Wlodarczyk, A.; Grzybowski, P.; Patkowski, A.; Dobek, A. Effect of ions on the polymorphism, effective charge, and stability of human telomeric DNA. Photon correlation spectroscopy and circular dichroism studies. J. Phys. Chem. B 2005, 109, 3594-3605. (51) Hardin, C. C.; Watson, T.; Corregan, M.; Bailey, C. Cation-dependent transition between the quadruplex and Watson-Crick hairpin forms of D(Cgcg3gcg). Biochemistry 1992, 31, 833841. (52) Smargiasso, N.; Rosu, F.; Hsia, W.; Colson, P.; Baker, E. S.; Bowers, M. T.; De Pauw, E.; Gabelica, V. G-quadruplex DNA assemblies: loop length, cation identity, and multimer formation. J. Am. Chem. Soc. 2008, 130, 10208-10216.

28

ACS Paragon Plus Environment

Page 29 of 30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(53) Guedin, A.; Gros, J.; Alberti, P.; Mergny, J. L. How long is too long? Effects of loop size on G-quadruplex stability. Nucleic Acids Res. 2010, 38, 7858-7868. (54) Hazel, P.; Huppert, J.; Balasubramanian, S.; Neidle, S. Loop-length-dependent folding of G-quadruplexes. J. Am. Chem. Soc. 2004, 126, 16405-16415. (55) Webba da Silva, M.; Trajkovski, M.; Sannohe, Y.; Ma'ani Hessari, N.; Sugiyama, H.; Plavec, J. Design of a G-quadruplex topology through glycosidic bond angles. Angew. Chem. Int. Edit. 2009, 48, 9167-9170. (56) Tippana, R.; Xiao, W.; Myong, S. G-quadruplex conformation and dynamics are determined by loop length and sequence. Nucleic Acids Res. 2014, 42, 8106-8114. (57) Chen, H.; Long, H. T.; Cui, X. J.; Zhou, J.; Xu, M.; Yuan, G. Exploring the formation and recognition of an important G-quadruplex in a HIF1 alpha promoter and Its transcriptional inhibition by a benzo[c]phenanthridine derivative. J. Am. Chem. Soc. 2014, 136, 2583-2591. (58) Basundra, R.; Kumar, A.; Amrane, S.; Verma, A.; Phan, A. T.; Chowdhury, S. A novel Gquadruplex motif modulates promoter activity of human thymidine kinase 1. FEBS J. 2010, 277, 4254-4264.

29

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

TABLE OF CONTENTS IMAGE

30

ACS Paragon Plus Environment

Page 30 of 30