Sequence Requirements of Intrinsically Fluorescent G-Quadruplexes

Jun 13, 2018 - Desalted DNA oligonucleotides, salts, and buffer were purchased from Sigma. ... concentration of 100 μM and typically used without add...
0 downloads 0 Views 745KB Size
Subscriber access provided by University of Groningen

Article

Sequence requirements of intrinsically fluorescent G-quadruplexes Tat'ána Majerová, Tereza Streckerová, Lucie Bednarova, and Edward Curtis Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.8b00252 • Publication Date (Web): 13 Jun 2018 Downloaded from http://pubs.acs.org on June 14, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Sequence requirements of intrinsically fluorescent G-quadruplexes

Tat'ána Majerová1, Tereza Streckerová1,2, Lucie Bednárová1, and Edward A. Curtis1*

1

The Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Prague 166 10, Czech Republic

2

Department of Biochemistry and Microbiology, University of Chemistry and Technology Prague 166 10, Czech Republic

Keywords: G-quadruplex, loop, fluorescence, GFP, fluorescent G-quadruplex

ACS Paragon Plus Environment

1

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 35

ABSTRACT

G-quadruplexes are four-stranded nucleic acid structures typically stabilized by GGGG tetrads. These structures are intrinsically fluorescent, which expands the known scope of nucleic acid function and raises the possibility that they could eventually be used as signaling components in label-free sensors constructed from DNA or RNA. In this study we systematically investigated the effects of mutations in tetrads, loops, and overhanging nucleotides on the fluorescence intensity and maximum emission wavelength of more than 500 sequence variants of a reference DNA G-quadruplex. Some of these mutations modestly increased the fluorescence intensity of this G-quadruplex, while others shifted its maximum emission wavelength. Mutations that increased fluorescence intensity were distinct from those that increased maximum emission wavelength, suggesting a tradeoff between these two biochemical properties.

Fluorescence

intensity and maximum emission wavelength were also correlated with multimeric state: the most fluorescent G-quadruplexes were monomers, while those with the highest maximum emission wavelengths typically formed dimeric structures. Oligonucleotides containing multiple G-quadruplexes were in some cases more fluorescent than those containing a single Gquadruplex, although this depended on both the length and sequence of the spacer linking the Gquadruplexes. These experiments provide new insights into the properties of fluorescent Gquadruplexes, and should aid in the development of improved label-free nucleic acid sensors.

ACS Paragon Plus Environment

2

Page 3 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

INTRODUCTION Under certain conditions the hydrozoan jellyfish Aequorea victoria produces fluorescent green light (1). Biochemical fractionation experiments using material prepared from thousands of jellyfish revealed that this fluorescence is produced by a 238 amino acid protein called green fluorescent protein (GFP) (2-3). The core of GFP contains three amino acids that cyclize to generate an aromatic chromophore called 4-(p-hydroxybenzylidene)imidazolidin-5-one (HBI) (4-5). In the absence of GFP this chromophore is not fluorescent, but in the context of a cylindrical cavity created by the three-dimensional fold of the protein, its fluorescence is significantly enhanced (6-7). In addition to increasing its fluorescence, the structural context of the chromophore in GFP can alter its properties in other ways. For example, by mutagenizing amino acids within and nearby the chromophore in the tertiary structure of the protein, it has been possible to generate blue, cyan, and yellow versions of GFP as well as variants with shifted absorption spectra (3,8). These variants have been particularly useful for FRET studies in which a GFP with a given maximum emission wavelength is tethered to a second GFP with an overlapping excitation wavelength by a linker that changes its conformation in the presence of a ligand of interest (3). GFP is widely used as a genetic reporter to analyze protein expression and localization (9). It is also a powerful tool in bioimaging applications such as fluorescence microscopy (9). The importance of GFP in biotechnology and basic research has stimulated the search for GFP-like nucleic acid structures that generate fluorescence. The first example of such a motif was discovered during the characterization of aptamers that bind the small-molecule fluorophore malachite green (10). When bound to these aptamers, the fluorescence intensity of malachite green is enhanced more than 2000-fold (11). Aptamers that enhance the fluorescence intensity of

ACS Paragon Plus Environment

3

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 35

several other fluorophores have recently been developed, including the spinach aptamer (12) and the mango aptamer (13). In some cases these aptamers can be used as sensors or reporters in the context of living cells (14-16). For example, variants of the spinach aptamer have been constructed that generate an enhanced fluorescent signal in the presence of specific ligands such as SAM and ADP, and were used to measure the cellular concentrations of these metabolites in E. coli (14). The structural basis of fluorescence has also been elucidated for the spinach and mango aptamers (17-18). In both cases, a G-quadruplex with an unusual topology in the core of the structure binds the fluorophore using stacking energy and hydrogen bonding. Although motifs such as spinach and mango can be thought of as functional analogs of GFP, the mechanism by which they generate fluorescence is fundamentally different because it requires the presence of an external fluorophore. Enhancement of nucleic acid fluorescence in the absence of an external fluorophore has also been described (19-21). This requires a folded DNA or RNA G-quadruplex structure, and the mechanism of enhancement is probably related to the aromatic nature of the nucleotide building blocks of G-quadruplexes. Guanine is weakly fluorescent by itself (22), and this is likely enhanced by the extended system of conjugation generated by a tetrad (23-25). Stabilization of tetrads by stacking interactions probably also plays a role by restricting molecular motions that inhibit fluorescence (26).

The relative

orientations of tetrads is also thought to be important (25,27), and probably explains why multimerization modulates the fluorescence of some G-quadruplexes (28-30). Fluorescence anisotrophy experiments suggest that energy transfer occurs among the bases in fluorescent Gquadruplexes, and time-resolved studies indicate that the fluorescence lifetimes of Gquadruplexes are longer than those of nucleotides (25, 31-32). Although the signal generated by fluorescent G-quadruplexes is only about 20-fold above background, with a fluorescence

ACS Paragon Plus Environment

4

Page 5 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

quantum yield of approximately 10-3 (19, 33), their sequence requirements have only been explored to a limited extent (24). This raises the possibility that more comprehensive searches of sequence space could provide new insights into the mechanism by which fluorescence is enhanced, and also identify variants with improved fluorescent properties. To investigate these possibilities we measured the fluorescence intensity and maximum emission wavelengths of approximately 500 variants of a reference G-quadruplex structure.

Our library contained

mutations in tetrads, loops, and overhanging nucleotides, and most mutations were present in several sequence backgrounds.

A handful of mutants in this library were slightly more

fluorescent than the reference G-quadruplex, and others had maximum emission wavelengths that were shifted relative to the starting construct.

Mutations that increased fluorescence

intensity were distinct from those that increased maximum emission wavelength, suggesting a tradeoff between these two biochemical properties.

Fluorescence intensity and maximum

emission wavelength were also correlated with multimeric state: the most fluorescent Gquadruplexes were monomers, while those with the highest maximum emission wavelengths typically formed dimeric structures.

We also explored several strategies to enhance the

fluorescence intensity of G-quadruplex structures. These experiments revealed that in some cases concatemerization can modestly increase the fluorescence of oligonucleotides containing multiple G-quadruplexes, although this enhancement depends on both the length and sequence of the spacer linking the G-quadruplexes. Taken together, our results provide new insights into the properties of fluorescent G-quadruplexes, and should aid in the development of improved labelfree nucleic acid sensors.

MATERIALS/EXPERIMENTAL DETAILS

ACS Paragon Plus Environment

5

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 35

Reagents Desalted

DNA

oligonucleotides,

salts

and

buffer

were

purchased

from

Sigma.

Oligonucleotides were resuspended in Milli-Q water at a concentration of 100 µM and typically used without additional purification.

Control experiments indicated that the fluorescent

properties of these oligonucleotides were similar before and after after HPLC purification (Figure S1). Longer oligonucleotides (i.e. those used in concatamerization experiments) were purified on PAGE gels. Stock solutions were stored at -20°C and thawed at room temperature before use.

Fluorescence measurements In a typical assay, a 100 µM G-quadruplex stock solution (stored at -20°C) was thawed at room temperature. After vortexing, 15 µl was mixed with 60 µl of Milli-Q water. The solution was then heated at 65°C for 5 minutes, cooled at room temperature for 5 minutes, and mixed with 75 µl of 2× buffer. Final concentrations were 10 µM G-quadruplex in a buffer containing 1 M KCl and 20 mM HEPES pH 7.1. After incubating for 30 minutes, the sample was excited at 290 nm, and the emission spectrum was typically measured from 330 nm to 500 nm using a FluoroMax-4 spectrofluorometer (Horiba Scientific). Some of the measurements in Figure 1D were made using a Spark fluorescent plate reader (Tecan), and we confirmed that results were similar to those obtained using the spectrofluorometer (Figure S2). Background fluorescence was determined by measuring the emission spectrum of a sample containing buffer alone, and was subtracted from each G-quadruplex measurement. Fluorescence quantum yields were also

ACS Paragon Plus Environment

6

Page 7 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

measured for several constructs (Figure S3).

For these examples, fluorescence intensities

(measured as described above) reflected the fluorescence quantum yields.

Native gels In a typical assay, material from a 100 µM G-quadruplex stock solution was mixed with a trace amount (≤ 10 nM) of a radiolabeled version of the sequence. The solution was then heated at 65°C for 5 minutes, cooled at room temperature for 5 minutes, and mixed with buffer. Final concentrations were 10 µM G-quadruplex in a buffer containing 1 M KCl and 20 mM HEPES pH 7.1. After incubating at room temperature for 30 minutes, the material was analyzed on 10% native PAGE gels containing 5 mM KCl in both the gel and buffer. Gels were run at 300 V for 30 minutes and scanned using a Typhoon phosphorimager. For more information see references 34 and 35.

RESULTS/DISCUSSION

Fluorescence of G-quadruplexes with mutated tetrads In several recent studies we investigated the effects of mutating the central tetrad in a parallelstrand G-quadruplex on its ability to bind GTP, promote peroxidase reactions, and form multimeric structures (34-36). The reference construct used in these experiments is fluorescent (Figures 1A-C and Figure S3), and we speculated that mutations in tetrads could affect its spectroscopic properties. We were motivated in part by the idea that the fluorescence of Gquadruplexes is generated by the extended conjugation of GGGG tetrads (23-25). If true, G-

ACS Paragon Plus Environment

7

Biochemistry

Figure 1 a

b 400 400000

13

8 9 11 T G 15 A G 4 6 G A G G G G G 5’ G G A 3’ T G G

Ref Rand

300 300000 FU × 1000

2

200000 200 100 100000 00 300

5 1

10

6

11

FU × 1000

3

450

500

550

1000

15 7

400

c

14

2

350

Wavelength (nm)

5’

12

100

16 3’ 10 Rand

d

Ref

e Positions 2 and 6

Positions 2 and 6

GG AG CG TG GA GC GT AA AC AT CA CC CT TA TC TT

GG AG CG TG GA GC GT AA AC AT CA CC CT TA TC TT

GG

GG

AG

AG

CG

CG

TG

TG

GA

GA

Positions 11 and 15

Positions 11 and 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 35

GC GT AA AC AT CA

GC GT AA AC AT CA

CC

CC

CT

CT

TA

TA

TC

TC

TT

TT

Shift in emission wavelength relative to reference construct

Percent fluorescence intensity relative to reference construct > 80% > 70% to 80%

> 30 nm > 25 nm to to 30 nm

> 60% to 70%

> 20 nm to 25 nm

> 50% to 60%

> 15 nm to 20 nm

> 40% to 50%

> 10 nm to 15 nm

  ֠

 10 nm or not fluorescent

ACS Paragon Plus Environment

8

Page 9 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 1. Effect of mutations in tetrads on G-quadruplex fluorescence. (A) Primary sequence and proposed topology of the reference construct used in these experiments. Mutated positions in the central tetrad and loops are numbered. (B) Fluorescence spectrum of the reference construct used in these experiments (blue curve = ref) compared to a random sequence pool of the same length (orange curve = rand). (C) Maximum fluorescence intensity of the reference construct compared to that of a 17 nucleotide random sequence pool. (D) Heat map showing the relative fluorescence intensity of all possible variants of the central tetrad in the reference G-quadruplex. (E) Maximum emission wavelength of all possible variants of the central tetrad in the reference Gquadruplex. Experiments were performed at 10 µM G-quadruplex concentration in a buffer containing 1 M KCl and 20 mM HEPES pH 7.1 using an excitation wavelength of 290 nm. Experiment in panels B and C were performed using a G-quadruplex with the sequence GGGTGGGAAGGGTGGGA. See Table S1 for more information about the sequences, fluorescence intensities, and maximum emission wavelengths of these constructs.

quadruplexes containing mutations in tetrads, especially those that form noncanonical tetrads, might exhibit unusual fluorescent properties.

Moreover, since stacking of tetrads at the

interfaces of multimeric G-quadruplexes can in some cases alter their properties (28-30), we hypothesized that mutations in our library previously shown to induce formation of higher-order structures (35) might also affect G-quadruplex fluorescence. To address these questions in a systematic way, we characterized the fluorescence of all possible sequence variants of the central tetrad of our reference G-quadruplex.

Each of these 256 variants was folded in a buffer

optimized with respect to DNA concentration, potassium concentration, and pH (Figure S4). After exciting at a wavelength of 290 nm (the optimal excitation wavelength of the reference

ACS Paragon Plus Environment

9

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 35

construct), emission was measured between 330 nm and 500 nm. This revealed that, as is the case for G-quadruplexes with other biochemical activities (34-35), approximately 10% of these mutants were above our cutoff for fluorescence intensity, including several examples as fluorescent as the reference construct (Figure 1D). The sequence requirements of fluorescent Gquadruplexes were more similar to G-quadruplexes that bind GTP and form tetramers than to those that promote peroxidase reactions and form dimers. In particular, all G-quadruplexes containing a GGNN mutation in the central tetrad were above our cutoff for fluorescence intensity (compare to Figure 3 of reference 34 and Figure 5 of reference 35). We also noticed that the emission spectra of some variants were shifted relative to that of the reference construct (Figure 1E and Figure S5). This was most pronounced for three variants with emission peaks more than 20 nm higher than that of the reference construct (Figure 1E and Figure S5). In contrast to the most fluorescent mutants in the library, each of these variants contained an NNGG rather than a GGNN mutation in the central tetrad of the reference construct (Figure 1E). Control experiments indicated that the excitation wavelengths of these variants were not also shifted (Figure S6). Taken together, these experiments indicate that mutations in the central tetrad of the reference construct can alter both its fluorescence intensity and maximum emission wavelength. They also suggest that mutations which enhance fluorescence are distinct from those that modulate maximum emission wavelength.

Fluorescence of G-quadruplexes with mutated loops We next turned our attention to the effects of mutations in loops on G-quadruplex fluorescence. A previous study of G-quadruplexes with poly(A), poly(C), and poly(T) loops of different lengths showed that G-quadruplexes with shorter loops tend to be more fluorescent than

ACS Paragon Plus Environment

10

Page 11 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

those with longer ones, and that that G-quadruplexes with adenosine-rich loops tend to have higher maximum emission wavelengths than those with other sequences (24). To investigate the effects of loop sequences on G-quadruplex fluorescence in a more systematic way, a library was prepared that contained all possible mutations (A, C or T but not G) at loop positions 4, 8, 9 and 13 in the reference G-quadruplex (Figure 1A).

The reference construct also contains a 3'

adenosine overhang (Figure 1A), but this position was not mutated in our library. Because Gquadruplex fluorescence can be affected by multimerization (28-30), mutations were analyzed in two additional sequence backgrounds: that of a dimer-forming sequence that contains an AGGG mutation in the central tetrad of the reference construct, and that of a tetramer-forming sequence that contains a GGAG mutation in the central tetrad of the reference construct.

As was

previously observed for G-quadruplexes that bind GTP, promote peroxidase reactions, and form multimeric structures (34-35), point mutations in loops typically had only small effects on the fluorescence intensity of these G-quadruplexes (Figure 2). However, when G-quadruplexes contained multiple mutations in loops, results were typically different in the three sequence backgrounds (Figure 2 and Figure S7). The range of mutational effects was smallest in the context of the reference construct: all variants containing multiple mutations in loops were above our cutoff for fluorescence intensity, and the signal generated by the least fluorescent variant was about half of that of the most fluorescent variant (Figure 2B). On the other hand, fluorescence intensities varied more than 5-fold in both the dimer-forming and tetramer-forming backgrounds. Furthermore, ~10% of variants containing multiple mutations in loops were below our cutoff for fluorescence intensity in the background of a dimer-forming sequence (Figure 2D), and ~70% were below our cutoff in the background of a tetramer-forming sequence (Figure 2F). These differences suggest that loop nucleotides play more important roles in multimers than in

ACS Paragon Plus Environment

11

Biochemistry

Figure 2

a

b Positions 4 and 8 TA AA CA TC TT AC AT CC CT AT

120

CT

Positions 9 and 13

Percent fluorescence intensity

GGGG 90 60 30 0

TT AA AC CA CC TA

A

C

C

4

T

C

8

T

A

9

C

TC

13

Fluorescence High

c

Low

d Positions 4 and 8 TA AA CA TC TT AC AT CC CT AT

120

CT

Positions 9 and 13

Percent fluorescence intensity

AGGG 90 60 30 0

TT AA AC CA CC TA

A

C

C

4

T

C

8

T

A

9

C

TC

13

Fluorescence High

e

Low

f Positions 4 and 8 TA AA CA TC TT AC AT CC CT AT

120 GGAG 90 60 30 0

CT

Positions 9 and 13

Percent fluorescence intensity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 35

TT AA AC CA CC TA

A

C 4

C

T 8

C

T 9

A

C

TC

13

Fluorescence High

Low

ACS Paragon Plus Environment

12

Page 13 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 2. Effect of mutations in loops on G-quadruplex fluorescence intensity. (A) Effects of point mutations in loop positions 4, 8, 9, and 13 on the fluorescence intensity of the reference Gquadruplex. Mutations were made in the context of the sequence GGGTGGGAAGGGTGGGA. (B) Heat map showing the effects of all possible mutations (A, C or T but not G) in loop positions 4, 8, 9, and 13 on the fluorescence intensity of the reference G-quadruplex described in panel A. (C) and (D) Same as panels A and B, but mutations were made in the context of a dimeric G-quadruplex with the sequence GAGTGGGAAGGGTGGGA. (E) and (F) Same as panels A and B, but mutations were made in the context of a tetrameric G-quadruplex with the sequence GGGTGGGAAGAGTGGGA. Experiments were performed at 10 µM G-quadruplex concentration in a buffer containing 1 M KCl and 20 mM HEPES pH 7.1 using an excitation wavelength of 290 nm. The heights of bars in panels A, C, and E indicate the average of three experiments, and error bars indicate one standard deviation. See Table S1 for more information about the sequences, fluorescence intensities, and maximum emission wavelengths of these constructs.

monomers. This could be related to our previous observation that these multimers are less stable than the multimeric reference construct (35), and might also reflect the number of mutated positions in each type of structure (a DNA strand containing a point mutation will be present in four copies in a tetramer and two copies in a dimer, but in only one copy in a monomer). Effects of mutations in loops on maximum maximum emission wavelengths were also typically background-specific, and largest in the dimeric sequence background (Figure 3).

Sixteen

mutations increased the maximum emission wavelength of the reference G-quadruplex by more than 10 nm in this background, and seven mutations increased it by more than 20 nm (Figure

ACS Paragon Plus Environment

13

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 35

3D). In contrast, only four loop mutations had this effect in the background of the monomeric reference construct (Figure 3B), and none did in the tetrameric sequence background (Figure 3F). A heterodimer made up of constructs with different maximum emission wavelengths had an maximum emission wavelength equal to the average of the two homodimers, suggesting that these changes behave like partially dominant mutations (Figure S8). When G-quadruplexes contained multiple mutations in loops, both fluorescence intensity and maximum emission wavelength could be approximated using a model in which mutational effects at different loop positions are independent (Figure S9; see also references 34, 37 and 38). This is consistent with the idea that the nucleotides in the short loops of these G-quadruplexes do not physically interact with one another, although this conclusion is not necessarily applicable to G-quadruplexes with longer loops. Taken together, these experiments indicate that loop nucleotides can affect the fluorescent properties of G-quadruplexes. They also show that this is dependent on both the sequence background and the biochemical property being tested.

Effect of overhanging nucleotides on G-quadruplex fluorescence To gain additional insight into the sequence requirements of fluorescent G-quadruplexes, we investigated the extent to which overhanging nucleotides could affect their properties. Such overhangs can inhibit multimerization by interfering with stacking interactions (35, 39-43), which in some cases can modulate G-quadruplex fluorescence (28-30).

Non-guanosine

nucleotides in G-quadruplexes can also interact with tetrads to form unusual structures including pentads (44), hexads (45), and heptads (46), and such a structure formed by three loop adenosines and a tetrad has been proposed to be responsible for the shifted maximum emission wavelength of the GGAGGAGGAGG G-quadruplex (24). To determine the effects of

ACS Paragon Plus Environment

14

Page 15 of 35

Figure 3

a

b Positions 4 and 8 TA AA CA TC TT AC AT CC CT AT

GGGG

CT

440

Positions 9 and 13

Emission wavelength (nm)

460

420 400 380 360 340

TT AA AC CA CC TA

A

C

C

4

T

C

8

T

A

9

C

TC

13

Change in wavelength High

c

Low

d Positions 4 and 8 TA AA CA TC TT AC AT CC CT AT

AGGG

CT

440

Positions 9 and 13

Emission wavelength (nm)

460

420 400 380 360 340

TT AA AC CA CC TA

A

C

C

4

T

C

8

T

A

9

C

TC

13

Change in wavelength High

e

Low

f Positions 4 and 8 TA AA CA TC TT AC AT CC CT AT

460 GGAG 440 420 400 380 360 340

CT

Positions 9 and 13

Emission wavelength (nm)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

TT AA AC CA CC TA

A

C 4

C

T 8

C

T 9

A

C

TC

13

Change in wavelength High

Low

ACS Paragon Plus Environment

15

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 35

Figure 3. Effect of mutations in loops on G-quadruplex maximum emission wavelength. (A) Effects of point mutations in loop positions 4, 8, 9, and 13 on the maximum emission wavelength of the reference G-quadruplex. Mutations were made in the context of the sequence GGGTGGGAAGGGTGGGA. (B) Heat map showing the effects of all possible mutations (A, C or T but not G) in loop positions 4, 8, 9, and 13 on the maximum emission wavelength of the reference G-quadruplex described in panel A. (C) and (D) Same as panels A and B, but mutations were made in the context of a dimeric G-quadruplex with the sequence GAGTGGGAAGGGTGGGA. (E) and (F) Same as panels A and B, but mutations were made in the context of a tetrameric G-quadruplex with the sequence GGGTGGGAAGAGTGGGA. Experiments were performed at 10 µM G-quadruplex concentration in a buffer containing 1 M KCl and 20 mM HEPES pH 7.1 using an excitation wavelength of 290 nm. The heights of bars in panels A, C, and E indicate the average of three experiments, and error bars indicate one standard deviation. See Table S1 for more information about the sequences, fluorescence intensities, and maximum emission wavelengths of these constructs.

overhanging nucleotides on the G-quadruplexes studied here, different combinations of 5' and 3' overhangs were added to model constructs with different multimeric states (34-35). As was the case for mutations in loops, the effects of overhanging nucleotides depended on both sequence background and the biochemical property being tested (Figure 4). Such overhangs modestly increased the fluorescence of the monomeric reference construct, but had no effect on its maximum emission wavelength (Figures 4A-C). In contrast, the addition of overhanging nucleotides to the 3' (but not 5') terminus of a dimeric G-quadruplex slightly decreased its fluorescence while significantly increasing its maximum emission wavelength (Figures 4D-F).

ACS Paragon Plus Environment

16

Page 17 of 35

Figure 4 a

c

b 160

440

3’

G

G

G

G

G

G

G

G

G

G

G

G

5’

Emission wavelength (nm)

Percent fluorescence intensity

GGGG 120 80 40 0 C

T

5’ overhang

d

A

C

T

3’ overhang

A

C

G

G

G

G

G

G

G

G

G

A

G

A

G

G

G

G

G

120 80 40 0

5’

A

C

T

5’ overhang

g

A

C

T

3’ overhang

A

C

C

T

3’ overhang

A

C

T

5’ and 3’ overhang

400 380 360

A

C

T

5’ overhang

A

C

T

3’ overhang

A

C

T

5’ and 3’ overhang

i

3’

G

G

G

G

A

G

A

G

G

G

G

G

G

G

G

G

G

G

G

G

G

5’ 5’

G

G

G

G

G

G

G

G

G

G

G

G

G

G

G

G

G

G

G

G

A

G

A

G

G

G

G

160

440 GGAG

120 80 40 0 A

3’

A

AGGG

5’ and 3’ overhang

Emission wavelength (nm)

5’ 5’

T

420

340

T

h 3’

C

440 Emission wavelength (nm)

G

G

Percent fluorescence intensity

G

5’

G

A

5’ overhang

AGGG

G

360

5’ and 3’ overhang

160

G

380

f

3’

G

400

T

e 3’

GGGG

420

340 A

Percent fluorescence intensity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

C

T

5’ overhang

A

C

T

3’ overhang

A

C

T

5’ and 3’ overhang

GGAG 420 400 380 360 340

A

C

T

5’ overhang

A

C

T

3’ overhang

A

C

T

5’ and 3’ overhang

3’

ACS Paragon Plus Environment

17

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 35

Figure 4. Effect of overhanging nucleotides on G-quadruplex fluorescence. (A) Proposed secondary

structure

of

a

monomeric

reference

G-quadruplex

with

the

sequence

GGGTGGGAAGGGTGGG. (B) Effect of overhanging nucleotides on the fluorescence intensity of this G-quadruplex. (C) Effect of overhanging nucleotides on the maximum emission wavelength of this G-quadruplex. (D) Proposed secondary structure of a dimeric reference Gquadruplex with the sequence GAGTGGGAAGGGTGGG. Note the potentially unstable isolated 5' tetrad. (E) Effect of overhanging nucleotides on the fluorescence intensity of this Gquadruplex. (F) Effect of overhanging nucleotides on the maximum emission wavelength of this G-quadruplex. (G) Proposed secondary structure of a tetrameric reference G-quadruplex with the sequence GGGTGGGAAGAGTGGG. Note the potentially unstable isolated 3' tetrad. (H) Effect of overhanging nucleotides on the fluorescence intensity of this G-quadruplex. (I) Effect of overhanging nucleotides on the maximum emission wavelength of this G-quadruplex. Experiments were performed at 10 µM G-quadruplex concentration in a buffer containing 1 M KCl and 20 mM HEPES pH 7.1 using an excitation wavelength of 290 nm. The heights of bars indicate the average of three experiments, and error bars indicate one standard deviation. See Table S1 for more information about the sequences, fluorescence intensities, and maximum emission wavelengths of these constructs.

This was most pronounced for a variant containing a 3' adenosine, which had an emission peak 50 nm higher than a G-quadruplex lacking this adenosine (Figure 4F). The overhang does not change the multimeric state of the G-quadruplex (35), indicating that this is not the reason for the increase in maximum emission wavelength. The shift in maximum emission wavelength was also observed for a heterodimer made up of one strand with a 3' adenosine overhang and one

ACS Paragon Plus Environment

18

Page 19 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

strand without it, indicating that a single adenosine was sufficient for this effect (Figure S9). Consistent with this interpretation, dimers containing two or three adenosines at each 3' terminus did not have higher maximum emission wavelengths than those containing a single adenosine (Figure S10). Overhanging nucleotides also increased maximum emission wavelengths in the context of a tetrameric G-quadruplex, but effects were smaller, and appeared to depend more on the identity of the added nucleotide than the position of the overhang (Figure 4G-I). In each background, mutational effects of 5' and 3' overhangs were independent (Figure S9).

This is

consistent with previously proposed models, which suggest that 5' and 3' termini are on opposite ends of both monomeric and multimeric structures (reference 35 and Figure 4). The patterns observed here were different in many cases from those reported in a previous study, in which 5' overhangs decreased maximum emission wavelengths while 3' overhangs either increased maximum emission wavelengths or had no effect (33). Taken together, these results indicate that overhanging nucleotides can modulate the fluorescence properties of G-quadruplexes in complex ways. They also highlight the important role played by sequence background in determining their effects on G-quadruplex fluorescence.

Tradeoff between fluorescence intensity and maximum emission wavelength The sequence requirements of the most fluorescent G-quadruplexes identified in this study were distinct from those with the highest maximum emission wavelengths, suggesting a tradeoff between fluorescence and high maximum emission wavelength.

Some of these

differences could be rationalized by the sequence background, but loop sequences were also important (Figures 5A-B and Figure S11). For example, the most fluorescent G-quadruplexes in the loop library usually contained a central GGGG tetrad, a C or T at position 4, and a T at

ACS Paragon Plus Environment

19

Biochemistry

Figure 5

a Percent fluorescence intensity

150 GGGG tetrad HHHH loop 100 AGGG tetrad HHHH loop 50 GGAG tetrad HHHH loop 0 380

390

400

410

420

430

Emission wavelengt h (nm)

b 150 Percent fluorescence intensity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 35

100

GGGG tetrad YTHH loop

50

AGGG tetrad AHHH loop

0 380

390

400

410

420

430

Emission wavelengt h (nm)

c GGGG tetrad YTHH loop ds

ss

1

2

3

4

5

6

7

8

9

10

Construct

Dimer Monomer

d AGGG tetrad AHHH loop ds

ss

1

2

3

4

5

6

7

8

9

10

Construct

Dimer Monomer

ACS Paragon Plus Environment

20

Page 21 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Figure 5. Mutations that increase G-quadruplex fluorescence intensity and maximum emission wavelength. (A) Relationship between maximum emission wavelength and fluorescence intensity for G-quadruplexes with mutations in loops. Blue = variants with a central GGGG tetrad and HHHH loop sequences (H = A, C or T); orange = variants with an AGGG mutation in the central tetrad of the reference G-quadruplex and HHHH loop sequences; green = variants with an GGAG mutation in the central tetrad of the reference G-quadruplex and HHHH loop sequences. (B) The sequence requirements of the most fluorescent G-quadruplexes in the loop library are distinct from those with the highest maximum emission wavelengths. Blue = variants with a GGGG central tetrad and YTHH loop sequences (Y = C or T, H = A, C or T), orange = variants with an AGGG mutation in the central tetrad of the reference G-quadruplex and AHHH loop sequences. (C) Multimeric state of the ten most fluorescent G-quadruplex variants in the loop library as determined by native PAGE. (D) Multimeric state of the ten G-quadruplex variants in the loop library with the highest maximum emission wavelengths as determined by native PAGE. Experiments were performed at 10 µM G-quadruplex concentration in a buffer containing 1 M KCl and 20 mM HEPES pH 7.1 using an excitation wavelength of 290 nm. See Table S1 for more information about the sequences, fluorescence intensities, and maximum emission wavelengths of these constructs. ss = a single-stranded oligonucleotide with the sequence GACTGCCTCGTCACGAT; ds = this sequence and its reverse complement.

position 8 (Figure 5B and Figure S11).

On the other hand, G-quadruplexes with high

maximum emission wavelengths typically contained an AGGG mutation in the central tetrad of the reference construct and an A at position 4 (Figure 5B and Figure S11). Fluorescence intensity and maximum emission wavelength were also correlated with different multimeric

ACS Paragon Plus Environment

21

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 35

states: the most fluorescent G-quadruplexes in the library were monomers (Figure 5C), while those with the highest maximum emission wavelengths typically formed dimeric structures (Figure 5D). In addition, the range of maximum emission wavelengths in the dimeric sequence background is considerably larger than that observed in the monomeric or tetrameric backgrounds (Figure 5A). The observation that monomers are more fluorescent than multimers could be related to the number of stable tetrads per molecule in each type of structure. Previously work suggests that these monomers contain three stable tetrads per molecule, while dimers and tetramers contain fewer because in each case, at least one of the terminal tetrads is unstable (35). If the fluorescence intensity of a G-quadruplex is related to the number of stable tetrads it contains, monomers would be expected to be more fluorescent than dimers or tetramers, as was observed in this study. The extent of folding could also influence fluorescence, but appears to be less important than multimeric state for the G-quadruplexes analyzed here.

For example,

fluorescence intensity is only weakly correlated with the height of the ~260 nm peak in the circular dichroism spectra of these G-quadruplexes (Figure S12) (47). Although the height of this peak is not a direct readout of the fraction of folded G-quadruplex (it also reflects the multimeric state of the G-quadruplex and the number of tetrads per folded structure) (48), similar results were obtained when only G-quadruplexes with a given multimeric state were analyzed (Figure S12). The observation that dimers tend to have higher maximum emission wavelengths than monomers or teramers is more difficult to rationalize because the mechanisms by which the maximum emission wavelength of a G-quadruplex is determined are not well understood (33). Several recent studies suggest that excimers generated by stacking interactions can modulate the fluorescent properties of G-quadruplexes (28-30), and interactions between tetrads and nucleotides in loops have also been proposed to be responsible for shifted maximum emission

ACS Paragon Plus Environment

22

Page 23 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

wavelengths in some G-quadruplexes (24). Although our results do not clearly distinguish these two possibilities, they do demonstrate that significant changes in maximum emission wavelength can be achieved in the absence of changes in multimeric state (Figure 4F and Figure 9A of reference 35). Furthermore, since such shifts only occur when an overhanging nucleotide is close to a stable tetrad (i.e. 3' overhangs increase the maximum emission wavelengths of dimers while 5' overhangs have no effect), it is most consistent with the idea that interactions between unpaired nucleotides and tetrads can shift the maximum emission wavelength of fluorescent Gquadruplexes (24).

Effect of concatamerization on G-quadruplex fluorescence After characterizing the effects of mutations in tetrads, loops, and overhanging nucleotides on the properties of fluorescent G-quadruplexes, we next investigated ways to increase the fluorescence intensity of these structures. The biochemical activities of functional nucleic acids can sometimes be increased by generating constructs that contain multiple copies of a motif. For example, the GTP-binding activity of RNA oligonucleotides containing the CA motif GTP aptamer increases with repeat number (49). This strategy has also been reported to increase the fluorescent signal generated by the malachite green aptamer (50). On the other hand, the addition of flanking sequence to an aptamer or ribozyme can be inhibitory, presumably because it increases the propensity of the motif to misfold (51). Furthermore, stacking can quench the signal generated by fluorescent nucleotide analogs such as 2-aminopurine (52), and it could have a similar effect on arrays of stacked G-quadruplexes. For these reasons, the expected effect of concatamerization on G-quadruplex fluorescence intensity was not obvious. To address this question experimentally, we measured the fluorescence intensity of a series of constructs in

ACS Paragon Plus Environment

23

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 35

which either two or three copies of the reference G-quadruplex were connected by linkers that differ in terms of both sequence and length (Figure 6A). This revealed that both linker sequence and length influence the fluorescence intensity of concatamerized G-quadruplexes. Concatamers containing adenosine linkers were slightly more fluorescent than those containing thymidine linkers, and significantly more fluorescent than those containing cytosine linkers (Figures 6B-C). For constructs with adenosine or thymidine linkers, fluorescence intensity increased with increasing linker length between one and ten nucleotides, but typically did not increase for constructs with longer linkers (Figures 6B-D).

On the other hand, fluorescence intensity

decreased with linker length for constructs containing cytosine loops, probably due to base pairing between cytosines and guanosines that form tetrads in G-quadruplexes (Figures 6B-C). In the case of constructs that contained adenosine or thymidine linkers of ten or twenty nucleotides, addition of a second G-quadruplex resulted in a two-fold increase in the fluorescent signal, but increases were more modest when a third G-quadruplex was added to the construct (Figures 6B-E). Synthesis of longer constructs could not be readily achieved using either chemical or enzymatic methods. This was likely partially due to the inhibitory effects of Gquadruplexes on DNA synthesis, which can lead to inefficient polymerization as well as amplification bias in the context of methods such as PCR (53-54). Although encouraging in some respects, these findings also emphasize the need for significant improvements before fluorescent G-quadruplexes can be used as sensors. In particular, the fluorescence quantum yields of the most fluorescent G-quadruplexes identified in this study (Figure S3) are still several orders of magnitude lower than that of GFP (19,33,55), and the optimization of this parameter will represent a significant challenge for future research.

ACS Paragon Plus Environment

24

Page 25 of 35

Figure 6 a

b Percent fluorescence intensity

300

Concatamerize using linkers of different length and sequence

2 copies of G-quadruplex

200

100

0 1

3 10 20 1 Poly(A) linker

3 10 20

1

Poly(C) linker

3 10 20 Linker length Poly(T) linker

c

Excite at 290 nm and measure emission spectrum

Percent fluorescence intensity

300 3 copies of G-quadruplex 200

100

0

1

3 10 20 1 Poly(A) linker

d

3 10 20

1

Poly(C) linker

3 10 20 Linker length Poly(T) linker

e 400 A1 A3 A10 A20

300 200 100 0 300

350

400 450 Wavelength (nm)

500

550

Percent fluorescence intensity

400

FU × 1000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

300 200 100 0

1

2

3

Theoretical maximum

1

2 A20 linker

3

1

2 C20 linker

3

1

2

3

Copy number

T20 linker

ACS Paragon Plus Environment

25

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 35

Figure 6. Effect of concatemerization on G-quadruplex fluorescence intensity. (A) Design of constructs containing multiple copies of a fluorescent G-quadruplex linked by spacers that vary in both length and sequence. (B) Effect of spacer length and sequence on the fluorescence intensity of oligonucleotides containing two copies of a reference G-quadruplex with the sequence GGGTGGGAAGGGTGGG. (C) Effect of spacer length and sequence on the fluorescence intensity of oligonucleotides containing three copies of a reference G-quadruplex with the sequence GGGTGGGAAGGGTGGG. (D) Fluorescence spectra of oligonucleotides containing two copies of the reference G-quadruplex linked by poly(A) spacers of different lengths. (E) Effect of copy number on the fluorescence intensity of concatemerized Gquadruplexes connected by 20 nucleotide spacers. The bars labeled theoretical maximum indicate the expected values for a model in which each G-quadruplex makes an independent contribution to the fluorescence of the oligonucleotide. Experiments were performed at 10 µM G-quadruplex concentration in a buffer containing 1 M KCl and 20 mM HEPES pH 7.1 using an excitation wavelength of 290 nm. See Table S1 for more information about the sequences, fluorescence intensities, and maximum emission wavelengths of these constructs.

ASSOCIATED CONTENT Supporting Information. The Supporting Information is available free of charge on the ACS Publications website. Figures S1-S12 (PDF) Table S1 (PDF)

ACS Paragon Plus Environment

26

Page 27 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

AUTHOR INFORMATION Corresponding Author * E-mail: [email protected] Author Contributions T.M, T.S., and E.A.C designed the experiments. T.M, T.S., and L.B performed the experiments. E.A.C. wrote the manuscript. All authors have given approval to the final version of the manuscript. Funding Sources This work was supported by an IOCB start-up grant awarded to E.A.C, InterBioMed LO 1302 from the Ministry of Education of the Czech Republic, and "Chemical biology for drugging undruggable targets (ChemBioDrug)" No. CZ.02.1.01/0.0/0.0/16_019/0000729 from the European Regional Development Fund (OP RDE).

Notes The authors declare no competing financial interest. ACKNOWLEDGMENT We thank Sofia Kolesnikova for useful discussions and assistance with native PAGE. We also thank Jan Konvalinka and colleagues at the IOCB for useful discussions and support. ABBREVIATIONS

ACS Paragon Plus Environment

27

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 35

DNA, deoxyribonucleic acid; RNA, ribonucleic acid; GFP, green fluorescent protein; HBI, 4-(phydroxybenzylidene)imidazolidin-5-one; FRET, Förster resonance energy transfer; SAM, SAdenosyl methionine; ADP, adenosine diphosphate; GTP, guanosine triphosphate. REFERENCES (1) Davenport, D., and Nicol, J.A.C. (1955) Luminescence of hydromedusae, Proc. R. Soc. London, Ser. B. 144, 399-411. (2) Shimomura, O. (2005) The discovery of aequorin and green fluorescent protein, J. Microsc. 217, 1-15. (3) Tsien, R.Y. (1998) The green fluorescent protein, Annu. Rev. Biochem. 67, 509-544. (4) Shimomura, O. (1979) Structure of the chromophore of Aequorea green fluorescent protein, FEBS Lett. 104, 220-222. (5) Cody, C.W., Prasher, D.C., Westler, W.M., Prendergast, F.G., and Ward, W.W. (1993) Chemical structure of the hexapeptide chromophore of the Aequorea green-fluorescent protein, Biochemistry 32, 1212-1218. (6) Ormö, M., Cubitt, A.B., Kallio, K., Gross, L.A., Tsien, R.Y., and Remington. S.J. (1996) Crystal structure of the Aequorea victoria green fluorescent protein, Science 273, 1392-1395. (7) Yang, F., Moss, L.G., and Phillips, G.N. Jr. (1996) The molecular structure of green fluorescent protein, Nat. Biotechnol. 14, 1246-1251. (8) Shaner, N.C., Steinbach, P.A., and Tsien, R.Y. (2005) A guide to choosing fluorescent proteins, Nat. Methods 2, 905-909.

ACS Paragon Plus Environment

28

Page 29 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

(9) Chudakov, D.M., Matz, M.V., Lukyanov, S., and Lukyanov, K.A. (2010) Fluorescent proteins and their applications in imaging living cells and tissues, Physiol. Rev. 90, 1103-1163. (10) Grate, D., and Wilson, C. (1999) Laser-mediated, site-specific inactivation of RNA transcripts, Proc. Natl. Acad. Sci. USA. 96, 6131-6136. (11) Babendure, J.R., Adams, S.R., and Tsien, R.Y. (2003) Aptamers switch on fluorescence of triphenylmethane dyes, J. Am. Chem. Soc. 125, 14716-14717. (12) Paige, J.S., Wu, K.Y., and Jaffrey, S.R. (2011) RNA mimics of green fluorescent protein, Science 333, 642-646. (13) Dolgosheina, E.V., Jeng, S.C., Panchapakesan, S.S., Cojocaru, R., Chen, P.S., Wilson, P.D., Hawkins, N., Wiggins, P.A., and Unrau, P.J. (2014) RNA mango aptamer-fluorophore: a bright, high-affinity complex for RNA labeling and tracking, ACS Chem. Biol. 9, 2412-2420. (14) Paige, J.S., Nguyen-Duc, T., Song, W., and Jaffrey, S.R. (2012) Fluorescence imaging of cellular metabolites with RNA, Science 335, 1194. (15) Strack, R.L., and Jaffrey, S.R. (2015) Live-cell imaging of mammalian RNAs with Spinach2, Methods Enzymol. 550, 129-146. (16) Litke, J.L., You, M., and Jaffrey, S.R. (2016) Developing fluorogenic riboswitches for imaging metabolite concnetration dynamics in bacterial cells, Methods Enzymol. 572, 315-333. (17) Huang, H., Suslov, N.B., Li, N.S., Shelke, S.A., Evans, M.E., Koldobskaya, Y., Rice, P.A., and Piccirilli, J.A. (2014) A G-quadruplex-containing RNA activates fluorescence in a GFP-like fluorophore, Nat. Chem. Biol. 10, 686-691.

ACS Paragon Plus Environment

29

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 35

(18) Trachman III, R.J., Demeshkina, N.A., Lau, M.W.L., Panchapakesan, S.S.S., Jeng, S.C.Y., Unrau, P.J., and Ferré-D'Amaré, A.R. (2017) Structural basis for high-affinity fluorophore binding and activation by RNA mango, Nat. Chem. Biol. 13, 807-813. (19) Mendez, M.A., and Szalai, V.A. (2009) Fluorescence of unmodified oligonucleotides: a tool to probe G-quadruplex DNA structure, Biopolymers 91, 841-850. (20) Miannay, F.A., Banyasz, A., Gustavsson, T., and Markovitsi, D. (2009) Excited states and energy transfer in G-quadruplexes, J. Phys. Chem. C 113, 11760-11765. (21) Kwok, C.K., Sherlock, M.E., and Bevilacqua, P.C. (2013) Decrease in RNA folding cooperativity by deliberate population of intermediates in RNA G-quadruplexes, Angew. Chem. Int. Ed. Engl. 52, 683-686. (22) Udenfriend, S., and Zaltzman, P. (1962) Fluorescence characteristics of purines, pyrimidines, and their derivatives: measurement of guanine in nucleic acid hydrolyzates, Anal. Biochem. 3, 49-59. (23) Changenet-Barret, P., Emanuele, E., Gustavsson, T., Improta, R., Kotlyar, A.B., Markovitsi, D., Vaya, I., Zakrzewska, K., and Zikich, D. (2010) Optical properties of guanine nanowires: experimental and theoretical study, J. Phys. Chem. C. 114, 14339-14346. (24) Kwok, C.K., Sherlock, M.E., and Bevilacqua, P.C. (2013) Effect of loop sequence and loop length on the intrinsic fluorescence of G-quadruplexes, Biochemistry 52, 3019-3021. (25) Changenet-Barret, P., Hua, Y., and Markovitsi, D. (2015) Electronic excitations in guanine quadruplexes, Top. Curr. Chem. 356, 183-202.

ACS Paragon Plus Environment

30

Page 31 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

(26) Improta, R. (2014) Quantum mechanical calculations unveil the structure and properties of the absorbing and emitting excited electronic states of guanine quadruplex, Chem. Eur. J. 20, 8106-8115. (27) Karsisiotis, A.I., Hessari, N.M., Novellino, E., Spada, G.P., Randazzo, A., Webba da Silva, M. (2011) Topological characterization of nucleic acid G-quadruplexes by UV absorption and circular dichroism, Angew. Chem. Int. Ed. Engl. 50, 10645-10648. (28) Dao, N.T., Haselsberger, R., Michel-Beyerle, M.E., and Phan, A.T. (2013) Excimer formation by stacking G-quadruplex blocks, Chemphyschem. 14, 2667-2671. (29) Gao, S., Cao, Y., Yan, Y., and Guo, X. (2016) Sequence effect on the topology of 3 + 1 interlocked bimolecular DNA G-quadruplexes, Biochemistry 55, 2694-2703. (30) Gao, S., Cao, Y., Yan, Y., Xiang, X., and Guo, X. (2016) Correlations between fluorescence emission and base stacks of nucleic acid G-quadruplexes, RSC Adv. 6, 9453194538. (31) Hua, Y., Changenet-Barret, P., Improta, R., Vayá, I., Gustavsson, T., Kotlyar, A.B., Zikich, D., Šketl, P., Plavec, J., and Markovitsi, D. (2012) Cation effect on the electronic excited states of guanine nanostructures studied by time-resolved fluorescence spectroscopy, J. Phys. Chem. C. 116, 14682-14689. (32) Changenet-Barret, P., Hua, Y., Gustavsson, T., and Markovitsi, D. (2015) Electronic excitations in G-quadruplexes formed by the human telomeric sequence: a time-resolved fluorescence study, Photochem. Photobiol. 91, 759-765.

ACS Paragon Plus Environment

31

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 35

(33) Sherlock, M.E., Rumble, C.A., Kwok, C.K., Breffke, J., Maroncelli, M., and Bevilacqua, P.C. (2016) Steady-state and time-resolved studies into the origin of the intrinsic fluorescence of G-quadruplexes, J. Phys. Chem. B. 120, 5146-5158. (34) Švehlová, K., Lawrence, M.S., Bednárová, L., and Curtis, E.A. (2016) Altered biochemical specificity of G-quadruplexes with mutated tetrads, Nucleic Acids Res. 44, 1078910803. (35) Kolesnikova, S., Hubálek, M., Bednárová, L., Cvačka, J., and Curtis, E.A. (2017) Multimerization rules for G-quadruplexes, Nucleic Acids Res. 45, 8684-8696. (36) Curtis, E.A., and Liu, D.R. (2013) Discovery of widespread GTP-binding motifs in genomic RNA and DNA, Chem. Biol. 20, 521-532. (37) Wells, J.A. (1990) Additivity of mutational effects in proteins, Biochemistry 29, 85098517. (38) Curtis, E.A., and Bartel, D.P. (2013) Synthetic shuffling and in vitro selection reveal the rugged adaptive fitness landscape of a kinase ribozyme, RNA 19, 1116-1128. (39) Sen, D., and Gilbert, W. (1992) Novel DNA superstructures formed by telomere-like oligomers, Biochemistry 31, 65-70. (40) Krishnan-Ghosh, Y., Liu, D., and Balasubramanian, S. (2004) Formation of an interlocked quadruplex dimer by d(GGGT), J. Am. Chem. Soc. 126, 11009-11016. (41) Kato, Y., Ohyama, T., Mita, H., and Yamamoto, Y. (2005) Dynamics and thermodynamics of dimerization of parallel G-quadruplexed DNA formed from d(TTAGn) (n=35), J. Am. Chem. Soc. 127, 9980-9981.

ACS Paragon Plus Environment

32

Page 33 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

(42) Do, N.Q., Lim, K.W., Teo, M.H., Heddi, B., and Phan, A.T. (2011) Stacking of Gquadruplexes: NMR structure of a G-rich oligonucleotide with potential anti-HIV and anticancer activity, Nucleic Acids Res. 39, 9448-9457. (43) Wang, Y., and Patel, D.J. (1992) Guanine residues in d(T2AG3) and d(T2G4) form parallelstranded potassium cation stabilized G-quadruplexes with anti glycosidic torsion angles in solution, Biochemistry 31, 8112-8119. (44) Phan, A.T., Kuryavyi, V., Ma, J.B., Faure, A., Andréola, M.L., and Patel, D.J. (2005) An interlocked dimeric parallel-stranded DNA quadruplex: a potent inhibitor of HIV-1 integrase, Proc. Natl. Acad. Sci. USA. 102, 634-639. (45) Kettani, A., Gorin, A., Majumdar, A., Hermann, T., Skripkin, E., Zhao, H., Jones, R., and Patel, D.J. (2000) A dimeric DNA interface stabilized by stacked A.(G.G.G.G).A hexads and coordinated monovalent cations, J. Mol. Biol. 297, 627-644. (46) Matsugami, A., Ouhashi, K., Kanagawa, M., Liu, H., Kanagawa, S., Uesugi, S., and Katahira, M. (2001) An intramolecular quadruplex of (GGA)(4) triplet repeat DNA with a G:G:G:G tetrad and a G(:A):G(:A):G(:A):G heptad, and its dimeric interaction, J. Mol. Biol. 313, 255-269. (47) Vorlíčková, M., Kejnovská, I., Sagi, J., Renčiuk, D., Bednářová, K., Motlová, J., and Kypr, J. (2012) Circular dichroism and guanine quadruplexes, Methods 57, 64-75. (48) Tóthová, P., Krafčíková, P., and Víglaský, V. (2014) Formation of highly ordered multimers in G-quadruplexes, Biochemistry 31, 8112-8119.

ACS Paragon Plus Environment

33

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 35

(49) Curtis, E.A., and Liu, D.R. (2014) A naturally occurring, noncanonical GTP aptamer made of simple tandem repeats, RNA Biol. 11, 682-692. (50) Furukawa, K., Abe, H., Abe, N., Harada, M., Tsuneda, S., and Ito, Y. (2008) Fluorescence generation from tandem repeats of a malachite green RNA aptamer using rolling circle transcription, Bioorg. Med. Chem. Lett. 18, 4562-4565. (51) Sabeti, P.C., Unrau, P.J., and Bartel, D.P. (1997) Accessing rare activities from random RNA sequences: the importance of the length of molecules in the starting pool, Chem. Biol. 4, 767-774. (52) Rachofsky, E.L., Osman, R., and Ross, J.B. (2001) Probing structure and dynamics of DNA with 2-aminopurine: effects of local environment on fluorescence, Biochemistry 40, 946956. (53) Ramos-Alemán, F., González-Jasso, E., and Pless, R.C. (2017) Use of alternative alkali chlorides in RT and PCR of polynucleotides containing G quadruplex structures, Anal. Biochem. 543, 43-50. (54) Ingr, M., Dostál, J., and Majerová, T. (2015) Enzymological description of multitemplate PCR - shrinking amplification bias by optimization the polymerase-template ratio, J. Theor. Biol. 382, 178-186. (55) Patterson, G.H., Knobel, S.M., Sharif, W.D., Kain, S.R., and Piston, D.W. (1997) Use of the green fluorescent protein and its mutants in quantitative fluorescence microscopy, Biophys. J. 73, 2782-2790.

ACS Paragon Plus Environment

34

Page 35 of 35 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

5’

G N G

G N

G

G N G

G

N

G

Fold in KCl buffer

5’

G

G

G

G

N

N

N

N

G

G

G

G

3’

Screen for fluorescence

ACS Paragon Plus Environment

3’