Characterization of G-quadruplexes in Chlamydomonas reinhardtii

4 mins ago - Chlamydomonas reinhardtii is a green algae with a very GC-rich genome (67%) and a high density of potential G-quadruplex-forming ...
0 downloads 0 Views 1MB Size
Subscriber access provided by University of South Dakota

Article

Characterization of G-quadruplexes in Chlamydomonas reinhardtii and the effects of polyamine and magnesium cations on structure and stability Warren Andrew Vinyard, Aaron M. Fleming, Jingwei Ma, and Cynthia J. Burrows Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.8b00749 • Publication Date (Web): 09 Nov 2018 Downloaded from http://pubs.acs.org on November 14, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Characterization of G-quadruplexes in Chlamydomonas reinhardtii and the effects of polyamine and magnesium cations on structure and stability W. Andrew Vinyard, Aaron M. Fleming, Jingwei Ma, and Cynthia J. Burrows* Department of Chemistry, University of Utah, Salt Lake City, Utah 84112-0850, United States *To whom correspondence should be addressed. E-mail: [email protected] Abstract Chlamydomonas reinhardtii is a green algae with a very GC-rich genome (67%) and a high density of potential G-quadruplex-forming sequences (PQSs). Using the Ensembl Plants DNA database, 19 PQSs were selected, and their ability to fold in vitro was examined using four experimental methods. Our results support in vitro folding of 18 of the 19 PQSs selected for study. The high physiological polyamine concentrations in C. reinhardtii create unique conditions for studying G4 folding. We investigated whether high polyamine concentrations affect the stability and structural fold of two polymorphic G4s selected from the cohort of PQSs. The two polymorphic G4s selected were found to be greatly stabilized when studied in the physiologically high polyamine concentrations. Lastly, the effects of physiologically relevant Mg2+ concentrations were tested on both of the polymorphic G4s, and one of the G4s shifted from a dynamic mixture of folds to favor a parallel fold in the presence of Mg2+. Our work supports the concept of folding of G4s under the unique conditions observed in C. reinhardtii, and these structures, being located in promoter regions of DNA repair and photosynthetic genes, might be relevant structures in the physiology of C. reinhardtii.

ACS Paragon Plus Environment

1

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 34

Introduction DNA is dynamic, exhibiting a broad range of possible secondary structures outside of the helical B-form DNA structure.1,2 These secondary structures are in equilibrium with the canonical double-helical structure with the equilibrium being dependent on both sequence context and the physical conditions.3,4 For example, DNA structural equilibria can be shifted depending on ion concentration, pH, and base pair fidelity.5 One secondary structure that has garnered significant attention is the Gquadruplex (G4) fold. Guanine-rich regions in DNA have the potential to form G4s that can exist when a sequence contains four runs of at least three consecutive guanines (Figure 1A).6 The sequence can fold into a helical structure wherein four guanines, one from each track, associate with one another through the Hoogsteen face, bonding around a monovalent cation such as potassium to form a guanine tetrad (Figure 1B). A stable DNA G4 forms when the guanine tetrads stack on top of one another with a potassium ion sandwiched between layers (Figure 1C).7,8 Each G run is connected by loops ranging in length from 1 to 12 nucleotides.9 G4s can adopt a broad range of folds; some are highly dynamic, while others can only sample one topology.10,11

ACS Paragon Plus Environment

2

Page 3 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

A BR

5`-GGG… N1-12GGG... N1-12 GGG... N1-12 GGG...-3`

N N

N H

H N H N

H N

H

N

H

O

N

O H

N N H N R H R = DNA

O

C

N O

K+ N

R N

N

5’

H

H N

N H

N

N R

R = DNA

N

3’

Parallel

Hybrid G

3’

3’ 5`

5`

Antiparallel

K+

Figure 1. (A) General sequence for a potential G-quadruplex forming sequence (PQS) motif. (B) Base pairing of a G-tetrad that is stabilized by coordination to a potassium ion. (C) Different structural folds observed for G4s in solution.

The cellular function of G4s is still being investigated, and there is strong evidence that G4s do have a function and are important in humans, where they are abundant.12 Bioinformatics analysis and G4-Seq studies on the whole human genome have identified a range of PQSs from 375,000 (bioinformatics) to more than 700,000 (G4-Seq).13,14 The Balasubramanian laboratory demonstrated that ~10,000 of the >700,000 PQSs fold and regulate transcription in human skin cells.14,15 Formation of G4s in cellular DNA was initially thought to have deleterious effects,3 but this notion has been heavily challenged over the last decade. For instance, research in our laboratory has supported the hypothesis that G4s play a regulatory role in cells to include up or downregulation of gene expression during oxidative stress conditions.16,17 There is evidence that these regulatory roles impact many cellular pathways that include transcription, translation, and replication; additionally, G4 folds are critical structures in

ACS Paragon Plus Environment

3

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 34

telomeres.2,15,18,19 Furthermore, evidence supports regulation of G4 formation within the cell by helicases and polymerases.20,21 Bioinformatic analysis has found that PQSs are disproportionally found in promoter regions, the first intron, and 5`-UTR regions of the human genome.22,23 Lastly, G4s have recently been proposed to regulate gene expression when oxidatively modified at a G nucleotide.16,24 However, the full extent to which G4 motifs function in cells remains to be determined, and to what extent they are beneficial or detrimental to cellular processes.25–27 Despite the fact that the study of G4s is a burgeoning area of research, and G4s have the potential to affect many different genes and pathways, the overwhelming majority of G4 research has been in mammalian systems to date.12,16,28 Because cellular conditions often vary greatly from one organism to another, it is crucial to investigate G4 formation in other organisms. Studies of RNA G4s in plants have been performed concluding that G4s do appear to be biologically relevant.29–33 However, DNA G4s in plants and other photosynthetic organisms have been sparsely studied. Computational studies in the model plant species Arabidopsis thaliana found that PQSs were underrepresented in the genome with only 1,200 identifiable sequences, assuming 3-tetrad G4s.34 Other plant genomes show much higher PQS density with O. sativa indicia (rice) and Z. mays (corn) having PQS densities of 91.6 PQS/Mb and 74.8 PQS/Mb, respectively.34 The distribution of three-tetrad PQSs in plant genomes was also found to be biased toward gene promoters and introns.35 An organism of particular interest is C. reinhardtii, which belongs to a division of green algae referred to as Chlorophyta. Chlorophyta predate the monocot/dicot split and thus occupy an unusual position in the evolutionary tree.36

ACS Paragon Plus Environment

4

Page 5 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Furthermore, their exposure to a broad range of environmental stressors makes them of interest for studying oxidative damage and repair.37–39 C. reinhardtii is a green unicellular soil algae that has a highly GC-rich genome (67%).40 Furthermore, C. reinhardtii has a very high level of PQSs based on bioinformatics studies in our laboratory (Y. Ding, unpublished results). For over 30 years, C. reinhardtii has been cultured and used for biological experiments as a model organism for studying chloroplast-based photosynthesis and eukaryotic flagella.41,42 Recently C. reinhardtii was also earmarked as a model organism for the production of biofuels.43 The organism is capable of resisting high levels of oxidative and environmental stress.44 Thus, C. reinhardtii is an excellent organism to continue exploring the role of G4s in regulation of cellular processes especially under oxidative stress conditions. C. reinhardtii cells are exposed to a broad range of environmental changes including fluctuations in ionic concentrations.45 As with most organisms, the intracellular concentrations of ions are highly regulated in C. reinhardtii.46 Magnesium is a particularly crucial ion, acting as a cofactor in many enzymatic processes, as well as being the critical ion coordinated in the porphyrin ring of chlorophyll. Because G4s have been demonstrated to bind metal ions such as Pb2+, Mg2+, Ca2+, and Zn2+, studies with divalent metals to understand how they disrupt G4 folds, strengthen G4 folds, or shift their structural equilibria are of interest.47–49 Another interesting feature of the cellular environment of C. reinhardtii results from its high concentration of polyamines that includes 120 mM putrescine, 20 mM norspermidine, and 4 mM spermidine.50 Polyamines, such as spermine, spermidine, and putrescine, along with their derivatives, are present in all living cells generally at

ACS Paragon Plus Environment

5

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 34

high micromolar to low millimorar concentrations depending on the cell type.51,52 The biological roles of polyamines are diverse, and they are clearly essential for cellular functions, one of which is interaction with DNA.51 Changes in polyamine concentrations have been observed during the exponential growth phase of several algae, and the concentration of free polyamines was found to track linearly with growth rate.53–55 Polyamines at low concentrations (μM) appear to encourage G4 folding, while at high concentrations (mM) they may stabilize some G4s or denature others in a sequencedependent manner.56 In C. reinhardtii, the polyamines putrescine, norspermidine, and spermidine were detected by mass spectrometry, and as noted, the putrescine concentration was found to be unusually high.50 In higher organisms, the polyamine concentration is cell-type dependent, and it can vary greatly.51,57 Due to the dynamic nature of G4s, variations in cellular conditions can lead to drastic shifts in the equilibrium and stability of G4s. This phenomenon has to be explored on a case-by-case basis. In the present studies, we inspected the C. reinhardtii genome for PQSs in DNA repair genes that allow comparisons to a previous inspection of the human genome for PQSs in these genes by our laboratory.58 A subset of these sequences were then synthesized and characterized by established biophysical methods to determine the folded state of the selected sequences. Lastly, due to the unique cellular conditions found in C. reinhardtii, we asked whether the cellular environment in C. reinhardtii supports G4 folding with the high polyamine concentrations and the fluctuations in Mg2+ concentrations observed in these cells.

ACS Paragon Plus Environment

6

Page 7 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Materials and Methods Bioinformatic Analysis.

All genomic sequences were obtained from the Ensembl

Plants genomic database. The sequences were 2500 nucleotides (nt) upstream and 600 nt downstream of each gene under investigation. Quadparser was used to identify PQSs in selected DNA repair genes and photosystem genes.59 The parameters required that the loop length be 1-12 nt and that there exist 4 or more runs of G with ≥3 Gs per run. Oligodeoxynucleotide Preparation. All oligodeoxynucleotides were synthesized by the DNA/Peptide core facility at the University of Utah, using commercially available phosphoramidites and a standard solid-phase synthesis protocol. A semi-preparative, anion-exchange HPLC column running line A = 9:1 ddH2O/MeCN, line B with 1 M LiCl in 9:1 ddH2O/MeCN 25 mM Tris (pH 7.4) and a flow rate = 3 mL/min while monitoring the elution via the absorbance at 260 nm was used. Following purification, the samples were dialyzed in ddH2O for 36 h in a 4 L beaker. The water was changed three times to minimize the final salt concentration. Following dialysis, the samples were lyophilized and resuspended in ddH2O. The oligodeoxynucleotide concentrations were determined by measuring the absorbance at 260 nm and using the primary sequence to estimate the extinction coefficient for each sequence studied. All oligodeoxynucleotides were stored at -20 °C when not being used. The oligodeoxynucleotides were annealed in either a HEPES or phosphate buffer in the desired salt at 90 °C for 5 min. The samples were then slowly cooled to room temperature and stored at 4 °C for 24 hours prior to their study.

ACS Paragon Plus Environment

7

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1H-NMR

Page 8 of 34

Analysis. The G4 strands were annealed in a 300-μL solution of 20 mM KPi

(pH 7.0) and 50 mM KCl in 9:1 H2O/D2O at ~300 µM concentration. The annealed samples were placed in D2O-matched Shigemi NMR tubes. All experimental measurements were collected on an 800-MHz NMR spectrometer with the temperature set to 24 °C. Samples were scanned 2048 times using the Watergate solvent suppression pulse sequence. Circular Dichroism Analysis. The PQS samples were annealed at 20 μM concentration with 5 mM NaCl, 150 mM KCl, 25 mM HEPES (pH 7.5), or KPi (pH 7.5) (experiment dependent). The circular dichroism (CD) data were collected in a 0.2-cm quartz cuvette at room temperature. The solvent background was subtracted from the recorded data and then the spectra were normalized on the y-axis to units of molar ellipticity ([θ] deg*cm2*dmol-1). Thermal Melting Analysis. The thermal melting (Tm) values were measured at concentrations of 5 μM oligodeoxynucleotide in buffered solutions with biologically relevant K+ and Na+ concentrations (25 mM KPi or HEPES pH 7.4, 150 mM KCl, and 5 mM NaCl). The experiments were started at an initial temperature of 20 °C that was held for 10 minutes followed by heating at 0.5 °C/minute and equilibrating at each 1 °C increment for 1 min. Measurements were taken at 295 nm for every 1 °C temperature change starting at 20 °C and ending at 100 °C. Plots of absorbance at 295 nm vs. temperature were constructed, and the Tm values were determined by a two-point analysis protocol using the Shimadzu software. Thioflavin T Fluorescence Analysis. The oligodeoxynucleotide samples were annealed at 4 μM concentrations in 25 mM KPi (pH 7.5), 150 mM KCl, and 5 mM NaCl.

ACS Paragon Plus Environment

8

Page 9 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The 4 μM solutions were diluted to a final concentration of 1 μM oligodeoxynucleotide and 0.5 μM thioflavin T (3,6-dimethyl-2-(4-dimethylaminophenyl) benzothiazolium cation; ThT) before making the fluorescence measurements. The samples were placed in a 0.2-cm quartz cuvette, and the fluorescence was measured using a fluorescence spectrometer. The excitation wavelength was set to 425 nm and the emission spectra were collected over the 440-700 nm range at a 2 nm interval. The spectrum of buffer plus thioflavin T without sample was used as a baseline and was subtracted from each spectrum prior to plotting the data. The c-MYC G4 sequence (5`-GG GTG GGG AGG GTG GGG-3`) was studied as a positive control. The sequence 5`-ATG CTT GGA TGG ACG TTC GAC-3` was used as a single-stranded control. Polyamine and Magnesium Studies. The ERCC3 and psaK1 G4s found in the experiments that follow were annealed at 20 μM concentrations in 25 mM HEPES (pH 7.5), 150 mM KCl, and 5 mM NaCl. For the putrescine and norspermidine titrations, CD and Tm measurements were taken at 0 mM, 50 mM, 100 mM, and 500 mM and 0 mM, 10 mM, 25 mM, 50 mM, 100 mM, respectively. The magnesium chloride hexahydrate (MgCl26H2O) titrations were carried out at 0 mM, 5 mM, 10 mM, 15 mM, 20 mM, and 50 mM concentrations. The data were background subtracted and normalized before being plotted and compared. Results and Discussion Structural characterization of C. reinhardtii PQSs.

After searching through a

database of relevant DNA repair genes and photosystem genes, the PQSs we identified were ranked based on the sequences that were most likely to fold predicted by the quadparser algorithm,59,60 using search parameters of ≥4 G tracks with ≥ 3Gs per track

ACS Paragon Plus Environment

9

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 34

and loops ≤ 12 nts. The selected PQSs (Table 1) were sorted into three categories: DNA repair genes (ERCC1, ERCC3, LIG1, ATR, DMC1, GPXH1, and GPXH2), the photosystem genes (CYC3, HCF136 (1-2), UQCC1 (1-2), PS2 family protein, psaK1, PS2 subunit 1, psaE, PS2 subunit 3), and sequences with 5 or more G-tracks (LIG1 5G track, GPXH1, and DMC1 5G track). Each sequence studied was from the coding (nontranscribed) strand. All of the sequences had 2-nt overhangs because previous studies have shown that their absence may affect the G4 fold topology.61 The PQSs were tested for their ability to adopt G4 folds by a variety of techniques (i.e., CD, 1H-NMR, ThT fluorescence, and Tm analysis). Each technique required slightly different physical conditions to achieve the analysis; although, we consistently placed K+ ions in the solution to ensure G4 folding occurred around the most relavent cation in C. reinhardtii cells. Following synthesis and HPLC purification of the DNA strands, the 19 PQSs were annealed in a standard NMR buffer at a concentration of ~300 μM.

ACS Paragon Plus Environment

10

Page 11 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Table 1. Potential G-quadruplex sequences studied in the C. reinhardtii genome. Gene Sequence ERCC1

5`- GCGGGGGAGGGGAGGGGAGGGGAA-3`

ERCC3

5`-AAGGGGAGAGGGGAAAAGGGAGAAGGGGTT-3`

LIG1

5`-GCGGGGTGGGTTGGGGTAGGGGTC-3`

ATR1

5`-GCGGGTGCGGGTGCGGGTGCGGGTG-3`

DMC1

5`-GTGGGTGGGTGTGGGTGGGTT-3`

CYC3

5`-GCGGGGA GGGGAGGGGA GGGCA-3`

HCF136 (1)

5`-TTGGGGGAA GGGGGGAGGG GGAGGGGTG-3`

HCF136 (2)

5-GAGGGG AGAGGGTGGG AGAGGGTG-3

UQCC1 (1)

5`-TTGGGTGGGA GGGTGGGAG-3`

UQCC1 (2)

5`-AC GGGTGGGTGC GGGTCCGGGG CT-3`

PS2 family protein 5`-GCGG GGAGTGGGTG AGGGCTGGGG CG-3` psaK1

5`-ATGGGCCTGGGCGTGGG TCTAGGGGGAG-3`

PS2 Subunit1 psaE

5`-GTGGGAGCGGGAGCGGGAGCGGGAG-3` 5`-ACGGGCGGGCGGGCGGGCG-3`

PS2 subunit 3

5`-CA GGGGGCAGGGGCTAGGGGGC AGGGGGCA-3`

LIG1 5G track

5`-GCGGGGTGGGTTGGGGTAGGGGTCGCTCTGCGGGGTT-3`

GPXH1

5`-AAGGGTGGGCAGGCGGAGGGGGCCGGGCGGGCTGGGG TG-3` 5`-CAGGGTGCAGCCGGGTTGGGTGCGGCCGGGTC-3`

GPXH2 DMC1 5G track

5`-GCGGGTGGGTGCGTGTCGTGTGGGTGGGTGTGGG TGGGTT-3`

The NMR salt and buffer concentrations selected for the NMR studies were at a low ionic strength (70 mM K+) to provide the best initial parameters for determination of G4 folding on the basis of a literature report.62 Hoogsteen base pairs between G nucleotides in a G tetrad are identified by the presence of imino protons with chemical shifts in the 10-12 ppm range in the 1H-NMR spectrum, and they correlate with the formation of a G4 structure (Figures 2A and S1-S19).62 With the exception of the GPXH2 sequence, all of the PQSs showed imino proton shifts (10-12 ppm) consistent with G4 formation. While these findings cannot exclude other structural folds such as

ACS Paragon Plus Environment

11

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 34

hairpins and triplexes, the data support the possibility of the sequences folding into G4s. With respect to GPXH2, it is likely that the long loops (1st and 3rd loops have 7 nts) resulted in this sequencing failing to adopt a stable G4 fold under the NMR conditions studied. Our findings corroborate the commonly reported observation that sequences with long G-runs tend to have broad, less resolved peaks and those with short G-runs produced more resolved structures.63–65

ACS Paragon Plus Environment

12

Page 13 of 34

B

A

psaK1 5`-AT GGG CCT GGG CGT GGG TCTA GGGGG AG-3`

[θ] deg*cm2*dmol-1

4E+07

13.0

12.0

11.0

10.0

ppm

psaK1

3E+07 2E+07 1E+07 0E+00 -1E+07 -2E+07 220

240

260

280

300

320

Wavelength (nm)

DMC1 5`-GT GGG T GGG TGT GGG T GGG TT-3`

[θ] deg*cm2*dmol-1

3E+07

DMC1

2E+07 2E+07 1E+07 5E+06 0E+00

-5E+06 -1E+07

13.0

12.0

11.0

10.0

ppm

-2E+07 220

ERCC3 5`-AA GGGG AGA GGGG AAAA GGG AGAA GGGG TT-3`

[θ] deg*cm2*dmol-1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

240

260

280

Wavelength (nm)

6E+07

300

320

ERCC3

4E+07 2E+07 0E+00

-2E+07 -4E+07

13.0

12.0

11.0

10.0

ppm

-6E+07 220

240

260

280

300

320

Wavelength (nm)

Figure 2. Examples of (A) 1H-NMR and (B) CD spectra for psaK1, DMC1,and ERCC3. The defined peaks in the 10-12 ppm range for DMC1 support the idea that the G4 fold exists in one major folded state. In contrast, the broad undefined peaks observed in the ERCC3 G4 suggest this sequence adopts a polymorphic mixture of folds. For the psaK1 sequence the defined peaks observed in the 1H-NMR support a single fold and the CD supports a hybrid fold.

ACS Paragon Plus Environment

13

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 34

In the next studies, CD spectroscopy was utilized to differentiate between G4 folds (Figures 2B and S1-S19). Each experiment was performed in a buffer that best mimicked the physiological conditions in C. reinhardtii (25 mM KPi (pH 7.5), 150 mM KCl, and 5 mM NaCl). All of the oligomers were used at a concentration of 20 µM. Parallel-stranded G4s show a λmax = 262 nm and λmin ≈ 245 nm, while antiparallel strands have λmax = 295 nm and λmin ≈ 260 nm. Mixed hybrid conformations or a mixture of parallel and antiparallel-stranded topologies have signatures at λmax = 262 and 290295 nm and λmin ≈ 245 nm. Lastly, a λmax = 265-280 nm and λmin ≈ 240 nm signifies that no known G4 folding event occurred.66,67 With the exception of two sequences, the

sequences adopted parallel-stranded folds. The ERCC3 and psaK1 sequences were the only ones that exhibited a mixed folding topology on the basis of their CD spectra (Figures 2B and S1-S19). The broad, unresolved peaks observed in the 1H-NMR of ERCC3 correlated with the CD measurements (Figure S5), suggesting a high level of structural polymorphism. On the other hand, the psaK1 sequence furnished a well resolved

1H-NMR

spectrum suggesting one fold in solution (Figure S17).

The

differences observed for psaK1 between the 1H-NMR and CD analyses likely results from these two techniques requiring different conditions to obtain the data (1H-NMR = 300 µM DNA, 20 mM KPi (pH 7), and 50 mM KCl at 20 °C; CD = 20 µM DNA, 25 mM KPi (pH 7.4), 5 mM NaCl, 150 mM KCl, at 20 °C). The GPXH2 PQS showed a broad CD spectrum and no imino peaks in the NMR experiment (Figure S8), and therefore, it was assumed that the sequence did not fold. In order to further characterize the G4 sequences, thioflavin T (ThT) experiments were conducted (Figure 3A). Following a protocol developed by the Mergny

ACS Paragon Plus Environment

14

Page 15 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

laboratory,68 the fluorescence emission enhancement of ThT when bound to the folded G4s was measured for the 19 G4s. A >20 FI490

nm/FI0

enhancement is commonly

reported in the literature, and this was used as a lower limit to support G4 formation.68 The well-established c-MYC G4-forming sequence was used as a positive control, and a single-stranded DNA oligodeoxynucleotide was used as a negative control.69,70

Figure 3. Plots of the (A) ThT and (B) Tm data for the 19 PQSs characterized in the present study. The sequences marked with >90 °C had Tm values too high to be measured and the one with marked with an “*” failed to fold and show a thermal induced transition.

ACS Paragon Plus Environment

15

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 34

In total, 17 of the 19 PQSs showed a >20 FI490 nm/FI0 enhancement (Figure 3A). The HCF136(2), HCF136(1), psaK1, PS2 subunit 3, and ERCC3 showed very high enhancements of >200 FI490 nm/FI0 consistent with the NMR and CD results that these sequences adopt G4 folds. In contrast, the GPXH2 sequence yielded a positive ThT enhancement of 29 FI490 nm/FI0 that is inconsistent with the lack of 1H-NMR G4-specific imino peaks. The fluorescence enhancement in the GPXH2 case is therefore likely an artifact on the basis of further studies demonstrating ThT is not strongly G4 selective.71 The other sequences provided ThT fluorescence enhancements consistent with G4 folding, further supporting the observations made in the 1H-NMR and CD studies discussed above. The previous studies addressed the ability of the PQSs to adopt G4 folds, and in the final study we measured the Tm values for the sequences to determine the stability of the folded G4s. Thus, Tm measurements were made on each of the 19 PQSs (Figure 3B). The decrease in absorbance at 295 nm was measured over an increasing temperature range of 20 to 100 °C to determine the denaturing Tm values.

All

measurements were made in triplicate with relatively small error between the values (average error ~1.5 °C). From the temperature-dependent UV transitions at 295 nm, Tm values were measured for 16 of the 19 PQSs studied. A broad range of Tm values were obtained, ranging from 52 °C (ERCC3) to 85 °C (DMC1). No values were obtained for the GPXH2, DMC1 5G, HCF136(2), and the psaE PQSs. As mentioned previously, the GPXH2 sequence showed no imino peaks in the 1H-NMR and the CD exhibited peaks associated with single-stranded DNA (λmax = 267 nm and λmin = 240 nm, Figures S8). On the basis of these previous results, a G4-specific UV transition at 295 nm was not

ACS Paragon Plus Environment

16

Page 17 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

expected. With respect to the DMC1 5G and psaE PQSs, the CD, 1H-NMR, and ThT data all support G4 folding. Denaturing events were expected to be observed in the thermal melting curve, but transitions were not observed to obtain Tm values. Some G4s can have Tm values > 95 °C.72 Therefore, to reduce the G4 stability, we lowered the potassium concentration from 150 mM to 5 mM KCl, and the Tm experiments were repeated. Again, the sequences did not show an observable transition during the experiment. The experiments were repeated a third time in 500 μM KCl, but again no Tm values were obtained from lowering the KCl concentration. Therefore, we conclude the G4s likely adopt very stable folds, and fail to unfold in the temperature range studied (20–100 °C). This argument is further supported by the 1H-NMR and CD data for the sequences that provided signatures consistent with G4 folding; therefore, taking this into consideration, the results were interpreted as false negatives. These four biophysical methods were used to assay the probability of G4 folding events for the 19 PQSs. Sequences that showed positive results for all four of the experimental methods were classified as having a high probability of folding, while sequences that had positive results for three of the experimental methods were classified as having a medium probability of folding. Lastly, sequences that had two or less positive results were classified as having a low probability of folding to a G4. Strictly adhering to this criteria, 14 of the 19 G4s had a high probability of folding; however, the two G4s psaE and DMC1 5G technically failed the Tm experiments, but this is likely due to the very stable nature of the G4 folds. The CD and 1H-NMR spectroscopic studies, the two most structurally informative experimental methods, along with the ThT studies strongly supported folding for the sequences. Furthermore, these two sequences are

ACS Paragon Plus Environment

17

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 34

ideal for folding into stable G4s given their short, pyrimidine-rich, loop contents. Considering these observations, 17 of the 19 PQSs had a high probability of folding. The PS2 subunit 1 PQS failed the ThT test, and thus had a medium probability of folding. Lastly, the GPXH2 PQS only tested positive for the ThT experiment, and it failed all other experimental tests. Hence, 18 of the 19 PQSs can be classified as having either a high or medium probability of folding on the basis of these four biophysical assays. Using several complementary methods to verify folding is an excellent way to exclude false positives that are sometimes observed when using a single biophysical method for determination of G4 folding. This approach is followed by the Mergny laboratory and others.73,74 An example sequence screened out by the complementary approach is the GPXH2 sequence that passed the ThT evaluation to support a G4 folding; however, the 1H-NMR, CD, and Tm data were all negative for G4 folding.

Effects of physiologically relevant concentrations of polyamines on the G4 stability. Due to the high physiological polyamine concentrations in C. reinhardtii, we decided to measure the effects of physiologically relevant concentrations of polyamines on the stabilities of two PQSs from the previous studies. The ERCC3 (Figure 2) and the psaK1 (Figure 2) sequences were selected for the polyamine studies because both exhibited a mixed topology by CD, and relatively low thermal melting temperatures were observed for ERCC3 (Tm = 52.0 °C) and psaK1 (Tm = 65.0 °C) in comparison to the cohort of PQSs examined in this study (Figure 3B). These two sequences allow us to determine whether polyamines cause a change in the structure on the basis of the CD signatures, alter the thermal stabilities of the folds, or both. The ERCC3 gene codes for

ACS Paragon Plus Environment

18

Page 19 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

the excision repair 3 protein, which is an ATP-dependent DNA helicase, and it is part of the transcription factor IIH basal transcription factor complex.75 The protein is active in the nucleotide excision repair pathway. The psaK1 gene codes for the photosystem 1 reaction center subunit psak1 protein, which is a component of the photosystem I (PSI) complex.76 It is unclear how the subunit integrates with the rest of PS1, but it contacts the membrane layer on the stromal side.76 It is important to test whether the equilibrium between folds could be shifted by the introduction of polyamines to determine how the unique context of C. reinhardtii impacts the G4 folds. Putrescine and norspermidine were selected for experiments because they exist in the highest cellular concentrations in C. reinhardtii (120 mM and 20 mM, respectively; see Figure S20 for the polyamine structures).50 Next, CD and Tm measurements were evaluated for the ERCC3 and psaK1 PQSs with increasing concentrations of the polyamines. For putrescine, measurements were made at 50 mM, 100 mM, and 500 mM, and for norspermidine measurements were made at 10 mM, 25 mM, 50 mM, and 100 mM, which provide concentration ranges indicative of intracellular concentrations in C. reinhardtii (Figures 4 and S20).

ACS Paragon Plus Environment

19

Biochemistry

1E+08

B

ERCC3 Norspermidine Studies 100 mM 50 mM 25 mM 10 mM 0 mM

8E+07 4E+07 0E+00 -4E+07 -8E+07 220

240

260

280

300

320

ERCC3 Putrescine Studies 500 mM 100 mM 50 mM 0 mM

8E+07

[θ] deg*cm2*dmol-1

A [θ] deg*cm2*dmol-1

4E+07 0E+00 -4E+07 -8E+07 220

240

260

C

80 75 70 65 60 55 50 45 40

280

Wavelength (nm)

Wavelength (nm)

Tm (°C)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 34

300

320

Putrescine Norspermidine 0

20

40

60

80

100

Polyamine Concentration (mM)

Figure 4. The impact of polyamines on the ERCC3 PQS. Stacked CD spectra when (A) norspermidine or (B) putrescine was titrated into the PQS. (C) Plots of the Tm values measured at each polyamine titration point evaluated. In Figure S20 all Tm values measured are provided.

With the ERCC3 PQS, titration of norspermidine or putrescine (Figures 4A and 4B) into the solution led to CD spectra that were consistently the same shape. On the basis of the CD spectra, neither polyamine affected the overall fold of the ERCC3 G4. Based on the Tm, the thermal stabilitity increased with the addition of either of the polyamines (Figure 4C). The increase in stability was first observed with the addition of 10 mM putrescine. A drastic increase in stability was observed with the addition of 50, 100, and 500 mM putrescine. At these three concentrations, ERRC3 showed a thermal stability increase of approximately 18 °C. It is likely that the G4 binding sites were saturated with polyamines, and the increased concentration had little to no effect after

ACS Paragon Plus Environment

20

Page 21 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

reaching saturation. The norspermidine titrations demonstrated an increase in stability starting at 10 mM, which increased the melting temperature by 17 °C. A further increase in stability was observed at 50 mM showing finally a 23 °C increase in melting temperature. However, the highest concentration of norspermidine studied (100 mM) did not appear to further stabilize the G-quadruplex fold. In fact, at 100 mM norspermidine the Tm appeared to lower by 4 °C. Collectively these results demonstrate that both polyamines increase the thermal stability of the ERCC3 G4 fold. The psaK1 G4 was also titrated with putrescine and norspermidine. The CD spectra (Figures 5A and 5B) showed that the titration of 10 mM norspermidine shifted the structural equilibrium from mixed to parallel, and the structural shift was maximized with the addition of 50 mM norspermidine with a higher concentration (i.e., 100 mM) having no additional effect (Figure 5B). This suggests that the G4 remains in a dynamic mixture but the population of parallel-folded G4s increases relative to the antiparallel G4s in the presence of norspermidine. The titration of putrescine appeared to have no effect on the structural equilibrium as no significant changes in the shape of the CD spectra were observed as the concentration of putrescine was increased (Figure 5A). These observations suggest that norspermidine is more effective at binding and shifting the psaK1 G4 fold than putrescine.

ACS Paragon Plus Environment

21

Biochemistry

4E+07 3E+07 2E+07

B

psaK1 Putrescine Studies 500 mM 100 mM 50 mM 0 mM

1E+07 0E+00 -1E+07 -2E+07 220

psaK1 Norspermidine Studies

4E+07

[θ] deg*cm*2dmol-1

A [θ] deg*cm2*dmol-1

100 mM 50 mM 25 mM 10 mM 0 mM

3E+07 2E+07 1E+07 0E+00

-1E+07

240

260

280

300

320

-2E+07 220

240

Wavelength (nm)

260 280 Wavelength (nm)

300

320

C

Tm (°C)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 34

80

Putrescine

75

Norspermidine

70 65 60 55 50

0

20 40 60 80 100 Polyamine Concentration (mM)

Figure 5. The CD spectra and Tm values for the psaK1 G4-forming sequence studied in the presence of the polyamines norspermidine and putrescine. The CD spectra were recorded in the presence of (A) putrescine and (B) norspermidine, as well as (C) Tm values for the sequence determined at each point during the titration. In Figure S20 all Tm values measured are provided.

The Tm values for the titration of psaK1 with putrescine and norspermidine demonstrated a destabilization effect for putrescine, in which the Tm value decreased by 5 °C (Figure 5C), while in contrast a stabilization effect for norspermidine was observed by an increase in the Tm value by 6 °C (Figure 5C).

These observations suggest

putrescine and norspermidine can impact the G4 stability differently. High polyamine concentrations can have destabilization effects; in the case of psaK1, the high putrescine concentration appears to destabilize the fold, and the effect was observed starting at 10 mM and was essentially the same from 50-500 mM. With

ACS Paragon Plus Environment

22

Page 23 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

respect to the norspermidine titrations, the effect was first observed at 1 mM norspermidine. The increase in stability leveled out with the addition of 10 mM norspermidine.

The increase in stability may be due to a favorable electrostatic

interaction of the charged norspermidine with the phosphate backbone. It could also be due to a shift in the structural equilibrium observed with the addition of 10 mM norspermidine. Shifting the population from a dynamic mixture to favor a parallel-folded G4 could also increase the Tm values because parallel-folded G4s are typically more stable than hybrid or antiparallel-folded G4s in the mixture. Overall, the results support the folding of the psaK1 G4 in the physiologically relevant concentrations of norspermidine while putrescine may slightly destabilize the G4 fold; however, the thermal stability measured with 500 mM putrescine present was still >30 °C above the growth conditions of this C. reinhardtii.

Structural shifts induced by the introduction of MgCl26H2O. Folding of PQSs to G4s has been reported in the presence of metal ions other than K+. Divalent cations such as Sr2+, Ca2+, Mg2+, and Ba2+ have been demonstrated to affect G4 folding;49,66,77 for example, Ca2+ can shift a mixed fold or an antiparallel fold to a parallel-folded G4.49 Additionally, G4s have been proposed as metal ion sensors.78 Soil algae such as C. reinhardtii are exposed to drastic changes in soil moisture and salt concentrations in their natural environment.43,45 Considering that G4s may function as metal ion sensors, we examined whether the characterized G4s exhibiting a mixed fold would display structural shifts with the addition of Mg2+. Two G4s were selected for the study; the G4s ERCC3 (Figure 2) and psaK1 (Figure 2) are both polymorphic and exhibit a mixture of

ACS Paragon Plus Environment

23

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 34

folds with the monovalent cations, and thus had the potential to show a structural shift upon addition of Mg2+. The CD spectra of ERCC3 and psaK1 were measured at 0 mM, 1 mM, 5 mM, 10 mM, 20 mM, and 50 mM Mg2+ (Figures 6A and 6B). With respect to ERCC3, the CD spectra demonstrated an increase in the intensity of the signal with increasing Mg2+ concentration. This increase may result from stabilization of the fold by Mg2+ interacting with the phosphate backbone of the DNA. It could also be due to a shift from an intramolecular folded structure to the formation of intermolecular G4s. Oligomerization events have been reported to be induced by introduction of Mg2+.79 Further studies to understand whether high Mg2+ concentrations led to oligomerization were not pursued. Lastly, Tm measurements were taken at varying concentrations of G4 in the presence of 50 mM Mg2+; changes in Tm value observed with a concentration change reflect a multimolecular fold, while an insignificant Tm value change supports a unimolecular fold. In the concentration-dependent Tm studies, no changes in the Tm values were observed. These results support the G4 being intramolecularly folded after the addition of Mg2+. On the basis of the Tm measurements, Mg2+ stabilized the ERRC3 G4 by approximately 10 °C (Figure 6C). While this is a significant increase in stability, the stability of ERRC3 did not increase with higher concentrations of Mg2+. This may result from saturation of the binding sites on the G4 fold by Mg2+ at the first point of the titration series.

ACS Paragon Plus Environment

24

Page 25 of 34

B

ERCC3 Mg2+-Dependent CD Studies 50 mM 20 mM 10 mM 5 mM 1 mM 0 mM

8E+07 6E+07 4E+07 2E+07 0E+00

-2E+07 220

240

260

280

300

psaK1 Mg2+-Dependent CD Studies

4E+07

[θ] deg*cm2*dmol-1

A 1E+08 [θ] deg*cm2*dmol-1

50 mM 20 mM 10 mM 5 mM 1 mM 0 mM

3E+07 2E+07 1E+07 0E+00

-1E+07 220

320

240

C

70

ERCC3

Mg2+-Dependent

Tm Studies

D

65 60

°C

55 50 45 40 35 30

0 mM

1 mM

5 mM

10 mM 20 mM 50 mM

[Mg2+]

260

280

300

320

Wavelength (nm)

Wavelength (nm)

°C

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

75 70 65 60 55 50 45 40 35 30

psaK1 Mg2+-Dependent Tm Studies

0 mM 1 mM 5 mM 10 mM 20 mM 50 mM

[Mg2+]

Figure 6. Analysis of the Mg2+ concentration dependency in the Tm values and CD spectra for the psaK1 and ERCC3 G4s. The Mg2+ titration CD spectra for (A) psaK1 and (B) ERCC3 G4-forming sequences, and the Tm values for the (C) psaK1 and (D) ERCC3 sequences as Mg2+ was titrated in to the samples. With respect to psaK1, the CD spectra demonstrated a major structural shift from a mixture of folds to favor a parallel-stranded fold with increasing concentration of Mg2+ (Figure 6B). At a concentration of 10 mM Mg2+, the peak at 293 nm disappeared, signifying that the G4 existed favorably in the parallel-folded state. The Tm data identified an increase in stability with addition of Mg2+ (Figure 6D). At 10 mM Mg2+, the stabilizing effect leveled out, and no further increase in stability was observed. While this increase in stability might be due to the interaction of the Mg2+ cation with the phosphate backbone, it is more likely that the increased stability is due to the nature of

ACS Paragon Plus Environment

25

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 34

the fold itself. Parallel-folded G4s are generally more stable in comparison to antiparallel or mixed fold G4s, as has been observed in our laboratory.74 These studies demonstrated that the physiological conditions in C. reinhardtii support the folding of G4s. With respect to the ERCC3 G4, the high polyamine concentrations drastically stabilized the fold (Figure 4). Furthermore, the majority of the PQSs studied were demonstrated to fold in solution under physiologically relevant conditions. The examination of the effects of Mg2+ on both the psaK1 and ERCC3 G4s further supports that these G4s have the potential to fold in the unusual conditions of C. reinhardtii cells (Figure 6). As a whole, C. reinhardtii presents a unique context for studying DNA secondary structure in comparison to the biological context of other organisms that are currently used for studying G4s. Furthermore, these observations demonstrate that DNA repair gene promoters in C. reinhardtii also possess PQSs that could possibly function in regulating gene expression similar to our previous observations for the human genome.58 Conclusions In the present report, we examined the GC-rich C. reinhardtii genome for PQSs. We focused our attention on promoter regions in genes involved in both photosynthesis and DNA repair. We selected 19 PQSs from a total of 17 gene promoter regions for our study, and we structurally characterized all 19 of the PQSs using four complementary methods. A total of 18 of the 19 PQSs showed a high or, in one case, medium probability of folding in vitro based on our experimental results. Due to our interest in how the biological conditions in C. reinhardtii affect G4 folding, we examined the effects of high polyamine concentration for the two most polymorphic G4s, psaK1, and ERCC3.

ACS Paragon Plus Environment

26

Page 27 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

The ERCC3 G4 fold demonstrated no structural changes in the presence of either one of the polyamines; however, a drastic increase in thermal stability was observed in the presence of both polyamines (Figures 4 and 5). The psaK1 showed a shift in the structural equilibrium in the presence of norspermidine, putrescine had no visible effect (Figure 5). An increase in G4 stability was observed with added norspermidine, and a slight decrease in stability was observed for the titration of putrescine. Moreover, the addition of Mg2+ shifted the structural equilibrium of psaK1 to a purely parallel fold (Figure 6). This drastic shift appeared to be unique to the addition of Mg2+, and it was not observed with either of the polyamines. Our work demonstrates that C. reinhardtii is an ideal case for studying G4s in photosynthetic organisms. The unique physiological conditions observed in C. reinhardtii set it apart from most other eukaryotes in which the majority of G4 research has been conducted. The experiments reported here provide the initial step in examining the propensity of G4s selected from a photosynthetic organism to fold in vitro in physiologically relevant conditions. The current results will be used to further explore the prevalence of G4s in different organisms, and the results could be used for cellular studies designed to examine the biological role of G4s in C. reinhardtii. Supporting Information The Supporting Information is available free of charge on the ACS Publications website at DOI: XXX. CD spectra, 1H-NMR spectra, ThT values, and Tm values of the 19 PQSs (PDF)

ACS Paragon Plus Environment

27

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 34

Acknowledgments This work was supported by the National Science Foundation (CHE-1507813 and CHE1808745) and by the National Science Foundation Graduate Research Fellowship Program (Fellow ID: 2014175577). The oligodeoxynucleotides were provided by the DNA/Peptide core facility at the University of Utah that is supported in part by a NCI Cancer Center Support Grant (P30 CA042014).

Conflict of Interest The authors declare no competing financial interests regarding these studies.

References (1) Wing, R., Drew, H., Takano, T., Broka, C., Tanaka, S., Itakura, K., and Dickerson, R. E. (1980) Crystal structure analysis of a complete turn of B-DNA. Nature 287, 755–758. (2) Bochman, M. L., Paeschke, K., and Zakian, V. A. (2012) DNA secondary structures: stability and function of G-quadruplex structures. Nat. Rev. Genet. 13, 770–780. (3) Bacolla, A., and Wells, R. D. (2004) Non-B DNA conformations, genomic rearrangements, and human disease. J. Biol. Chem. 279, 47411–47414. (4) SantaLucia, J., and Hicks, D. (2004) The thermodynamics of DNA structural motifs. Annu. Rev. Biophys. Biomol. Struct. 33, 415–440. (5) Choi, J., and Majima, T. (2011) Conformational changes of non-B DNA. Chem. Soc. Rev. 40, 5893–5909. (6) Lane, A. N., Chaires, J. B., Gray, R. D., and Trent, J. O. (2008) Stability and kinetics of G-quadruplex structures. Nucleic Acids Res. 36, 5482–5515. (7) Wang, Y., and Patel, D. J. (1993) Solution structure of a parallel-stranded Gquadruplex DNA. J. Mol. Biol. 234, 1171–1183. (8) Parkinson, G. N., Lee, M. P. H., and Neidle, S. (2002) Crystal structure of parallel quadruplexes from human telomeric DNA. Nature 417, 876–880.

ACS Paragon Plus Environment

28

Page 29 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

(9) Smirnov, I., and Shafer, R. H. (2000) Effect of loop sequence and size on DNA aptamer stability. Biochemistry 39, 1462–1468. (10) Burge, S., Parkinson, G. N., Hazel, P., Todd, A. K., and Neidle, S. (2006) Quadruplex DNA: sequence, topology and structure. Nucleic Acids Res. 34, 5402–5415. (11) Tippana, R., Xiao, W., and Myong, S. (2014) G-quadruplex conformation and dynamics are determined by loop length and sequence. Nucleic Acids Res. 42, 8106– 8114. (12) Hänsel-Hertsch, R., Di Antonio, M., and Balasubramanian, S. (2017) DNA Gquadruplexes in the human genome: detection, functions and therapeutic potential. Nat. Rev. Mol. Cell Biol. 18, 279–284. (13) Todd, A. K., Johnston, M., and Neidle, S. (2005) Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic Acids Res. 33, 2901–2907. (14) Chambers, V. S., Marsico, G., Boutell, J. M., Di Antonio, M., Smith, G. P., and Balasubramanian, S. (2015) High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat. Biotechnol. 33, 877–881. (15) Hänsel-Hertsch, R., Beraldi, D., Lensing, S. V., Marsico, G., Zyner, K., Parry, A., Di Antonio, M., Pike, J., Kimura, H., Narita, M., Tannahill, D., and Balasubramanian, S. (2016) G-quadruplex structures mark human regulatory chromatin. Nat. Genet. 48, 1267–1272. (16) Fleming, A. M., Ding, Y., and Burrows, C. J. (2017) Oxidative DNA damage is epigenetic by regulating gene transcription via base excision repair. Proc. Natl. Acad. Sci. U. S. A. 114, 2604–2609. (17) Fleming, A. M., Zhu, J., Ding, Y., and Burrows, C. J. (2017) 8-Oxo-7,8dihydroguanine in the context of a gene promoter G-quadruplex is an on–off switch for transcription. ACS Chem. Biol. 12, 2417–2426. (18) Sun, D., Liu, W.-J., Guo, K., Rusche, J. J., Ebbinghaus, S., Gokhale, V., and Hurley, L. H. (2008) Proximal promoter region of the human vascular endothelial growth factor gene has a G-quadruplex structure which can be targeted by G-quadruplex-interactive agents. Mol. Cancer Ther. 7, 880–889. (19) Moye, A. L., Porter, K. C., Cohen, S. B., Phan, T., Zyner, K. G., Sasaki, N., Lovrecz, G. O., Beck, J. L., and Bryan, T. M. (2015) Telomeric G-quadruplexes are a substrate and site of localization for human telomerase. Nat. Commun. 6, 7643. (20) Sauer, M., and Paeschke, K. (2017) G-quadruplex unwinding helicases and their function in vivo. Biochem. Soc. Trans. 45, 1173–1182.

ACS Paragon Plus Environment

29

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 34

(21) Estep, K. N., Butler, T. J., Ding, J., and Brosh, R. M. (2017) G4-interacting DNA helicases and polymerases: potential therapeutic targets. Curr. Med. Chem. doi:10.2174/0929867324666171116123345. (22) Huppert, J. L., and Balasubramanian, S. (2007) G-quadruplexes in promoters throughout the human genome. Nucleic Acids Res. 35, 406–413. (23) Maizels, N., and Gray, L. T. (2013) The G4 Genome. PLOS Genet. 9, e1003468. (24) Cogoi, S., Ferino, A., Miglietta, G., Pedersen, E. B., and Xodo, L. E. (2018) The regulatory G4 motif of the Kirsten ras (KRAS) gene is sensitive to guanine oxidation: implications on transcription. Nucleic Acids Res. 46, 661–676. (25) Maizels, N. (2012) G4 motifs in human genes. Ann. N. Y. Acad. Sci. 1267, 53–60. (26) Guo, J. U., and Bartel, D. P. (2016) RNA G-quadruplexes are globally unfolded in eukaryotic cells and depleted in bacteria. Science 353, 353–360. (27) Lipps, H. J., and Rhodes, D. (2009) G-quadruplex structures: in vivo evidence and function. Trends Cell Biol. 19, 414–422. (28) Henderson, A., Wu, Y., Huang, Y. C., Chavez, E. A., Platt, J., Johnson, F. B., Brosh, R. M., Sen, D., and Lansdorp, P. M. (2014) Detection of G-quadruplex DNA in mammalian cells. Nucleic Acids Res. 42, 860–869. (29) Mullen, M. A., Assmann, S. M., and Bevilacqua, P. C. (2012) Toward a digital gene response: RNA G-quadruplexes with fewer quartets fold with higher cooperativity. J. Am. Chem. Soc. 134, 812–815. (30) Kwok, C. K., Ding, Y., Shahid, S., Assmann, S. M., and Bevilacqua, P. C. (2015) A stable RNA G-quadruplex within the 5′-UTR of Arabidopsis thaliana ATR mRNA inhibits translation. Biochem. J. 467, 91–102. (31) Yang, M., Wu, Y., Jin, S., Hou, J., Mao, Y., Liu, W., Shen, Y., and Wu, L. (2015) Flower bud transcriptome analysis of Sapium sebiferum (linn.) roxb. and primary investigation of drought induced flowering: pathway construction and G-quadruplex prediction based on transcriptome. PLoS ONE 10, e0118479. (32) Cho, H., Cho, H. S., Nam, H., Jo, H., Yoon, J., Park, C., Dang, T. V. T., Kim, E., Jeong, J., Park, S., Wallner, E.-S., Youn, H., Park, J., Jeon, J., Ryu, H., Greb, T., Choi, K., Lee, Y., Jang, S. K., Ban, C., and Hwang, I. (2018) Translational control of phloem development by RNA G-quadruplex–JULGI determines plant sink strength. Nat. Plants 6, 376–390.

ACS Paragon Plus Environment

30

Page 31 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

(33) Griffin, B. D., and Bass, H. W. (2018) Review: Plant G-quadruplex (G4) motifs in DNA and RNA; abundant, intriguing sequences of unknown function. Plant Sci. 269, 143–147. (34) Mullen, M. A., Olson, K. J., Dallaire, P., Major, F., Assmann, S. M., and Bevilacqua, P. C. (2010) RNA G-Quadruplexes in the model plant species Arabidopsis thaliana: prevalence and possible functional roles. Nucleic Acids Res. 38, 8149–8163. (35) Garg, R., Aggarwal, J., and Thakkar, B. (2016) Genome-wide discovery of Gquadruplex forming sequences and their functional relevance in plants. Sci. Rep. 6, 28211. (36) Sun, L., Fang, L., Zhang, Z., Chang, X., Penny, D., and Zhong, B. (2016) Chloroplast phylogenomic inference of green algae relationships. Sci. Rep. 6, 20528. (37) Baselga-Cervera, B., Costas, E., Bustillo-Avendaño, E., and García-Balboa, C. (2016) Adaptation prevents the extinction of Chlamydomonas reinhardtii under toxic beryllium. PeerJ. 4, 1823. (38) Almeida, A. C., Gomes, T., Langford, K., Thomas, K. V., and Tollefsen, K. E. (2017) Oxidative stress in the algae Chlamydomonas reinhardtii exposed to biocides. Aquat. Toxicol. 189, 50–59. (39) Gao, X., Zhang, F., Hu, J., Cai, W., Shan, G., Dai, D., Huang, K., and Wang, G. (2016) MicroRNAs modulate adaption to multiple abiotic stresses in Chlamydomonas reinhardtii. Sci. Rep. 6, 38228. (40) Barahimipour, R., Strenkert, D., Neupert, J., Schroda, M., Merchant, S. S., and Bock, R. (2015) Dissecting the contributions of GC content and codon usage to gene expression in the model alga Chlamydomonas reinhardtii. Plant J. 84, 704–717. (41) Yang, M., Jiang, J.-P., Xie, X., Chu, Y.-D., Fan, Y., Cao, X.-P., Xue, S., and Chi, Z.Y. (2017) Chloroplasts isolation from Chlamydomonas reinhardtii under nitrogen stress. Front. Plant Sci. 8, 1503. (42) Silflow, C. D., and Lefebvre, P. A. (2001) Assembly and motility of eukaryotic cilia and flagella. Lessons from Chlamydomonas reinhardtii. Plant Physiol. 127, 1500–1507. (43) Wang, Z. T., Ullrich, N., Joo, S., Waffenschmidt, S., and Goodenough, U. (2009) Algal lipid bodies: stress induction, purification, and biochemical characterization in wildtype and starchless Chlamydomonas reinhardtii. Eukaryot. Cell 8, 1856–1868. (44) Vega, J. M., Garbayo, I., Domínguez, M. J., and Vigara, J. (2006) Effect of abiotic stress on photosynthesis and respiration in Chlamydomonas reinhardtii: Induction of oxidative stress. Enzyme Microb. Technol. 40, 163–167.

ACS Paragon Plus Environment

31

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 34

(45) Mendez-Alvarez, S., Leisinger, U., and Eggen, R. I. (1999) Adaptive responses in Chlamydomonas reinhardtii. Int. Microbiol. 2, 15–22. (46) Wang, S., Zhao, S.-X., Wei, C.-L., Yu, S.-Y., Shi, J.-P., and Zhang, B.-G. (2014) Effect of magnesium deficiency on photosynthetic physiology and triacylglyceride (TAG) accumulation of Chlorella vulgaris. Huan Jing Ke Xue 35, 1462–1467. (47) Blume, S. W., Guarcello, V., Zacharias, W., and Miller, D. M. (1997) Divalent transition metal cations counteract potassium-induced quadruplex assembly of oligo(dG) sequences. Nucleic Acids Res. 25, 617–625. (48) Hardin, C. C., Perry, A. G., and White, K. (2000) Thermodynamic and kinetic characterization of the dissociation and assembly of quadruplex nucleic acids. Biopolymers 56, 147–194. (49) Miyoshi, D., Nakao, A., and Sugimoto, N. (2003) Structural transition from antiparallel to parallel G‐quadruplex of d(G4T4G4) induced by Ca2+. Nucleic Acids Res. 31, 1156–1163. (50) Mei, Y.-H., Wilson, T., and Khan, A. U. (1993) Polyamine content and cell cycle in a unicellular alga. Data are consistent with polyamines’ function of protecting DNA against oxidative damage by singlet oxygen. Quimica Nova 16, 337–342. (51) Pegg, A. E. (2016) Functions of polyamines in mammals. J. Biol. Chem. 291, 14904–14912. (52) Miller-Fleming, L., Olin-Sandoval, V., Campbell, K., and Ralser, M. (2015) Remaining mysteries of molecular biology: the role of polyamines in the cell. J. Mol. Biol. 427, 3389–3406. (53) Naoyoshi, N., Fujihara, S., and Nishijima, T. (2006) Changes in intracellular polyamine concentration during growth of Heterosigma akashiwo (Raphidophyceae). Fish. Sci. 72, 350–355. (54) Nishibori, N., and Nishijima, T. (2004) Changes in polyamine levels during growth of a red-tide causing phytoplankton Chattonella antiqua (Raphidophyceae). Eur. J. Phycol. 39, 51–55. (55) Kumar, S. V., Basu, B., and Rajam, M. (2006) Modulation of polyamine levels influence growth and cell division in Chlamydomonas reinhardtii. Physiol. Mol. Biol. Plants 12, 53–58. (56) Sun, H., Xiang, J., Liu, Y., Li, L., Li, Q., Xu, G., and Tang, Y. (2011) A stabilizing and denaturing dual-effect for natural polyamines interacting with G-quadruplexes depending on concentration. Biochimie 93, 1351–1356.

ACS Paragon Plus Environment

32

Page 33 of 34 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

(57) Pegg, A. E. (2009) Mammalian polyamine metabolism and function. IUBMB Life 61, 880–894. (58) Fleming, A. M., and Burrows, C. J. (2017) 8-Oxo-7,8-dihydroguanine, friend and foe: Epigenetic-like regulator versus initiator of mutagenesis. DNA Repair 56, 75–83. (59) Huppert, J. L., and Balasubramanian, S. (2005) Prevalence of quadruplexes in the human genome. Nucleic Acids Res. 33, 2908–2916. (60) Manova, V., and Gruszka, D. (2015) DNA damage and repair in plants – from models to crops. Front. Plant Sci. 6, 885. (61) Phan, A. T., Kuryavyi, V., Luu, K. N., and Patel, D. J. (2007) Structure of two intramolecular G-quadruplexes formed by natural human telomere sequences in K+ solution. Nucleic Acids Res. 35, 6517–6525. (62) Adrian, M., Heddi, B., and Phan, A. T. (2012) NMR spectroscopy of Gquadruplexes. Methods 57, 11–24. (63) Kendrick, S., and Hurley, L. H. (2010) The role of G-quadruplex/i-motif secondary structures as cis-acting regulatory elements. Pure Appl. Chem. 82, 1609–1621. (64) Patel, D. J., Phan, A. T., and Kuryavyi, V. (2007) Human telomere, oncogenic promoter and 5’-UTR G-quadruplexes: diverse higher order DNA and RNA targets for cancer therapeutics. Nucleic Acids Res. 35, 7429–7455. (65) Piazza, A., Adrian, M., Samazan, F., Heddi, B., Hamon, F., Serero, A., Lopes, J., Teulade-Fichou, M.-P., Phan, A. T., and Nicolas, A. (2015) Short loop length and high thermal stability determine genomic instability induced by G-quadruplex-forming minisatellites. EMBO J. 34, 1718–1734. (66) Venczel, E. A., and Sen, D. (1993) Parallel and antiparallel G-DNA structures from a complex telomeric sequence. Biochemistry 32, 6220–6228. (67) del Villar‐Guerra, R., Trent, J. O., and Chaires, J. B. (2018) G-quadruplex secondary structure obtained from circular dichroism spectroscopy. Angew. Chem. Int. Ed. 57, 7171–7175. (68) Renaud de la Faverie, A., Guédin, A., Bedrat, A., Yatsunyk, L. A., and Mergny, J.-L. (2014) Thioflavin T as a fluorescence light-up probe for G4 formation. Nucleic Acids Res. 42, e65. (69) You, H., Wu, J., Shao, F., and Yan, J. (2015) Stability and kinetics of c-MYC promoter G-quadruplexes studied by single-molecule manipulation. J. Am. Chem. Soc. 137, 2424–2427.

ACS Paragon Plus Environment

33

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 34

(70) Yang, D., and Hurley, L. H. (2006) Structure of the biologically relevant Gquadruplex in the c-MYC promoter. Nucleosides Nucleotides Nucleic Acids 25, 951–968. (71) Zhou, H., Wu, Z.-F., Han, Q.-J., Zhong, H.-M., Peng, J.-B., Li, X., and Fan, X.-L. (2018) Stable and label-free fluorescent probe based on G-triplex DNA and Thioflavin T. Anal. Chem. 90, 3220–3226. (72) Huppert, J. L. (2008) Four-stranded nucleic acids: structure, function and targeting of G-quadruplexes. Chem. Soc. Rev. 37, 1375–1384. (73) Bedrat, A., Lacroix, L., and Mergny, J.-L. (2016) Re-evaluation of G-quadruplex propensity with G4Hunter. Nucleic Acids Res. 44, 1746–1759. (74) Fleming, A. M., Zhu, J., Ding, Y., Visser, J. A., Zhu, J., and Burrows, C. J. (2018) Human DNA repair genes possess potential G-quadruplex sequences in their promoters and 5’-untranslated regions. Biochemistry 57, 991–1002. (75) Weeda, G., van Ham, R. C. A., Vermeulen, W., Bootsma, D., van der Eb, A. J., and Hoeijmakers, J. H. J. (1990) A presumed DNA helicase encoded by ERCC-3 is involved in the human repair disorders xeroderma pigmentosum and Cockayne’s syndrome. Cell 62, 777–791. (76) Ozawa, S.-I., Onishi, T., and Takahashi, Y. (2010) Identification and characterization of an assembly intermediate subcomplex of photosystem I in the green alga Chlamydomonas reinhardtii. J. Biol. Chem. 285, 20072–20079. (77) Chen, F. M. (1992) Strontium(2+) facilitates intermolecular G-quadruplex formation of telomeric sequences. Biochemistry 31, 3769–3776. (78) Ruttkay-Nedecky, B., Kudr, J., Nejdl, L., Maskova, D., Kizek, R., and Adam, V. (2013) G-quadruplexes as sensing probes. Mol. Basel Switz. 18, 14760–14779. (79) Kolesnikova, S., Hubálek, M., Bednárová, L., Cvačka, J., and Curtis, E. A. (2017) Multimerization rules for G-quadruplexes. Nucleic Acids Res. 45, 8684–8696. TOC Graphic

Addition of Mg2+ 3’

Addition of Polyamines

CHLAMY

5’

Stabilization Tm 

ACS Paragon Plus Environment

34