Detection of G-Quadruplex Structures Formed by G-Rich Sequences

7 Jul 2017 - Beijing National Laboratory for Molecular Sciences, Key Laboratory of ... Research/Education Center for Excellence in Molecular Sciences,...
0 downloads 0 Views 1MB Size
Subscriber access provided by UNIVERSITY OF CONNECTICUT

Article

Detection of G-quadruplex structures formed by G-rich sequences from rice genome and transcriptome using combined probes Tianjun Chang, Weiguo Li, Zhan Ding, Shaofei Cheng, Kun Liang, Xiangjun Liu, Tao Bing, and Dihua Shangguan Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.7b01992 • Publication Date (Web): 07 Jul 2017 Downloaded from http://pubs.acs.org on July 7, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Detection of G-quadruplex structures formed by G-rich sequences from rice genome and transcriptome using combined probes Tianjun Chang1, Weiguo Li1, Zhan Ding1, Shaofei Cheng1, Kun Liang1, Xiangjun Liu2,3, Tao Bing2,3, Dihua Shangguan2,3,* 1

Department of Biology, Institute of Resources and Environment, Henan Polytechnic University, Jiaozuo, 454000, P.R. China 2 Beijing National Laboratory for Molecular Sciences, Key Laboratory of Analytical Chemistry for Living Biosystems, CAS Research/Education Center for Excellence in Molecular Sciences, Institute of Chemistry, Chinese Academy of Sciences, Beijing, 100190, P. R. China 3 University of the Chinese Academy of Sciences, Beijing 100049, China *Corresponding author: Tel & Fax: +86-10-62528509; E-mail: [email protected] (Dr. D. Shangguan) Keywords: G-quadruplexes; DNAzyme; fluorescent probe; Oryza sativa

ABSTRACT: Putative G-quadruplex (G4) forming sequences (PQS) are highly prevalent in the genome and transcriptome of various organisms, and are considered as potential regulation elements in many biological processes by forming G4 structures. The formation of G4 structures highly depends on the sequences and the environment. In most cases, it is difficult to predict G4 formation by PQS, especially PQS containing G2 tracts. Therefore, the experimental identification of G4 formation is essential in the study of G4-related biological functions. Herein, we report a rapid and simple method for the detection of G4 structures by using a pair of complementary reporters, hemin and BMSP. This method was applied to detect G4 structures formed by PQS (DNA and RNA) searched in the genome and transcriptome of Oryza sativa. Unlike most of the reported G4 probes that only recognize part of G4 structures, the proposed method based on combined probes positively responded to almost all G4 conformations including parallel, antiparallel and mixed/hybrid G4, but did not respond to non-G4 sequences. This method shows potential for high-throughput identification of G4 structures in genome and transcriptome. Furthermore, BMSP was observed to drive some PQS to form more stable G4 structures, or induce the G4 formation of some PQS that cannot form G4 in normal physiological conditions, which may provide a powerful molecular tool for gene regulation.

G-quadruplexes (G4s) are four-stranded structures adopted by G-rich DNA or RNA sequences, in which, four guanine bases form stacked arrays of G-quartets via Hoogsteen base pairs with the help of monovalent cations, typically K+ or Na+.1 G4s have multiple conformations, including parallel, antiparallel or mixed/hybrid topologies.2 Bioinformatics studies have reported that putative G-quadruplex forming sequences (PQS) are highly prevalent in the human genome and transcriptome, such as promoters, telomeres and 5’- or 3’untranslated regions (UTRs) of mRNA.3,4 G4 structures have been recognized as potential regulation elements for multiple cellular processes such as replication, transcription and genomic maintenance. Although G4 researchers have achieved great progress in human genome studies, there are few researchers focusing on plant genome studies. It is reported that under drought or high salinity stress, K+ ion concentration increases even up to 700 mM in plant cells, which might potentially drive G4 formation.5 High temperature usually affects the G4 formation and even alters G4 structures, which implies that some PQS might act as regulators of gene expression under heat stress in plant cells.6 Therefore, studies of the distribution and structure of PQS from plants may offer new clues for understanding their responses to environmental stresses.

So far, the formation of G4 structures by many PQS are supported by physical evidence in vitro.7 The most common sequence motif used for bioinformatics searching of PQS is (G3+L1–7)3+G3+ (G3 motif), where L refers to any base.8 However, accumulating evidence shows that many sequences that are not fulfilling the G3 motif are indeed folded into G4s, such as G4 with bulges9 or hairpin loops,10 and guanine-vacancy– bearing G4 (a G4 that contains one G2 tract and three G3 tracts).11,12 DNA/RNA sequences with four G2-tracts (G2 motif) are able to form two-quartet G4 as well.13,14 Bioinformatics analysis based on G2 motif ((G2+L1-7)3+G2+) has revealed a much higher density of PQS in the genome and transcriptome of the model plant species Arabidopsis thaliana5, especially in the genes which are differentially regulated by drought5 or upregulated under heat stress6. Compared to G4s formed by G3 tracts, G4s formed by G2 tracts have fewer quartets to provide enthalpy contribution, usually resulting in low thermal stability.15 Loop and flanking sequences have been reported to affect the G4 formation, for example, cytosine bases in the flanks16 or loops17 might inhibit the folding of G4. Moreover, the type and concentration of metal ions in the buffer always influence the G4 formation.18,19 Thus, many PQS (especially the sequences containing G2 tracts) may not form G4 struc-

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

tures under certain conditions.5 Therefore, it is essential to experimentally verify whether these PQS actually adopt G4 structures. There are various techniques available to measure the formation of G4 structures, such as melting temperature determination20, thermal or isothermal difference spectra21, circular dichroism (CD)22, and nuclear magnetic resonance (NMR)23; these methods are relatively time-consuming and require specialized equipment. In order to identify G4 structures of numerous PQS found by a bioinformatics search, rapid and simple methods are necessary. Additionally, because PQS may form multiple G4 conformations, the detection method should be able to recognize different G4 conformations from non-G4 sequences. Small molecular probes selectively binding to G4 have attracted significant attention recently. Hemin has been reported to selectively bind to G4s, especially parallel G424,25, and exhibit peroxidase activity (G4/hemin complex is named as DNAzyme or RNAzyme).26,27 This DNAzyme has been widely used for G4-based label-free analysis.28-30 Fluorescent probes selectively binding to G4s also show potential for G4 detection.31-37 However, most G4 probes exhibit selectivity to parallel G4 and show poor response to other G4 conformations. Recently, we reported a fluorescent probe of G4, 2,9bis[4-(4-methylpi-perazin-1-yl)styryl]-1,10-phenanthroline (BMSP), which showed high affinity to G4s, and it especially showed the strongest affinity to antiparallel human telomere G4; it did not bind to non-G4 sequences38, suggesting that BMSP might have the potential to identify multiple G4 conformations from non-G4 sequences. Rice (Oryza sativa) is one of the most important crop plants for human consumption. Drought, salt and heat stress are the major environmental factors that limit the productivity of rice.39 In order to understand the potential functions of PQS in rice, in this study we first analyzed the PQS in the genome and transcriptome of O. sativa Japonica by using the G2 and G3 motifs as the searching patterns. Then we developed a simple method to detect G4 structures by using BMSP and hemin as a pair of reporters. Using this method, we tested a group of PQS picked from the O. sativa genome and transcriptome. The fluorescence enhancement of BMSP and the catalytic rate increasing of hemin in the presence of DNA or RNA PQS were measured to evaluate the formation of G4 structures. The detected G4 structures were confirmed by CD spectrum and polyacrylamide gel electrophoresis (PAGE).

EXPERIMENTAL SECTION Materials. DNA and RNA were synthesized and HPLC purified by Sangon Biotech Co. Ltd (Shanghai, China). DNA solutions were prepared in Tris-HCl buffers (25 mM, pH 7.65) and stored at -20 ºC. RNA were dissolved in diethylpyrocarbonate (DEPC) treated water and stored at -80 ºC. All the RNA experiments were performed under RNase and DNase free condition. The DNA and RNA solutions were heated for 5 min at 95 ºC and then annealed slowly to room temperature (RT) within 150 mM KCl or NaCl for forming G4 structures. 2,9-bis[4-(4-methylpi-perazin-1-yl)styryl]-1,10phenanthroline (BMSP) was synthesized and characterized as described in our previous work.38 Hemin was purchased from MP Biomedicals (France). BMSP and hemin stock solutions (10 mM, respectively) were prepared in dimethyl sulfoxide (DMSO) and stored in dark at -20 ºC. 2, 2’-azino-bis(3ethylbenzothiozoline-6-sulfonic acid) (ABTS) was purchased

Page 2 of 8

from Sigma–Aldrich (America). Other reagents were purchased from China National Pharmaceutical Group Corporation (Shanghai, China). All the solutions were prepared with ultrapure water (specific resistance of 18.3 MΩ cm-1). Bioinformatics. Genomic sequences of O. sativa Japonica were obtained from the Rice Annotation Project Database in EnsemblPlants (http://www.plants.ensembl.org/; the serial number of sequence assembly is GCA_001433935.1). The sequences of 5’- and 3’-UTRs were downloaded from EnsemblPlants (release 28) via the BioMart interface (version 0.7, http://www.plants.ensembl.org/biomart/). PQS were searched from the whole genome and transcriptome, including genic and intergenic region, 5’-UTR, 3’-UTR and coding sequences (CDS) by a Perl program developed by Tan’s group.40 The search sequence is (GX+L1-N)3+GX+, where x is 2 or 3, N ≤ 7, the range of “3+” is from 3 to 6, and L corresponds to any of the base (A, G, C, T or U). The PQS in whole genome were taken from both sense and antisense strands by searching with G and C patterns5, while PQS in the UTRs and CDS regions were only taken from the coding strands. Fluorescent measurement. DNA or RNA (1 µM) and BMSP (0.5 µM, containing 0.5% DMSO) were mixed to a final volume of 400 µL and the mixtures were incubated for 30 min at room temperature in the dark. Fluorescence spectra between 390 to 650 nm were recorded on a fluorescence spectrometer (F-7000, Hitachi, Japan) with a cuvette of path length 10 mm with excitation at 338 nm. The fluorescence intensity at 512 nm was selected for the response of BMSP to DNA or RNA G4. Catalytic oxidation of ABTS by the G4/hemin complexes. This set of experiments was performed as previous reported.25 In detail, hemin was freshly diluted in working solution containing 0.025% (v/v) Triton X-100 and 0.5% (v/v) DMSO. Freshly prepared hemin was added to DNA or RNA solutions and incubated for 1 hours at RT. Then 2 mM ABTS and 0.4 mM H2O2 were added into the solutions to initiate the reaction. Increase in absorbance at 415 nm (the radical anion ABTS•-) was measured as a function of time by a UV-visible spectrophotometer (UV-2550, Shimadzu, Japan). The initial rates were calculated from the slope of the initial linear portion (the first 20 seconds) of the increase in absorbance at 415 nm. All kinetic measurements were repeated at least three times. CD experiments. The CD spectra of DNA and RNA samples were measured in 25 mM Tris-HCl buffers (pH 7.65) containing 150 mM KCl or NaCl at 25 °C. Spectra between λ = 220 and 320 nm were collected on a J-1500 or J-815 spectrometer (JASCO Ltd., Japan) by using 10 mm path length quartz cuvette at a rate of 200 nm min-1. To measure the CD spectra of G4 in the presence of BMSP or hemin, 5 µM BMSP or hemin was added into the samples and then incubated for 1 h before the measurement. For each sample, an average of three scans was taken, and the spectrum of buffer was subtracted. Polyacrylamide gel electrophoresis (PAGE). DNA (1 µM) were denatured at 95 °C for 5 min and then renatured slowly till RT in buffers containing 150 mM KCl. Hemin and BMSP were added into DNA solutions and then incubated for no less than 1 h before electrophoresis analysis. Native PAGE were performed on 20% acrylamide gels at 4 °C in TBE buffer (89 mM Tris, 2 mM Na2EDTA, and 89 mM boric acid, pH 8.3). The running buffers and gels were added into 20 mM KCl to maintain the G4 conformations throughout the electrophoresis

ACS Paragon Plus Environment

Page 3 of 8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

process. Gels were visualized by silver-staining and imaged by AlphaImager system (Alpha Innotech, America).

RESULTS AND DISCUSSION Bioinformatics analysis of PQS in the O. sativa genome. PQS in the genome of O. sativa Japonica were analyzed by using the G-quadruplex forming sequence motifs, (G2+L1– 7)3+G2+, (G2+L1–4)3+G2+, (G3+L1–7)3+G3+ and (G3+L1–4)3+G3+ as searching patterns (Table 1). In the genome of O. sativa, the PQS number of the (G2+L1–7)3+G2+ motif is 1,523,492, and the density is 4,081.7 per Mb; the PQS number of the (G3+L1– 7)3+G3+ motif in the O. sativa genome is 41,597, and the densi-

ty is 111.4 per Mb. The searching results of the (G2+L1–4)3+G2+ and (G3+L1–4)3+G3+ motifs showed a similar tendency (Table 1). The numbers and densities of PQS in O. sativa are much higher than those in the A. thaliana genome.5 Additionally, the density of PQS containing the G2 motif in the coding region (a part of the genic region) is about 2-fold higher than that in the genic and intergenic regions. PQS in the transcriptome were also examined (Table S1). 43% of G2 motifs and 26% of G3 motifs in the genic region are transcribed to mRNA (containing CDS and 5’- and 3’-UTR). These results indicate a preference of PQS with G2 motifs in the O. sativa genome and transcriptome.

Table 1. Distribution of PQS motifs in the O. sativa genome. Genome Intergenic PQS Motif Numbera Db Numbera Db

Ec

Numbera

Db

Ec

Numbera

Db

Ec

(G2+L1–7)3+G2+

1,523,492

4,081.7

1,014,807

3,991.6

0.98

508,685

4,274.4

1.05

337,592

8,119.6

1.99

(G2+L1–4)3+G2+

817,209

2,189.5

532,368

2,094.0

0.96

284,841

2,393.5

1.09

191,362

4,605.5

2.10

(G3+L1–7)3+G3+

41,597

111.4

31,415

123.6

1.11

10,182

85.6

0.77

3,989

96

0.86

(G3+L1–4)3+G3+

19,039

51

13,576

53.4

1.05

5463

45.9

0.90

1,742

41.5

0.81

Genic

Coding

a The PQS number in genome was taken from both sense and antisense strands with G and C searching pattern. The whole genome contains genic region and intergenic region, and the genic region contains coding region. bPQS density (D) is defined as total number of PQS per megabase (PQS/Mb) in specified region. Number of megabases per region: whole genome 373.245 Mb, genic region 119.007 Mb, intergenic region 254.239 Mb and coding region 41.551 Mb. cEnrichment (E) is defined as density ratio of PQS in specified region against the genome.

Identification of G4 structures formed by DNA PQS. In order to determine whether these PQS found in O. sativa can fold into G4 structures, we developed a rapid and simple method by using BMSP and hemin as probes. 46 DNA sequences, mainly containing the (G2+L1–4)3+G2+ motif, from the O. sativa genome and transcriptome were chosen for investigation (Table S2). Some of them were picked out from the genes related to the response to heat stress (GO:0009408), water deprivation (GO:0009414) and salt stress (GO:0009651) from the China Rice Data Center (http://www.ricedata.cn/ontology). Sequences containing the (G3+L1–7)3+G3+ motif were also selected for confirming the applicability of the detection method. All the DNA sequences were prepared in Tris-HCl buffers (pH = 7.65) containing 150 mM KCl. The enhancements of the catalytic rate of hemin and the fluorescence intensity of BMSP by different DNA were measured (Figure 1). 16 sequences weakly enhanced the catalytic rate of hemin (< 3-fold), while 12 out of them (75%) also weakly increased the fluorescence of BMSP (< 3-fold); 18 sequences greatly enhanced the catalytic rate of hemin (> 15fold), while 13 of them (72%) moderately increased the fluorescence of BMSP (~ 3~15-fold). 12 sequences moderately enhanced the catalytic rate of hemin (~ 3~15-fold), but 7 of them (60%) greatly increased the fluorescence of BMSP (> 15-fold); especially sequences Oshp18-1 and FP65 increased the BMSP fluorescence by ~ 40-fold. Additionally, sequences OsCI-7 and Oshp18-3 showed very low activity on the catalytic rate of hemin, but greatly increased the BMSP fluorescence by more than 20-fold. These results indicate that BMSP exhibits high fluorescence response to the DNA sequences with relatively low DNAzyme activity. All the experiments were then transfered to 96-well plates. In addition to reading the results with the plate reader, the enhancements in the catalytic activity of hemin could be ob-

served directly under the daylight; the fluorescence of BMSP was also visible under a UV light. The results observed by bare eyes matched well with that obtained with a UV-visible spectrophotometer and fluorescence spectrometer (Figure 1B). This set of results suggests that this method could be used for high-throughput primary screening of G4s from a huge number of PQS. The DNAzyme activity of G4s are known to be structuredependent: intramolecular parallel G4 generally show higher activity than antiparallel G4, and non-G4 sequences do not have DNAzyme activity.41,42 BMSP has been reported to bind G4s through π-π stacking onto the terminal G-quartets.38 The G4 binding provides a hydrophobic environment for BMSP and reduces the interaction of BMSP with water, which results in the decrease of nonradiative relaxation and the enhancement of BMSP fluorescence. In addition, because the loops of antiparallel G4s distribute at their terminal surfaces, they provide a more hydrophobic pocket for BMSP than that of parallel G4s. Therefore BMSP has higher fluorescence response to antiparallel G4 (human telomere) than to parallel or mixed G4 conformations.38 Circular dichroism (CD) spectra can provide information concerning G4 structures: parallel G4s exhibit a dominant positive maximum at 264 nm and a negative minimum at 240 nm, antiparallel G4s exhibit a dominant positive maximum at 295 nm and a negative minimum at 260 nm, and mixed/hybrid G4s exhibit a positive maximum at 295 nm (plus a positive shoulder near 265 nm) and a negative minimum around 235 ~ 240 nm.1 In order to know the relativity between the response to hemin/BMSP and the structures of these sequences, CD spectra of these sequences were measured. As expected, 18 sequences with high DNAzyme activity (> 15fold enhancement) could be assigned to parallel G4s or mixed/hybrid G4s based on the CD results (Table S2 and Figure S1). 11 sequences with low DNAzyme activity (< 3-fold) and low fluorescent enhancement of BMSP (< 3-fold) did not show characteristic CD signals of G4s, indicating non-G4

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

structures (Table S2 and Figure S1). The sequences that strongly enhanced BMSP fluorescence (> 15-fold) and showed moderate DNAzyme activity (~ 3~15-fold) could form different conformations, including parallel, antiparallel and

Page 4 of 8

mixed/hybrid structures (Table S2 and Figure S1). These results suggest that hemin and BMSP could serve as a pair of complementary probes for G4 detection; almost all G4s could be covered by these two probes.

Figure 1. Response of hemin and BMSP to PQS. (A) Fluorescence enhancement of BMSP (0.5 µM) by different DNA (1 µM) and catalytic rate enhancement of hemin (2 µM) by different DNA (2 µM) (F and F0: fluorescence intensity of BMSP in the presence or absence of DNA; V and V0: catalytic rate of hemin in the presence or absence of DNA). (B) Photograph of catalytic oxidization of ABTS by different hemin/DNA complexes (2 µM) (left) and fluorescence image of BMSP (5 µM) in the presence of different DNA (5 µM) under UV light (right). DNA sequences from a1 to a8: OI2690, OsCI-8, DST-1, DST-3, OsHC-1, OsDOS-1, OsCI-1, OsCI-4; b1 to b8: OsfA7-2, OC1866, OK4339, Oshp18-2, FP75-3, FP75-2, OsCI-5, OsHC-2; c1 to c8: OL6136, OsfA2d, OC2392, OsDOS-3, DST-2, OsbP60-4, OsbP60-1, OsCI-6; d1 to d8: OshP18-1, OsCI-2, FP65, OsCI-3, OsHC-3, FP75-21, HA32, OsCI-7; e1 to e8: OA1478, OJ5640, OsbP602, OshP18-3, OsHC-4, OK2702, OI4662, OsbP60-3; and f1 to f8: OC1749, OE3442, OsDOS-2, OsfA7-3, NC DNA, OF5535, FP75-1, Null.

The effect of BMSP and hemin on G4 conformations. It is interesting that two PQS containing the G2 motif (OsCI-7 and Oshp18-3) possessed very weak DNAzyme activity, but strongly enhanced the fluorescence of BMSP (more than 20fold). Their CD spectra indicated that both sequences did not form G4 structures (Figure S2), suggesting that they might be induced to form G4 conformations by BMSP. In order to reveal the effects of hemin and BMSP on G4 conformation, we selected 5 PQS containing the G2 motif (OsHC-1, OsHC-4, Oshp18-1, Oshp18-3 and FP75-1) with different DNAzyme activity and fluorescence response to BMSP, and investigated their CD change in the presence of hemin and BMSP. Hemin and BMSP showed the same CD spectrum as the buffer in the range of 235-320 nm (Figure S2), indicating they do not disturb the CD spectra of G4s. CD spectra of these DNA sequences in K+ solutions suggested that sequences OsHC-1 and Oshp18-1 adopted parallel G4 structures; the other sequences (NC DNA, FP75-1, OsHC4 and Oshp18-3) did not show characteristic peaks of G4 structures (that is, non-G4) (Figure 2A). After addition of he-

min, the CD spectra of these sequences did not show notable change except that of OsHC-4 and Oshp18-3 with a small redshift or slight decrease of part of their CD peaks, suggesting very weak or no effect on their structures. After addition of BMSP, the CD spectra of parallel G4 (OsHC-1) and control sequences (NC DNA and FP75-1) showed very tiny changes, indicating no effect on the structure of these sequences. However, the addition of BMSP caused the positive CD peak of Oshp18-1 and OsHC-4 to red-shift to 260 ~ 264 nm and their negative peak to red-shift to 240 nm; meanwhile the addition of BMSP caused both peaks to become intense, suggesting the formation of parallel G4 structures induced by BMSP. BMSP also caused the positive CD peak of Oshp18-3 to red-shift from 280 to 295 nm and its negative peak to red-shift from 245 to 255 nm, accompanied by an increase of both peaks, suggesting the formation of antiparallel G4 structures induced by BMSP. With the addition of BMSP, the broad CD peak of OsCI-7 was decreased, and two positive peaks at 265 and 295 nm appeared and increased (Figure S3), suggesting that sequence OsCI-7 was induced to fold into a mixed/hybrid G4.

ACS Paragon Plus Environment

Page 5 of 8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

The structural transformation of these PQS induced by BMSP was further confirmed by Native PAGE (Figure 2B). Multiple coexisting structures of Oshp18-1 were induced to form a uniform structure by BMSP when its concentration was above 5.0 µM. The structural transformation of OsHC-4 and Oshp18-3 was also observed when BMSP concentration was above 5.0 µM. No structural transformation of FP75-1 was observed in the presence of BMSP. Meanwhile, hemin did not induce any

conformational changes of these DNAs. These results are consistent with the results of the CD spectra. This set of results suggests that BMSP can induce some PQS that cannot form G4 structures under physiological conditions to form the G4 structures, resulting in strong fluorescence enhancement. This property may make BMSP a powerful molecular tool for the functional study of G4s.

Figure 2. Conformational changes of DNA sequences (5 µM) induced by BMSP or hemin in 150 mM KCl solutions. (A) CD spectra of DNA sequences (5 µM) in the absence or presence of BMSP or hemin (5 µM); (B) Native PAGE of DNA sequences (1 µM) in the absence or presence of different concentrations of BMSP or hemin.

Because K+ concentration increases in plant cells under some stress conditions, we further tested the influence of K+ on the G4 structure of the sequences (OsHC-4 and Oshp18-1) that showed great difference on the response to hemin and BMSP. FP75-1 was used as a control because it did not respond to both probes. As shown in Figure 3, FP75-1 did not show any characteristic CD peaks of G4 at the tested range of K+ concentration. However, the positive CD peak of Oshp18-1 and OsHC-4 continuously red-shifted to 260 ~ 264 nm and their negative peak red-shifted to 240 nm with the increase of

K+, accompanied by the enhancement of these peaks, which suggests the formation of stable parallel G4 structures. These spectral changes of these sequences were consistent with that induced by BMSP, suggesting that some PQS in the rice genome (such as OsHC-4 and Oshp18-1) may not fold into G4 under physiological conditions, but form G4 structures at high concentrations of K+ or in the presence of BMSP. Therefore, BMSP may help to discover the potential G4-related genes that are involved in the response to high salt or drought stress in O. sativa.

Figure 3. CD changes of PQS (5 µM) in the presence of different concentrations of K+.

Identification of G4 structures adopted by RNA PQS. Recently, RNA G4 have attracted much attention because of

their high thermal stability and potential roles in mRNA processing and translation.43 Accurate prediction of RNA G4 is

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

crucial to unlocking their biological functions and impacts. To validate the above method, we further tested the response of hemin and BMSP to RNA PQS from the O. sativa transcriptome (sequences are listed in Table S3). The enhanced catalytic rates of hemin and fluorescence intensities of BMSP by RNA PQS in K+ solutions are shown in Figure 4. rOsHC-1 and rOshp18-3 moderately enhanced the catalytic activity of hemin (7 ~ 19-fold), while rOsHC-4 and rOshp18-1 greatly enhanced the catalytic activity by 42- and 106-fold, respectively. Additionally, rOsHC-1 and rOshp18-3 increased the fluorescence of BMSP by 6 ~ 18-fold, and rOsHC-4 and rOshp18-1 increased the fluorescence of BMSP by 113- and 78-fold, respectively (Figure 4). NC RNA, rFP75-1, rOE3442 and rOF5355 did not enhance the catalytic rate of hemin and fluorescence intensity of BMSP (Figure 4).

Figure 4. Catalytic rate enhancement of hemin (0.5 µM) in the presence of 0.5 µM RNA and fluorescence enhancement of BMSP (0.5 µM) in the presence of 1 µM RNA (here, F and F0 represent the BMSP fluorescence intensity in the presence or absence of RNA; V and V0 represent the catalytic rate of hemin in the presence or absence of RNA).

The conformations of these RNAs were measured by CD experiments (Figure 5 and Figure S4). NC RNA showed a

Page 6 of 8

wide positive CD signal around 260 ~ 280 nm and a wide negative signal around 230 ~ 240 nm; rFP75-1, rOE3442 and rOF5355 showed positive peaks around 265 nm, but no negative peak or very tiny negative peaks around 232 nm, suggesting that they did not form G4 structures. The addition of hemin or BMSP did not cause significant change in the CD peaks of these four sequences, except for the increase in CD signals below 230 nm that can be attributed to the CD signals of hemin or BMSP (Figure S2). CD spectra of rOsHC-1, rOsHC-4, rOshp18-1 and rOshp18-3 showed characteristic positive peaks around 264 nm and negative peaks around 240 nm, indicating the formation of parallel G4 structures. After addition of hemin, the CD spectra of these RNAs showed very little change, suggesting hemin did not alter the structures of these RNA PQS. The addition of BMSP also did not change the CD signals of rOsHC-1 and rOshp18-3, but made the characteristic CD signals of rOSHC-4 and rOshp18-1 became intense, suggesting that BMSP could stabilize the parallel RNA G4 formed by rOSHC-4 and rOshp18-1. A similar response pattern of BMSP to these RNA PQS was also observed in Na+ solutions; in particular, the enhancement of the characteristic CD signals of rOSHC-4 and rOshp18-1 became much more significant after the addition of BMSP (Figure S5 and Figure S6). These results indicate that the detection method is well suited for RNA G4 detection as well. Recently, an elegant approach based on an engineered structure-specific antibody realized the direct visualization of RNA G4s inside cells and demonstrated that endogenous RNA G4 can be stabilized by a small-molecule ligand.44 Our results suggest that many RNA PQS with the G2 motif in O. sativa may also form G4 structures in vivo. Moreover, BMSP showed the ability to drive RNA PQS with the G2 motif to form more stable G4 structures, suggesting the potential of BMSP in the regulation of RNA functions in cells.

Figure 5. CD spectra of RNA sequences in the absence or presence of BMSP and hemin. RNA (5 µM) were dissolved in buffers containing 150 mM KCl. The finial concentrations of BMSP and hemin were 5 µM.

We have investigated the distribution of PQS in the genome and transcriptome of O. sativa by bioinformatics analysis. PQS, especially those with G2 motifs were found highly prevalent in O. sativa. Using a pair of complementary probes, hemin and BMSP, we detected the G4 formation of 53 PQS (46 DNA and 7 RNA) chosen from the searched PQS in buffers

containing 150 mM K+. 14 PQS that were not responded by both of the probes were demonstrated not to form G4 structures by CD spectra. 39 PQS were positively responded by both or either of the probes. Among them, 34 PQS were found to fold into G4 structures including parallel, antiparallel and mixed/hybrid G4. Additionally, 4 PQS that were strongly re-

ACS Paragon Plus Environment

Page 7 of 8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

sponded by BMSP did not form G4s in normal physiological condition, but formed G4s in the presence of BMSP. They also formed G4s in the presence of high concentration of K+, which might relate to the responses of O. sativa to drought and salt stress. Although many probes have been reported to detect G4 structures, many of them mainly recognize parallel G4s,32,33 and very few can recognize all G4 structures with good selectivity over ss-DNA and ds-DNA. Hemin is widely used for G4 detection based on the peroxidase activity of G4/hemin complex, it strongly respond to parallel G4s and do not or weakly respond to other G4 structures.41 BMSP recognizes different types of G4s, and strongly responds to antiparallel G4s.38 Both probes do not respond to non-G4 sequences. Our results indicate that the combination of both probes can respond to all G4 structures, which shows great potential as a rapid and simple method for high-throughput identification of G4 structures in genome and transcriptome.

CONCLUSIONS In this work, we developed a rapid and simple method to detect G4 structures by using hemin and BMSP as a pair of probes. By this method, almost all G4s adopted by the tested PQS from the O. sativa genome and transcriptome were detected by comparing the enhancement of either catalytic activity of hemin or the fluorescence intensity of BMSP. The measurement can also be performed on 96-well plates for high throughput and visual detection. Furthermore, we discovered that BMSP was able to drive some PQS with the G2 motif to form more stable G4s, even those that did not adopt G4 structures in physiological conditions. This finding might provide a powerful molecular tool to study the biological functions of PQS within cells.

ASSOCIATED CONTENT Supporting Information The Supporting Information is available free of charge on the ACS Publications website. PQS in the transcriptome of O. sativa searched by bioinformatics (Table S1); Detailed DNA and RNA sequences used in this work (Table S2 and S3); CD spectra of 46 DNA sequences in K+ solutions (Figure S1); CD spectra of BMSP and hemin (Figure S2); CD changes of DNA or RNA in the presence of BMSP or hemin (Figure S3, S4 and S6); Fluorescence spectra of BMSP in the presence of RNA sequences (Figure S5).

AUTHOR INFORMATION Corresponding Author * E-mail: [email protected]

ACKNOWLEDGMENT This work was supported by Natural Science Foundation of China (grant number 21575147, 21635008, 21375135 and 21621062), the Natural Science Foundation of Henan Province, China (grant number 162300410111), and the project of Preeminent Youth Fund of Henan Polytechnic University, China (grant number J2017-1).

REFERENCES (1) Burge, S.; Parkinson, G. N.; Hazel, P.; Todd, A. K.; Neidle, S. Nucleic Acids Res. 2006, 34, 5402-5415.

(2) Gabelica, V., Smargiasso, N., Rosu, F., Hsia, W., Colson, P., Baker, E.S., Bowers, M.T. and De Pauw, E. J. Am. Chem. Soc. 2008, 130, 10208-10216. (3) Huppert, J. L.; Balasubramanian, S. Nucleic Acids Res. 2007, 35, 406-413. (4) Kwok, C. K.; Marsico, G.; Sahakyan, A. B.; Chambers, V. S.; Balasubramanian, S. Nat. Methods 2016, 13, 841-844. (5) Mullen, M. A.; Olson, K. J.; Dallaire, P.; Major, F.; Assmann, S.M.; Bevilacqua, P. C. Nucleic Acids Res. 2010, 38, 81498163. (6) Lukoszek, R.; Feist, P.; Ignatova, Z. BMC Plant Biol. 2016, 16, 221. (7) Beaudoin, J. D.; Perreault, J. P. Nucleic Acids Res. 2010, 38, 7022-7036. (8) Todd, A. K.; Johnston, M.; Neidle, S. Nucleic Acids Res. 2005, 33, 2901-2907. (9) Mukundan, V. T.; Phan, A. T. J. Am. Chem. Soc. 2013, 135, 5017-5028. (10) Onel, B.; Carver, M.; Wu, G.; Timonina, D.; Kalarn, S.; Larriva, M.; Yang, D. J. Am. Chem. Soc. 2016, 138, 2563-2570. (11) Li, X. M.; Zheng, K. W.; Zhang, J. Y.; Liu, H. H.; He, Y. D.; Yuan, B. F.; Hao, Y. H.; Tan, Z. Proc. Natl. Acad. Sci. USA 2015, 112, 14581-14586. (12) Heddi, B.; Martin-Pintado, N.; Serimbetov, Z.; Kari, T. M.; Phan, A. T. Nucleic Acids Res. 2016, 44, 910-916. (13) Qin, M.; Chen, Z.; Luo, Q.; Wen, Y.; Zhang, N.; Jiang, H.; Yang, H. J. Phys. Chem. B 2015, 119, 3706-3713. (14) Mullen, M. A.; Assmann, S. M.; Bevilacqua, P. C. J. Am. Chem. Soc. 2012, 134, 812-815. (15) McManus, S. A.; Li, Y. PLoS One 2013, 8, e64131. (16) Beaudoin, J. D.; Jodoin, R.; Perreault, J. P. Nucleic Acids Res. 2014, 42, 1209-1223. (17) Sobczak, K.; de Mezer, M.; Michlewski, G.; Krol, J.; Krzyzosiak, W. J. Nucleic Acids Res. 2003, 31, 5469-5482. (18) Hardin, C. C.; Perry, A. G.; White, K. Biopolymers 2000, 56, 147-194. (19) Chang, T.; Li, G.; Zhao, K.; Ban, L.; Bian, W.; Bing, T.; Shangguan, D. Chem. J. Chinese U. 2014, 35, 2556-2562. (20) Rachwal, P. A.; Fox, K. R. Methods 2007, 43, 291-301. (21) Mergny, J. L.; Li, J.; Lacroix, L.; Amrane, S.; Chaires, J. B. Nucleic Acids Res. 2005, 33, e138. (22) Del Villar-Guerra, R.; Gray, R. D.; Chaires, J. B. Curr. Protoc. Nucleic Acid Chem. 2017, 68, 171811-171816. (23) Webba da Silva, M. Methods 2007, 43, 264-277. (24) Chang, T; Liu, X.; Cheng, X.; Qi, C.; Mei, H.; Shangguan, D. J. Chromatogr. A 2012, 1246, 62-68. (25) Chang, T.; Gong, H., Ding, P.; Liu, X.; Li, W.; Bing, T.; Cao, Z.; Shangguan, D. Chem. Eur. J. 2016, 22, 4015-4021. (26) Travascio, P.; Li, Y.; Sen, D. Chem. Biol. 1998, 5, 505-517. (27) Li, W.; Li, Y.; Liu, Z.; Lin, B.; Yi, H.; Xu, F.; Nie, Z.; Yao S. Nucleic Acids Res. 2016, 44, 7373–7384. (28) Mei, H.; Bing, T.; Qi, C.; Zhang, N.; Liu, X.; Chang, T.; Yan, J.; Shangguan, D. Chem. Commun. 2013, 49, 164-166. (29) Fu, T.; Ren, S.; Gong, L.; Meng, H.; Cui, L.; Kong, R. M.; Zhang, X. B.; Tan, W. Talanta 2016, 147, 302-306. (30) Du, Y. C.; Jiang, H. X.; Huo, Y. F.; Han, G. M.; Kong, D. M. Biosens. Bioelectron. 2016, 77, 971-977. (31) Ma, D. L.; Zhang, Z.; Wang, M.; Lu, L.; Zhong, H. J.; Leung, C. H. Chem. Biol. 2015, 22, 812-828. (32) Jin, B.; Zhang, X.; Zheng, W.; Liu, X.; Zhou, J.; Zhang, N.; Wang, F.; Shangguan, D. Anal. Chem. 2014, 86, 7063-7070. (33) Jin, B.; Zhang, X.; Zheng, W.; Liu, X.; Qi, C.; Wang, F.; Shangguan, D. Anal. Chem. 2014, 86, 943-952. (34) Hu, M. H.; Chen, S. B.; Wang, Y. Q.; Zeng, Y. M.; Ou, T. M.; Li, D.; Gu, L. Q.; Huang, Z. S.; Tan, J. H. Biosens. Bioelectron. 2016, 83, 77-84. (35) Huang, H.; Suslov, N. B.; Li, N. S.; Shelke, S. A.; Evans, M. E.; Koldobskaya, Y.; Rice, P. A.; Piccirilli, J. A. Nat. Chem. Biol. 2014, 10, 686-691. (36) Ma, D. L.; Che, C. M.; Yan, S. C. J. Am. Chem. Soc. 2009, 131, 1835–1846.

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(37) Wang, M.; Wang, W. H.; Kang, T. S.; Leung, C. H.; Ma, D. L. Anal. Chem. 2016, 88, 981-987. (38) Wu, S.; Wang, L.; Zhang, N.; Liu, Y.; Zheng, W.; Chang, A.; Wang, F.; Li, S.; Shangguan, D. Chem. Eur. J. 2016, 22, 60376047. (39) Zhu, J. K. Cell 2016, 167, 313-324. (40) Xiao, S.; Zhang, J. Y.; Zheng, K. W.; Hao, Y. H.; Tan, Z. Nucleic Acids Res. 2013, 41, 10379-10390. (41) Cheng, X.; Liu, X.; Bing, T.; Cao, Z.; Shangguan, D. Biochemistry 2009, 48, 7817-7823.

Page 8 of 8

(42) Kong, D. M.; Yang, W.; Wu, J.; Li, C. X.; Shen, H. X. Analyst 2010, 135, 321-326. (43) Millevoi, S.; Moine, H.; Vagner, S. Wiley Interdiscip. Rev. RNA 2012, 3, 495-507. (44) Biffi, G.; Di Antonio, M.; Tannahill, D.; Balasubramanian, S. Nat. Chem. 2014, 6, 75-80.

For TOC only

ACS Paragon Plus Environment