Identification of G-Quadruplex-Binding Protein from the Exploration of

Dec 5, 2018 - However, unlike the entire RGG domains that have been broadly explored, the role of the RGG motif remains obscure, with very limited stu...
1 downloads 0 Views 2MB Size
Article pubs.acs.org/JACS

Cite This: J. Am. Chem. Soc. 2018, 140, 17945−17955

Identification of G‑Quadruplex-Binding Protein from the Exploration of RGG Motif/G-Quadruplex Interactions Zhou-Li Huang,‡ Jing Dai,‡ Wen-Hua Luo, Xiang-Gui Wang, Jia-Heng Tan, Shuo-Bin Chen,* and Zhi-Shu Huang* School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, People’s Republic of China

Downloaded via UNIV OF WINNIPEG on January 23, 2019 at 14:46:09 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

S Supporting Information *

ABSTRACT: The arginine/glycine-rich region termed the RGG domain is usually found in G-quadruplex (G4)-binding proteins and is important in G4−protein interactions. Studies on the binding mechanism of RGG domains found that small segments (RGG motif) inside the domain contribute greatly to the G4 binding affinity. However, unlike the entire RGG domains that have been broadly explored, the role of the RGG motif remains obscure, with very limited study. Herein, to clarify the role of the RGG motif in G4−protein interactions, we systematically investigated the binding affinity and mode between RGG-motif peptides and G4s. The internal arrangement of RGG repeats and gap amino acids played a more crucial role in the G4-binding mechanism than a critical number of RGG repeats. Arginines and phenylalanines at the exact position of the RGG motif might enable additional hydrogen bonding and π-stacking interaction with nucleobases and strengthen the binding of G4. Impressively, proceeding from a G4-binding RGG peptide, 12, discovered above, we identified the cold-inducible RNA-binding protein (CIRBP) as a new G4 DNA-binding protein both in vitro and in cells. In addition, we found that the key amino acids for G4 binding in peptide 12 and CIRBP were highly similar, and peptide 12 clearly played a key role in the G4 binding of CIRBP. This report is the first in which a G4-binding protein was identified from exploration of the G4-binding RGG motif. Our findings suggest a novel strategy for discovering new G4-binding proteins by exploring key peptide segments.



INTRODUCTION The G-quadruplex (G4) is a noncanonical nucleic acid secondary structure formed within guanine-rich (G-rich) sequences in both DNA and RNA and is widely found in promoter regions, telomeres, and the transcriptome.1−4 Its compact four-stranded structure is held together by Hoogsteen hydrogen bonds between guanines and π−π stacking between G-quartets and is further stabilized by monovalent cations, typically K+ or Na+.5,6 G4s play an important role in a variety of cellular processes, including DNA transcription, translation, and telomere maintenance. These sophisticated biological processes necessitate the participation of G4-binding proteins.7,8 For example, binding of nucleolin to the c-MYC NHE (nuclease hypersensitive element) III1 element induces folding of the G4 and reduces the transcription of the oncogene c-myc, while binding of NM23-H2 unfolds the G4 and promotes transcription.9,10 At the translational level, human Fragile X Mental Retardation Protein (FMRP) binds RNA G4s in the 5′-UTR of various mRNAs such as FMR1 and represses translation initiation.11,12 The G4−protein interaction is also significant for telomere maintenance.13,14 Translocated in liposarcoma (TLS) protein binds to both human telomere G4 DNA and RNA, subsequently regulating histone modifications and telomere length in vivo.15 Because these © 2018 American Chemical Society

important biological events are involved, studying the G4− protein interaction is a research area of intensive investigation, and identifying new G4-binding proteins is still of great interest and would contribute to a more global understanding of G4 functions and the biological events that involve G4s. In recent years, an increasing number of proteins with binding affinity to G-quadruplexes have been identified. For examples, DEAH-Box Helicase 36 (DHX36) was found to bind G4s by affinity chromatography, and SRA stem-loopinteracting RNA-binding protein (SLIRP) has been identified as G4-binding proteins by a quantitative mass spectrometrybased approach.16,17 In G4-binding proteins, various motifs have been found to participate in G4 recognition. Notably, a recent study found the N-terminal motif and OB-fold-like subdomain of helicase DHX36 mediating the G4-binding of DHX36 based on the crystal structure of the protein−DNA complex.18 On the other hand, arginine/glycine-rich regions termed the RGG domain were frequently found among G4binding proteins.19,20 In the RGG domain, a sequence of closely spaced arginine-glycine-glycine (RGG) repeats is interspersed with other, often aromatic, amino acids. A recent Received: August 29, 2018 Published: December 5, 2018 17945

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955

Article

Journal of the American Chemical Society

their concentrations were determined from the absorbance at 260 nm on the basis of their molar extinction coefficients using a NanoDrop 1000 spectrophotometer (Thermo Scientific). To obtain Gquadruplex formation, oligonucleotides were annealed in relevant buffer containing KCl by heating to 95 °C for 5 min, followed by gradual cooling to room temperature. Further dilutions of samples to working concentrations were made with relevant buffer immediately prior to use. Circular Dichroism Studies. CD studies were performed on a Chirascan circular dichroism spectrophotometer (Applied Photophysics). A quartz cuvette with a 1 cm path length was used for the recording of spectra with a 1 nm bandwidth, 1 nm step size, and time of 0.5 s per point. The melting data were recorded over a range of 25−95 °C, with a heating rate of 1.0 °C/min. A buffer baseline was collected in the same cuvette and was subtracted from the sample spectra. The final analysis of the data was conducted using Origin (OriginLab Corp.). Surface Plasmon Resonance. SPR measurements were performed on a ProteOn XPR36 protein interaction array system (Bio-Rad Laboratories, Hercules, CA, USA) using a Streptavidincoated sensor chip. Biotinylated oligonucleotides were prefolded in filtered and degassed running buffer (50 mM Tris-HCl, 150 mM KCl, pH 7.4, 0.05% Tween20) and then attached to the chip. The biotinylated oligonucleotides were captured (∼1000 RU) in five flow cells, leaving one flow cell as a blank. Solutions of peptides or proteins were prepared with running buffer by serial dilutions from stock solutions. The samples were then injected at a flow rate of 50 μL/min during the association phase, which was followed by a 400 s disassociation phase at 25 °C. The chip was regenerated with a short injection of 2 M KCl between consecutive measurements. Data were analyzed with ProteOn manager software. Filter-Binding Assay. Filter-binding assays were performed as follows. In brief, a nylon membrane was placed directly below the nitrocellulose membrane to trap any DNA not retained on the nitrocellulose. The nitrocellulose membrane was then treated with 0.5 M KOH for 10 min at 4 °C and washed with 0.5× TB prior to use. Biotinylated DNAs (5 nM) were incubated with peptides or proteins at 37 °C for 30 min in binding buffer (10 mM Tris-HCl, 100 mM KCl, pH 7.4). All samples were applied to the membrane under vacuum and washed with binding buffer. The cross-linking reaction was carried out under UV irradiation at 265 nm for 120 s. The detection of biotinylated DNA was carried out using a chemiluminescent nucleic acid detection module kit (Thermo Scientific). The gray levels of the dots were measured using Quantity One. The EC50 (half-maximal effective concentration for binding) value was evaluated via a Hill model. The obtained data were analyzed using GraphPad Prism (GraphPad Software, San Diego, CA, USA). Fluorescence Studies. Fluorescence studies were performed on a Fluoromax-4 luminescence spectrophotometer (HORIBA). A quartz cuvette with a 3 mm × 3 mm path length was used for recording the spectra. For TO displacement, 0.25 μM G-quadruplexes were preincubated with 0.5 μM thiazole orange for 1 h at 37 °C (10 mM Tris-HCl, 100 mM KCl, pH 7.4) to form a stable complex, and then different concentrations of peptides were added. Fluorescence measurement was taken when excited at 480 nm. For the 2-Ap titration, peptides or proteins were added into the solution containing Ap-labeled oligonucleotides at fixed concentration (1 μM) in buffer (10 mM Tris-HCl, 100 mM KCl, pH 7.4). After each addition, the reaction was stirred and allowed to equilibrate for at least 1 min, and fluorescence measurement was taken when excited at 305 nm. NMR Spectroscopy. Samples for NMR were incubated in phosphate buffer (25 mM KH2PO4, 70 mM KCl, 10% D2O, pH 7.4) at 25 °C before measurement. The final concentration of DNA was 150 μM, and that of peptide 12 was 300 μM. Experiments were performed on a 400 MHz spectrometer (Bruker) at 25 °C. Protein Expression and Purification. The CIRBP cDNAcoding region was synthesized and cloned into the BamHI and EcoRI sites of pGEX-4T-1 (expression of GST-tagged proteins). Expression was achieved in Escherichia coli BL21(DE3) with an induction temperature of 30 °C for 6 h. The proteins were purified using GSH

analysis revealed that this domain is the second most common RNA-binding domain, emphasizing its important role in mediating protein−nucleic acid interactions.21 Its low complexity facilitates interaction with a variety of nucleic acid substrates, including G4s. The interaction of the RGG domain with the G4 has attracted wide interest, and the significance of elucidating RGG-mediated G4 recognition is underscored by the observation that this domain is indispensable for protein function. For example, the C-terminal region of nucleolin, consisting of RNA-binding domains (RBDs) 3 and 4 as well as the RGG domain, is critical for the initial recognition of the cMYC NHE III1 sequence and responsible for promoting G4 formation.22 Research has been conducted to shed light on the role of the RGG domain in mediating protein−G4 interactions. It has been demonstrated that the RGG domain alone is both essential and sufficient for G4 binding in situations involving Ewing’s sarcoma (EWS) and TLS, strongly suggesting that this domain is the core interaction element for G4−protein recognition.23−27 Studies on the binding mechanism of RGG domains found that small segments (RGG motif) inside the domain contribute greatly to the G4-binding affinity. Interestingly, recent studies found that a small peptide of the RGG motif alone was also capable of G4 binding.28,29 A peptide of the RGG motif from FMRP comprises three RGG repeats and 18 amino acids in total. Both the solution and crystal structures of this peptide with the in vitro-selected G-rich RNA Sc1 were obtained, and the originally unstructured RGG peptide folds into a β-turn conformation upon binding a duplex−quadruplex junction. The above information suggests that peptides as small as the RGG peptide are capable of binding G4, and this result suggests how the RGG motif mediates a domain or whole protein binding to G4s. Most studies of the RGG sequence were carried out on the entire domain, while a smaller unit, the RGG motif, within the domain has received much less attention. The role of the RGG motif remains obscure and has been the subject of very limited study. These studies have been restricted to the known RGG motif, leaving a large number of others excluded. Whether these motifs bind G4 and how this recognition happens is worth exploring. In addition, because the function of a protein is mainly dominated by the motifs in its amino acid sequence, studying RGG motifs reveals previously unknown properties of proteins and provides clues about how a motif mediates G4− protein interactions. Herein, to clarify the role of the RGG motif in G4−protein interactions, we systematically investigated the binding affinity and mode between the RGG-motif peptides and G4s. Our investigation started with RGG peptides that were devised on the basis of nucleic acid-binding proteins. To further detail the interactions, the key amino acids in G4-binding peptides were determined by making point mutations, and the binding modes of peptides with G4 were studied by biophysical methods. Furthermore, to determine the role of the RGG motif in proteins, the G4 binding ability of the original protein was studied both in vitro and in cells.



METHODS AND MATERIALS

Peptides and Oligonucleotides. All peptides used were purchased from ChinaPeptides, dissolved in water according to the powder weight, and stored at −80 °C (Table S1). All oligonucleotides (Table S2) used in this study were purchased from Sangon and Takara. All the oligonucleotides were dissolved in relevant buffer, and 17946

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955

Article

Journal of the American Chemical Society Table 1. RGG Peptides from Nucleic-Binding Proteins protein

amino acid feature

name

motif sequence

name

ID

residue

length

RGGs

aromatic

basic

acid

1 2 3 4 5 6 7 8 9 10 11 12

RGGRGQNSASRGG RGGSGGTRGPPSRGG RGGNFSGRGGFGGSRGG RGGLGGGMRGPPRGG RGGGGGFHRRGGGGRGG RGGFAGRARGRGG RGGYRGRGGFQGRGG RGGRGSFRGCRGG KGGRGGARGSARGGVRGG RGGHEQGGGRGGRGGYDHGGRGG RGGGHRGRGGFNMRGGNFRGGAPGNRGG RGGSAGGRGFFRGGRGRGRGFSRGG

MRE11 hnRNP G ROA0 G3BP1 SFPQ hnRNP D0 RB56 DDX4 SPRN FA98A hnRNP U CIRBP

P49959 P38159 Q13151 Q13283 P23246 Q14103 Q92804 Q9NQI0 Q5BIV9 Q8NCA5 Q00839 Q14011

577−589 113−126 192−202 435−449 9−27 272−284 337−351 147−159 25−42 352−374 702−729 94−118

13 15 15 15 17 13 15 13 18 23 28 25

3 3 3 3 3 3 4 4 4 4 6 7

0 0 0 2 2 1 1 1 0 1 3 3

3 3 3 3 3 4 3 4 5 6 6 7

0 0 0 0 0 0 0 0 0 2 2 0

Table 2. Dissociation Constants (KD) of RGG Peptides Binding G-Quadruplexes KD (μM) Htg22 Pu22

1

2

3

4

5

6

7

8

9

10

11

12

n.d.a n.d.

n.d. n.d.

n.d. n.d.

n.d. n.d.

n.d. n.d.

n.d. n.d.

n.d. n.d.

17.3 n.d.

14.3 n.d.

n.d. n.d.

n.d. n.d.

7.2 5.2

n.d.: no binding was detected at concentrations less than 20 μM.

a



magnetic beads (BeaverBeads) and eluted by freshly prepared elution buffer. The purified proteins were verified by gel electrophoresis and Western blotting. pSANG10-3F-BG4 was a gift from Shankar Balasubramanian (Addgene plasmid #55756).30 The G-quadruplex antibodies BG4 and D1 were prepared following a previous report.31 Electrophoretic Mobility Shift Assay (EMSA). Binding experiments of fluorescein-labeled DNAs and proteins were performed in a buffer, 10 mM Tris-HCl, pH 7.4, 100 mM KCl. The samples were loaded onto a nondenatured polyacrylamide gel (10%) after incubation at 37 °C for 30 min. Electrophoresis was performed in 1× tris-borate-EDTA (TBE) buffer with the addition of 20 mM KCl at 4 °C. The gel was photographed on a 4500SF gel-imaging system (TANON). ChIP Assay. Chromatin immunoprecipitation (ChIP) experiments were performed using the Pierce agarose ChIP kit (Thermo Scientific) following the standard protocol. CIRBP antibody (ab191885, Abcam) was used to trap CIRBP and its substrate. Normal rabbit IgG was used as a negative control, and the total genomic DNA (input) was used as a positive control. DNA samples were amplified by PCR using primers for telomeres. The amplified products were separated on a 1.5% agarose gel and photographed on a 4500SF gel-imaging system (TANON). Confocal Imaging. HeLa cells were first seeded on a glass-bottom plate and transfected with pEGFP-N3 containing the CIRBP coding region (expression of GFP-tagged CIRBP). The cells were then fixed with 4% paraformaldehyde, permeabilized with 0.5% TritonX-100/ phosphate-buffered saline (PBS), and blocked with 5% bovine serum albumin/PBS, sequentially. For visualization of DNA G-quadruplexes, cells were treated with RNase A at 37 °C for 2 h and then incubated with G-quadruplex antibodies (BG4 or D1) at 37 °C for 30 min and FLAG antibody (#8146, Cell Signaling Technology) at 4 °C overnight. For visualization of the telomere, cells were incubated with TRF2 antibody (ab13579, Abcam) at 4 °C overnight. Cells were then incubated with Alexa 647-conjugated secondary antibody (A28181 or A21206, Thermo Scientific) at 37 for 1 h. After rinsing with PBS, cells were stained with DAPI. Fluorescence signals were recorded by using a FV3000 laser scanning confocal microscope (Olympus).

RESULTS Selection of RGG Motifs. To identify representative RGG motifs among more than 1000 RGG-containing proteins, we first reduced the scope on the basis of the following principles. First, the peptides should be from well-reported nucleic acidbinding proteins, including transcription factors and helicases. These proteins function through binding nucleic acids, so RGG motifs may be involved in nucleic acid binding. Second, the peptides should contain at least three repeats of RGG with a short residue gap and preferentially contain aromatic residues, which are reported to contribute to G4 binding.28,32 According to the above principles, we finally chose 12 peptides from a collected database, and their sequence information and features are listed in Table 1.33 The amino acid number of these peptides ranged from 13 to 28, and there was more than one peptide of each amino acid length, so that we could analyze the relationship between their nucleic acid affinity and amino acid components. The proteins from which we chose these peptides include transcription factors (hnRNP D0, SFPQ, RB56, hnRNP U), helicases (DDX4, G3BP1), DNA repair factors (MRE11), and RNA processing factors (hnRNP G, ROA0, CIRBP). Binding of G-Quadruplexes by the Selected RGG Peptides. Nucleic acid binding is the most important function of the RGG motif, but limited studies on their G4 affinity have been reported.24−26,28,34 Therefore, we investigated the binding affinity of the chosen peptides to G4 using surface plasmon resonance (SPR). The G4s we applied were the classic telomeric G4 Htg22 and the c-myc promotor G4 Pu22 (Figure S1). Dissociation constants (KDs) were obtained using an equilibrium fitting mode (Table 2). Only three of the 12 RGG peptides bound DNA G4s (Figure S2). Among them, peptide 12 showed efficient binding with a KD value less than 10 μM, while peptides 8 and 9 exhibited slight binding to G4s, with a KD value greater than 10 μM. Peptide 12, with the strongest G4 affinity, has the greatest number of continuous RGG repeats among these motifs, 17947

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955

Article

Journal of the American Chemical Society

RGG repeat was more important, and substitution of arginine with alanine in different positions gave peptides 13 to 19. The mutant sequences are listed in Table 3. The binding affinity of mutants 13−20 for G4 DNA was studied by SPR and filter binding (Table 3, Figure S5−9). Peptide 20, which has a similar positively charged state to peptide 12, shows a significant decrease in G4-binding affinity. Thus, arginine residues in RGG repeats of peptide 12 participate in binding not only via electrostatic interactions but also via other types of interactions, such as hydrogen bonding. In the arginine-to-alanine-substituted peptides, peptides 13, 16, and 17, with arginine mutated in the first, fourth, and fifth RGG repeats, respectively, had much weaker G4-binding affinity than the others. Their binding constants (KD) for G4 were all beyond the highest sample concentration in SPR, and their EC50 value with telomeric G4 was 1 order of magnitude higher than that of peptide 12. Meanwhile, other arginine-to-alanine-substituted peptides exhibited only a slight change in binding constants. This result suggests that the arginines do not contribute equally and that the three arginines in the first, fourth, and fifth RGG repeats might be key arginines that mediate G4 binding. On the other hand, aromatic residues are frequently observed within the G4-binding RGG motif and are proposed to be an important component for the specific recognition of G4.34 Therefore, a phenylalanine-to-alanine-substituted peptide, 21, and a phenylalanine-to-leucine-substituted peptide, 25, were designed to evaluate the role of phenylalanine in the motif−G4 interaction. Point mutations of peptide 12 were also made to determine the possible role of each phenylalanine. Substitutions of phenylalanine with alanine in different positions were made and named peptides 22 to 24. As shown in Table 3, peptide 21 had weak binding affinity among the peptide mutants, slightly higher than that of peptide 17. Thus, phenylalanines might also be important residue in RGG peptide 12. Peptide 25, which has a hydrophobic state close to peptide 12 but with all the phenylalanines replaced by leucine, shows a significant decrease in G4-binding affinity. Therefore, we assumed the phenylalanines in peptide 12 were more likely to participate in G4 binding via π−π interactions. In the phenylalanine-to-alanine peptides, peptides 22, 23, and 24, exhibit binding constants (KD) for telomeric G4 all beyond 10 μM, and their EC 50 values with telomeric G4 were approximately 800 nM. Their similar G4-binding affinities were weaker than that of peptide 12, denoting all these phenylalanines in peptide 12 participated in the G4 binding. Interestingly, the arginines in RGG repeats bind in a sitespecific manner, which corresponds to the observations for peptides 1 to 12, indicating that the internal arrangement of RGG repeats and gap amino acids might be crucial for G4 binding. Besides the RGG repeats and aromatic gap amino acid, phenylalanines, it might also be interesting to see the influence of other gap amino acids on G4 binding of peptide 12. We then designed peptide 26, in which all the serine residues were replaced by alanine. The G4-binding affinity of peptide 26 was similar to that of peptide 12, so the serine in peptide 12 seems not to contribute to G4 binding. Peptides 27 and 28, in which all the serines and alanines were replaced by an acidic amino acid, aspartate, or basic amino acid, lysine, respectively, were designed to evaluate the possible influence of nonaromatic amino acids on G4 binding of RGG peptide 12. The G4-binding affinity of peptide 27 was much weaker than that of peptide 12, while that of peptide 28 was slightly

indicating that the RGG repeats might be important for G4 recognition. However, peptide 11, with six RGG repeats but two negatively charged amino acids, did not effectively bind DNA G4s. Peptides 1 to 10 shared similar lengths and numbers of RGG repeats but had different gap amino acids. Although positively charged and aromatic amino acids are believed to participate in peptide−DNA binding, G4-binding peptides 8 and 9 contain neither the most basic nor the most aromatic amino acids among these peptides.35 The ability of the RGG peptide to bind G4 was not crudely related to the number of amino acids that could contribute to DNA binding. Instead, the internal arrangement of the RGG repeats and the gap amino acids might be crucial for peptide−G4 binding. Interaction between Peptide 12 and G-Quadruplexes. To obtain more insight into the relationship between the G4 binding affinity and particular RGG peptide sequences, peptide 12, with good G4-binding affinity, was selected for further study. Before studying the structural basis of peptide 12 binding, we confirmed its binding to G4s using a filter-binding assay. In such an assay, peptide-bound nucleic acids are retained in the upper membrane, and the lower membrane holds the rest. To obtain a quantitative analysis, the gray level of the dots in the upper and lower membranes was counted, and the effective binding concentration (EC50) was determined. As shown in Figure 1 and Figure S3, peptide 12

Figure 1. Filter-binding assay of peptide 12 with different DNA structures. (A) Filter-binding dots of peptide 12 with G-quadruplex (Htg22), single-stranded (mHtg22), and hairpin (Hp18) DNA; (B) quantification of DNA on filter-binding dots.

effectively bound G4 Htg22 DNA with an EC50 value of 244 nM, while no dots with peptide 12 bound the single-stranded DNA mHtg22 or the hairpin DNA Hp18, and only a few dots with peptide 12 bound the G-rich single-stranded DNA mG14. It is clear that peptide 12 selectively bound the G4 DNA structure compared with single-stranded or duplex DNA. To understand the details of the peptide 12 interaction with G4 DNA, we delved into the structural basis for its binding. Since RGG domains do not adopt a single, stable secondary structure and have an intrinsically disordered region (Figure S4), we focused on the key amino acids involved in G4 binding. A previous investigation of RGG domains determined that the arginine in RGG repeats was highly conserved and might be the key residue. The positively charged guanidine group enables electrostatic interactions with the negatively charged phosphate backbone and might enable hydrogen bond formation and π-stacking interactions with the nucleobases. To investigate the role of arginine, we designed peptide 20, in which all the arginine residues were replaced by another cationic amino acid, lysine. In addition, arginines at different positions contribute differently to substrate binding. Thus, point mutations of peptide 12 were made to determine which 17948

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955

Article

Journal of the American Chemical Society Table 3. Sequence Information of Peptide 12 Mutants and Their Binding Ability with G-Quadruplexes

Dissociation constants (KD) were determined by SPR assay. bEffective binding concentrations (EC50) were determined by filter-binding assay. Displacement ratios (DR) were determined by TO displacement assay. >20: binding constant was not smaller than 20 μM.

a c

basis for the amino acids in peptide 12 used for binding, it was still intriguing to discover how the RGG motif binds to G4s with site-specific features. Before investigating the binding site on G4 for peptide 12, we first evaluated the possible impact of peptide 12 on G4 formation by circular dichroism (CD) and melting assays (Figures S14 and S15). The addition of peptide 12 to G4s with different formations (hybrid-type G4 Htg22, parallel G4 Pu22, and antiparallel G4 Hras) did not cause a change in the global characteristics of CD signals but increased the Tm value of G4 by 3.5 to 20 °C. The binding of peptide 12 increased the stabilization of the G4s without inducing any conformational change. A 1H NMR titration was carried out to further explore the mode of peptide 12 binding to G4s in detail. Notably, telomere G4 Htg25 and c-myc promoter G4 Pu22 are used to study the G4 interaction with peptide 12. The signals of the imino protons (10−13 ppm) in these two G4s were well resolved, which enabled the binding sites of the peptide to be determined.34,37,38 As shown in the 1H NMR data of Htg25 (Figure 2A), among all the imino protons, G3, G17, and G21 imino signals significantly shift upon addition of peptide 12. The signals of only these three guanines, belonging to the 5′terminal G-quartet of Htg25, were changed. This result indicated that the binding site of peptide 12 is specific for the 5′ terminal region of Htg25. On the other hand, the 1H NMR data of Pu22 (Figure 2B) show distinct chemical shifts of imino peaks of G6, G10, G17, and G19 upon addition of peptide 12, while other imino peaks had no or little shift. Interestingly, these four guanines belonged to the terminal Gquartet, suggesting peptide 12 also may bind on the G-quartet of Pu22. All these data suggested that peptide 12 binds to G4 upon the G-quartet plane with a specific interaction rather than via random binding. Meanwhile, modification with 2-aminopurine (2-Ap) in different loops has been widely used to estimate the binding modes of molecules with G4s.39 To obtain more detail on the binding mode, fluorescence experiments were performed using

stronger than that of peptide 12. It is clear that the negatively charged acidic amino acids could hamper the G4-binding affinity of peptide 12, while the positively charged basic amino acids could enhance the G4-binding affinity. In addition, we varied the internal arrangement of RGG repeats, and peptides R1 to R7 were obtained from peptide 12 with the RGG repeats and gap amino acids in different orders (Table S3, Figures S11−S13). Their binding affinities to G4 were studied by SPR. Impressively, all the binding affinity to G4 of these altered RGG peptides decreased in different levels, suggesting the importance of the internal arrangement of the RGG motif sequence. To consolidate the binding component of peptide 12 toward the G4, thiazole orange (TO) displacement of G4 DNA was also carried out. In the TO replacement assay, TO fluoresced upon binding G4, and the fluorescence intensity decreased with the addition of peptides to compete with its binding site of G4.36 A higher replacement ratio represented a stronger affinity of peptides to G4. The addition of peptide 12 effectively decreased TO fluorescence with Htg22 and Pu22 (Table 3 and Figure S10). In addition, the results of TO displacement for the mutants were in good agreement with those of SPR and filter binding. Taken together, our results indicate that the arginines in RGG repeats and the phenylalanines throughout the sequence of peptide 12 are all important for the G4 binding. The arginine in this RGG motif for G4 binding not only mediates the electrostatic interactions but also is involved in possible hydrogen bonding and π−π interactions. Meanwhile, the role of phenylalanines in peptide 12 for G4 binding might involve π−π interactions. The internal arrangement of RGG repeats and gap amino acids also has an effect on the G4 binding of RGG peptides. Investigation of the Peptide 12 Binding Mode in the G4−Peptide 12 Complex. The great difference in G4binding affinity of the mutants also suggests a special binding mode of peptide 12. While we learned about the molecular 17949

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955

Article

Journal of the American Chemical Society

5′-terminal G-quartet (G3, G9, G17, and G21) upon the addition of all the mutant peptides and compared them with that of peptide 12. The difference in signal change upon substitution of peptide 12 could somehow represent the influence of G-quartet binding of the substituted amino acids. As shown in Figures S16 and S17, the signal changes of all the mutant peptides were variously different from that of peptide 12. Impressively, upon addition of the all phenylalaninesubstituted peptides 21 and 25, the signal change of G17 was decreased, while those of G3, G9, and G21 were increased, suggesting that the loss of the phenylalanines in peptide 12 would hamper the binding with G17. A slight decrease in the change of the G17 signal and an increase in the change of G9 were observed for peptides 22 and 23, while decreases in the changes of all these guanine signals were observed for peptide 24. The two phenylalanines in the center of peptide 12 might have specific interactions with G17, while the other might bind with G4 in another manner. On the other hand, the changes of G3, G17, and G21 signals were decreased upon the addition of arginine-substituted peptides, suggesting that all the arginines have a contribution to G-quartet binding of peptide 12. Among the single arginine-substituted peptides, peptide 16 had the least change in the imino signals, indicating that the arginine in the fourth RGG repeat should be the most important arginine involved in the G-quartet binding of peptide 12. Moreover, the G-quartet binding mode is likely to be that seen in the NMR structure of the peptide segment of DHX36 (Rhau18) and prion (P16) with the G4, where they both preferentially bind on the G-quartet plane (Figure S18).40,41 Therefore, the structural principles of the RGG motif−G4 complex reported in our article might still be applicable to other peptide−G4 complexes. These results implied a new way in which the RGG motif mediates protein−G4 recognition. Identification of CIRBP as a G4-Binding Protein. RGG domains have been reported to mediate G4 binding in several G4-binding proteins. The effective binding of RGG peptide 12 might also mediate protein G4 recognition. Thus, we look back to the original protein from these studies, named the coldinducible RNA-binding protein (CIRBP). CIRBP has an RNAbinding function in the stress response.42,43 A recent study reported that knocking out CIRBP led to a shortened telomere length.44 Interestingly, the telomere region contains an abundance of repeated G-rich DNA sequences that fold into a G4 and are involved in the regulation of the telomere length. Peptide 12 also effectively bound telomere G4s. Thus, this evidence suggests a correlation between CIRBP and telomere

Figure 2. Chemical shift perturbation analysis of G-quadruplexes with peptide 12. (A) Imino proton regions of the 1H NMR spectra of Htg25 either alone (bottom) or with peptide 12 at a ratio of 1:2 (top). (B) Imino proton regions of the 1D 1H NMR spectra of Pu22 either alone (bottom) or with peptide 12 at a ratio of 1:2 (top). The assays were performed in 25 mM KH2PO4 buffer (70 mM KCl, 10% D2O, pH 7.4) using 400 MHz Bruker spectrometers at 25 °C.

the telomere G4 Htg22 and Pu22 with 2-Ap substitutions at different positions in the G-quartet or in the loop. Upon titration with peptide 12, the fluorescence intensity of Ap1 in the 5′ terminal region in Htg22 was significantly disturbed and that of Ap7 in the first loop was slightly disturbed (Figure 3A). However, the fluorescence of Ap13 and Ap19 was not affected. This result indicated that peptide 12 bound the 5′ terminal region and the propeller loop of Htg22. On the other hand, the fluorescence intensity of Ap21 in the 3′ terminal region in Pu22 was significantly disturbed upon titration with peptide 12, and that of Ap7 and Ap16 in the loop was slightly disturbed (Figure 3B). This result indicated that peptide 12 bound the terminal region and two propeller loops of Pu22 simultaneously. Combining the results of the NMR and 2-Ap titrations, we deduced that peptide 12 was restricted to binding on the G-quartet plane of G4s. Recent studies have found that the G-quartet is important for G4 binding by the RGG domain.34 An NMR-based binding assay revealed that the aromatic amino acids phenylalanine and tyrosine were critical in the binding. It is also important to see the key amino acids modulating G-quartet binding of the RGG peptide. 1H NMR titrations of mutated peptides to Htg25 were carried out to reveal the amino acids for G-quartet binding. Since the RGG peptide 12 exhibits a binding mode specific to the G-quartet, we focus on the imino signals belonging to the

Figure 3. Fluorescence titrations of 1 μM 2-AP-labeled G-quadruplexes with stepwise addition of peptide 12 in 10 mM Tris-HCl buffer, 100 mM KCl, pH 7.4. (A) Plot of normalized fluorescence intensity at 375 nm of Htg22 labeled with individual 2-Ap versus the binding ratio of [peptide 12]/[Htg22]. (B) Plot of normalized fluorescence intensity at 375 nm of Pu22 labeled with individual 2-Ap versus the binding ratio of [peptide 12]/[Pu22]. The 2-Ap-labeled sites in the G-quadruplexes are shown in the chart. 17950

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955

Article

Journal of the American Chemical Society

The RGG Motif Is Essential for Binding G-Quadruplexes. As discussed above, the similarity of the binding selectivity of CIRBP and peptide 12 prompted us to speculate that the RGG motif within CIRBP was responsible for the binding affinity and selectivity for its nucleic acid substrates. To pursue this hypothesis, we constructed a mutant protein with a deleted RGG motif (peptide 12), termed ΔRGG (Figure 5A). The KD values of ΔRGG for G4s were around 10

G4s. To determine whether RGG peptide 12 mediates the protein−G4 interaction, we investigated the binding of CIRBP to G4 both in vitro and in cells. CIRBP was purified with a glutathione S-transferase (GST) tag, following a previous report (Figure S19). The affinity of CIRBP binding different types of nucleic acids was determined by SPR, which prompted us to investigate whether CIRBP binds to G4 in a structure-dependent manner. In addition to G4 (Htg22 and Pu22) and hairpin (Hp18) and single-stranded (mG14) DNA, G4 RNA (TERRA) and single-stranded RNA (mTERRA) were included because CIRBP has been reported to bind RNA. As shown in Table S4 and Figures S20 and S21, CIRBP exhibited strong binding of G4s (Htg22, Pu22, and TERRA), with KD values of 35.8, 44.1, and 86.9 nM, respectively. Meanwhile, CIRBP also showed good binding of single-stranded RNA (mTERRA), with a KD value of 110.0 nM, but no binding to hairpin or single-stranded DNA, which was consistent with its role as an RNA-binding protein. It seems that the selective binding of CIRBP toward the G4 is relevant only for DNA. Filter binding and electrophoretic mobility shift assays were then performed to confirm the selective binding of CIRBP to G4s. The filter binding results were in good agreement with SPR, where CIRBP clearly bound G4s with an EC50 value of 73.8 nM, and no distinct binding was observed with single-stranded DNA (mHtg22 and mG14) and double-strand hairpin DNA (Hp18) (Figure 4 and Figure

Figure 5. (A) Alterations in the RGG motif of CIRBP. (B) Filterbinding assays of RGG motif deletion (ΔRGG) or mutation (mRGG1 and mRGG2) constructs with telomeric G-quadruplex Htg22 DNA. (C) Quantification of DNA on filter-binding dots.

μM, which were 3 orders of magnitude (∼250 times) higher than that of CIRBP, denoting the necessity of the RGG motif in CIRBP binding G4 (Table S5, Figures S26 and S27). In addition, the results of filter binding and EMSAs corroborated this conclusion, while no distinct band corresponding to G4 binding was observed with ΔRGG (Figure S28). In the studies of RGG peptides, arginine at the fourth and fifth RGG repeats played a key role in G4 binding. We next explored whether these arginine residues in the RGG motif play important roles in CIRBP binding G4. Therefore, two mutant proteins were designed with an arginine substituted by an alanine (R108A for mRGG1 and R110A for mRGG2) (Figure 5A). Both mRGG1 and mRGG2 exhibited weaker binding of telomeric G4 than CIRBP, with KD values of 825.2 and 355.3 nM, respectively (Table S5, Figure S27). The EMSAs of these two mutants with G4s also exhibited a dramatic reduction in the nucleic acid affinity (Figure 5B, Figure S28), confirming that these two amino acid residues played an important role in mediating the CIRBP−G4 interaction. That lack of binding observed for ΔRGG and seriously impaired binding observed for mRGG1 and mRGG2 suggested that the RGG motif (peptide 12) in CIRBP was essential for its G4 binding and that the peptide sequence should be involved as a core binding mediator to G4. Interaction Mode between G-Quadruplexes and CIRBP. It has been reported that the RGG domain of G4binding proteins effectively recognizes the loop of G4s. However, peptide 12 alone was more likely to bind on the G-quartet plane. It is interesting that this RGG motif participated in G4 binding by CIRBP. To investigate this behavior, we used human telomeric DNA as a study model. The telomere G4 DNA, with different repeats of TTAGGG, can form different topologies containing different loops,

Figure 4. CIRBP binds to the G-quadruplex. (A) Filter-binding dots of CIRBP with G-quadruplex (Htg22), single-stranded (mHtg22) DNA, and hairpin (Hp18) DNA. (B) Quantification of DNA on filterbinding dots. (C) EMSA binding bands of CIRBP with G-quadruplex (Htg22), single-stranded (mHtg22) DNA, and hairpin (Hp18) DNA.

S22). The sequences were then expanded in EMSAs, and similar selectivity in DNA binding was observed for CIRBP (Figure 4C and Figures S23 and S24). Biolayer interferometry (BLI) indicated that CIRBP bound telomeric G4 Htg22 DNA with a KD of 27.1 nM, and a competition assay also confirmed its selectivity for telomeric G4 DNA (Figure S25). Taken together, these data represent the first time that CIRBP is identified as a G4-binding protein. In addition, CIRBP bound different nucleic acid substrates, which is a behavior known as promiscuous binding or degenerate specificity. It was also interesting to see that CIRBP showed similar binding selectivity to peptide 12, implying that the RGG motif is crucial for CIRBP binding G4. Therefore, it could be deduced that its binding flexibility was mostly due to the plasticity of the RGG motif. 17951

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955

Article

Journal of the American Chemical Society

Figure 6. (A) BLI competition assay of 50 nM CIRBP binding with Htg22 upon various DNAs. (B) Plot of normalized fluorescence intensity at 375 nm of Htg22 labeled with individual 2-Ap versus the binding ratio of [CIRBP]/[Htg22].

including a tetramolecular G4 without a loop, a bimolecular G4 with two propeller loops, and a unimolecular G4 with two basket loops and one propeller loop.45−47 The formation of G4 structures was confirmed by CD spectra (Figure S1). We first compared the affinity of CIRBP binding to these three G4s by competition assays using BLI.48 In addition, CIRBP showed weaker binding affinity with dimer and tetramer telomere G4s than with previous G4s (Figure 6A). This result suggested that CIRBP might interact with the loop of G4. Then, 2-Ap titration was carried out to directly determine the possible CIRBP binding of telomeric G4 DNA. In contrast to the RGG peptide alone, CIRBP affects all 2-Aps, both in the G-quartet (Ap1) and in the loop site (Ap7, Ap13, Ap19). The CIRBP might bind both G-quartet and loop site (Figure 6B). The binding stoichiometry between CIRBP and G-quadruplex DNA was also determined by isothermal titration calorimetry (ITC). The binding enthalpy was well fitted with a 1:1 binding mode, suggesting that one CIRBP might interact with both Gquartet and the loop of G4, simultaneously (Figure S29). Considering these results, it could be concluded that CIRBP efficiently bound by surrounding the telomeric G4 DNA. In addition to the RGG motif that could bind the G-quartet, other components of CIRBP might also participate in G4 recognition, while all loops could be reached by the CIRBP. The effect of the CIRBP on the G4 structure was then evaluated by fluorescence resonance energy transfer (FRET) and CD; while the signal of G4 was slightly affected, CIRBP might be directly bound to G4 without helicase activity (Figures S30 and S31). CIRBP Binds to Telomeric G-Quadruplex DNA in Cells. Because CIRBP selectively binds DNA G4 structures in vitro, whether CIRBP interacts with G4 DNA in cells is the next question to be answered. To test this hypothesis, an immunofluorescence assay was conducted. Notably, CIRBP was expressed at a low level, making it difficult for us to observe its location. Therefore, we transfected a plasmid that expresses GFP-tagged CIRBP in HeLa cells. DNA G4s were visualized by the G4 antibody BG4 or D1 after RNase A treatment, which removed the RNA.30,31 From the images in Figure 7, the partial colocalization of GFP-CIRBP and DNA G4s was observed. Because CIRBP does not specifically bind G4 and because there are many more nucleic acids than G4s, this extent of colocalization was plausible. On the other hand, the colocalization of the RGG motif deleted mutant (GFPΔRGG) and DNA G4s was not observed (Figure S32), suggesting that the RGG motif in CIRBP might also be essential for G4 binding in cells.

Figure 7. Immunofluorescence image of GFP-tagged CIRBP (green) and G-quadruplex antibody BG4 or D1 (red) in HeLa cells after RNase A treatment. The nucleus was stained with DAPI (blue). The colocalized foci are indicated by white arrows and shown in an enlarged image of the yellow box.

Telomeric G4 DNA is one of the most abundant G4s in the nucleus, and the formation of telomeric DNA G4s could play an important role in telomere maintenance. CIRBP has been reported to play a role in telomere maintenance and could effectively bind to telomeric DNA G4s in vitro in our study. Thus, CIRBP might also participate in telomere regulation as a telomeric G4-binding protein in vivo. We then investigated the interaction between CIRBP and telomeric G4s in cells. The binding of CIRBP to telomeric DNA was first evaluated by ChIP. The results indicated that CIRBP could indeed bind to telomeric repeat DNA (Figure S33). The binding of CIRBP to the telomere region was confirmed by immunofluorescence.49 In Figure S34, a large portion of GFP-tagged CIRBPs were colocalized with the telomeric protein TRF2, which could represent the location of telomere regions. At the same time, the colocalization of the telomere was not observed for GFPtagged ΔRGG. Taken together, ChIP and immunofluorescence data illustrated an association of CIRBP with G4s and telomeres in cells, which was possibly driven by the G4-binding RGG motif.



DISCUSSION G4 has long been recognized as a regulating element for its involvement in telomere maintenance and gene expression. These processes are regulated by numerous G4-binding proteins. However, how a protein recognizes G4 is obscure. In this study, we focused on the mechanism of RGG motif binding to G4s; this motif is found in parts of G4-binding 17952

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955

Article

Journal of the American Chemical Society

repeats and gap amino acids played a more crucial role in the G4-binding mechanism than a critical number of RGG repeats. Arginines and phenylalanines at the exact position of the RGG motif might enable additional hydrogen bonding and π -stacking interaction with nucleobases and strengthen the binding to G4s. Impressively, proceeding from the G4-binding RGG peptide 12, we identified CIRBP as a new G4 DNAbinding protein both in vitro and in cells. This is the first time that a G4-binding protein was identified from the exploration of the G4-binding RGG motif. In addition, we found that the RGG motif acted as a key mediator in the G4 binding of CIRBP. The binding ability of this key peptide segment to G4 could represent the binding ability of the full-length protein. A recent analysis also found the RGG sequence as a shared motif from 77 G4-binding proteins.20 In future studies, based on deeper insight into the role of the G4-binding RGG motif, it might be possible to rapidly identify G4-binding proteins by analyzing the internal arrangement of RGG repeats. This approach might be a novel strategy for discovering new G4binding proteins by exploring key peptide segments.

proteins. To clarify the role of the RGG motif in G4−protein interactions, we systematically investigated the binding affinity and mode between the RGG-motif peptides and G4s. From the SPR results of peptides 1 to 12, we found that RGG peptides with similar lengths and numbers of RGG repeats but different gap amino acids and internal arrangements exhibited great differences in G4-binding ability. Among these peptides, peptide 12, with seven RGG repeats, was found to effectively bind G4 DNA over single-stranded and hairpin DNA. Mutation studies of peptide 12 revealed that both arginine and phenylalanine residues are crucial for the G4 binding of peptide 12 and that the positions of the residues are site specific. The arginine in the RGG repeat is more capable of binding G4s than lysine or alanine, and it is likely that arginine not only provides positive charges for electrostatic interactions but also provides hydrogen bonding or π-stacking interaction sites with substrates. Thus, arginine is conserved in the RGG motif. Moreover, the phenylalanine is more capable of binding G4s than leucine or alanine, prompting its role in G4 binding via π−π interactions. In addition, the internal arrangement of RGG repeats and gap amino acids also plays a role in the G4binding mechanism rather than a critical number of RGG repeats. We also found that RGG peptide 12 binds mainly upon one G-quartet plane of G4s. The arginine in the fourth RGG repeat and the phenylalanines in the middle of the sequence were found to be the most important amino acids in G-quartet binding. This G-quartet binding mode was also observed in the RGG3 domain of TLS/FUS or the peptide, Rhau18.34,40 Interestingly, a recent crystal structure showed that Rhau18 mediated the binding between protein DHX36 and a G4 with a similar binding mode (Figure S18).18 These intriguing results bring our focus to the origin protein of peptide 12, CIRBP. According to SPR, BLI, filter binding, and EMSA results, we found that CIRBP bound both G4 and single-stranded structures in RNA but only G4 structures in DNA. As CIRBP has been reported as an RNA-binding protein whose action is mediated in part through an RNA recognition motif, it is not surprising that CIRBP has a broad substrate-binding scope in RNA. However, the DNA binding of CIRBP is specific for the G4 structure, which might be mediated by the G4-binding RGG motif. The cellular interaction between CIRBP and G4 DNA was confirmed by an immunofluorescence assay in HeLa cells. In addition, the binding of CIRBP to DNA is associated with telomeres. This observation suggests that CIRBP might bind telomeric G4 DNA in vivo. Considering the effect of CIRBP in telomeric DNA maintenance, CIRBP might also be involved in this function via binding with telomeric G4 DNA. The G4 DNA-binding domain of CIRBP was confirmed by the deletion and mutation of the RGG motif. The deletion of the RGG motif (peptide 12) completely abolished the binding affinity of CIRBP to G4s, while mutations of arginine 108 or 110 in the RGG motif resulted in much weaker binding affinity. Both the deletion and mutation indicated the necessary RGG motif−G4 interaction for the CIRBP recognition of G4 DNA. In particular, arginines 108 and 110 in CIRBP might be the key residues for its binding to DNA G4s and are key amino acids for peptide 12 binding alone. The highly consistent key amino acids clearly suggest that this RGG motif is the core mediator of the CIRBP−G4 interaction. In conclusion, our study provides new insight into how RGG peptides bind to G4s. The internal arrangement of RGG



ASSOCIATED CONTENT

* Supporting Information S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/jacs.8b09329.



Characterization of the peptides and proteins; supplemental spectra and graphs (PDF)

AUTHOR INFORMATION

Corresponding Authors

*[email protected] *[email protected] ORCID

Jia-Heng Tan: 0000-0002-1612-7482 Shuo-Bin Chen: 0000-0001-9118-2185 Zhi-Shu Huang: 0000-0002-6211-5482 Author Contributions ‡

Z.-L. Huang and J. Dai contributed equally.

Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was supported by the National Natural Science Foundation of China (81330077, 21708053, 81872732, and 21672265), the Natural Science Foundation of Guangdong Province (2017A030308003 and 2017A030313040), the Ministry of Education of China (No. IRT-17R111), the Fundamental Research Funds for the Central Universities (17ykpy18), and the Guangdong Provincial Key Laboratory of Construction Foundation (2017B030314030).



REFERENCES

(1) Bochman, M. L.; Paeschke, K.; Zakian, V. A. DNA secondary structures: stability and function of G-quadruplex structures. Nat. Rev. Genet. 2012, 13 (11), 770−80. (2) Chambers, V. S.; Marsico, G.; Boutell, J. M.; Di Antonio, M.; Smith, G. P.; Balasubramanian, S. High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat. Biotechnol. 2015, 33 (8), 877−81. (3) Kwok, C. K.; Marsico, G.; Sahakyan, A. B.; Chambers, V. S.; Balasubramanian, S. rG4-seq reveals widespread formation of G-

17953

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955

Article

Journal of the American Chemical Society quadruplex structures in the human transcriptome. Nat. Methods 2016, 13 (10), 841−4. (4) Chen, X.-C.; Chen, S.-B.; Dai, J.; Yuan, J.-H.; Ou, T.-M.; Huang, Z.-S.; Tan, J.-H. Tracking the Dynamic Folding and Unfolding of RNA G-Quadruplexes in Live Cells. Angew. Chem., Int. Ed. 2018, 57 (17), 4702−6. (5) Kuryavyi, V.; Phan, A. T.; Patel, D. J. Solution structures of all parallel-stranded monomeric and dimeric G-quadruplex scaffolds of the human c-kit2 promoter. Nucleic Acids Res. 2010, 38 (19), 6757− 73. (6) Dhakal, S.; Cui, Y.; Koirala, D.; Ghimire, C.; Kushwaha, S.; Yu, Z.; Yangyuoru, P. M.; Mao, H. Structural and mechanical properties of individual human telomeric G-quadruplexes in molecularly crowded solutions. Nucleic Acids Res. 2013, 41 (6), 3915−23. (7) Tian, T.; Chen, Y.-Q.; Wang, S.-R.; Zhou, X. G-Quadruplex: A Regulator of Gene Expression and Its Chemical Targeting. Chem. 2018, 4 (6), 1314−44. (8) Takahashi, S.; Brazier, J. A.; Sugimoto, N. Topological impact of noncanonical DNA structures on Klenow fragment of DNA polymerase. Proc. Natl. Acad. Sci. U. S. A. 2017, 114 (36), 9605−10. (9) Sutherland, C.; Cui, Y.; Mao, H.; Hurley, L. H. A Mechanosensor Mechanism Controls the G-Quadruplex/i-Motif Molecular Switch in the MYC Promoter NHE III1. J. Am. Chem. Soc. 2016, 138 (42), 14138−51. (10) Reyes-Reyes, E. M.; Teng, Y.; Bates, P. J. A new paradigm for aptamer therapeutic AS1411 action: uptake by macropinocytosis and its stimulation by a nucleolin-dependent mechanism. Cancer Res. 2010, 70 (21), 8617−29. (11) Blice-Baum, A. C.; Mihailescu, M. R. Biophysical characterization of G-quadruplex forming FMR1 mRNA and of its interactions with different fragile X mental retardation protein isoforms. RNA 2014, 20 (1), 103−14. (12) Zhang, Y.; Gaetano, C. M.; Williams, K. R.; Bassell, G. J.; Mihailescu, M. R. FMRP interacts with G-quadruplex structures in the 3′-UTR of its dendritic target Shank1 mRNA. RNA Biol. 2014, 11 (11), 1364−74. (13) Neidle, S.; Parkinson, G. Telomere maintenance as a target for anticancer drug discovery. Nat. Rev. Drug Discovery 2002, 1 (5), 383− 93. (14) Ray, S.; Bandaria, J. N.; Qureshi, M. H.; Yildiz, A.; Balci, H. Gquadruplex formation in telomeres enhances POT1/TPP1 protection against RPA binding. Proc. Natl. Acad. Sci. U. S. A. 2014, 111 (8), 2990−5. (15) Takahama, K.; Takada, A.; Tada, S.; Shimizu, M.; Sayama, K.; Kurokawa, R.; Oyoshi, T. Regulation of telomere length by Gquadruplex telomere DNA- and TERRA-binding protein TLS/FUS. Chem. Biol. 2013, 20 (3), 341−50. (16) Williams, P.; Li, L.; Dong, X.; Wang, Y. Identification of SLIRP as a G Quadruplex-Binding Protein. J. Am. Chem. Soc. 2017, 139 (36), 12426−9. (17) Vaughn, J. P.; Creacy, S. D.; Routh, E. D.; Joyner-Butt, C.; Jenkins, G. S.; Pauli, S.; Nagamine, Y.; Akman, S. A. The DEXH protein product of the DHX36 gene is the major source of tetramolecular quadruplex G4-DNA resolving activity in HeLa cell lysates. J. Biol. Chem. 2005, 280 (46), 38117−20. (18) Chen, M. C.; Tippana, R.; Demeshkina, N. A.; Murat, P.; Balasubramanian, S.; Myong, S.; Ferre-D’Amare, A. R. Structural basis of G-quadruplex unfolding by the DEAH/RHA helicase DHX36. Nature 2018, 558 (7710), 465−9. (19) Brazda, V.; Haronikova, L.; Liao, J. C.; Fojta, M. DNA and RNA quadruplex-binding proteins. Int. J. Mol. Sci. 2014, 15 (10), 17493−517. (20) Brazda, V.; Cerven, J.; Bartas, M.; Mikyskova, N.; Coufal, J.; Pecinka, P. The Amino Acid Composition of Quadruplex Binding Proteins Reveals a Shared Motif and Predicts New Potential Quadruplex Interactors. Molecules 2018, 23 (9), 2341−56. (21) Thandapani, P.; O’Connor, T. R.; Bailey, T. L.; Richard, S. Defining the RGG/RG motif. Mol. Cell 2013, 50 (5), 613−23.

(22) Gonzalez, V.; Hurley, L. H. The C-terminus of nucleolin promotes the formation of the c-MYC G-quadruplex and inhibits cMYC promoter activity. Biochemistry 2010, 49 (45), 9706−14. (23) Takahama, K.; Kino, K.; Arai, S.; Kurokawa, R.; Oyoshi, T. Identification of Ewing’s sarcoma protein as a G-quadruplex DNAand RNA-binding protein. FEBS J. 2011, 278 (6), 988−98. (24) Takahama, K.; Oyoshi, T. Specific binding of modified RGG domain in TLS/FUS to G-quadruplex RNA: tyrosines in RGG domain recognize 2’-OH of the riboses of loops in G-quadruplex. J. Am. Chem. Soc. 2013, 135 (48), 18016−9. (25) Ozdilek, B. A.; Thompson, V. F.; Ahmed, N. S.; White, C. I.; Batey, R. T.; Schwartz, J. C. Intrinsically disordered RGG/RG domains mediate degenerate specificity in RNA binding. Nucleic Acids Res. 2017, 45 (13), 7984−96. (26) Yagi, R.; Miyazaki, T.; Oyoshi, T. G-quadruplex binding ability of TLS/FUS depends on the beta-spiral structure of the RGG domain. Nucleic Acids Res. 2018, 46 (12), 5894−5901. (27) Takahama, K.; Sugimoto, C.; Arai, S.; Kurokawa, R.; Oyoshi, T. Loop lengths of G-quadruplex structures affect the G-quadruplex DNA binding selectivity of the RGG motif in Ewing’s sarcoma. Biochemistry 2011, 50 (23), 5369−78. (28) Vasilyev, N.; Polonskaia, A.; Darnell, J. C.; Darnell, R. B.; Patel, D. J.; Serganov, A. Crystal structure reveals specific recognition of a G-quadruplex RNA by a beta-turn in the RGG motif of FMRP. Proc. Natl. Acad. Sci. U. S. A. 2015, 112 (39), E5391−400. (29) Phan, A. T.; Kuryavyi, V.; Darnell, J. C.; Serganov, A.; Majumdar, A.; Ilin, S.; Raslin, T.; Polonskaia, A.; Chen, C.; Clain, D.; Darnell, R. B.; Patel, D. J. Structure-function studies of FMRP RGG peptide recognition of an RNA duplex-quadruplex junction. Nat. Struct. Mol. Biol. 2011, 18 (7), 796−804. (30) Biffi, G.; Tannahill, D.; McCafferty, J.; Balasubramanian, S. Quantitative visualization of DNA G-quadruplex structures in human cells. Nat. Chem. 2013, 5 (3), 182−6. (31) Liu, H.-Y.; Zhao, Q.; Zhang, T.-P.; Wu, Y.; Xiong, Y.-X.; Wang, S.-K.; Ge, Y.-L.; He, J.-H.; Lv, P.; Ou, T.-M.; Tan, J.-H.; Li, D.; Gu, L.Q.; Ren, J.; Zhao, Y.; Huang, Z.-S. Conformation Selective Antibody Enables Genome Profiling and Leads to Discovery of Parallel GQuadruplex in Human Telomeres. Cell Chem. Biol. 2016, 23 (10), 1261−70. (32) Nagatoishi, S.; Isono, N.; Tsumoto, K.; Sugimoto, N. Hydration is required in DNA G-quadruplex-protein binding. ChemBioChem 2011, 12 (12), 1822−6. (33) Corley, S. M.; Gready, J. E. Identification of the RGG box motif in Shadoo: RNA-binding and signaling roles? Bioinf. Biol. Insights 2008, 2, 383−400. (34) Kondo, K.; Mashima, T.; Oyoshi, T.; Yagi, R.; Kurokawa, R.; Kobayashi, N.; Nagata, T.; Katahira, M. Plastic roles of phenylalanine and tyrosine residues of TLS/FUS in complex formation with the Gquadruplexes of telomeric DNA and TERRA. Sci. Rep. 2018, 8 (1), 2864−75. (35) Sathyapriya, R.; Vishveshwara, S. Interaction of DNA with clusters of amino acids in proteins. Nucleic Acids Res. 2004, 32 (14), 4109−18. (36) Monchaud, D.; Teulade-Fichou, M. P. G4-FID: a fluorescent DNA probe displacement assay for rapid evaluation of quadruplex ligands. Methods Mol. Biol. 2010, 608, 257−71. (37) Dai, J.; Carver, M.; Hurley, L. H.; Yang, D. Solution structure of a 2:1 quindoline-c-MYC G-quadruplex: insights into G-quadruplexinteractive small molecule drug design. J. Am. Chem. Soc. 2011, 133 (44), 17673−80. (38) Luu, K. N.; Phan, A. T.; Kuryavyi, V.; Lacroix, L.; Patel, D. J. Structure of the Human Telomere in K+ Solution: An Intramolecular (3 + 1) G-Quadruplex Scaffold. J. Am. Chem. Soc. 2006, 128 (30), 9963−70. (39) Cravens, H. A scientific project locked in time. The Terman Genetic Studies of Genius, 1920s-1950s. Am. Psychol. 1992, 47 (2), 183−9. (40) Heddi, B.; Cheong, V. V.; Martadinata, H.; Phan, A. T. Insights into G-quadruplex specific recognition by the DEAH-box helicase 17954

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955

Article

Journal of the American Chemical Society RHAU: Solution structure of a peptide-quadruplex complex. Proc. Natl. Acad. Sci. U. S. A. 2015, 112 (31), 9608−13. (41) Mashima, T.; Matsugami, A.; Nishikawa, F.; Nishikawa, S.; Katahira, M. Unique quadruplex structure and interaction of an RNA aptamer against bovine prion protein. Nucleic Acids Res. 2009, 37 (18), 6249−58. (42) Yang, R.; Weber, D. J.; Carrier, F. Post-transcriptional regulation of thioredoxin by the stress inducible heterogenous ribonucleoprotein A18. Nucleic Acids Res. 2006, 34 (4), 1224−36. (43) Chen, J. K.; Lin, W. L.; Chen, Z.; Liu, H. W. PARP-1dependent recruitment of cold-inducible RNA-binding protein promotes double-strand break repair and genome stability. Proc. Natl. Acad. Sci. U. S. A. 2018, 115 (8), E1759−68. (44) Zhang, Y.; Wu, Y.; Mao, P.; Li, F.; Han, X.; Zhang, Y.; Jiang, S.; Chen, Y.; Huang, J.; Liu, D.; Zhao, Y.; Ma, W.; Songyang, Z. Coldinducible RNA-binding protein CIRP/hnRNP A18 regulates telomerase activity in a temperature-dependent manner. Nucleic Acids Res. 2016, 44 (2), 761−75. (45) Parkinson, G. N.; Lee, M. P.; Neidle, S. Crystal structure of parallel quadruplexes from human telomeric DNA. Nature 2002, 417 (6891), 876−80. (46) Dai, J.; Punchihewa, C.; Ambrus, A.; Chen, D.; Jones, R. A.; Yang, D. Structure of the intramolecular human telomeric Gquadruplex in potassium solution: a novel adenine triple formation. Nucleic Acids Res. 2007, 35 (7), 2440−50. (47) Hounsou, C.; Guittat, L.; Monchaud, D.; Jourdan, M.; Saettel, N.; Mergny, J. L.; Teulade-Fichou, M. P. G-quadruplex recognition by quinacridines: a SAR, NMR, and biological study. ChemMedChem 2007, 2 (5), 655−66. (48) Chen, S.-B.; Hu, M.-H.; Liu, G.-C.; Wang, J.; Ou, T.-M.; Gu, L.-Q.; Huang, Z.-S.; Tan, J.-H. Visualization of NRAS RNA GQuadruplex Structures in Cells with an Engineered Fluorogenic Hybridization Probe. J. Am. Chem. Soc. 2016, 138 (33), 10382−5. (49) Rodriguez, R.; Muller, S.; Yeoman, J. A.; Trentesaux, C.; Riou, J. F.; Balasubramanian, S. A novel small molecule that alters shelterin integrity and triggers a DNA-damage response at telomeres. J. Am. Chem. Soc. 2008, 130 (47), 15758−9.

17955

DOI: 10.1021/jacs.8b09329 J. Am. Chem. Soc. 2018, 140, 17945−17955