G-Quadruplex and I-Motif Structures within the ... - ACS Publications

Dec 27, 2018 - Molecular dynamics simulations were employed to study the properties of G-quadruplex and i-motif secondary DNA structures formed within...
0 downloads 0 Views 5MB Size
Subscriber access provided by La Trobe University Library

B: Biophysics; Physical Chemistry of Biological Systems and Biomolecules

G-Quadruplex and I-Motif Structures Within the Telomeric DNA Duplex. A Molecular Dynamics Analysis of Protonation States as Factors Affecting Their Stability Pawel Wolski, Krzysztof Nieszporek, and Tomasz Panczyk J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.8b11547 • Publication Date (Web): 27 Dec 2018 Downloaded from http://pubs.acs.org on January 4, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

G-quadruplex and I-motif Structures Within the Telomeric DNA Duplex. A Molecular Dynamics Analysis of Protonation States as Factors Affecting Their Stability

Pawel Wolski1, Krzysztof Nieszporek,2 Tomasz Panczyk1,* 1Institute

of Catalysis and Surface Chemistry, Polish Academy of Sciences ul. Niezapominajek 8, 30239 Cracow, Poland e-mail: [email protected] phone: +48 815375620; fax:+48 815375685

2Department

of Chemistry, Maria Curie-Sklodowska University pl. M. Curie-Sklodowskiej 3, 20031 Lublin, Poland Abstract

Molecular dynamics simulations were employed to study the properties of G-quadruplex and i-motif secondary DNA structures formed within the canonical telomere fragment of the Watson-Crick duplex. These secondary structures were build symmetrically in the same place of the duplex and were subjected to the analysis in standard unbiased simulations and using metadynamics scheme for the determination of potential of mean force associated with the enforced unfolding of the i-motif parts of the systems. Also, enforced formation of i-motif structures, starting from partially unfolded duplex, were studied in order to find whether formation of i-motif facilitates spontaneous formation of Gquadruplex. We found that i-motif formed from single stranded DNA is unstable at neutral pH and room temperature. On the other hand, the i-motif is strongly stabilized by the presence of complementary G-quadruplex which should be the most likely configuration when these secondary structures form from double stranded DNA. The stabilization is observed either in neutral or in acidic pH though in the neutral case the i-motif can also reveal considerable stability in the hairpin configuration. We did not observe spontaneous folding of guanine rich strand into G-quadruplex when the cytosine rich strand was dragged to i-motif configuration. This observation suggests that both folding and unfolding transitions are kinetically blocked. *Corresponding Author, e-mail: [email protected] phone: +48 815375620, fax: +48 815375685

ACS Paragon Plus Environment

1

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 36

1. Introduction The telomeric DNA consists of highly repetitive short sequences (approximately 2500 times in humans) of the nucelotides (TTAGGG):(CCCTAA).1-4 The biological role of this DNA fragment is protection of chromosome from deterioration or from fusion with neighboring chromosomes. Telomeres are also responsible for genome integrity, cellular aging and perhaps cancer. Both the guanine rich (G-rich) and cytosine rich (C-rich) strands of telomeric DNA are able to form unusual secondary structures. The G-rich strand can form the so called G-quadruplex structure in which four guanines form planar quartets. The G-quadruplex is normally observed at neutral pH in the presence of cations (Na+, K+).4-6 The C-rich strand, as found by Gehring et. al.7, can form intercalated, quadruplehelical structures under acidic conditions. This structure (i-motif) consists of two parallel duplexes combined in an antiparallel fashion by forming intercalated hemiprotonated cytosine-cytosine base pairs. These noncanonical DNA structures that is, i-motif

1,3,7-12

and G-quadruplex

4,5,13-15

have

recently been carefully studied using either experimental or computational techniques.16-21 However, in majority of cases these structures were studied separately and as short individual fragments. To our best knowledge, theoretical studies of these secondary structures were always limited to individually existing short sequences of telomeric DNA. Because folding and unfolding of telomeric DNA in in vivo cases have important consequences in e.g. inhibition of telomerase activity

22,23

a careful analysis of

those processes by means of computational tools is important. Molecular dynamics simulations of G-quadruplex folding and unfolding led to several important observations; namely, it was found that those processes are described by kinetic partitioning mechanism. This means that there is a competition between many well-separated and structurally different conformational ensembles.24 Thus, even in the case of G-quadruplex formation from single strand of guanine rich sequence there was at least 3 different intermediates observed until final Gquadruplex was formed, moreover the timescale of such a process reached several hours.25 Similar complexities were also observed in unfolding transitions of G-quadruplexes.24,25 The folding/unfolding transitions of i-motif structure are less recognized and they represent slightly more complex problem from the computational point of view as the protonation/deprotonation reactions are involved.18,16 However, most of the studies carried out so far indicate that formation of i-motif occurs in a sequential manner with intermediate duplex and triplex states. The timescale of folding and unfolding transitions reach millisecond regime as found using the stopped-flow circular dichroism technique.10

ACS Paragon Plus Environment

2

Page 3 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Considering naturally formed structures it seems that the G-quadruplex and i-motif should appear together. This is because formation of one of the structures leaves uncompensated strain in the complementary chain. The exception may be formation of G-quadruplexes within the telomeric 3’ overhangs because there are no complementary cytosine rich strands. Within the telomeric WC duplex these secondary structures may appear at the same position or at different locations on a duplex.1 Therefore, our studies are focused mainly on a structure where i-motif and G-quadruplex appear exactly on the same position of the duplex as this seems to be the most likely case. However, recent studies demonstrated that these structures are often mutually exclusive due to steric hindrance. 26,27 But in still other studies, concerning the promoter region of the RET oncogene, which also reveals G-rich and C-rich sequences, the G-quadruplex/i-motif analog was found as a stable low energy construct. Its existence was confirmed by circular dichroism spectra and molecular footprinting methods.28 This makes the structure even more intriguing and its careful molecular dynamics analysis is necessary in order to better understand physical factors responsible for its existence (or not) in vivo. The aim of this study is thus the analysis of an extended structure containing both the Gquadruplex and i-motif and terminated by canonical Watson-Crick duplex fragments. It allows us to study how the presence of one of the structures (eg. G-quadruplex) affects the stability of the another one (i-motif) and also to draw some conclusions concerning the likelihood of simultaneous existence of these structures in the same place of the telomeric duplex in vivo. The studies are also focused on the analysis of the role of the protonated state of cytosines as a factor affecting the stability of the overall structure. Temperature and the presence of longer single stranded fragments of DNA, terminating the imotif spatial structure, are also considered as additional factors which can affect the stability of the considered species.

2 Methods The studied structure of the i-motif + G-quadruplex (iG) has been composed by utilizing NAB (nucleic acid builder) language from AmberTools16 package29 and using pdb files from the PDB database. The starting structure of the Watson-Crick (WC) duplex of the sequence [5’-GGG(TTAGGG)6 ] : [5’CCC(TAACCC)6 ] has been build using NAB and next the atomic coordinates of the bases from 13 to 33 have been replaced by the corresponding coordinates from pdb files 2JPZ and 1EL2. In that way we obtained the structure iG shown schematically in Fig.1, that is the telomeric fragment of DNA with the secondary structures of i-motif and G-quadruplex placed symmetrically in the middle of the WC duplex. It should be underlined that we consider the situation when the i-motif and G-quadruplex are

ACS Paragon Plus Environment

3

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 36

located in the same position on the duplex, in contrast to other possible cases like their individual appearance or just offset between them. Such an arrangement seems to intuitively be the most likely in in vivo cases. For comparison, we also studied two smaller structures, i.e the i-motif with intact single stranded fragments (long i-motif, liM) and the i-motif without the single stranded fragments (short imotif, siM). All the three structures (after equilibration stage) are shown in Fig. 2.

Figure 1. Schematic representation of the secondary structure of telomeric DNA fragment iG studied in this work. The i-motif and G-quadruplex are located symmetrically in the same position of the WatsonCrick duplex. The i-motif and G-quadruplex form due to corresponding bases pairing (Hoogsteen base pairing) shown in the insets. Construction of the simulation box, software applied and all settings were very similar as in ref. 16

Thus, all calculations were based on the amber force field for nucleic acids30 ff99 with the bsc1

modifications.31 The constructions of the topologies were done using self-designed scripts and the force field was generated using tleap program from the AmberTools16 package. The input scripts for tleap launched commands for using tip3p water model, suitable amounts of Na+ and Cl- for production of 0.145 mol L-1 ionic strength of solution, and protonation of cytosines in the case of acidic conditions. At neutral pH all cytosines are in their standard unprotonated forms and we will shortly call this state as the neutral pH case. However, in order to study effects coming from the reduced pH, the cytosines 13, 14, 15, 19, 20 and 21 were additionally protonated by adding extra protons to nitrogen atoms. We will shortly call this state as the acidic pH case. The protonation leads to modification of charge distribution within the molecules and the new set of point charges come from dedicated quantum chemical calculations done by the amber force field developers.30 All those effects were accounted for by using tleap program which is able to use force field parameterization for protonated nucleic acids as well. It ACS Paragon Plus Environment

4

Page 5 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

should be underlined that the protonated state of i-motif obtained in that way represent another molecular topology of the system as it contains extra hydrogen atoms, different total charge and different set of point charges within the cytosine residues. Moreover, it is a static configuration – the deprotonation cannot occur partially as this is non reactive force field. Of course, full deprotonation is possible as it corresponds to the neutral pH case, i.e. to the standard molecular topologies of cytosines. All calculations were done using lammps molecular dynamics engine.32 Every studied system was initially subjected to heating from 100 K to 310 K within 1 ns simulation time at constant volume. Next, the equilibration runs were performed at constant temperature 310 K and pressure 1 atm for 2 ns. The final states from the equilibration runs were used as starting configurations in production runs. The lengths of productions runs in unbiased simulations were 40 ns and most of the analyses were done using results from these 40 ns runs. However, in order to confirm the convergence (or, in some cases, the lack of stationary states) the extra 10 ns runs were performed for each analyzed system. In the case of metadynamics simulations, used for the determination of the potential of mean force profiles, the total simulation times were 60 ns. The calculations were carried out in NPT ensemble using 2fs integration timestep and 12 Å cutoff distance for interatomic interactions was applied. The pressure and temperature were controlled using the Nose−Hoover barostat33 with the relaxation times constants 2ps and 0.2 ps for pressure and temperature, respectively. The sizes of the simulation boxes were ca. 70x70x100 Å and the number of water molecules were ca. 18000 – 31000 depending on the system being studied. The total numbers of atoms in the simulation boxes were from 56681 to 96936 in the smallest and the largest system, respectively. The water molecules were kept rigid using the shake34 algorithm.

ACS Paragon Plus Environment

5

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 36

Figure 2. Spatial structures of the considered systems at acidic pH: (A) iG system – the upper secondary structure (red colored) is the i-motif while the lower one (blue colored) is the G-quadruplex, (B) long i-motif, liM, (C) short i-motif, siM. Free energy computations were based on the metadynamics bias scheme collective variables module (colvars).

36

35

by utilizing

This approach was mainly used for controlled decomposition

of i-motif structures in various conditions and to monitor the energetic costs of those transformations. A closer description of the variables definitions and computational set-ups will be given later. 3. Results and Discussion 3.1 Role of pH in stabilization of i-motif containing DNA structures There is large literature data showing that i-motif can reversibly fold and unfold to hairpin structure in response to pH change. Most of the studies were experimental ones based on monitoring of the circular dichroism spectra or analysis of various fluorescence techniques.9-11,20,37-39 Theoretical analyses of the i-motif, including computer simulations, are less frequent though the unfolding processes have been thoroughly investigated by Smiatek et. al.

17-19

and by Panczyk and Wolski.16 Thus, several important

conclusions have already been drawn concerning the i-motif stability as a function of pH. At acidic pH the i-motif structure is highly stable but increase of pH (deprotonation of cytosines) leads to spontaneous unfolding of i-motif to hairpin structure. The free energy barrier accompanied the unfolding process was determined by Smiatek 17,18 and it is confirmed by our current analysis. It is not larger than 25 kJ mol-1 thus thermal agitation is able to destroy the i-motif in a short time. We also

ACS Paragon Plus Environment

6

Page 7 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

observed spontaneous unfolding of i-motif at neutral pH in unbiased simulations but the process lasted ca. 40 ns. 16 It should be also mentioned that the distinction between the behavior of the protonated and deprotonated forms of i-motif in folding/unfolding transitions is some simplification. This is because pKa of cytosines might change to some extent depending on the local environment, i.e. in the folded imotif it might be different than in the unfolded state. In the recent paper, Kim and Chalikian9 proposed a theoretical model which allows for determination and distinction between pKa of cytosines in unfolded chain and in the i-motif structure. They found that transition from the unfolded state to i-motif occurs at pH =6.1, however, at still lower pH=3.6 the i-motif unfolds again due to protonation of the rest of the cytosines. We are considering the case of weak acidic pH conditions when half of the cytosines become protonated, thus, the appropriate pKa of i-motif is 6.1. The pKa of cytosines in the unfolded state was determined by these authors to be ca. 4.8. Thus, it is very likely that during the unfolding of i-motif (due to increase of pH or other forces) the cytosines undergo fast deprotonation. But this effect should not affect our computational results significantly since at the unfolded state the distances between atoms capable of forming hydrogen bonds become large. Figs. 1 and 2 show both i-motif and G-quadruplex creating one complex structure which is additionally capped by duplex fragments. It is obvious that stability of that complex structure, iG, is different than that of a short 22 nucleotides containing fragment, siM. It is also very likely that stability of an isolated i-motif, siM, but capped by short fragments of single strands of telomeric sequences, liM, (Fig. 2B) will be still different than the other two. Looking at Figs. 1 and 2 we can see that the stability of these secondary structures should be enhanced by the formation of hydrogen bonds. In the case of G-quadruplex 4 guanines form planar quartets and each guanine contribute to 4 hydrogen bonds with the neighboring bases. It is important to note that guanines do not undergo protonation/deprotonation reactions in the considered conditions, thus pH change has formally no effect in regulation of the stability of G-quadruplexes. Instead, the presence of ions Na+ or K+ stabilize the G-quadruplex structure in vivo.

40,41

The i-motif is, in turn,

highly sensitive to pH because the cytosines may be reversibly protonated and deprotonated in the pH range from 4 to 7.

1

This means that regulation of pH leads to formation or cleavage of the third

hydrogen bond in each C-C+ pair (Fig.1) and, as a result, the stability of i-motif structure changes. This was seen and computationally supported in the case of siM fragment. 16,17 Larger structures like liM or iG have not been studied in this context so far. Fig. 3 shows the simulations snapshots of the considered structures taken after 40 ns, however extra 10 ns calculations were performed in each case in order to confirm the already done observations. ACS Paragon Plus Environment

7

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 36

Rmsd values (root of mean squared displacement from the initial state) were determined in the whole simulations timescales i.e. up to 50 ns. The applied rmsd definition assumes that the displacement is calculated after subtraction of the rotation and the center of mass motion, i.e. represents displacements between best superimposition of the initial and the current state. As seen in Fig. 3, the rmsds were determined separately for atoms forming G-quadruplexes and i-motifs as this facilitates analysis of the stability of these subsystems. For comparison, the rmsds for all atoms in the whole considered structures are presented in Fig. S1 in the electronic supplementary material file. As shown in Fig. 3 the stability of iG structure at acidic pH is high as we do not see any qualitative changes of the initial structure after 40ns simulation time. Moreover, as seen in rmsd profile the i-motif and G-quadruplex reached their stable states already after 5 - 10 ns of the simulations and they keep their spatial structures intact for the rest of the time. The stability of iG should be weakened at neutral pH but as seen in the snapshots the structure does not decompose within the relatively long time 40 or 50 ns. The rmsds show that after 5 – 10 ns the G-quadruplex structure reaches its final state and this state does not differ from that in the acidic pH case. On the other hand, the i-motif part at the neutral pH represents larger rmsd values but they do not grow any more after reaching the plateau at ca. 10 ns. This means that i-motif spatial structure is indeed weakened at the neutral pH but it seems to be still stable. Recall that the siM structure at the neutral pH has totally unfolded to hairpin within 40 ns time.16 Also the liM version of i-motif seems to be unstable at the neutral pH and it slowly but spontaneously unfolds to the hairpin structure. The rmsd profile of i-motif in this case reveals the growing tendency after 30 ns and quite large fluctuations within the studied 50 ns simulation time. Interestingly, within the first 30 ns the i-motif part of iG seems to be more distorted than in the case of liM as the rmsd has a larger value. Probably, the presence of the complementary G-rich strand affects to some extent the spatial structure of i-motif but at the same time it enhances its stability. The above conclusions are also supported by the rmsd plots determined for the whole structures, i.e. with all the atoms included in the rmsd calculation. These plot are provided in the supplementary material in Fig. S1. We can see that after the first 10 ns the rmsds for all structures reach similar stationary values but at the same time the resolution of these plots is strongly worsened when compared to G-quadruplex and i-motif analyzed individually. Particularly, in the case of liM it is difficult to observe the instability of i-motif which, in turn, was obvious in Fig. 3. Simply, the other parts of the structures like double stranded terminal DNA parts (or single stranded in the case of liM) mask the information coming from the analyzed secondary DNA structures. These parts of the systems perform quite intense structural changes (like bending) as well but these motions are not coupled with the unfolding of i-motifs. ACS Paragon Plus Environment

8

Page 9 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 3. Snaphosts of the considered structures after 40 ns simulations at constant temperature 310 K and pressure 1atm and the rmsd’s of atoms forming G-quadruplexes and i-motifs as functions of time. (A) iG structure at acidic pH – protonated i-motif (B) iG at neutral pH – unprotonated i-motif and (C) liM at neutral pH – unprotonated i-motif. The G-quadruplexes (located in the lower parts) are colored in blue while i-motifs (located in the upper parts) in red. The above conclusions, drawn from the simulation snapshots and rmsd plots shown in Fig. 3, are supported by the distances measured between some predefined pairs of bases. Table 1 shows these results together with the definitions of the base pairs between which hydrogen bonds are expected. The standard deviations determined for these distances are provided in the electronic supplementary material in Table S1. Thus, at acidic pH the distances between atoms forming hydrogen bonds in C-C+ pairs are below 3 Å (with a few exceptions) which definitely means that there exists a strong hydrogen bond. The same happens within the G-quadruplex structure at acidic pH though the distances are more diverse ranging from 2.83 to 5.94 Å. But still most of these distances (with a few exceptions with

ACS Paragon Plus Environment

9

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 36

distances above 4Å ) are indicators of strong hydrogen bonds acting between given atoms and creating a stable spatial structure. However, at neutral pH the situation changes significantly, namely the distances between NN and NO atoms increased and formally all hydrogen bonds, responsible for formation of i-motif, have been cleaved. The i-motif is thus destroyed though as seen in Fig. 3 its shape is still preserved, at least qualitatively, within the available simulation time. But G-quadruplex remains almost intact and is still stabilized by strong hydrogen bonds between guanine quartets with, however, a few exceptions in the innermost guanine quartet G13-G21-G33-G25. It seems that the i-motif part of the iG structure at the neutral pH is still preserved due to existence of the complementary G-quadruplex at the same part of the telomeric DNA. The liM structure lacks of that complementary guanine rich strand and therefore the i-motif slowly unfolds to hairpin structure at the neutral pH, as seen in Fig. 3. Also, the siM readily unfolded at neutral pH as shown in

16.

Obviously, the above conclusions, drawn from unbiased

simulations, concerns very short period of real time i.e. ca. 50 ns. In a longer timescale, reaching for example seconds, minutes or just experimental times, another stable or metastable states may appear. However, such a timescale is not accessible for molecular dynamics simulations. Table 1. Mean distances, determined from the last 10ns of the simulation runs, between nitrogennitrogen d(NN) and nitrogen-oxygen d(NO) belonging to a given pair or quartet of nitrogenous bases (second column) and defined and graphically displayed in Fig. 1. d(NN), Å

d(NO), Å

Acidic pH, protonated i-motif i-motif, C+-C

G-quadruplex, G-G-G-G

C13-C25

3.50

2.85, 4.21

C14-C26

2.88

2.84, 2.88

C15-C27

2.87

2.77, 2.93

C19-C31

2.91

2.81, 3.04

C20-C32

2.87

2.78, 2.92

C21-C33

2.90

2.82, 2.99

G13-G21-G33-G25

3.03, 5.94, 3.02, 5.31

4.12, 5.26, 3.01, 4.04

G14-G20-G32-G26

3.05, 3.14, 3.19, 3.01

2.94, 2.86, 2.83, 2.91

G15-G19-G31-G27

3.12, 3.49, 4.18, 3.71

2.85, 3.86, 2.94, 3.23

C13-C25

6.01

6.29, 6.32

C14-C26

4.94

7.38, 2.96

C15-C27

3.79

4.80, 6.33

C19-C31

8.45

7.79, 9.37

Neutral pH, unprotonated i-motif i-motif, C-C

ACS Paragon Plus Environment

1

Page 11 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

G-quadruplex, G-G-G-G

C20-C32

14.64

14.96, 15.22

C21-C33

3.45

3.04, 4.40

G13-G21-G33-G25

5.01, 6.73, 3.01, 4.88

5.76, 5.78, 3.00, 3.74

G14-G20-G32-G26

3.06, 2.99, 3.15, 3.03

2.88, 2.91, 2.87, 2.88

G15-G19-G31-G27

3.09, 3.57, 4.25, 3.65

2.86, 3.96, 2.92, 3.22

3.2 Thermal agitation of i-motif containing DNA structures The iG spatial structure seems to be preserved at neutral pH though formally its i-motif part should unfold. Thus, in order to asses its stability and ability of structural transformations we performed prolonged calculations at elevated temperatures. Fig. 4 shows results of those calculations concerning the neutral pH case. As qualitatively seen in the simulations snapshots, 40 ns run at high 400 K temperature did not lead to highly visible destruction of the iG spatial structure. Similar conclusion could be drawn from the rmsd of all atoms plot which is shown in the supplementary material file, Fig. S2. Also, the rmsd of the G-quadruplex part of that structure reached the stationary value already after 10 ns and this value is preserved until the end of the run. However, the absolute value of the rmsd is larger than the corresponding one from Fig. 3 indicating that at 400 K the G-quadruplex has been deteriorated to some extent. This conclusion is supported by the data from Table 2 showing the distances between atoms forming the hydrogen bonds. Clearly, at 400 K the innermost G-quartet has been weakened and hydrogen bonds, keeping its structure together, have been lost. The rmsd of the i-motif part of iG at 400 K indicates that this structure is deteriorating continuously. From ca. 5 Å at the beginning it has grown to ca. 7.5 Å after 50 ns of the simulation. Also, as seen in Table 2, all hydrogen bonds have been lost and, statistically, more distances between atoms forming those bonds are larger than their counterparts from Table 1. At the temperature 500K the iG structure underwent significant deformation and also rmsds of either all atoms or i-motif or G-quadruplex taken individually reveal significant fluctuations and reach higher values. Clearly 500 K is enough for destruction of the iG structure. Also the liM structures were destroyed at high temperatures but in this case 400 K is sufficient to initiate significant deformations and strong fluctuations of the rmsd. More quantitative conclusions can be drawn from the distances table, i.e. Table 2. The standard deviations for the distances from Table 2 are provided in the supplementary material in Table S2.

ACS Paragon Plus Environment

1

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 36

Figure 4. Rmsd and snapshots of i-motif containing structures taken after 40 ns simulations at elevated constant temperatures and constant volume calculations. (A) iG structure at neutral pH and 400 K; (B) iG structure at neutral pH and 500 K; (C) liM structure at neutral pH and 400K; (D) liM structure at neutral pH and 500K.

Analysis of the existence (or not) of hydrogen bonds between NN and NO atoms leads to the conclusion that the i-motif parts of iG at neutral pH are not kept by hydrogen bonds either at normal temperature 310 K or at high 400 – 500 K temperatures. The distances between given pairs of atoms are only slightly larger at 400 K than they were in 310 K. Application of extreme temperature 500 K leads to significant increase of those distances and obviously the specific symmetry of i-motif is lost but visually we cannot notice unfolding to hairpin as it happened in the case of siM.16 The Gquadruplex part of iG is also very resistant to temperature; at 400 K most of the hydrogen bonds are preserved – only the G13-G25 quartet reveals significant deformation. The same part of G-quadruplex is affected by the extreme temperature 500 K, the other two quartets are still highly temperature resistant. Contrary to iG behavior the liM structure unfolds spontaneously at elevated temperatures, similarly like siM analyzed in our previous work.

16

This is directly seen in the simulation snapshots

but also rmsd plots reveal continuous deterioration of the i-motif parts of the liM structures.

ACS Paragon Plus Environment

1

Page 13 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Table 2. Temperature dependence of the mean distances between nitrogen-nitrogen d(NN) and nitrogen-oxygen d(NO), determined from the last 10 ns of the simulations runs, belonging to a given pair or quartet of nitrogenous bases (second column) and defined and graphically displayed in Fig. 1. d(NN), Å

d(NO), Å

Temperature 400 K i-motif, C-C

G-quadruplex, G-G-G-G

C13-C25

4.02

4.07, 5.18

C14-C26

8.37

7.62, 9.41

C15-C27

19.28

17.50, 21.29

C19-C31

5.29

3.62, 8.28

C20-C32

6.70

6.69, 7.62

C21-C33

4.43

6.63, 3.34

G13-G21-G33-G25

8.13, 6.64, 14.00, 13.50

9.15, 7.33, 13.69, 14.31

G14-G20-G32-G26

3.03, 3.03, 3.00, 2.90

2.89, 2.92, 3.07, 3.13

G15-G19-G31-G27

2.95, 2.93, 2.98, 3.00

2.99, 2.94, 2.96, 2.95

C13-C25

21.19

24.44, 18.28

C14-C26

19.78

21.37, 18.32

C15-C27

5.57

7.45, 5.65

C19-C31

14.79

15.25, 15.26

C20-C32

27.51

27.18, 28.08

C21-C33

23.99

21.95, 26.32

G13-G21-G33-G25

18.14, 7.50, 7.11, 7.06

15.04, 9.94, 7.23, 5.72

G14-G20-G32-G26

4.55, 3.24, 3.09, 3.70

4.42, 3.03, 3.84, 3.02

G15-G19-G31-G27

5.76, 3.59, 4.76, 2.98

4.67, 3.51, 4.19, 3.16

Temperature 500 K i-motif, C-C

G-quadruplex, G-G-G-G

The presented analysis of iG behavior at elevated temperatures leads to quite unexpected conclusions. Namely, it seems that appearance of iG can be very difficult to reverse and it can have important implications in in vivo cases. Obviously, conclusions drawn from those very short molecular dynamics runs are not directly transferable to in vivo situations, however, we can expect that appearance of the i-motif and G-quadruplex in the same place of the double stranded DNA might represent a structure which is actually irreversible in normal situation. Perhaps, a specific enzymatic action would be effective, similarly like in the cases of the isolated G-quadruplex unfolding, but definitely this iG structure represent a particularly stable state. Of course, neutral pH and high temperature lead to destruction of the i-motif symmetry but they are not enough for full unfolding and ACS Paragon Plus Environment

1

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 36

recovery of the WC duplex. On the other hand, it is unlikely that double stranded DNA structure can be recovered within 40 ns, therefore free energy analysis is necessary in order to asses the likelihood of the unfolding transition. But we can already state that the main reason of that behavior is the presence of the complementary G-quadruplex together with the terminal G-rich fragments. Fig. 4C shows that extension of i-motif by addition of extra C-rich single strands to its ends (i.e. formation of liM structure) does not enhance the stability to a large extent. The main source of the high stability of the iG is the presence of G-quadruplex which is stable either at neutral or at acidic pH. Destabilization of G-quadruplex could be achieved by reduction of the salt concentration or its change into LiCl, for instance,27,26 but these methods are not applicable in in vivo cases. 3.3 Unfolding the i-motif containing DNA structures in biased dynamics The observed strong stabilization of i-motif by G-quadruplex, formed in the same place of the complementary strand, needs more detailed analysis. Thus, we performed dedicated calculations aimed at determination of free energy barriers accompanying the unfolding of i-motif within the iG structure. To that purpose we applied the metadynamics

35,36

bias scheme to the i-motifs in the studied iG

structures. The metadynamics is based on insertion into the system hamiltonian an extra potential energy being usually a gaussian hill with a given height and width. The insertions are done at the current places defined by the suitably chosen collective variables (these places are usually the local potential energy minima) and are repeated many times during a run. So, finally, all local minima are filled out and the energy landscape becomes almost flat. According to metadynamics foundation,35 the potential of mean force (free energy) can be recovered as the opposite of the sum of gaussian hills and plotted as functions of the collective variables which define the positions of the inserted hills. The hills were added every 2 ps, their widths were 2.5 Å and heights were 0.25 kJ mol-1 (neutral pH) or 0.5 kJ mol-1 (acidic pH). The calculations involved two collective variables; one of them was the distance between C13 and T34 bases (their centers of masses), called rC13-T34, while the second was the distance between the center of mass of C13 and T34 taken together and the A23 base, called rC13/T34-A23. The applied definition of collective variables is the same as in papers by Smiatek et al. 17-19 and it allows to track the unfolding of i-motif to hairpin (rC13/T34-A23 increases) and to random coil (both variables increase). Of course, the choice of the collective variables determines to some extent the enforced transition pathway. Our definition of the collective variables assumes an intuitively most probable pathway, i.e. going through the hairpin. But, it does not mean that the hairpin is always an intermediate state, other intermediates are possible but obviously the choice of collective variables affect their appearance to some extent. ACS Paragon Plus Environment

1

Page 15 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

The metadynamics runs were continued for 60ns in each case and results are plotted as 2D contour maps in Figs. 5 and 6 for the iG structures for the neutral and acidic pH cases, respectively. Because the iG structures are relatively large the computations involving these structures are quite time consuming and the 60 ns runs represent actually the upper limit of our computational resources. Thus, in order to confirm the convergence of the metadynamics within the applied 60 ns simulation time, we checked how the determined potentials of mean force, pmf, evolve within the last 20 ns of the computations. These results are shown in Fig. S3 and Fig. S4 in the electronic supplementary information file. Looking at these figures we can notice that between 40 ns and 50 ns timepoints the pmf landscapes still change significantly. However, between 50 ns and 60 ns the pictures become almost unchanged. Particularly, the depths of the energy wells preserve their values. Moreover, the time evolutions of the analyzed collective variables rC13-T34 and rC13/T34-A23, presented in Fig. S5, confirm that both variables begin to fluctuate strongly for times larger than 40 ns. Additonally, we monitored how the difference between pmf landscapes separated by 1ns timestep evolves in time. These difference were of the order of the applied gaussian hill (or zero in the currently unprobed colvar subspace) when the calculations time exceeded 50ns. All those pieces of information allow us to assume that the pmf landscapes obtained at 60 ns represent the already converged results.

Figure 5. Contour maps of the potential of mean force pmf accompanied the biased unfolding of i-motif within the iG structure at neutral pH, i.e. with the unprotonated i-motif. The collective variables rC13-T34 ACS Paragon Plus Environment

1

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 36

and rC13/T34-A23 are defined as distances between centers of masses of C13 and T34 bases (see Fig.1) and C13 and T34 taken together and A23, respectively. The state (A) is the initial configuration of iG as shown in Fig. 3 and this is the reference point (zero) in the pmf map. The state (B) is the lowest energy configuration found while states (C) and (D) are unfolded states close to the hairpin (C) or the random coil (D) configuration of the i-motif. As seen in Fig. 5 the initial structure of iG obtained in unbiased calculations (A) and at neutral pH is not the lowest energy one though its transition to the most stable structure (B) needs only slight increase of the distance rC13-T34. Explanation of this effect is following; the triggering of the conformational changes within i-motif leads to some adjustment of the conformation of the whole structure comprising of double stranded DNA and also G-quadruplex. As a result, the free energy minimum corresponding to the undeformed i-motif is slightly shifted by extra forces coming from the conformational changes of the other parts of the structure. Then the lowest energy state, corresponding to slightly deformed i-motif, appears. In that lowest energy state the system is trapped in quite deep potential energy well reaching -60 kJ mol-1 in reference to the initial state. Thus, there appears an energy barrier which is significant and the probability of its spontaneous surmounting is quite low though not totally blocked. However, transitions between (A) and (B) are not particularly interesting, the more important is the likelihood of spontaneous unfolding of the i-motif or the whole structure to the WC duplex. It is well known that at neutral pH the i-motif spontaneously disappears and switch into the hairpin or random coil.1,9,10,17 This was observed in isolated single strands but here we can see that the situation is more complex. Transitions from (B) to (C) (harpin) or to (D) (random coil) need crossing the energy barriers of ca. 65 - 70 kJ mol-1 and it is rather impossible that such a transition can be performed spontaneously due to thermal agitation only. The reverse transition from (D) to (B), in turn, needs much lower energy barrier ca. 12 kJ mol-1 so the state with the i-motif unfolded to the random coil is rather unlikely (it represents much less favorable free energy level). The reverse transition from state (C) to (B) needs about 60 kJ mol-1 thus the hairpin state can be considered as a second relatively stable state because the free energy reveals quite a deep minimum (-40 kJ mol-1 ) for this configuration. However, none of the states which were generated by the enforced scanning of both distances was similar to a structure with both i-motif and G-quadruplex fully unfolded and approaching the WC duplex structure. Moreover, the G-quadruplex part of the system was absolutely intact during the enforced unfolding of the i-motif. Thus, we can conclude that increasing the pH of solution, which normally leads to unfolding of i-motif, cannot destroy the iG structure. Perhaps a simultaneous action of two factors i.e. removal of Na+ ions and increase of pH would be necessary for total unfolding of the iG structure but this would be unusual situation in vivo. ACS Paragon Plus Environment

1

Page 17 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 6. Contour maps of the potential of mean force pmf accompanied the biased unfolding of i-motif within the iG structure at acidic pH, i.e. with the protonated i-motif. The meaning of all symbols is the same as in Fig. 5. Fig. 6 shows similar map of pmf for the iG system but in the case of acidic pH. Again, the starting configuration was not the minimum energy one and the system needs some small adjustment of the collective variables to reach that state. Explanation of this effect is the same as the case of Fig. 5. However, that state represents very deep free energy well; the difference between the starting configuration (A) and the minimum energy configuration (B) reaches 150 kJ mol-1. Thus, the iG structure at acidic pH is extremely stable and this is obviously due to formation of 6 strong hydrogen bonds within the i-motif part of iG. Displacement of the i-motif to the hairpin (C) or random coil (D) structures is at acidic pH impossible as barriers against these transitions are well above 150 kJ mol-1. Of course, the reverse transitions from (C) or (D) to (B) are not blocked by any energy barriers thus even if the hairpin configuration appears (assuming initial neutral pH and next quick change of pH to acidic one) then it spontaneously goes to the i-motif state. The above analysis leads to the conclusion that the stability of iG structure is high either at neutral or in acidic pH. The i-motif is not able to unfold due to stabilizing effect of the G-quadruplex existing in the proximity of the i-motif. The stability of G-quadruplex seems to be very high and the hydrogen bonds existing within the G-quadruplex structure do not brake during the biased calculations ACS Paragon Plus Environment

1

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 36

leading to strong deformations of the i-motif part of iG. The cooperative unfolding of both i-motif and G-quadruplex is another interesting process but it would involve simultaneous action of two factors: increase of pH and reduction of salt concentration. Analysis of such a process, however, represents a very demanding task from the computational point of view. 3.4

Steered formation of i-motif from WC duplex

Role of the neighborhood in the stabilization of i-motif was found to be crucial. Thus, it is interesting to check how the enforced formation of i-motif affects its neighborhood, particularly whether the Gquadruplex forms spontaneously in such circumstances. To that purpose we need some kind of an initial structure which should be simply the WC duplex structure. In the next step, we were going to apply moving springs attached to a few points (centers of mass of A23, C13 and T34 bases) within the C-rich strand of the duplex and the trajectories of those strings were going to be programmed in such a way that at some fixed point of time they form the spatial template of i-motif. However, we quickly found out that direct formation of i-motif from the WC-duplex according to that scheme is impossible because such a process leads to formation of knots within the duplex. Therefore, we must assume that the WC duplex has to be unwrapped prior to formation of any secondary DNA structure. The unwrapping of the duplex during DNA replication is normally carried out due to specialized enzymes (helicases). It seems that formation of secondary structures also needs some catalytic action of enzymes because the thermodynamic stability of the duplex is so high that its instantaneous and spontaneous unwrapping, in order to form eg. i-motif, is impossible. Therefore, the initial structure from which we formed i-motif was prepared by imposing 4 spring forces to the duplex in a separate calculation. These forces were imposed on the T34 and A12 nucleotides in both strands and as a result we got the unwrapped structure shown in Fig. 7A. Fig. 7 shows the unwrapped WC duplex (A) being the starting point of further analysis and this same structure after 40 ns of unbiased calculations (B). This part of the study was aimed at checking whether that unwrapped structure (Fig. 7A) can spontaneously recover the spatial structure resembling the WC duplex. As we can see in Fig. 7B it cannot, the obtained structure is rather a random coil with only a weak tendency of formation of double helix. Moreover, rmsd of all atoms suggests that during 40 ns of simulations the structure has not reached the equilibrium or a stationary state. It is simply going to evolve further and in the unbiased simulations we cannot reach its target configuration. We can thus conclude that the reconstruction of the WC duplex is kinetically blocked by severe entropic limitations, i.e. the process needs lots of configurational changes (not involving significant energetic ACS Paragon Plus Environment

1

Page 19 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

effects) in order to finally reach its target state. Also, it cannot be excluded that the transition involves distinct intermediates which are separated by significant energetic barriers. Similar kinetic partitioning effects are commonly observed in the dynamics of G-quadruplex folding.

25,42

Simply, the trajectory

leading to creation of the duplex is very complex, contains many local free energy minima and needs a lot of time to be finally completed.

Figure 7. (A) The unwrapped WC duplex obtained by applying external spring forces. (B) The relaxed structure of (A) after 40 ns of simulations without any bias and its rmsd evolution in time. The results concern the structure with the protonated cytosines, i.e. acidic pH. Thus, the essential part of this fragment of study, that is the analysis of steered formation of imotif, was initiated from the unwrapped WC duplex from Fig. 7A. For that purpose, the moving (linear trajectories) springs were attached to the cytosines (from C13 to C33) and the trajectories were programmed in such a way that after 2 ns they ended up in a spatial template of the i-motif. The Gquadruplex part of the unfolded WC duplex was left without any external forces and it was moving freely during enforced formation of i-motif. These calculations were performed for both neutral and acidic pH (normal and protonated cytosines in C-rich strand) and after finishing formation of i-motifs ACS Paragon Plus Environment

1

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 36

(2 ns) the spring forces have been removed. Next, the unbiased calculations were continued for 50 ns and the final configurations are shown in Fig. 8.

Figure 8. Snapshots after 40 ns of simulations and the rmsd plots of the G-quadruplex and i-motif forming segments after steered formation of i-motifs in (A) acidic and (B) neutral pH. The reference states for the rmsd calculation were the unfolded WC duplex structures from Fig. 7A. As can be seen the i-motif formed at acidic pH (Fig.8A) preserves its spatial structure in further quite long calculations. But the G-rich strand was not able to spontaneously fold into the G-quadruplex within that time. It formed little ordered structure without any tendency of creation the guanine quartets though the structure seems to be quite compact. This is confirmed in Table 3 where distances between NN and NO pairs are shown, similarly like in Tables 1 and 2. The standard deviations for the distances from Table 3 are provided in the supplementary material in Table S3. The distances within the G-rich strand are very long and this proves that the structure in Fig. 8A is very far from the G-quadruplex which reveals these distances close to 3 Å (see Table 1 or 2). The i-motif spatial structure is, roughly speaking preserved, at acidic pH though the distances from Table 3 suggest that its stability is significantly reduced when compared to the case when i-motif and G-quadruplex exists in the same ACS Paragon Plus Environment

2

Page 21 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

place of the duplex. We can see that C13-C25 and C15-C27 pairs are too long to be classified as hydrogen bonds pairs. Similarly, C21-C33 pairs are much longer than 3Å and this means that the imotif is in this case significantly loosen. Thus, the lack of G-quadruplex leads to significant destabilization of the i-motif at acidic pH. In the case of neutral pH (Fig. 8B) the i-motif formed by steered dynamics turns out to be actually unstable. The final structure is far from the i-motif symmetry and the distances between NN and NO pairs (Table 3) confirm that i-motif symmetry has been lost. These distances are even longer than those from Table 2 which corresponded to temperature agitation of the iG structures. Clearly, the i-motif without the complementary G-quadruplex is unstable at neutral pH. However, full unfolding to the radom coil or harpin structures have not happened but the G-rich strand unfolded almost completely to the random coil. The rmsd plots for both protonated and unprotonated i-motifs shown in Fig.8 A and B indicate that steered formation of i-motifs brought the whole structures to some intermediate states with lifetimes comparable to or longer than 50 ns. The first 2 ns of these plots correspond to the stage of enforced formation of i-motifs from the fully unfolded states, so these parts are not particularly interesting. The rests of the rmsd curves, for times greater than 2ns, show their evolutions without any bias applied and their analysis support the above conclusion. Namely, the protonated i-motif part represent fairly static values of the rmsd meaning that this subsystem reached an intermediate state. The unprotonated i-motif reveals small decreasing tendency meaning that the structure decays toward the reference state, which is the fully unfolded WC duplex from Fig. 7A. The G-quadruplex forming parts behave in diverse way meaning that the state of the i-motif part has influence on the folding trajectory. The protonated i-motif led the G-quadruplex forming part to an intermediate structure with the lifetime greater than 50 ns. However, the unprotonated i-motif makes the G-quadruplex forming part actually unstable because the rmsd is decreasing which means that it is going to the unfolded reference state. The rmsds of all atoms within the structures are plotted in Fig. S6 in the supplementary material file. Analysis of these plots supports the conclusion that the protonation state of i-motif affects to some extent the folding transition of the whole structure. The unprotonated i-motif leads to the structure which is closer to the reference state but both rmsds only fluctuate around their mean values suggesting the existence of the intermediate structures.

ACS Paragon Plus Environment

2

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 36

Table 3. Mean distances, determined from the last 10 ns of the simulation runs, between nitrogennitrogen d(NN) and nitrogen-oxygen d(NO) belonging to a given pair or quartet of nitrogenous bases, defined and graphically displayed in Fig. 1, for the structures shown in Fig. 8. Fig. 8 A – acidic pH, protonated i-motif

i-motif, C+-C

G-quadruplex, G-G-G-G

d(NN), Å

d(NO), Å

C13-C25

5.84

6.06, 6.18

C14-C26

2.89

2.84, 2.98

C15-C27

7.15

9.43, 5.72

C19-C31

2.92

2.85, 3.01

C20-C32

3.02

2.81, 3.78

C21-C33

10.59

11.97, 10.03

G13-G21-G33-G25

25.23, 25.26, 30.54, 28.69

27.93, 25.96, 29.86, 30.03

G14-G20-G32-G26

33.35, 13.22, 26.97, 18.82

29.35, 16.57, 28.26, 21.02

G15-G19-G31-G27

20.52, 18.83, 25.53, 22.64

22.39, 16.61, 25.92, 24.11

d(NN), Å

d(NO), Å

C13-C25

5.20

7.34, 4.24

C14-C26

9.30

10.75, 8.52

C15-C27

13.71

12.59, 15.42

C19-C31

10.46

11.14, 9.89

C20-C32

17.42

18.92, 16.26

C21-C33

15.07

15.78, 14.75

G13-G21-G33-G25

17.91, 30.05, 16.31, 26.71

16.61, 25.31, 15.47, 23.48

G14-G20-G32-G26

26.72, 19.10, 32.07, 16.71

25.72, 17.77, 31.00,14.30

G15-G19-G31-G27

35.34, 8.62, 36.54, 13.21

31.68, 10.29, 36.99, 14.10

Fig. 8B – neutral pH, unprotonated i-motif

i-motif, C-C

G-quadruplex, G-G-G-G

Generally, the lack of tendency to formation of G-quadruplex within the G-rich part suggests that such a process is kinetically blocked and much more time is needed to recover the G-quadruplex symmetry. It can be carefully stated that the presence of the protonated i-motif (acidic pH) facilitates the G-quadruplex formation and the neutral pH hinders it to some small extent. But we can definitely state that in ex vivo cases both i-motif and G-quadruplex can exist in the same place of the duplex and there are no steric hindrances which can prevent the existence of iG structure.

ACS Paragon Plus Environment

2

Page 23 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

4.

The Journal of Physical Chemistry

Summary and Conclusions

The presented results of the analysis of secondary DNA structures which may appear within the telomeric region lead to several important conclusions. Namely, it was found that i-motif (siM) structure is highly stable at acidic pH when hydrogen bonds appear between semiprotonated cytosines. It, however, looses stability when pH is increased and the cytosines become deprotonated. Then, the imotif spontaneously unfold to hairpin or random coil structures. The stability of i-motif is not significantly affected by the presence of C-rich single strands attached to its terminal parts forming liM structure. However, formation of iG structure, that is, the presence of complementary G-quadruplex within the complementary G-rich strand enhances the stability of i-motif either at acidic or at neutral pH. Particularly, at neutral pH we observe another stable secondary structure within the C-rich strand which is the hairpin structure. We can thus carefully conclude that the iG structure with the i-motif and G-quadruplex formed in the same place of duplex needs specific action of enzymatic machinery for its unfolding otherwise is could not be biologically relevant case. This is because such a structure would be actually irreversible in normal conditions. On the other hand, the lack of G-quadruplex within the complementary strand destabilizes imotif significantly. Its symmetry is strongly distorted at neutral pH and also at acidic pH three hydrogen bond pairs disappear contrary to other studied cases (especially to temperature agitation). However, the G-rich strand has not revealed a tendency to G-quadruplex formation within the applied simulation time. Thus, we can conclude that either the time for G-qudruplex formation must be much longer or the formation of G-quadruplex (as complementary to i-motif) is not necessary and i-motif can exist alone within the telomeric DNA duplex. Supporting Information Additional results concerning: (i) standard deviations for numbers in Tables 1-3, (ii) RMSD and (iii) potential of mean force plots

Acknowledgments This work was supported by Polish National Science Centre grant 2017/27/B/ST4/00108. TP additionally thanks for partial financial support from statutory fund of ICSC.

ACS Paragon Plus Environment

2

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 36

References (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20)

Phan, A. T.; Mergny, J.-L. Human Telomeric DNA: G-Quadruplex, i-Motif and Watson–Crick Double Helix. Nucleic Acids Res. 2002, 30, 4618–4625. Shammas, M. A. Telomeres, Lifestyle, Cancer, and Aging: Curr. Opin. Clin. Nutr. Metab. Care 2011, 14, 28–34. Day, H. A.; Pavlou, P.; Waller, Z. A. E. I-Motif DNA: Structure, Stability and Targeting with Ligands. Bioorg. Med. Chem. 2014, 22, 4407–4418. Russo Krauss, I.; Ramaswamy, S.; Neidle, S.; Haider, S.; Parkinson, G. N. Structural Insights into the Quadruplex–Duplex 3′ Interface Formed from a Telomeric Repeat: A Potential Molecular Target. J. Am. Chem. Soc. 2016, 138, 1226–1233. Sen, D.; Gilbert, W. Formation of Parallel Four-Stranded Complexes by Guanine-Rich Motifs in DNA and Its Implications for Meiosis. Nature 1988, 334, 364–366. Chen, Y.; Qu, K.; Zhao, C.; Wu, L.; Ren, J.; Wang, J.; Qu, X. Insights into the Biomedical Effects of Carboxylated Single-Wall Carbon Nanotubes on Telomerase and Telomeres. Nat. Commun. 2012, 3, 1074. Gehring, K.; Leroy, J.-L.; Guéron, M. A Tetrameric DNA Structure with Protonated CytosineCytosine Base Pairs. Nature 1993, 363, 561–565. Debnath, M.; Ghosh, S.; Chauhan, A.; Paul, R.; Bhattacharyya, K.; Dash, J. Preferential Targeting of I-Motifs and G-Quadruplexes by Small Molecules. Chem. Sci. 2017, 8, 7448–7456. Kim, B. G.; Chalikian, T. V. Thermodynamic Linkage Analysis of PH-Induced Folding and Unfolding Transitions of i-Motifs. Biophys. Chem. 2016, 216, 19–22. Choi, J.; Kim, S.; Tachikawa, T.; Fujitsuka, M.; Majima, T. PH-Induced Intramolecular Folding Dynamics of i-Motif DNA. J. Am. Chem. Soc. 2011, 133, 16146–16153. Ren, W.; Zheng, K.; Liao, C.; Yang, J.; Zhao, J. Charge Evolution during the Unfolding of a Single DNA I-Motif. Phys. Chem. Chem. Phys. 2018, 20, 916–924. Peng, Y.; Wang, X.; Xiao, Y.; Feng, L.; Zhao, C.; Ren, J.; Qu, X. I-Motif Quadruplex DNABased Biosensor for Distinguishing Single- and Multiwalled Carbon Nanotubes. J. Am. Chem. Soc. 2009, 131, 13813–13818. Balasubramanian, S.; Neidle, S. G-Quadruplex Nucleic Acids as Therapeutic Targets. Curr. Opin. Chem. Biol. 2009, 13, 345–353. Bergues-Pupo, A. E.; Gutiérrez, I.; Arias-Gonzalez, J. R.; Falo, F.; Fiasconaro, A. Mesoscopic Model for DNA G-Quadruplex Unfolding. Sci. Rep. 2017, 11756. Burge, S.; Parkinson, G. N.; Hazel, P.; Todd, A. K.; Neidle, S. Quadruplex DNA: Sequence, Topology and Structure. Nucleic Acids Res. 2006, 34, 5402–5415. Panczyk, T.; Wolski, P. Molecular Dynamics Analysis of Stabilities of the Telomeric WatsonCrick Duplex and the Associated i-Motif as a Function of PH and Temperature. Biophys. Chem. 2018, 237, 22–30. Smiatek, J.; Chen, C.; Liu, D.; Heuer, A. Stable Conformations of a Single Stranded Deprotonated DNA I-Motif. J. Phys. Chem. B 2011, 115, 13788–13795. Smiatek, J.; Heuer, A. Deprotonation Mechanism of a Single-Stranded DNA i-Motif. RSC Adv 2014, 4, 17110–17113. Smiatek, J.; Janssen-Müller, D.; Friedrich, R.; Heuer, A. Systematic Detection of Hidden Complexities in the Unfolding Mechanism of a Cytosine-Rich DNA Strand. Physica A: Stat. Mech. 2014, 394, 136–144. Garabedian, A.; Butcher, D.; Lippens, J. L.; Miksovska, J.; Chapagain, P. P.; Fabris, D.; Ridgeway, M. E.; Park, M. A.; Fernandez-Lima, F. Structures of the Kinetically Trapped I-Motif DNA Intermediates. Phys. Chem. Chem. Phys. 2016, 18, 26691–26702.

ACS Paragon Plus Environment

2

Page 25 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(21) Bian, Y.; Ren, W.; Song, F.; Yu, J.; Wang, J. Exploration of the Folding Dynamics of Human Telomeric G-Quadruplex with a Hybrid Atomistic Structure-Based Model. J. Chem. Phys. 2018, 148, 204107. (22) Wang, Q.; Liu, J. -q.; Chen, Z.; Zheng, K. -w.; Chen, C. -y.; Hao, Y. -h.; Tan, Z. G-Quadruplex Formation at the 3’ End of Telomere DNA Inhibits Its Extension by Telomerase, Polymerase and Unwinding by Helicase. Nucleic Acids Res. 2011, 39, 6229–6237. (23) Zahler, A. M.; Williamson, J. R.; Cech, T. R.; Prescott, D. M. Inhibition of Telomerase by GQuartet DMA Structures. Nature 1991, 350, 718–720. (24) Šponer, J.; Bussi, G.; Stadlbauer, P.; Kührová, P.; Banáš, P.; Islam, B.; Haider, S.; Neidle, S.; Otyepka, M. Folding of Guanine Quadruplex Molecules–Funnel-like Mechanism or Kinetic Partitioning? An Overview from MD Simulation Studies. Biochim. Biophys. Acta BBA - Gen. Subj. 2017, 1861, 1246–1263. (25) Ghoshdastidar, D.; Bansal, M. Dynamics of Physiologically Relevant Noncanonical DNA Structures: An Overview from Experimental and Theoretical Studies. Brief. Funct. Genomics 2018, doi: 10.1093/bfgp/ely026. (26) Cui, Y.; Kong, D.; Ghimire, C.; Xu, C.; Mao, H. Mutually Exclusive Formation of GQuadruplex and i-Motif Is a General Phenomenon Governed by Steric Hindrance in Duplex DNA. Biochemistry 2016, 55, 2291–2299. (27) Dhakal, S.; Yu, Z.; Konik, R.; Cui, Y.; Koirala, D.; Mao, H. G-Quadruplex and i-Motif Are Mutually Exclusive in ILPR Double-Stranded DNA. Biophys. J. 2012, 102, 2575–2584. (28) Guo, K.; Pourpak, A.; Beetz-Rogers, K.; Gokhale, V.; Sun, D.; Hurley, L. H. Formation of Pseudosymmetrical G-Quadruplex and i-Motif Structures in the Proximal Promoter Region of the RET Oncogene. J. Am. Chem. Soc. 2007, 129, 10220–10228. (29) Case, D. A.; Cheatham, T. E.; Darden, T.; Gohlke, H.; Luo, R.; Merz, K. M.; Onufriev, A.; Simmerling, C.; Wang, B.; Woods, R. J. The Amber Biomolecular Simulation Programs. J. Comput. Chem. 2005, 26, 1668–1688. (30) Weiner, S. J.; Kollman, P. A.; Nguyen, D. T.; Case, D. A. An All Atom Force Field for Simulations of Proteins and Nucleic Acids. J. Comput. Chem. 1986, 7, 230–252. (31) Ivani, I.; Dans, P. D.; Noy, A.; Pérez, A.; Faustino, I.; Hospital, A.; Walther, J.; Andrio, P.; Goñi, R.; Balaceanu, A.; et al. Parmbsc1: A Refined Force Field for DNA Simulations. Nat. Methods 2016, 13, 55-58. (32) Plimpton, S. Fast Parallel Algorithms for Short-Range Molecular Dynamics. J. Comput. Phys. 1995, 117, 1–19. (33) Shinoda, W.; Shiga, M.; Mikami, M. Rapid Estimation of Elastic Constants by Molecular Dynamics Simulation under Constant Stress. Phys. Rev. B 2004, 69, 134103. (34) Andersen, H. C. Rattle: A “Velocity” Version of the Shake Algorithm for Molecular Dynamics Calculations. J. Comput. Phys. 1983, 52, 24–34. (35) Laio, A.; Parrinello, M. Escaping Free-Energy Minima. Proc. Natl. Acad. Sci. 2002, 99, 12562– 12566. (36) Fiorin, G.; Klein, M. L.; Hénin, J. Using Collective Variables to Drive Molecular Dynamics Simulations. Mol. Phys. 2013, 111, 3345–3362. (37) Adam, C.; Olmos, J. M.; Doneux, T. Electrochemical Monitoring of the Reversible Folding of Surface-Immobilized DNA i-Motifs. Langmuir 2018, 34, 3112–3118. (38) Endo, M.; Xing, X.; Zhou, X.; Emura, T.; Hidaka, K.; Tuesuwan, B.; Sugiyama, H. SingleMolecule Manipulation of the Duplex Formation and Dissociation at the G-Quadruplex/i-Motif Site in the DNA Nanostructure. ACS Nano 2015, 9, 9922–9929. (39) Alba, J. J.; Sadurní, A.; Gargallo, R. Nucleic Acid i- Motif Structures in Analytical Chemistry. Crit. Rev. Anal. Chem. 2016, 46, 443–454.

ACS Paragon Plus Environment

2

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 36

(40) Hud, N. V.; Sklenář, V.; Feigon, J. Localization of Ammonium Ions in the Minor Groove of DNA Duplexes in Solution and the Origin of DNA A-Tract Bending. J. Mol. Biol. 1999, 286, 651–660. (41) Turel, I.; Kljun, J. Interactions of Metal Ions with DNA, Its Constituents and Derivatives, Which May Be Relevant for Anticancer Research. Curr. Top. Med. Chem. 2011, 11, 2661–2687. (42) Stadlbauer, P.; Kührová, P.; Banáš, P.; Koča, J.; Bussi, G.; Trantírek, L.; Otyepka, M.; Šponer, J. Hairpins Participating in Folding of Human Telomeric Sequence Quadruplexes Studied by Standard and T-REMD Simulations. Nucleic Acids Res. 2015, 43, 9626–9644.

ACS Paragon Plus Environment

2

Page 27 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

TOC Graphics

ACS Paragon Plus Environment

2

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

296x156mm (200 x 200 DPI)

ACS Paragon Plus Environment

Page 28 of 36

Page 29 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

191x198mm (300 x 300 DPI)

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

228x355mm (200 x 200 DPI)

ACS Paragon Plus Environment

Page 30 of 36

Page 31 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

208x276mm (200 x 200 DPI)

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

459x386mm (100 x 100 DPI)

ACS Paragon Plus Environment

Page 32 of 36

Page 33 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

407x358mm (100 x 100 DPI)

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

161x247mm (200 x 200 DPI)

ACS Paragon Plus Environment

Page 34 of 36

Page 35 of 36 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

345x559mm (200 x 200 DPI)

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table of Contents Graphics 310x91mm (200 x 200 DPI)

ACS Paragon Plus Environment

Page 36 of 36