What Makes Telomeres Unique? - The Journal of Physical Chemistry

Feb 14, 2017 - What features do telomeres have in common? In this article, we study the physical properties of human-like (TTAGGG), plant (TTTAGG), in...
0 downloads 0 Views 8MB Size
Subscriber access provided by Fudan University

Article

What Makes Telomeres Unique? Adam Kazimierz Sieradzan, Pawel Krupa, and David J. Wales J. Phys. Chem. B, Just Accepted Manuscript • Publication Date (Web): 14 Feb 2017 Downloaded from http://pubs.acs.org on February 15, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

What Makes Telomeres Unique? Adam K. Sieradzan 1,∗ , PaweÃl Krupa 1,2 and David J. Wales 3

1

Chemistry Department, University of Gda´ nsk, Wita Stwosza 63, Gda´ nsk 80-308, Poland and Institute of Physics, Polish Academy of Sciences, Aleja Lotnikow 32/46, PL-02668 Warsaw, Poland and 3 Department of Chemistry, Cambridge University, Lensfield Road, Cambridge CB2 1EW, United Kingdom. ∗ To whom correspondence should be addressed. Tel: +48 58 523 5350; Fax: +48 58 523 5012; Email: [email protected] 2

Abstract Telomeres are repetitive nucleotide sequences, which are essential for protecting the termini of chromosomes. Thousands of such repetitions are necessary to maintain the stability of the whole chromosome. Several similar repeated telomeric sequences have been found in different species, but why has nature chosen them? What features do telomeres have in common? In this article we study the physical properties of human-like (TTAGGG), plant (TTTAGG), insect (TTAGG) and candida guilermondi (GGTGTAC) telomeres in comparison with seven, control, non-telomeric sequences. We used steered molecular dynamics with the Nucleic Acid united RESidue (NARES) coarse-grained force field, which we compared to the all-atom AMBER14 force field and experimental data. Our results reveal important features in all the telomeric sequences, including their exceptionally high mechanical resistance and stability to untangling and stretching, compared to non-telomere sequences. We find that the additional stability of the telomeres comes from their ability to form triplex structures and wrap around loose chains of linear DNA by regrabbing the chain. We find that with slower pulling speed regrabbing and triplex formation is more frequent. We also found that some of the sequences can form triplexes experimentally, such as TTTTTCCCC, and can mimic telomeric properties.

1

Introduction

Telomeres are found at the termini of linear DNA, and play a central role in the fate of cells. 1 They take part in the ageing process, where shortening of the telomeres acts as a “cell-clock”, keeping track of previous cell divisions. 2 Telomeres adjust the cellular response to stress and growth stimulation, 3 and loop-like or lasso-like structures 4 play an important role in chromosome stability, 5 protecting DNA from immature replicative senescence and cancerogenesis. Short telomeres can lead to DNA instability, 6 which is associated with disorders such as cancer, 6 dyskeratosis congenita, 7 and pulmonary fibrosis. 8 Various forces are induced in vivo in cells. Outside the nucleus forces are damped by nuclear lamina, 9 but additional forces are induced in this environment. During translation and replication, DNA is subject to large forces, and the telomeres undergo significant reorganisation from closed-looped to open structures, 4

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 31

which are more vulnerable to damage. In vivo there many repair systems that correct potentially harmful double-strand DNA damage, for example: homologous recombination (HR) and nonhomologous endjoining (NHEJ). 10,11 Therefore the most important factor is the efficiency of the repair mechanisms, rather then the mechanical stability of DNA. However, even when these mechanisms fail, “backup nonhomologous endjoining” can occur. 11 Many telomere sequences are found in nature, 12–16 although they possess similar function and features. Why these sequences have been chosen by natural selection is not well understood. Hence, there is great interest in studies of the properties of double strand and quadruplex forms of DNA. 17–19 However, most the existing computational methods do not allow us to study systems of reasonable size on experimental time-scales due to the high computational cost. 20 Coarse-grained force fields are a partial solution to this problem, but existing approximations often lead to unphysical behaviour for the DNA. In this article we employ the physics-based NARES coarse-grained force field to show that the mechanical stability of telomere sequences is a common feature, suggesting that it is important in protecting DNA from mechanical stress. We compare our results with all-atom simulations, and propose a mechanism that distinguishes telomeric sequences from non-telomeric, based on the high stretching resistance in long pulling regions.

2

MATERIALS AND METHODS

To study the physical properties of several DNA sequences a series of steered molecular dynamics (SMD) simulations 21 using the all-atom AMBER14 22–24 force field and the coarse-grained nucleic-acid (NARES-2P) 25,26 force field were performed.

2.1

Starting structure generation

Initial structures for the DNA chains were generated using the NAB (Nucleic Acid Builder) program from the AmberTools14 package, 22–24 which was set to produce double-stranded Arnott B-DNA canonical structures. Four telomeric structures (d(TTAGG)a , d(TTAGGG)b , d(TTTAGGG)c and d(GGTGTAC)c ) and seven reference structures ((d(AT)d , d(GC)d , d(ATGC)e ), d(AATTGGCC)f , d(AAATTTGGGCCC)g , d(AAAATTTTGGGGCCCC)h , d(TTTTTCCCC)i ) were obtained. Do we need to explain a-i here? DJW Values of a-i were set to obtain systems containing ∼30bp, ∼60bp, ∼120bp, and ∼240bp by replicating the repetitions of the given patterns. For example, d(AT)30 is a two chain system, in which each of the chains contains 30 repetitions of the AT fragment giving 60bp (120 nucleotides in total) forming ∼6 full turns (twists) with a total length of ∼205 ˚ A. Due to computational limitations, 60 bp systems were the focus for the analysis and comparison between all-atom and coarse-grained force fields. Longer and shorter DNA structures were used for the coarse-grained NARES force field to analyse the influence of the length of the DNA on the mechanical properties. All initial structures were energy minimised using the Amber ff14SB force field. Additionally, for the NARES coarse-grained simulations, models were converted to the NARES representation and run for short MD simulations to relax the structures.

ACS Paragon Plus Environment

Page 3 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2.2

All-atom simulations

The initial B-DNA structures obtained from NAB were energy minimised for Amber ff14SB 24 using 600 steps of steepest-descent and 400 steps of conjugate-gradient optimisation. SMD simulations were carried out in implicit solvent, using the standard pairwise generalised Born model, with total lengths of 4, 20, 100 and 500 ns, each time step being 2 fs; the SHAKE algorithm was used to constrain bonds involving hydrogen atoms. The maximum pulling distance was set to 500 ˚ A, which corresponds to 12.5, 2.5, 0.5 and 0.1 m/s for the 4, 20, 100, and 500 ns simulations, respectively. Cutoff values in the range 12 to 25 ˚ A with particle mesh Ewald (PME) summation were used to treat long-range interactions, and Langevin dynamics was used to keep the temperature at 300 K. The pulling constant force (spring constant) was set in the range 10 to 500 kcal mol−1 ˚ A−2 .

2.3

SIRAH coarse-grained simulations

For comparison, the Southamerican Initiative for a Rapid and Accurate Hamiltonian (SIRAH) coarse-grained force field was used to simulate these model systems. SIRAH 27,28 is a relatively fine coarse-grained model of amino-acid residues, solvent and nucleotides, and for clarity will be referred to as fine-grained in the rest of the manuscript. Nucleotides in SIRAH are simplified to six interaction sites, which provides a ∼15-times speed-up for the systems studied compared to all-atom simulations. The main advantages of the SIRAH model are full compatibility with AMBER and GROMACS 29 software, and that in SIRAH simulations there are no restraints imposed on the structure of the DNA. Hence it can be used for studies of mechanical stability and conformational changes.

2.4

The NARES model of DNA

NARES-2P 25,26 is a coarse-grained model derived from the protein physics-based coarse-grained model UNited RESidue (UNRES), 30,31 sharing the same philosophy. Developed over 20 years, the UNRES force field has proved to be a successful tool for protein structure prediction during blind tests, 32,33 and has been a useful tool for studies of protein complexes. 34,35 The NARES-2P model is a new construction, and was already found to reproduce many biologically relevant properties of double-helix B-DNA, such as duplex formation, breathing motions, and melting temperatures. 25 In NARES-2P, nucleic acids are reduced to two interaction sites (Fig. 1), namely a united sugar-base and a united phosphate group. The sequence of virtual sugar atoms located at the geometric centre of the sugar ring is used to describe the geometry of the chain.

2.5

Coarse-grained simulations

To study the mechanical properties of DNA sequences SMD simulations 21 with constant pulling velocity were employed: F = k(αt − x0 ),

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 31

where F is the pulling force, k is the spring constant, α is the pulling velocity, x0 is the initial distance between pulling centres, and t is the simulation time. Simulations were conducted for two protocols (Fig. 2); firstly with pulling from opposite ends (parallel to the axis of the DNA chain), and secondly from the same end (perpendicular to the axis of the DNA chain). The simulations were performed with pulling speeds (α) of 0.0402 m/s and 0.402 m/s. We note that due to coarse-graining, internal motions are effectively faster than for an all-atom force field. For the UNRES force field, NARES force field protoplast, a speed up of at least three orders of magnitude was obtained, 36,37 and a similar speed up for the NARES force field is expected. Therefore, the pulling speed is significantly smaller than calculated directly from the transformation of molecular time units and ˚ A to s and m, and is approximately equal to 0.04 A−2 . For each system and and 0.40 mm/s. The spring force constant was set to 2.0 kcal mol−1 ˚ pulling parameters, 64 trajectories were run using a Berendsen thermostat set to 300 K with the angular momentum reset every 1000 steps. The time step was set to 0.1 mtu (molecular time units). The molecular Time Unit (mtu) comes from energy in kcal mol−1 , mass as g mol−1 A units instead of SI, and is equal to 48.9 fs. The simulations were performed and distance in ˚ until dissociation of the chains (for systems with 120bp) resulting in ∼0.4 ms simulation per trajectory. In the simulation analysis a “regrabbing” event was defined as the situation in which a local pulling force peak value was followed by a local minimum and subsequent local maximum (resistance, breaking and reforming, respectively, which constitute a regrab). Both values of the maxima need to be at least 50 pN greater than the local minimum between them to define a regrab. Contacts between nucleic bases were defined when the distance between centre of mass sugar-base from chain A was closer than 5.0˚ A to any sugar-base centre of mass from sugar-base B. Twists were calculated as a sum of γ angles, not including situations when angles were close to fully extended (180[deg]). For analysis of triplex formation one pair form canonical base pairing while the other forms Hoogsteen base pairing. Therefore, two residues had to form a contact (with the same criterion as previously, 5.0˚ A) and the third residue had to be at least four residues away from the forming contact and at most 6.0˚ A away from one of them. The larger distances criteria are used because Hoogsteen base pairing is weaker than the canonical form.

2.6

Conversion of coarse-grained structures to all-atom models

Coarse-grained models were subjected to all-atom reconstruction using the LEaP program, followed by restricted energy minimisation and short relaxation by restricted MD simulation in implicit solvent using the all-atom Amber ff14SB force field. All the reconstructed models were also subjected to short unrestricted all-atom MD simulations to control the stability of the structure.

ACS Paragon Plus Environment

Page 5 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

3 3.1

RESULTS Pulling at large speed from opposite ends

In all-atom simulations no significant difference in stretching pattern between the strongest (CG) and weakest (AT) base pairings was observed (Supplementary Fig. S1). For both cases the first step is untangling of B-DNA leading to the straight “ladder-like” structure, despite the applied pulling speed (Supplementary Fig. S2). The first significant jump in the force value is observed at a stretch of approximately 100% of the DNA length and a half-peak value of ∼130% of the initial length, whereas the experimentally obtained overstretching occurs at about 70% of the DNA length. The shift of overstretching and ladder-like untangling pattern is probably caused by the unphysically rapid pulling speed (0.1-12.5 m/s). When this experiment was repeated with the NARES-2P force field at a speed about 1000 times faster than typical experiments (∼40 mm/s) similar results were obtained (Supplementary Fig. S3), including a similar untangling pattern (compare Supplementary Movies 1 and 2). The ladder-like (Dladder) structure observed for all-atom, and the almost ladder-like structure (S-ladder) observed for the NARES force field is similar to structures predicted in earlier studies. 38 We also note that in all-atom simulations a saw-tooth profile is observed with a jump length of 25 ˚ A. When trajectories are analysed this shape is caused by “regrabbing” corresponding to reformation of hydrogen-bond patterns. This pattern is similar to the experimentally observed profile. However, as the experimental system size is much larger, the jump length was around 500 ˚ A. 39 There is also an interesting conformational change in all-atom simulations (Figure 3). Upon stretching there is a change in conformation of phosphate groups, leading to torsional angles approaching the extended (180[deg]) form. The sugars, on the other hand, remain in the same C2′-endo conformation. While DNA behaved normally in short MD simulations for SIRAH (100 ns), the strands started to detach from each other in longer simulations (5000 ns), therefore making the analysis of mechanical stability at slower speed pointless. In short simulations the force pulling pattern is similar to all-atom simulations, but the DNA never forms a “ladder-like” structure. This result may indicate that contrary to the all-atom simulations, in which the hydrogen-bonds are too strong, in the SIRAH force field the hydrogen-bonds are too weak. However, short simulations with the SIRAH force field (100 ns) produced force peaks in similar positions to the all-atom simulations, i.e. at approximately 100% stretch for the DNA.

3.2

Pulling at slow speed from opposite ends

The force against extension profile for a pulling speed of 0.04 mm/s is shown in Fig. 4. The peak for force stretching is strongly sequence-dependent and ranges from around 100 pN for d(AT)30 up to about 600 pN for d(GC)30 . We note that the stretch force peak values depend on the pulling speed (Supplementary Fig. S3 and S4). This effect is similar to experimental observations for proteins. 40 The initial length of our system is ∼200 ˚ A. The force starts to rise between 250 and 280 ˚ A, and the peak value occurs at around 330 to 350 ˚ A stretch. The half peak force value is observed

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 31

at a length of around 300 ˚ A, which is about a 70% overstretch of the initial length, and matches experimental results. 39 Interesting patterns occur for all telomere sequences, exhibiting high stretch resistance in the long stretch region (beyond twice the initial length), whereas all nontelomeric sequences are separated or almost separated, and the chains do not interact with each other (Fig. 4). This pattern is also observed for higher speed and for systems twice as large (when each chain contains around 120 nucleotides; Supplementary Fig. S5), so the trend does not depend on these factors, but on the sequences. When a four times larger system was studied (Figure 8) the tailing effect was still observed but is significantly less pronounced. This result may be attributed to larger forces interfering with regrabbing occurring more frequently. Interestingly, for smaller systems of half the reference size (when each chain contains around 30 nucleotides) the telomeres no longer exhibit strong resistance (Supplementary Fig. S13) in the long stretch regime, suggesting that a there is minimum number of repeats required to obtain this effect. Interestingly, the TTTTTCCCC, non-telomeric sequence, which can form a triplex 41 , also reveals resistance in the long stretch regime. The additional force needed to stretch telomeric sequences can be explained by the releasegrab-wrap mechanism. At some point the telomere reduces tension by releasing part of the pairing and subsequently by grabbing, wrapping around, and forming new base pairs (Figure 5 and Supplementary Movie 3). The regrabbing is associated with reformation of the hydrogenbond pattern (Figure 6), which occurs after sliding one chain along the other (indicated as increase of distance between centres of mass). It should be noted that for telomeres much longer simulations (Figure 6) are necessary because full dissociation occurs at much longer distances and the reformation of the hydrogen-bond pattern is much more pronounced. This behaviour is much less frequently observed for non-telomeric sequences (Table 1). A schematic representation of the stretching mechanism is shown in Fig. 7. For shorter chains regrabbing cannot occur, due to the low elasticity. Moreover, this resistance in the long stretch regime is not associated with the window size of repetition (Figure 4B). Interestingly, the TTTTTCCCC sequence repeats that can form a triplex also exhibit resistance in the long stretch regime. As both telomere sequences and the TTTTTCCCC sequence can form triplexes 19,41,42 under some conditions, this mechanical property might rather be correlated with triplex formation ability. Example single trajectory peak patterns are shown in Supplementary Figures S7 to S12 for d(TTAGG)12 , d(TTAGGG)10 , d(TTTAGGG)9 , d(AT)30 , d(GC)30 , and d(ATGC)15 , respectively. When single trajectories are analysed, a saw-tooth peak pattern is observed for all sequences. The pattern occurs least often for d(AT)30 and most often for the telomere sequences. (Table 1) The length of the peak ranges from 25 ˚ A (the same as for the all-atom simulations) up to 75 ˚ A, suggesting that the increase of the chain length, decrease of the pulling speed, and the sequence all influence the jump size. We note that the peak pattern is complicated and trajectory-dependent. It is also worth mentioning that in the initial stages of stretching an increase in the number of contacts is always observed. This result can be explained by reduction of breathing motions 43,44 in the initial stages of stretching and tightening of the ideal B-DNA conformations used as a starting structures and use of coarse-grained model, in which sugar-base energy minima for contacts are closer than in the ideal B-DNA structure. This process is not associated with the

ACS Paragon Plus Environment

Page 7 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

total number of twists. The analysis of work and average force at different stages of stretching (0-25% is an average force associated with relaxation, 25%-160% extension is associated with main stretching events and 160%-200% corresponds to long range resistance) is shown in Supplementary Table ST1. There is faster than linear dependence of work done on simulated sequence length. There is only a slight increase in the average force needed to break apart the structure (average force in stretch ranging from 25% to 160% of the initial length). This pattern is not reproduced by d(AT) sequences. In d(AT)30 the average force is significantly larger than in the other cases. The increase in the force in comparison with d(AT)15 can be explained by the longer chain. However, the decrease with the larger length of chain is associated with the simulation time. When this time is short the breathing motions 44 do not occur frequently. This event occurs only in AT sequences as those domains are the most susceptible 44 to breathing motions.

3.3

Pulling at slow speed from the same end

The plot of force against extension for pulling in parallel with the hydrogen-bond framework is shown in Fig. 9, which shows that the forces are significantly lower than for pulling perpendicular to the hydrogen-bond framework. For all non-telomere sequences the peak values are less than 50 pN. For telomere sequences the pattern is significantly different. Firstly, as for perpendicular pulling, telomere sequences exhibit significant resistance to stretching for long distances. This resistance is observed regardless of pulling speed (Fig. 9 and Supplementary Fig. S6). Secondly, the complex force pattern upon stretching occurs for all telomeric sequences. This pattern is strongly correlated with the window size of the repetition. The larger the repetition size the longer the “wave length” (distance between force peaks values) of initial stretching pattern. This result can be explained by sudden breaks of H-bonds, as the T..A pairing is much weaker than the G..C pairing, resulting in weak, strong pairing alternately. Analysis of simulations revealed that additional stability was induced by spontaneous formation of triplex structures (Fig. 10). One of the chains that is pulled folds back to form a loosely-bound triplex structure with the free (non-pulled) dsDNA end. These complexes are not very stable and dissociate upon further stretching. However, they contribute significantly to the higher mechanical resistance of the telomeric sequences. It should be stressed that in in the final stages of the simulations a stable bent triplex is formed. A schematic representation of the unravelling mechanism is shown in Fig. 11. When larger systems are analysed (Fig. 12) telomeres are no longer the only examples exhibiting resistance in the long stretch regime. This result indicates that when shorter fragments are analysed, only telomeres are flexible enough to form a triplex with a loose (unravelled) chain. The triplex can also be formed spontaneously, not only by telomeric sequences, but also when the window of repeats is larger (Table 2). Our analysis shows that local repetitions in consecutive nucleotides within repetitive blocks promote triplex formation upon unwinding. Sequences with a majority of one type of nucleotide have a higher probability to form triplexes, which can be partially explained because a purine/pyrimidine domain is required. 45

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 31

Moreover, such behaviour can be explained by the complementarity of the nucleotides: longer than one repeat nucleotides have a higher chance to find and attach to the loose chain. Finally, sequence-dependent stiffness of the chain seems to play a role in triplex formation, however, its distribution in non-trivial and difficult to analyse. 46 Our study shows that under specific conditions the number of possible triplex (or at least loosely-formed triplex, with a very small number of Hoogsteen contacts) is much broader. Triplex formation explains stability in the long range region, as the sequences exhibiting resistance in the long range regime exhibit at least ten times more frequent triplex formation (or at least three times in the case of fast pulling speed), as shown in Table 2.

4 4.1

DISCUSSION Comparison of all-atom and coarse-grained results

The all-atom and coarse-grained simulations both exhibit saw-tooth patterns during stretching. When the speed of the coarse-grained simulation was significantly increased, the overstretching peak shifted towards 100% of the initial length, corresponding to the all-atom overstretch peak value, but this limit was never reached. Despite a significant increase of pulling speed in the coarse-grained simulations the forces observed are still significantly lower than in the all-atom simulations. In all-atom simulations ladder-like structure is observed, whereas in coarse-grained simulations some hydrogen-bonds are always broken earlier. Nevertheless, with a significantly higher pulling speed for the coarse-grained force field similar ladder-like structures are seen only for fragments of dsDNA. Both the all-atom and coarse-grained structures are similar to those previously described results. 38 Coarse-grained simulations with high pulling speed and all-atom simulations share many common features. However, the effective pulling speed in the all-atom simulations is too high to reproduce all the features seen experimentally, in contrast to the coarse-grained simulations where slower pulling speeds are possible.

4.2

Comparison of NARES-2P simulation results to experimental data

The force values obtained for both types of stretching are in line with experimentally observed values. 47,48 However, in coarse-grained simulations the forces are lower than for force separation of DNA oligomers measured with atomic force microscopy by shearing apart opposite extremities. 49 As suggested in: 48 ‘(i) may include a strong coupling between bases because of the shearing motion and (ii) were produced at a higher rate’ as for our faster coarse-grained simulations and the all-atom simulations. Despite the fact that our pulling speed is still larger than experiment, our results for pulling at low speed confirm the experimental equilibrium 50 and very small pulling speed 39 value for the overstretching force jump at an extension of 70% of the initial length. we note that the position of the overstretching force jump in our simulations is velocity-dependent and shifts towards larger values at larger pulling speeds. The mechanism where hydrogen-bonds in B-DNA are lost gradually (by ‘jumps’), rather than one-by-one, has been described both theoretically 51 and experimentally, 39 in similar terms to the present

ACS Paragon Plus Environment

Page 9 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

work. The ratio between the peak force for d(GC)30 and d(AT)30 is 2.23 for parallel stretching (38.37 pN and 17.15 pN for d(GC)30 and d(AT)30 , respectively). This ratio is higher than the experimental estimate of 1.5. 48 However, the latter value was presented as a rough estimate, and in our case it might also be affected by the smaller length compared to experiment. The ranges of forces obtained for pulling parallel to the hydrogen-bond plane are slightly higher than for unwinding under tension (∼30pN) determined experimentally. 52 Finally, breaking of dsDNA was observed for l0 /l0,b = 2.14 ± 0.2., 47 which is similar to the results for non-telomere sequences, where force descent was observed in the range 1.75 to 2.4 of the initial length. We also find that the critical force for both unwinding and stretching perpendicular to the base pair plane increases with chain length (Supplementary Table 1, Figure S4-S6,S13) as previously described. 53,54 However there are too few points to find a mathematical relation between the length and the critical force required.

4.3

Telomere resistance

Our results reveal that telomere sequences and the triplex forming sequence TTTTCCCC have significantly stronger mechanical resistance in long range regions than the other non-telomeric sequences used as controls. This feature of telomeric fragments is probably the reason why such sequences secure the ends of DNA, which in vivo is subject to larger mechanical tension than other chromosome fragments. This phenomenon was not observed unless large systems were studied, suggesting that long stretched structures of DNA are unfavourable. It seems that the telomere effect of resistance is most important for short fragments of DNA and particularly locally damaged and becomes less important when a larger portion of DNA is damaged. The ability of telomeres to form triple-chain structures provides an additional stabilising property, preventing further untangling. As suggested by Wang and Vasquez 42 triplex formation is usually observed after double-strand breaking of DNA. Similar results were found in the present work when longer dsDNA was analysed. It seems that triplex formation in the case of telomeres is beneficial, as the strong resistance in the long stretch regime is also exhibited by other triplex forming sequence (TTTTTCCCC). Triplex formation prevents unravelling, which might lead to instability of the telomere and ultimately the whole chromosome. Our study suggests that the quadruplex should be even more mechanically stable than loosely bound triplexes, and perhaps this structure has evolved to protect DNA.

5

CONCLUSION

In this study we performed all-atom and coarse-grained simulations of telomeric and control (non-telomeric) DNA sequences to assess their mechanical properties. With the NARES-2P coarse-grained force field we were able to run simulations three orders of magnitude longer, with many more trajectories. Our results reveal that all-atom simulations do not provide realistic results due to the very high pulling force required, and the limitation to very short DNA sequences. The coarse-grained approach with comparable speed and size confirms that these are very important factors. However, due to the efficiency of the NARES-2P potential we were

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 31

able to use significantly slower pulling speeds, close to those employed experimentally, and we observed various telomere properties in agreement with experiment. 39,47,48,50 Our results highlight some key features of telomeric sequences. Evolution has designed telomeres to be significantly more stable than other sequences (over 50 pN), in the long stretch regime, to both perpendicular and parallel mechanical force. Telomeres exhibit features that reduce mechanical tension through a release/grab mechanism. In the case of force acting perpendicular to the DNA axis, telomeres can fold back, exhibiting additional stability from spontaneous formation of triple-chain-structures, which prevents unravelling. This mixture of flexibility (being able to release and fold back) and rigidity (the grabbing mechanism and the ability to form spontaneous triplexes) makes telomeres less prone to mechanical damage, making them ideal for protecting the DNA structure. Those features are not simply associated with the size of the repeating sequence but rather should be attributed to triplex formation ability. Quadruplex complexes cannot be treated yet with the NARES force field, because quadruplexes require cations for stability. 55–57 Potential of mean force (PMF) analysis of the DNAcation interactions is currently underway, and we plan to study mechanical properties of quadruplexes inthe future. The NARES-2P software is available at www.unres.pl.

6

ACKNOWLEDGEMENTS

AKS work supported by the National Science Center (Poland) Sonata UMO2015/17/D/ST4/00509, PK was supported by Foundation for Polish Science FNP Mistrz 7./2013. DJW gratefully acknowledge support from the EPSRC. Computational resources were also provided by (a) the supercomputer resources at the Informatics Center of the Metropolitan Academic Network (IC MAN) in Gda´ nsk, and (b) computational resources at Interdisciplinary Center for Mathematical and Computer Modeling in Warsaw (ICM), grant GA65-20 (c) our 682-processor Beowulf cluster at the Faculty of Chemistry, University of Gda´ nsk.

7

Supporting Information.

The Supporting Information is available free of charge on the ACS Publications website at DOI: . Additional plots of the force needed to stretch the DNA for pulling perpendicular (Figure S1, S3-5, S7-13) and parallel (Figure S6) to the hydrogen-bonds between bases, visualization of the ladder-like structure of d(GC)30 (Figure S2), example movies of all-atom (Supplementary Movie 1) and coarse-grained pulling simulations (Supplementary Movie 2-3), and average force and total work (kcal/mol) needed to dissociate double-stranded structure (Table ST1) (PDF)

ACS Paragon Plus Environment

Page 11 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

References [1] Aubert, G.; Lansdorp, P. M. Telomeres and Aging. Physiol. Rev. 2008, 88, 557–579. [2] Wright, W. E.; Piatyszek, M. A.; Rainey, W. E.; Byrd, W.; Shay, J. W. Telomerase Activity in Human Germline and Embryonic Tissues and Cells. Dev. Genet. 1996, 18, 173–179. [3] von Zglinicki, T.; Saretzki, G.; Dcke, W.; Lotze, C. Mild Hyperoxia Shortens Telomeres and Inhibits Proliferation of Fibroblasts: A Model for Senescence? Exp. Cell Res. 1995, 220, 186 – 193. [4] Galati, A.; Micheli, E.; Cacchione, S. Chromatin Structure in Telomere Dynamics. Front. Oncol. 2013, 3, 1–16. [5] Cech, T. R. Beginning to Understand the End of the Chromosome. Cell 2004, 116, 273 – 279. [6] De Lange, T. Telomer-Related Genome Instability in Cancer. Cold Spring Harbor symposia on quantitative biology. 2005; pp 197–204. [7] Vulliamy, T. J.; Marrone, A.; Knight, S. W.; Walne, A.; Mason, P. J.; Dokal, I. Mutations in Dyskeratosis Congenita: Their Impact on Telomere Length and the Diversity of Clinical Presentation. Blood 2006, 107, 2680–2685. [8] Armanios, M. Y.; Chen, J. J.-L.; Cogan, J. D.; Alder, J. K.; Ingersoll, R. G.; Markin, C.; Lawson, W. E.; Xie, M.; Vulto, I.; Phillips, J. A. I. et al. Telomerase Mutations in Families with Idiopathic Pulmonary Fibrosis. N. Engl. J. Med. 2007, 356, 1317–1326. [9] Schumacher, B.; Garinis, G. A.; Hoeijmakers, J. H. Age to Survive: DNA Damage and Aging. Trends Genet. 2008, 24, 77 – 85. [10] Moore, J. K.; Haber, J. E. Cell Cycle and Genetic Requirements of Two Pathways of Nonhomologous End-Joining Repair of Double-Strand Breaks in Saccharomyces Cerevisiae. Mol. Cell. Biol. 1996, 16, 2164–73. [11] Mansour, W. Y.; Rhein, T.; Dahm-Daphi, J. The Alternative End-Joining Pathway for Repair of DNA Double-Strand Breaks Requires PARP1 but is not Dependent upon Microhomologies. Nucl. Acids Res. 2010, 38, 6065. [12] Forney, J. D.; Blackburn, E. H. Developmentally Controlled Telomere Addition in Wildtype and Mutant Paramecia. Mol. Cell. Biol. 1988, 8, 251–258. [13] Meyne, J.; Ratliff, R. L.; Moyzis, R. K. Conservation of the Human Telomere Sequence (TTAGGG)n Among Vertebrates. Proc. Natl. Acad. Sci. USA 1989, 86, 7049–7053. [14] Cox, A. V.; Bennett, S. T.; Parokonny, A. S.; Kenton, A.; Callimassia, M. A.; Bennett, M. D. Comparison of Plant Telomere Locations using a PCR-generated Synthetic Probe. Ann. Bot. 1993, 72, 239–247.

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 31

[15] McEachern, M. J.; Blackburn, E. H. A Conserved Sequence Motif within the Exceptionally Diverse Telomeric Sequences of Budding Yeasts. Proc. Natl. Acad. Sci. USA 1994, 91, 3453– 3457. [16] Frydrychov, R.; Grossmann, P.; Trubac, P.; Vtkov, M.; Marec, F. Phylogenetic Distribution of TTAGG Telomeric Repeats in Insects. Genome 2004, 47, 163–178. [17] Naserian-Nik, A. M.; Tahani, M.; Karttunen, M. Pulling of Double-stranded DNA by Atomic Force Microscopy: A Simulation in Atomistic Details. RSC Adv. 2013, 3, 10516– 10528. [18] Schrodt, M. V.; Andrews, C. T.; Elcock, A. H. Large-Scale Analysis of 48 DNA and 48 RNA Tetranucleotides Studied by 1 µs Explicit-Solvent Molecular Dynamics Simulations. J. Chem. Theory Comput. 2015, 11, 5906–5917. [19] Stadlbauer, P.; Mazzanti, L.; Cragnolini, T.; Wales, D. J.; Derreumaux, P.; Pasquali, S.; poner, J. Coarse-Grained Simulations Complemented by Atomistic Molecular Dynamics Provide New Insights into Folding and Unfolding of Human Telomeric G-Quadruplexes. J. Chem. Theory Comput. 2016, 12, 6077–6097. [20] Kmiecik, S.; Gront, D.; Kolinski, M.; Wieteska, L.; Dawid, A.; Koli´ nski, A. Coarse-Grained Protein Models and Their Applications. Chem. Rev. 2016, 116, 7898–7936. [21] Grubm¨ uller, H.; Heymann, B.; Tavan, P. Ligand Binding: Molecular Mechanics Calculation of the Streptavidin-Biotin Rupture Force. Science 1996, 271, 997–999. [22] Ponder, J. W.; Case, D. A. Force Fields for Protein Simulations. Adv. Protein Chem. 2003, 66, 27–86. [23] Cheatham, T. E.; Case, D. A. Twenty-five Years of Nucleic Acid Simulations. Biopolymers 2013, 99, 969–977. [24] Maier, J. A.; Martinez, C.; Kasavajhala, K.; Wickstrom, L.; Hauser, K. E.; Simmerling, C. ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB. J. Chem. Theory Comput. 2015, 11, 3696–3713. [25] He, Y.; Maciejczyk, M.; OÃldziej, S.; Scheraga, H. A.; Liwo, A. Mean-Field Interactions between Nucleic-Acid-Base Dipoles can Drive the Formation of a Double Helix. Phys. Rev. Lett. 2013, 110, 098101. [26] Liwo, A.; Baranowski, M.; Czaplewski, C.; GoÃla´s, E.; He, Y.; JagieÃla, D.; Krupa, P.; Maciejczyk, M.; Makowski, M.; Mozolewska, M. A. et al. A Unified Coarse-Grained Model of Biological Macromolecules Based on Mean-Field Multipole-Multipole Interactions. J. Mol. Model. 2014, 20, 2306. [27] Darr, L.; Machado, M.; Brandner, A.; Gonzlez, H.; Ferreira, S.; Pantano, S. SIRAH: A Structurally Unbiased Coarse-Grained Force Field for Proteins with Aqueous Solvation and Long-Range Electrostatics. J. Chem. Theory Comput. 2015, 11, 723–739.

ACS Paragon Plus Environment

Page 13 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

[28] Machado, M.; Pantano, S. Exploring LacI-DNA Dynamics by Multiscale Simulations Using the SIRAH Force Field. J. Chem. Theory Comput. 2015, 11, 5012–5023. [29] Berendsen, H.; van der Spoel, D.; van Drunen, R. GROMACS: A Message-passing Parallel Molecular Dynamics Implementation. Comput. Phys. Commun. 1995, 91, 43–56. [30] Krupa, P.; Sieradzan, A. K.; Rackovsky, S.; Baranowski, M.; OÃldziej, S.; Scheraga, H. A.; Liwo, A.; Czaplewski, C. Improvement of the Treatment of Loop Structures in the UNRES Force Field by Inclusion of Coupling between Backbone- and Side-Chain-Local Conformational States. J. Chem. Theory Comput. 2013, 9, 4620–4632. [31] Sieradzan, A. K.; Krupa, P.; Scheraga, H. A.; Liwo, A.; Czaplewski, C. Physics-Based Potentials for the Coupling between Backbone- and Side-Chain-Local Conformational States in the United Residue (UNRES) Force Field for Protein Simulations. J. Chem. Theory Comput. 2015, 11, 817–831. [32] He, Y.; Mozolewska, M. A.; Krupa, P.; Sieradzan, A. K.; Wirecki, T. K.; Liwo, A.; Kach´ lishvili, K.; Rackovsky, S.; JagieÃla, D.; Slusarz, R. et al. Lessons from Application of the UNRES Force Field to Predictions of Structures of CASP10 Targets. Proc. Natl. Acad. Sci. USA 2013, 110, 14936–14941. [33] Krupa, P.; Mozolewska, M. A.; Winiewska, M.; Yin, Y.; He, Y.; Sieradzan, A. K.; ´ Ganzynkowicz, R.; Lipska, A. G.; Karczyska, A.; Slusarz, M. et al. Performance of Proteinstructure Predictions with the Physics-Based UNRES Force Field in CASP11. Bioinformatics 2016, 32, 3270–3278. [34] Sieradzan, A. K.; Liwo, A.; Hansmann, U. H. E. Folding and Self-Assembly of a Small Protein Complex. J. Chem. Theory Comput. 2012, 8, 3416–3422. [35] Mozolewska, M.; Krupa, P.; Scheraga, H.; Liwo, A. Molecular Modeling of the Binding Modes of the Iron-Sulfur Protein to the Jac1 Co-Chaperone from Saccharomyces Cerevisiae by All-Atom and Coarse-Grained Approaches. Proteins: Struct., Funct., Bioinf. 2015, 83, 1414–1426. [36] Khalili, M.; Liwo, A.; Jagielska, A.; Scheraga, H. A. Molecular Dynamics with the UnitedResidue Model of Polypeptide Chains. II. Langevin and Berendsen-bath Dynamics and Tests on Model α-Helical Systems. J. Phys. Chem. B 2005, 109, 13798–13810. [37] Liwo, A.; Khalili, M.; Scheraga, H. A. Ab initio Simulations of Protein-Folding Pathways by Molecular Dynamics with the United-Residue Model of Polypeptide Chains. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 2362–2367. [38] Konrad, M. W.; Bolonick, J. I. Molecular Dynamics Simulation of DNA Stretching Is Consistent with the Tension Observed for Extension and Strand Separation and Predicts a Novel Ladder Structure. J. Am. Chem. Soc. 1996, 118, 10989–10994.

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 31

[39] Gross, P.; Laurens, N.; Oddershede, L. B.; Bockelmann, U.; Peterman, E. J.; Wuite, G. J. Quantifying How DNA Stretches, Melts and Changes Twist under Tension. Nat. Phys. 2011, 7, 731–736. [40] Rico, F.; Gonzalez, L.; Casuso, I.; Puig-Vidal, M.; Scheuring, S. High-speed Force Spectroscopy Unfolds Titin at the Velocity of Molecular Dynamics Simulations. Science 2013, 342, 741–743. [41] Moraru-Allen, A. A.; Cassidy, S.; Alvarez, J.-L. A.; Fox, K. R.; Brown, T.; Lane, A. N. Coralyne Has A Preference for Intercalation between TA-T Triples in Intramolecular DNA Triple Helices. Nucl. Acids Res. 1997, 25, 1890–1896. [42] Wang, G.; Vasquez, K. M. Naturally Occurring H-DNA-Forming Sequences are Mutagenic in Mammalian Cells. Proc. Natl. Acad. Sci. USA 2004, 101, 13448–13453. [43] Mandal, C.; Kallenbach, N. R.; Englander, S. Base-pair Opening and Closing Reactions in the Double Helix. J. Mol. Biol. 1979, 135, 391 – 411. [44] Altan-Bonnet, G.; Libchaber, A.; Krichevsky, O. Bubble Dynamics in Double-Stranded DNA. Phys. Rev. Lett. 2003, 90, 138101. [45] Hanvey, J. C.; Shimizu, M.; Wells, R. D. Intramolecular DNA Triplexes in Supercoiled Plasmids. Proc. Natl. Acad. Sci. USA 1988, 85, 6292–6296. [46] Gromiha, M. M. Structure Based Sequence Dependent Stiffness Scale for Trinucleotides: A Direct Method. J. Biol. Phys. 2000, 26, 43–50. [47] Bensimon, D.; Simon, A. J.; Croquette, V.; Bensimon, A. Stretching DNA with a Receding Meniscus: Experiments and Models. Phys. Rev. Lett. 1995, 74, 4754. [48] Essevaz-Roulet, B.; Bockelmann, U.; Heslot, F. Mechanical Separation of the Complementary Strands ofDNA. Proc. Natl. Acad. Sci. USA 1997, 94, 11935–11940. [49] Lee, G. U.; Chrisey, L. A.; Colton, R. J. Direct Measurement of the Forces between Complementary Strands of DNA. Science 1994, 266, 771–773. [50] Smith, S. B.; Cui, Y.; Bustamante, C. Overstretching B-DNA: the Elastic Response of Individual Double-Stranded and Single-Stranded DNA Molecules. Science 1996, 271, 795– 799. [51] Romano, F.; Chakraborty, D.; Doye, J. P. K.; Ouldridge, T. E.; Louis, A. A. Coarse-grained Simulations of DNA Overstretching. J. Chem. Phys. 2013, 138 . [52] Gore, J.; Bryant, Z.; N¨ollmann, M.; Le, M. U.; Cozzarelli, N. R.; Bustamante, C. DNA Overwinds When Stretched. Nature 2006, 442, 836–839. [53] Mishra, R. K.; Mishra, G.; Li, M. S.; Kumar, S. Effect of Shear Force on the Separation of Double-Stranded DNA. Phys. Rev. E 2011, 84, 032903.

ACS Paragon Plus Environment

Page 15 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

[54] Mosayebi, M.; Louis, A. A.; Doye, J. P. K.; Ouldridge, T. E. Force-Induced Rupture of a DNA Duplex: From Fundamentals to Force Sensors. ACS Nano 2015, 9, 11993–12003. [55] Saintom, C.; Amrane, S.; Mergny, J.-L.; Alberti, P. The Exception that Confirms the Rule: A Higher-order Telomeric G-quadruplex Structure More Stable in Sodium than in Potassium. Nucl. Acids Res. 2016, [56] Bhattacharyya, D.; Mirihana Arachchilage, G.; Basu, S. Metal Cations in G-Quadruplex Folding and Stability. Front. Chem. 2016, 4, 38. [57] Largy, E.; Marchand, A.; Amrane, S.; Gabelica, V.; Mergny, J.-L. Quadruplex Turncoats: Cation-Dependent Folding and Stability of Quadruplex-DNA Double Switches. J. Am. Chem. Soc. 2016, 138, 2780–2792.

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 31

Table 1: Frequency of regrabbing for different nucleic sequences with different pulling speed perpendicular to hydrogen-bond plane Sequence d(AT)30 d(GC)30 d(ATGC)15 d(AATTGGCC)8 d(AAATTTGGGCCC)5 d(AAAATTTTGGGGCCCC)4 d(TTTTTCCCC)7 d(TTAGG)12 d(TTAGGG)10 d(TTTAGGG)9 d(GGTGTAC)9

fast pulling 1.38 0.46 1.27 0.97 0.22 0.19 1.53 1.56 2.06 2.03 1.64

slow pulling 0.02 1.00 1.52 0.97 0.39 0.45 2.45 2.73 3.33 2.95 2.03

Table 2: Fraction of simulation (in %) when at least one contact is in a triplex form for different nucleic sequences with different pulling speed parallel to hydrogen-bond plane Sequence d(AT)30 d(GC)30 d(ATGC)15 d(AATTGGCC)8 d(AAATTTGGGCCC)5 d(AAAATTTTGGGGCCCC)4 d(TTTTTCCCC)7 d(TTAGG)12 d(TTAGGG)10 d(TTTAGGG)9 d(GGTGTAC)9

fast pulling 0.00 0.07 0.01 1.06 0.67 1.87 1.77 1.10 1.80 2.65 0.21

ACS Paragon Plus Environment

slow pulling 0.00 0.13 0.03 5.49 5.30 9.82 12.92 6.97 9.66 9.35 1.57

Page 17 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

8

Figure legends

Fig. 1. Comparison between the all-atom representation of the nucleotide chain (sticks) and the coarse-grained NARES-2P model (circles, ellipsoids and heavy black lines). Red circles represent the united sugar groups (S), which serve only as geometric reference points. Yellow circles represent the united phosphate groups (P). Ellipsoids (light blue) represent united sugar-bases, with their geometric centres shown as solid green circles. The Ps are located halfway between two consecutive sugar atoms. Dipoles are located on the bases to represent their electrostatic interaction. The electrostatic part of the base-base interactions is represented by the mean-field interactions of the base dipoles (red arrows). Virtual-bond angles, θ, and virtual-bond dihedral angles, γ, are used to describe the backbone geometry. The base orientation angles α and torsional angles β, that define the location of a base with respect to the backbone, are also indicated. Fig. 2. Schematic representation of the pulling modes; (top) perpendicular and (bottom) parallel to hydrogen-bonds between bases. The forces are marked with red arrows, while dsDNA chains are marked with blue and green colouring. Fig. 3. Sugar and phosphate group conformational transition occurring upon stretching perpendicular to the hydrogen-bonds between bases, represented in a stick form. Blue is the initial state, red is the transition state and black represents the final conformation of sugar and phosphate group obtained. Fig. 4. Plot of the force needed to stretch the DNA perpendicular to the hydrogen-bonds between bases, as a function of sequence. Thin lines indicate averages over 64 trajectories and the coloured areas indicate one standard deviation. (A) Comparison between nontelomere and telomere sequences. (B) Influence of size of repeat window compared with TTTAGGG telomere sequence. Fig. 5. Structure of telomere after regrabbing. The free end (red) wraps around the second DNA chain. Fig. 6. Number of nucleic acid contacts (red), chain twists (blue), distance between centres of mass of DNA chains (green), distance between pulled residues (violet) for systems (A) d(GC)30 and (B) d(TTAGGG)10 as a function of time. Fig. 7. Structures of d(TTAGG)12 found on the stretching pathway for pulling perpendicular to the hydrogen-bonds between bases at different extensions (as indicated below each structure). (A) loss of H-bond contacts at the ends, (B) loss of twist associated with releasing of one of ends, (C) slide of whole chain with respect to each other followed by regrabbing (D) loss of twists associated with weakening base-pairing at the ends, (E) release of base pairing from one of chains followed by sliding ,(F) regrabbing occurring at one of chains, (G) late stage regrabbing occurring. Fig. 8. Plot of the force needed to stretch the DNA perpendicular to the hydrogen-bonds between bases, as a function of sequence for four-times larger systems with fast pulling

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 31

speed. Thin lines indicate averages over 64 trajectories and the coloured areas indicate one standard deviation. Fig. 9. Force against stretch plot for pulling parallel to the hydrogen-bond framework between bases. Thin lines indicate averages over 64 simulations, and coloured areas represent one standard deviation. (A) Comparison between non-telomere and telomere sequences. (B) Influence of size of repeat window compared with TTTAGGG telomere sequence. Fig. 10. Visualisation of loosely formed triplex structure. The pulled end middle section (green) folds back to pair with the free ends (blue and red). Fig. 11. Structures of d(TTAGG)12 found on unravelling pathway pulled parallel to the hydrogen-bonds between bases at different stages of simulation. (A) initial loss of basepairing at pulled ends (B) sudden loss of several base pair from pull ends, (C) looped structure forming triplex, (D) non-pulled ends unwinds to form canonical base pairing with stretched fragment. (E) release of triplex/duplex formed (reformation of base pairing at non-pulled end), (F) example structure of late stage triplex formation. Fig. 12. Plot of the force needed to stretch the DNA parallel to the hydrogen-bonds between bases, as a function of sequence for four-times larger systems with fast pulling speed. Thin lines indicate averages over 64 trajectories and the coloured areas indicate one standard deviation.

ACS Paragon Plus Environment

Page 19 of 31

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 31

Figure 2

ACS Paragon Plus Environment

Page 21 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

X

X X

Figure 3

ACS Paragon Plus Environment

The Journal of Physical Chemistry

900

GC30 GC30 AT30 AT30 TTTTTCCCC7 TTTTTCCCC7 TTAGGG10 TTAGGG10 TTTAGGG9 TTTAGGG9 TTAGG12 TTAGG12 GGTGTAC9 GGTGTAC9

800 700

Force [pN]

600 500 400 300 200 100 0 0

50

100

150

200

250

Stretch [Å]

300

350

400

A

900 ATCG15 ATCG15 AATTGGCC8 AATTGGCC8 AAATTTGGGCCC5 AAATTTGGGCCC5 AAAATTTTGGGGCCCC4 AAAATTTTGGGGCCCC4 TTTAGGG9 TTTAGGG9

800 700 600

Force [pN]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 31

500 400 300 200 100 0 0

50

100

150

200

250

Stretch [Å]

300

350

400

B

Figure 4

ACS Paragon Plus Environment

Page 23 of 31

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5

ACS Paragon Plus Environment

The Journal of Physical Chemistry

Number of nucleic acid contacts and chain twists

Number of nucleic acid contacts and chain twists 20

30

40

50

60

100

200

300

400

500

600

0

10

0

60

50

40

30

20

10

0 0 100

100

200

200

300

300

Time [µs]

500 600

Time [µs]

400

400

700

A

0 600

900

800

700

600

500

400

300

200

100

0 900 1000

B

500

800

Distance between pulled residues; distance between MC1 and MC2 [Å]

Figure 6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Page 24 of 31

ACS Paragon Plus Environment

Distance between pulled residues; distance between MC1 and MC2 [Å]

Page 25 of 31

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

286

A

298

B

323

C

329

D

404

E

409

F

615

G

Figure 7

ACS Paragon Plus Environment

The Journal of Physical Chemistry

2000

Force [pN]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 31

GC120 GC120 AT120 AT120 TTTTTCCCC27 TTTTTCCCC27 TTAGGG40 1500 TTAGGG40 TTTAGGG34 TTTAGGG34 TTAGG48 TTAGG48 GGTGTAC34 GGTGTAC34 1000

500

0 0

200

400

600

800

1000

1200

1400

1600

Stretch [Å]

Figure 8

ACS Paragon Plus Environment

Page 27 of 31

400

GC30 GC30 AT30 AT30 TTTTTCCCC7 TTTTTCCCC7 TTAGGG10 TTAGGG10 TTTAGGG9 TTTAGGG9 TTAGG12 TTAGG12 GGTGTAC9 GGTGTAC9

350 300

Force [pN]

250 200 150 100 50 0 −50 0

100

200

300 Stretch [Å]

400

500

A

400 ATCG15 ATCG15 AATTGGCC8 AATTGGCC8 AAATTTGGGCCC5 AAATTTGGGCCC5 AAAATTTTGGGGCCCC4 AAAATTTTGGGGCCCC4 TTTAGGG9 TTTAGGG9

350

300

250

Force [pN]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

200

150

100

50

0

−50 0

100

200

300

Stretch [Å]

400

500

B

Figure 9

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 31

Figure 10

ACS Paragon Plus Environment

Page 29 of 31

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 11

ACS Paragon Plus Environment

The Journal of Physical Chemistry

400 GC120 GC120 AT120 AT120 TTAGGG40 TTAGGG40 TTTAGGG34 TTTAGGG34

350

300

Force [pN]

250

200

150

100

50

0

−50 0

500

1000

Stretch [Å]

1500

2000

A

400 TTTTTCCCC27 TTTTTCCCC27 ATGC60 ATGC60 TTAGG48 TTAGG48 GGTGTAC34 GGTGTAC34

350

300

250

Force [pN]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 31

200

150

100

50

0

−50 0

500

1000

Stretch [Å]

1500

2000

B

Figure 12

ACS Paragon Plus Environment

Page 31 of 31

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure TOC

ACS Paragon Plus Environment