Single-Stranded DNA Oligonucleotides Retain Rise Coordinates

Aug 2, 2018 - The structures of single-stranded DNA oligonucleotides from dimeric to hexameric sequences have been thoroughly investigated...
0 downloads 0 Views 5MB Size
Subscriber access provided by READING UNIV

B: Biophysics; Physical Chemistry of Biological Systems and Biomolecules

Single Stranded DNA Oligonucleotides Retain Rise Coordinates Characteristic of Double Helices Amedeo Capobianco, Amalia Velardo, and Andrea Peluso J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/acs.jpcb.8b04542 • Publication Date (Web): 02 Aug 2018 Downloaded from http://pubs.acs.org on August 6, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Single Stranded DNA Oligonucleotides Retain Rise Coordinates Characteristic of Double Helices Amedeo Capobianco,∗ Amalia Velardo, and Andrea Peluso Dipartimento di Chimica e Biologia “A. Zambelli”, Università di Salerno, Via G. Paolo II, I-84084 Fisciano (SA), Italy E-mail: [email protected]

1

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract The structures of single stranded DNA oligonucleotides from dimeric to hexameric sequences have been thoroughly investigated. Computations performed at the density functional level of theory including dispersion forces and solvation show that single stranded helices adopt conformations very close to crystallographic B-DNA, with rise coordinates amounting up to 3.33 Å. Previous results, suggesting that single strands should be shorter than double helices were largely originated by the incompleteness of the adopted basis set. Although sensible deviations with respect to standard B-DNA are predicted, computations indicate that sequences rich of stacked adenines are the most ordered ones, favoring the B-DNA pattern and inducing regular arrangements also on flanking nucleobases. Several structural properties of double helices rich of adenine are indeed already reflected by the corresponding single strands.

2

ACS Paragon Plus Environment

Page 2 of 40

Page 3 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Introduction Structural information is of paramount importance for the biological functions of nucleic acids, inasmuch as the ability of DNA and RNA to assume proper conformations is responsible in the packaging of cellular nucleus, cutting and sealing operations, and biochemical catalysis. 1–5 Optimum structure of nucleic acids results from the interplay of several factors: Hydrogen bondings between complementary bases, intra-strand base-stacking interactions, constraints imposed by the conformations of the sugar-phosphate moieties, solvation and ionic strength. Although intra-strand π-π stacking interactions are weaker than hydrogen bonding between complementary nucleobases, 6–10 there is large evidence that not only single strands (ss), but also double helices owe their stability mainly to stacking interactions. 11–13 Significant insights for the conformation variabilities of nucleic acids have been achieved by molecular dynamics (MD) simulations. 14–16 MD represents a well suited tool for studying DNA inasmuch as it allows for the inclusion of solvent atomic structure and thermal effects in computations. Indeed the development of increasingly refined force fields especially suited for nucleic acids still constitutes an intense research effort. 17 On the other hand, the molecular mechanics force fields on which MD is based, being empirical in nature, cannot reproduce the polarization of molecules due to different kinds of intra- and inter-molecular interactions, as discussed e.g. in ref.s 18, 19, and 20. Those effects could be included by static and dynamic methods of quantum chemistry (QM). Over the past decade, numerous quantum chemical studies of DNA have appeared, showing that among QM methods density functional theory (DFT) can provide a reliable description of DNA oligonucleotides, on condition that functionals capable of handling dispersion forces are used and solvation effects are included in computations. 19–24 Previous DFT studies suggested that single stranded DNA sequences tend to adopt B-DNA configuration in solution, but significant distortions with respect to the ideal structure were detected. The larger discrepancies were found for rise coordinates, that were predicted to be sensibly shorter than 3

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

3.3 Å, the average value of B-DNA. 19,23–25 Prediction of reliable inter-base distances in DNA sequences is of great importance for the modeling of several important phenomena. As an example, it has been demonstrated that the electronic coupling for the hole transport in oxidized DNA depends mainly on rise coordinate and to a much lesser extent on twist. 26,27 The achievement of optimum rise coordinate is also essential for protein/DNA binding and recognition already at the level of single strand. 28 Despite the important role single strands play in DNA replication, recombination, repair, and transcription, scarce structural information is available. 29 MD simulations predict stacked conformations as the most populated for RNA and single stranded DNA dinucleotides in solution. 30–33 Unfortunately however, single-stranded nucleic acids are difficult to study by X-ray diffraction owing to their conformational lability. Therefore, lacking precise structural data, 34 it has still to be ascertained whether the prediction of short rises corresponds to a genuine feature of unpaired helices, possibly due to the absence of restraints imposed by the complementary strand, or if it arises from deficiencies of the computational methodology. 25,35,36 Herein we have addressed that problem by performing geometry optimizations at the DFT level for several ss-DNA oligonucleotides of different compositions, ranging from dimeric to hexameric sequences. Present geometry optimizations of oligonucleotide sequences started from B-DNA structure. Indeed previous MD simulations have shown that many of the ssoligonucleotides investigated here should assume slightly distorted B-DNA configuration in solution. 8 As concerns the conformations of oligonucleotides, we have carried out a complete analysis of the base step and sugar conformational parameters obtained by the different methodologies. Upon comparing present results with structural information available from experimental data, it emerges that the rise coordinate is very difficult to model by DFT computations and that sufficiently extended basis sets are to be used in computations. Furthermore, the

4

ACS Paragon Plus Environment

Page 4 of 40

Page 5 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

present analysis indicates that several structural properties of double helices are already found in the corresponding single strands, especially for sequences rich of stacked adenines.

Computational details Starting geometries for the adenine/guanine (A/G) nucleobase stack and single stranded oligomers were built in B-DNA configuration by using the 3DNA software with the default model based on fiber diffraction of calf thymus DNA. 3DNA was also used to analyze geometries in terms of standard rigid body coordinates and torsion angles of sugar-phosphate backbone. 37,38 DFT computations have been performed by employing M06-2X and B3LYP functionals. 39–41 The latter was augmented with the empirical correction for dispersion energy proposed by Grimme. Both the D2 (B3LYP-D2) and the D3 (B3LYP-D3) parameterizations were used. 42,43 B3LYP-D3 was employed in conjunction with the damping function by Becke and Johnson. 44 DFT-D3 is expected to achieve better performance than DFT-D2 because dispersion coefficients were computed from first principles and were allowed to depend on the coordination numbers. Moreover 1/R8 as well as three-body terms were introduced in DFT-D3 to model dispersion. 43 Although pure functionals are expected to provide similar accuracy as hybrid ones for predicting equilibrium geometries of DNA segments, 23,25 we have limited the present analysis to hybrid functionals because we are interested to the hole transfer in oxidized DNA, a phenomenon occurring through excited states with partial charge transfer character, for which pure functionals are not well suited. 45–49 Solvation effects were included in all geometry optimizations of oligonucleotides via the polarizable continuum model (PCM). 50 For all the oligonucleotides, the negative charges of phosphate groups were neutralized by sodium ions, 21 but for 50 -AA-30 whose negative charge was left unbalanced. Free valences

5

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

of oxygens at 50 and 30 positions of terminal deoxyribose units were saturated by hydrogen. DFT computations were performed by the Gaussian package. 51 The Turbomole suite of programs was used for MP2 computations, carried out in conjunction with the resolution of identity and the frozen core approximations. 52,53 The intermolecular basis set superposition error (BSSE) was computed according to the counterpoise (CP) scheme. 54 For each of the investigated systems, the same starting structure was utilized for the geometry optimizations carried out with different methodologies, in order to prevent biases and problems of reproducibility possibly originated by the flexibility of oligonucleotides. Vibrational contributions of Gibbs free energy for 50 -AA-30 dinucleotide were computed in harmonic approximation. In the following, ma-TZVP denotes the minimally augmented def2-TZVP basis set proposed by Truhlar and coworkers, 55–57 ma-TZVP(-f) is ma-TZVP without f functions on heavy atoms.

Results and discussion A/G stack In order to assess the influence of the basis set on the rise coordinate of DNA oligonucleotides, we started our investigation by optimizing a simple system composed only of nucleobases: The A/G stack initially prepared in B-DNA configuration. Lacking the constraints imposed by the sugar-phosphate backbone, the rise coordinate of A/G is determined mainly by π-π stacking interactions, for which the incompleteness of the basis set is expected to play a major role. Different basis sets were employed, going from simply polarized double-ζ sets such as SV(P) up to the very large def2-QZVP of quadruple-ζ quality. Solvation effects were included via the PCM in order to mimic the conditions of single stranded oligonucleotides. Rise coordinates obtained by different functionals and basis sets are reported in Figure 1. B3LYP-D and M06-2X functionals predict quite different inter-base distances for the A/G 6

ACS Paragon Plus Environment

Page 6 of 40

Page 7 of 40

3.35

B3LYP-D3 B3LYP-D2 M06-2X ma-TZVP(-f) 6-311+G(2d,p)

3.25

6-31+G(d,p)

rise (Å)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

TZVP

ma-TZVP

def2-QZVP

def2-TZVP

SV(P)

3.15

6-31G(d,p)

3.05

2.95

200

400

600 800 1000 1200 number of basis set functions

1400

1600

Figure 1: Rise coordinates of the A/G stack predicted by PCM computations employing different functionals and basis sets.

stack; rise coordinate increases in the order M06-2X < B3LYP-D2 < B3LYP-D3, converged values being 3.07, 3.13 and 3.26 Å, respectively. We anticipate this result as a general trend, as we shall see in detail in next sections concerning single stranded oligonucleotides. Notably, the possible underestimation of inter-ring distances for stacked nucleobases by M06-2X has been already conjectured in previous studies. 21,36,58 Figure 1 also shows that for all the functionals, inter-base distance is predicted to generally increase upon enlarging the size of the basis set, diffuse s and p being more effective than polarization functions, as seen e.g. by comparing the results of def2-TZVP with ma-TZVP(f) and def2-TZVPP sets. In passing from 6-31G(d,p), largely used in previous studies, to the def2-QZVP basis set, here taken as the reference, 43 the rise coordinate increases by ≈ 0.12 Å with all functionals, thus revealing the limitations of the smaller basis sets. Although a converged value for the rise is achieved by using ma-TZVP(-f) or even larger basis sets (Figure 1), TZVP, which makes computations of relatively large oligonucleotides still feasible, is able to recover almost the 70% of the gap between the worst and the best

7

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

performing basis set with all tested functionals. The RMSD between actual and def2-QZVP predicted geometries, reported in Table 1, decreases by a factor of three in passing from 6-31G(d,p) to TZVP and by a factor of two in passing from 6-31+G(d) to TZVP according to B3LYP-D3 functional, similar conclusions also holding for the other tested methodologies (see Tables S1 and S2 in the Supporting Information). Furthermore, data of Table 1 show that at variance with shift, tilt and twist which do not exhibit appreciable changes with basis set, the slide and roll coordinates undergo more relevant variations, ≈ 0.20 Å and 4 degrees, respectively in passing from 6-31G(d,p) to def2-QZVP. Since predicted slide and roll are consistent with the values observed in B-DNA and since they are known to exert a large influence on helix conformation, 59 basis set is expected to play an important role also on the conformations of single stranded DNA helices which also include the sugar-phosphate backbone. To further highlight the role of the basis set, we carried out an additional geometry optimization of the A/G stack, by using the second order Møller Plesset level of theory in conjunction with the cc-pVQZ basis set, 61 without including solvation, but imposing geometrical constraints able to preserve B-DNA configuration, as sketched in Figure S1 of the Supporting Information and fully detailed in ref 35. Then the interaction energy and the relative BSSE of the partially optimized A/G stack was computed at the DFT level with different basis sets, yielding the results reported in Table 2. Comparison of the data of Table 2 with the rise coordinates of Figure 1 shows that the basis sets which provide the smallest ratio between BSSE and interaction energy, also yield rise coordinates closer to def2-QZVP predictions for all tested functionals, thus demonstrating that the prediction of short rise coordinates by the smaller basis sets is largely due to their incompleteness. The TZVP basis set, as large as 6-31++G(d,p), outperforms all tested double-ζ basis sets (including aug-cc-pVDZ), singly polarized triple-ζ basis sets with diffuse functions such as 6-311++G(d,p) and even the much more extended cc-pVTZ, giving the same BSSE corrected interaction energy as def2-QZVP for all the tested functionals.

8

ACS Paragon Plus Environment

Page 8 of 40

Page 9 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

To sum up, the preliminary analysis of the A/G stack shows that the 6-31G(d,p) basis set largely used in previous studies on DNA oligonucleotides, included those carried out in our research group, 19–21,24 is rather incomplete and yields underestimated rise coordinates. The incompleteness of the basis set is known to be relevant for molecules with folded structures that can interact via dispersion forces. For single stranded DNA, if the basis set is not flexible enough, as for simply polarized double-ζ basis sets, a nucleobase could make its unoccupied basis functions available to adjacent nucleobases. That would give rise to an artificial lowering of the energy of the strand, achieved by taking nucleobases at too short distances. Similar issues were already found for n-helicenes and peptides. 62,63 Most of the above shortcomings should be remedied by TZVP basis set which appears to ensure a sufficient reliability at a still affordable computational cost. Therefore our analysis of DNA oligonucleotides will be carried out by comparing the equilibrium geometries obtained by the widely used 6-31G(d,p) basis set with those from TZVP predictions. 24

Single stranded oligonucleotides 50 -AA-30 single strand Two different initial configurations of the 50 -AA-30 single stranded dinucleotide have been considered: The stacked one, in which nucleobases are arranged as in calf thymus B-DNA, and an unstacked conformer, here used for testing purposes, whose starting geometry was adapted from a snapshot of the 50 -TTAATT-30 single strand from the MD simulations of ref 8 and then optimized by using the AMBER force field in vacuo. 64 Predicted base step parameters and puckering conformations for the stacked configuration of 50 -AA-30 are reported in Table 3. According to all the methodologies listed in Table 3, the AA single stranded oligonucleotide initially prepared as in calf thymus DNA is predicted to preserve B-DNA configuration. Indeed, independent of the method, the sugar units preferably assume C30 -exo conformation at 50 side and C20 -endo or C10 -exo at 30 side, all of them consistent with 9

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

B-DNA. 65 The twist coordinate also remains close to the standard value of B-DNA (36◦ ) especially with TZVP basis set, also slide and roll coordinates fall in B-DNA region. 59 In addition, distances between C10 atoms of consecutive deoxyribose units are predicted to lie in the range 4.3-4.6 Å, closer to the value of regular B-DNA (4.9 Å) than to that of A-DNA (5.4 Å), 2 and adenines assume anti conformations with respect to ribose units, as in regular DNA (Figure 2).

Figure 2: The M06-2X/6-31G(d,p) (a), and the B3LYP-D3/TZVP (b) optimized structures of the stacked configuration of 50 -AA-30 (color) compared with the B-DNA form (green). Structures have been superimposed by using quaternion-based methods. 66,67

Conformations of phosphate groups in DNA are usually classified as BI or BII . The former is most common in crystallographic DNA, the latter becomes comparatively more populated in solution. Those arrangements depend on ε and ζ torsions (Figure 3). BI , corresponds to (trans,gauche−) conformations for ε and ζ, while BII corresponds to (gauche−,trans). The ε − ζ difference thus ranges in [−160, +20] degrees for the former case and [+20, +200] degrees for the latter. 2,68,69 Although the starting configuration of oligonucleotides generated

Figure 3: Definition of the torsional angles of DNA backbone (left), and illustration of BI and BII conformations of phosphate units (right).

by 3DNA software gives ε − ζ = 19.7◦ , a value located at the border region of BI and BII , 10

ACS Paragon Plus Environment

Page 10 of 40

Page 11 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

geometry optimizations yield BI configuration with all the methodologies. Even if similar geometries are obtained by the different basis set for stacked 50 -AA-30 , a major variation is found in passing from 6-31G(d,p) to TZVP: The rise coordinate is seen to increase by ≈ 0.15 Å according to M06-2X computations, and by ≈ 0.07 Å according to B3LYP-D functionals (Table 3), thus further testifying that quite extended basis sets are needed for accurate predictions of the rise coordinate. The differences of the rise coordinates are clearly perceivable from Figure 2, where M06-2X/6-31G(d,p) and B3LYP-D3/TZVP predicted structures are superimposed to the starting B-DNA configuration. All the tested methodologies assign a larger stability to the stacked configuration of AA with respect to the unstacked dinucleotide, as shown in Table 4. The free energy differences between unstacked and stacked conformations are smaller than the electronic energy differences, indicating that vibrational entropy strongly contributes to the stability of the unfolded structures, as already found for similar systems. 36 However it is worth to note for present purposes that the 6-31G(d,p) basis set assigns a comparatively larger stability to the stacked form in comparison with TZVP, amounting to ca 1 kcal/mol in terms of potential energy and ca 3 kcal/mol in terms of free energy (Table 4). That extra-stabilization is clearly ascribable to the deficiencies of the smallest basis set. The internal basis set superposition error also affects equilibrium geometries, as shown in Figure 4, where the optimized structures of the unstacked configuration of 50 -AA-30 evaluated at the TZVP level have been superimposed to 6-31G(d,p) equilibrium geometries. For all the functionals, a global elongation of the structure is found upon enlarging the size of basis set, as testified by the distances between the geometrical centers of adenines, which increase by 1-2 Å in passing from 6-31G(d,p) to TZVP.

Trimeric and hexameric single strands We have considered in some detail the 50 -CAA-30 and 50 -AAC-30 sequences where adenine tracts are end capped by cytosine (C), because M06-2X/6-31G(d,p) and B3LYP-D2/6-

11

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4: Optimized structures of the unstacked configuration of 50 -AA-30 . Predicted TZVP equilibrium geometries (color) are superimposed to 6-31G(d,p) geometries (green). The distances (Å) between the geometrical centers of adenine are predicted to be: 11.3, 12.3, 10.0, 12.0, 10.2, 12.3 in going from M06-2X/6-31G(d,p) to B3LYP-D3/TZVP.

31G(d,p) equilibrium geometries are already available from previous work. 24 Predicted rise coordinates reported in Table 5 still show that for both sequences, averaged inter-ring distances increase in passing from M06-2X to B3LYP-D2 and from B3LYP-D2 to B3LYP-D3. With all functionals, an increase amounting to ≈ 0.2-0.3 Å even larger than that found for the A/G stack and the 50 -AA-30 stacked dinucleotide is predicted in passing form 6-31G(d,p) to TZVP, in such a way that rise coordinates obtained by the TZVP basis set are more in line with those inferred by X-rays structures of B-DNA, the best agreement with standard B-DNA being achieved by B3LYP-D functionals. Very similar patterns in the slide-roll diagram are obtained by different functionals (Fig12

ACS Paragon Plus Environment

Page 12 of 40

15

15

10

10

10

5

5

5

0

A

-5

B

-10

A

-5

B

-10

-15 -1.5

0

roll (degrees)

15

roll (degrees)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

roll (degrees)

Page 13 of 40

-0.5

0.0

0.5

slide (Å)

1.0

1.5

2.0

-1.5

A

-5

B

-10

-15 -1.0

0

-15 -1.0

-0.5

0.0

0.5

slide (Å)

1.0

1.5

2.0

-1.5

-1.0

-0.5

0.0

0.5

slide (Å)

1.0

1.5

2.0

Figure 5: Predicted slide and roll coordinates of trimeric single stranded oligonucleotides. Left: M06-2X; center: B3LYP-D2; right: B3LYP-D3. •: B(1) /B(2) steps; H: B(2) /B(3) steps. Black: CAA/6-31G(d,p); grey: CAA/TZVP; red: AAC/6-31G(d,p); orange: AAC/TZVP. Full line separates the domains of A- and B-DNA, see ref. 59 Dashed lines indicate the average slide and roll of the starting B-DNA configuration.

ure 5), all of them indicating significant distortions from regular B-DNA. For all the examined cases, slide and roll are found to depend more on basis set than on the functional. Appreciable changes of slide and roll coordinates are predicted for the A/A step in AAC upon varying the basis set (red and orange circles). With all the tested functionals, the roll coordinate is seen to lower by almost 10 degrees in passing from 6-31G(d,p) to TZVP, thus assuming values closer to canonical B-DNA, while slide coordinates increase by ca 0.5 Å. For CAA, the A/A step (grey and black triangles) is predicted to assume rigid BI -DNA conformation, corresponding to positive rolls with all the methodologies. 70 C/A and A/C steps appear to be even more sensitive to the adopted basis set. With all functionals, the C/A step of CAA has a largely negative (up to −16◦ with M06-2X) roll coordinate according to 6-31G(d,p) computations (black circles). In passing to TZVP (gray circles), a much more regular configuration is found, the roll coordinate assuming the standard B-DNA value, slide amounting to ca 1.0 Å. The opposite is found for the A/C step of AAC which should assume a regular configuration according to 6-31G(d,p), while it should possess negative slide and high positive roll (orange triangles) according to TZVP. That variation is particularly relevant for B3LYP based functionals, predicting that the A/C step should fall in the A-DNA type region. However no actual A-DNA configuration

13

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

is detected, since no C30 -endo conformation has been found, moreover distances between phosphor atoms are predicted to lie in the range 6.6-7.1 Å, values typical of B-DNA. In addition, helical twists are very close to actual twist coordinates for all the steps, as in B-DNA. 2,65 Therefore, also according to TZVP basis set, AAC and CAA assume B-DNA rather than A-DNA form, although strong deviations from the regular configuration are encountered for the steps involving cytosine. That result is in line with previous findings, since several studies indicate cytosine as the nucleobase more prone to give irregularities in single stranded DNA and RNA short sequences. 8,32,71–77 Results of Table 6 show that also twist coordinates depend more on basis set than on the adopted functional. Indeed average twist coordinates of ≈ 41◦ for AAC and ≈ 47◦ for CAA are predicted by the TZVP set, on average larger by ca 7 degrees than 6-31G(d,p) estimates and sensibly larger than the canonical value, 36◦ . The basis set appears to play a crucial role in determining the optimum geometry of trimeric single stranded sequences, having a large influence both on the orientation of consecutive nucleobases and on the conformations of sugar-phosphate backbone. However, strong deviations from the regular configuration have been encountered for the trimers. The latter only contain terminal steps, which are known to exhibit a larger conformational heterogeneity than internal positions even in double stranded DNA. 70,78 To further corroborate our conclusions, we have therefore considered also hexameric sequences. Oligonucleotides containing six nucleobases are longer than half turn of B-DNA helix and allow to include tetradic effects on the conformations of dinucleotide steps thanks to the presence of flanking nucleobases. 70 We mainly focused our attention on hexamers containing consecutive adenines, a particularly intriguing case inasmuch as several experimental as well as theoretical studies have shown that among nucleobases, adenine as the highest propensity to associate by stacking interactions. 8,34,79,80 Indeed RNA and DNA single stranded sequences rich of adenine were found to assume ordered stacked conformations in solution by both NMR measurements and

14

ACS Paragon Plus Environment

Page 14 of 40

Page 15 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

MD simulations, 8,79,81 also the outcomes of voltammetric experiments rationalized with the aid of MD simulations and femtosecond transient absorption spectroscopy pointed toward the formation of regular structures for oligonucleotides comprising consecutive adenines. 8,34 We have studied the 50 -TTTATT-30 (hereafter A1), 50 -TTAATT-30 (A2), 50 -TTAAAT30 (A3), 50 -TAAAAT-30 (A4) ss-DNA nucleotides composed by adenine and thymine (T) with all the methodologies, while 50 -AAAAAA-30 (A6) has been treated only at the B3LYPD2/TZVP level. A2, A3 and A4 sequences were already investigated in ref 8, where MD simulations , A1-A4 were also studied and in ref 25 by DFT, but optimized structures were computed only with the B3LYP-D2 functional and only with the SV(P) basis set. Although very similar results are expected (and indeed obtained) in passing from SV(P) to 6-31G(d,p), herein we have newly evaluated the B3LYP-D2/6-31G(d,p) equilibrium geometries of A1-A4 rather than relying on previous SV(P) optimizations, in order to avoid spurious comparisons suffering from the differences in the basis set. Figure 6 shows that C10 -exo and C20 -endo –both conformations distinctive of B-DNA– are predicted to be the most populated puckerings for hexameric single strands by all methodologies, although a different behavior is exhibited by the different functionals. While no substantial variation of the puckering distribution is found by M06-2X functional in passing from 6-31G(d,p) to TZVP basis set, B3LYP-D computations predict a larger variety of populated conformations, including C30 -exo and C40 -endo, when the more flexible basis set is adopted. Noteworthy, all the predicted conformations of deoxyribose units are consistent with the B-DNA form. 65 Figure 7 shows that, similar to puckering distributions, also B3LYP-D estimates of slide and roll are more sensible to the basis set than M06-2X predictions. Very similar slideroll diagrams are obtained by M06-2X upon changing the basis set (Figure 7, top), both patterns closely resembling B3LYP-D ones at the TZVP level. All the points of Figure 7 fall in the region of B-DNA, however TZVP basis set allows for a wider range of conformations. Considering e.g. B3LYP-D3 (bottom panels of Figure 7), slide coordinates between −0.5 and

15

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

13% 58%

29%

17%

M06-2X

58%

25%

C2'-endo

58%

42%

21%

B3LYP-D2 42%

13% 21%

33% 50%

50%

21%

C1'-exo O4'-endo C3'-exo C4'-endo

B3LYP-D3 42%

6-31G(d,p)

TZVP

Figure 6: Distribution of the puckering conformations for A1-A4 single stranded nucleotides. 1.5 Å are predicted at the 6-31G(d,p) level, while slides up to 2.0 Å are found with TZVP basis set. The same also applies to roll, covering the range [−18, 6] degrees according to 6-31G(d,p) and [−20, 13] degrees according to TZVP. Two main results emerge from the analysis of Figure 7: 1) the largest deviations form B-DNA are encountered for terminal steps, above all the ones at 50 side (diamond symbols), and 2) A4, the single strand most rich in adenine, is predicted to assume the most regular structure, at least with the TZVP basis set (blue symbols). Both outcomes are in line with the conclusions of previous investigations. 19,20,24,25,82 With the exception of A1, average twist coordinates of hexamers (Table 7) are found to increase upon enlarging the basis set according to all tested functionals, similar to the case of trimers (Table 6). Although B3LYP-D functionals still predict values slightly larger than 36◦ with the TZVP set, the larger regularity of hexameric sequences is also shown by the twist coordinates, which assume values more in line with B-DNA with respect to trimers. Also for hexamers, predicted rise coordinates appear to be strongly dependent on basis set. In passing from 6-31G(d,p) to TZVP, increments of ≈ 0.2-0.3 Å are found by all functionals, as shown in Figure 8. With the exception of A3 at the M06-2X/6-31G(d,p) level (Figure 8), the average rise is seen to increase –approaching the standard value of B-DNA– upon increasing the number of adjacent adenines in the strand, thus showing that steps

16

ACS Paragon Plus Environment

Page 16 of 40

Page 17 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 7: Predicted slide and roll coordinates of hexamers. Top: M06-2X, center: B3LYP-D2; bottom: B3LYP-D3. Left: 6-31G(d,p); right: TZVP. Black: A1; red: A2; green: A3; blue: A4. : B(1) /B(2) steps; : B(2) /B(3) steps; • B(3) /B(4) steps; H: B(4) /B(5) steps; F: B(5) /B(6) steps. The average slide and roll of starting B-DNA configurations are reported as dashed lines. The full line separates A- and B-DNA forms, see ref 59.

involving adenine nucleobases tend to assume more regular rises. For A6, rise coordinates (Figure 9) predicted at B3LYP-D2/TZVP level amount to ≈ 3.2 Å, very close to the crystallographic value and sensibly higher (up to 0.7 Å) than those of A1. The effect of adenine on rise coordinate is exerted also on flanking thymines (Figure 8): in A1 nucleotide T(3) /A(4) and A(4) /T(5) steps exhibit higher rises than T/T steps. In addition, all methodologies predict that inner steps of A4, only involving adenines, should possess higher rises than ending steps. Indeed (Figure 10) the same rise as regu-

17

ACS Paragon Plus Environment

The Journal of Physical Chemistry

3.2

M06-2X / 6-31G(d,p) M06-2X / TZVP B3LYP-D2 / 6-31G(d,p) B3LYP-D2 / TZVP B3LYP-D3 / 6-31G(d,p) B3LYP-D3 / TZVP

3.1

average rise (Å)

3.0 2.9 2.8 2.7 2.6 2.5 2.4 2.3 A1

A2

A3

A4

Figure 8: Predicted average rise coordinates of A1-A4 single stranded DNA nucleotides. (1)

(2)

A /A 3.3

(2)

(3)

A /A

(3)

(4)

A /A

(4)

(5)

A /A

(5)

(6)

A /A

3.2 3.1 3.0

rise (Å)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 40

5'-AAAAAA-3' 5'-TTTATT-3'

2.9 2.8 2.7 2.6 2.5 2.4 (1)

T /T

(2)

(2)

T /T

(3)

(3)

(4)

T /A

(4)

A /T

(5)

(5)

T /T

(6)

Figure 9: Predicted (B3LYP-D2/TZVP) rise coordinates of 50 -TTTATT-30 (red) and 50 -AAAAAA30 (blue) single stranded DNA nucleotides.

lar B-DNA (≈ 3.3-3.4 Å) is estimated for A/A steps by B3LYP-D3/TZVP calculations, a slightly lower value (3.2 Å) comes from B3LYP-D2 predictions, M06-2X still obtaining underestimated rises with respect to B-DNA. Identical conclusions are inferred for the A3 sequence. All the results suggest that a larger regularity of single stranded helices is achieved by the sequence with the larger number consecutive adenines. The above conclusion has several analogies with double stranded DNA: AA/TT sequences were found to act as “rigid steps” in the solid state. 4 ApA segments have been recognized as the less flexible steps in solution, by 18

ACS Paragon Plus Environment

Page 19 of 40

the analysis of twist, roll, and x-displacements inferred from a large data set of 31 P chemical shifts. 70 Furthermore, in contrast to ApA·TpT and ApT·TpA DNA rich sequences, only GC steps are found in the canonical Z form 83–85 of DNA and domains containing only cytosine and guanine are particularly able to promote the B → A transition. 86–88 3.4 3.2 3.0 2.8

rise (Å)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

2.6 B3LYP-D3 / TZVP B3LYP-D2 / TZVP B3LYP-D3 / 6-31G(d,p) M06-2X / TZVP B3LYP-D2 / 6-31G(d,p) M06-2X / 6-31G(d,p)

2.4 2.2 2.0 1.8

(1)

(2)

T /A

(2)

(3)

A /A

(3)

(4)

A /A

(4)

(5)

A /A

(5)

(6)

A /T

Figure 10: Predicted rise coordinates of 50 -TAAAAT-30 single strand. The MD investigation of A2, A3 and A4 ss-DNA nucleotides carried out in ref 8 showed that the above sequences should retain stacked arrangements in solution, therefore we have not afforded a systematic conformer search here. However, we have considered a further starting geometry for A4, based on the model of C-DNA by Van Damn and Levitt, consisting of purely BII nucleotides with helical twist of 40◦ , helical rise of 3.32 Å and rise of 3.96 Å. 69 The B3LYP-D3/TZVP equilibrium geometry of the A4 single strand optimized by starting from the C form is reported in Figure 11 where it has been superimposed to: a) the starting C-DNA geometry, b) the geometry of the same sequence adopting standard calf-thymus BDNA arrangement. It is seen that the structure of A4 obtained by starting from C-DNA is strongly distorted, but still assumes a B-DNA configuration dominated by BI conformations of phosphate. Moreover, the optimized structure of A4 starting from the C-form is much less regular than that obtained starting from calf-thymus B-form (see Tables S3-S6 of the Supporting Information for a comparison) and is predicted to be 6.3 kcal/mol less stable 19

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

than A4 optimized by starting with the B-DNA structure. Overall these results suggest that in aqueous environment the B-DNA motif should possess a large stability also at the level of single strand, at least for sequences rich of stacked adenines.

Figure 11: Equilibrium geometry (B3LYP-D3/TZVP) of the A4 single strand optimized from a starting C-DNA configuration (color) superimposed to the initial C-DNA geometry (a, green), and standard B-DNA geometry (b, green).

Comparison with experimental data Given the limited information for short single stranded DNA oligonucleotides, a direct comparison between present theoretical predictions and experimental data is not possible for the specific sequences investigated here. Nevertheless, the structure of single stranded 50 GAAAAC-30 (GA4 C) DNA nucleotide, very similar to the systems investigated here, was resolved by MD simulations adopting geometrical constraints consistent with the outcomes of 2D-NMR NOESY measurements carried out in solution. 79 A strongly regular arrangement was detected for GA4 C, closely resembling the structure of canonical ds-DNA, in accord with our predictions for A4 and A6 single strands. In particular, rise coordinates amounting to ≈ 3.5 Å, where found for all the steps of GA4 C, in excellent agreement with our main conclusion that single stranded adenine-rich sequences tend to adopt regular rises. That result 20

ACS Paragon Plus Environment

Page 20 of 40

Page 21 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

further testifies that DFT is able to predict reliable rise coordinates for single strands only if sufficiently extended basis sets are used in computations. Moreover, twist coordinates larger than ≈ 40◦ were inferred for GA4 C, again in better agreement with the predictions of TZVP basis set which obtains twist angles larger than 36◦ and almost systematically larger than 6-31G(d,p) estimates, see Tables 6 and 7. Table 8 reports average rise coordinates of dinucleotide steps evaluated by using the experimental structures of ds-DNA segments containing the sequences investigated here in the form of single strands. 89 Several systems of Table 8 were used in the recent reparameterization of the AMBER force field carried out by Orozco and coworkers. 17 Herein, to make the comparison between predicted and observed data meaningful, we have selected systems containing only DNA, i.e. without ligands, proteins or RNA. Comparison of the rise coordinates of Table 8 for CAA and AAC sequences with those of Table 5 shows that for C/A, A/A and A/C steps the best agreement with experimental rises comes from B3LYP-D2 predictions in conjunction with the TZVP basis set, while, independent of the method, 6-31G(d,p) gives sensibly underestimated values. An even more significant agreement is found between experimental and predicted rises for A1-A4 oligonucleotides. Data of Table 8 indeed show that A/A steps assume on average rise coordinates closer to the canonical value (3.3 Å) than T/A and A/T steps, while shorter interbases distances are found for T/T steps, as indeed predicted by present DFT computations; again a better numerical agreement with experiment is found if TZVP basis set is employed in computations. In particular, observed rise coordinates for A/A steps in A6 are all very close to 3.2 Å, in excellent agreement with B3LYP-D2/TZVP predictions, see Figure 9.

Conclusions DFT computations carried out with functionals capable of handling dispersion forces and including solvation effects predict that short single stranded DNA oligonucleotides assume

21

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

configurations close to the B-DNA form, typical of double helices. By using sufficiently extended basis sets in computations, it is shown that the lack of restraints imposed by the pairing of the nucleobases with those of the complementary strand does not affect rise coordinates of single strands to a large extent. This is in contrast with the outcomes of previous DFT studies which found single strands sensibly shorter than double helices. Previous results thus appear to be originated by the incompleteness of the basis sets adopted in computations, simply polarized double-ζ basis sets yielding sensibly shorter rise coordinates than triple-ζ sets. The need of using quite extended basis set in order to achieve accurate equilibrium geometries further underlines the challenge facing the quantum chemical modeling of DNA in solution. Among the tested methodologies, B3LYP-D functionals predict rise coordinates for single strands in closer agreement with experimental values, while inter-base distances estimated by M06-2X are on average shorter by ≈ 0.15 Å. By comparing single strands of different composition, it emerges that the sequences rich of stacked adenines are predicted to assume the more regular arrangements, in agreement with experimental evidence and that adenine induces also flanking nucleobases to assume a more regular coiling. Upon distorting the initial geometry from standard B-DNA, optimizations lead to less regular arrangements at sensibly higher energy, but still assuming B-DNA configuration. Although a necessarily limited number of cases could be explored here, the present study confirms that the B-DNA motif is a favorable minimum-energy structure also for single strands, because it ensures a good compromise between optimum base-stacking arrangements and backbone conformation.

Acknowledgement The financial support of the University of Salerno is gratefully acknowledged. A. C. acknowl-

22

ACS Paragon Plus Environment

Page 22 of 40

Page 23 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

edges the CINECA award HP10CYW18T under the ISCRA initiative, for the availability of high performance computing resources.

Supporting Information Available Base step parameters of the A/G stack obtained by B3LYP-D2 and M06-2X computations (Tables S1-S2). Comparison of structural parameters for 50 -TAAAAT-30 single strand at B- and C-DNA configurations (Tables S4-S6). Adopted constraints for partial geometrical optimization of the A/G stack at MP2 level (Figure S1). Cartesian coordinates.

23

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References (1) Tinoco, I.; Bustamante, C. How RNA Folds. J. Mol. Biol. 1999, 293, 271–281. (2) Lu, X.-J.; Shakked, Z.; Olson, W. A-Form Conformational Motifs in Ligand-Bound DNA Structures. J. Mol. Biol. 2000, 300, 819–840. (3) Luscombe, N. M.; Laskowski, R. A.; Thornton, J. M. Amino Acid-Base Interactions: A Three-Dimensional Analysis of Protein-DNA Interactions at an Atomic Level. Nucleic Acids Res. 2001, 29, 2860–2874. (4) Calladine, C. R.; Drew, H. R.; Luisi, B. F.; Travers, A. A. Understanding DNA, 3rd Ed.; Elsevier Academic Press: Oxford, 2004; Chapter 3. (5) Tolstorukov, M. Y.; Colasanti, A. V.; McCandlish, D. M.; Olson, W. K.; Zhurkin, V. B. A Novel Roll-and-Slide Mechanism of DNA Folding in Chromatin: Implications for Nucleosome Positioning. J. Mol. Biol. 2007, 371, 725–738. (6) Caruso, T.; Capobianco, A.; Peluso, A. The Oxidation Potential of Adenosine and Adenosine-Thymidine Base-Pair in Chloroform Solution. J. Am. Chem. Soc. 2007, 129, 15347–15353. (7) Caruso, T.; Carotenuto, M.; Vasca, E.; Peluso, A. Direct Experimental Observation of the Effect of the Base Pairing on the Oxidation Potential of Guanine. J. Am. Chem. Soc. 2005, 127, 15040–15041. (8) Capobianco, A.; Caruso, T.; Celentano, M.; D’Ursi, A. M.; Scrima, M.; Peluso, A. Stacking Interactions between Adenines in Oxidized Oligonucleotides. J. Phys. Chem. B 2013, 117, 8947–8953. (9) Capobianco, A.; Caruso, T.; Celentano, M.; La Rocca, M. V.; Peluso, A. Proton Transfer in Oxidized Adenosine Self-Aggregates. J. Chem. Phys. 2013, 139, 145101–4.

24

ACS Paragon Plus Environment

Page 24 of 40

Page 25 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(10) Capobianco, A.; Caruso, T.; D’Ursi, A. M.; Fusco, S.; Masi, A.; Scrima, M.; Chatgilialoglu, C.; Peluso, A. Delocalized Hole Domains in Guanine-Rich DNA Oligonucleotides. J. Phys. Chem. B 2015, 119, 5462–5466. (11) Bommarito, S.; Peyret, N.; SantaLucia, J. J. Thermodynamic Parameters for DNA Sequences with Dangling Ends. Nucleic Acids Res. 2000, 28, 1929–1934. (12) Kool, E. T. Hydrogen Bonding, Base Stacking, and Steric Effects in DNA Replication. Annu. Rev. Biophys. Biomol. Struct. 2001, 30, 1–22. (13) Yakovchuk, P.; Protozanova, E.; Frank-Kamenetskii, M. D. Base-Stacking and BasePairing Contributions into Thermal Stability of the DNA Double Helix. Nucleic Acids Res. 2006, 34, 564–574. (14) Lebrun, A.; Lavery, R. Modelling Extreme Stretching of DNA. Nucleic Acids Res. 1996, 24, 2260–2267. (15) Rezác, J.; Hobza, P.; Harris, S. A. Stretched DNA Investigated Using MolecularDynamics and Quantum-Mechanical Calculations. Biophys. J. 2010, 98, 101–110. (16) Lavery, R.;

Zakrzewska, K.;

Beveridge, D.;

Bishop, T. C.;

Case, D. A.;

Cheatham, T., III; Dixit, S.; Jayaram, B.; Lankas, F.; Laughton, C. et al. A Systematic Molecular Dynamics Study of Nearest-Neighbor Effects on Base Pair And Base Pair Step Conformations and Fluctuations in B-DNA. Nucleic Acids Res. 2010, 38, 299–313. (17) Ivani, I.; Dans, P. D.; Noy, A.; Pérez, A.; Faustino, I.; Hospital, A.; Walther, J.; Andrio, P.; Goñi, R.; Balaceanu, A. et al. Parmbsc1: A Refined Force Field for DNA Simulations. Nat. Methods 2015, 13, 55–58. (18) Sponer, J.; Riley, K. E.; Hobza, P. Nature and Magnitude of Aromatic Stacking of Nucleic Acid Bases. Phys. Chem. Chem. Phys. 2008, 10, 2595–2610. 25

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(19) Zubatiuk, T. A.; Shishkin, O. V.; Gorb, L.; Hovorun, D. M.; Leszczynski, J. B-DNA Characteristics Are Preserved in Double stranded d(A)3 ·d(T)3 and d(G)3 ·d(C)3 MiniHelixes: Conclusions from DFT/M06-2X Study. Phys. Chem. Chem. Phys. 2013, 15, 18155–18166. (20) Zubatiuk, T.; Kukuev, M. A.; Korolyova, A. S.; Gorb, L.; Nyporko, A.; Hovorun, D.; Leszczynski, J. Structure and Binding Energy of Double-Stranded A-DNA Mini-helices: Quantum-Chemical Study. J. Phys. Chem. B 2015, 119, 12741–12749. (21) Churchill, C. D. M.; Wetmore, S. D. Developing a Computational Model that Accurately Reproduces the Structural Features of a Dinucleoside Monophosphate Unit within B-DNA. Phys. Chem. Chem. Phys. 2011, 13, 16373–16383. (22) Gu, J.; Wang, J.; Leszczynski, J. Stacking and H-bonding Patterns of dGpdC and dGpdCpdG: Performance of the M05-2X and M06-2X Minnesota Density Functionals for the Single Strand DNA. Chem. Phys. Lett. 2011, 512, 108–112. (23) Barone, G.; Fonseca Guerra, C.; Bickelhaupt, F. M. B-DNA Structure and Stability as Function of Nucleic Acid Composition: Dispersion-Corrected DFT Study of Dinucleoside Monophosphate Single and Double Strands. ChemistryOpen 2013, 2, 186–193. (24) Capobianco, A.; Peluso, A. The Oxidization Potential of AA Steps in Single Strand DNA Oligomers. RSC Adv. 2014, 4, 47887–47893. (25) Capobianco, A.; Caruso, T.; Peluso, A. Hole Delocalization over Adenine Tracts in Single Stranded DNA Oligonucleotides. Phys. Chem. Chem. Phys. 2015, 17, 4750– 4756. (26) Senthilkumar, K.; Grozema, F. C.; Fonseca Guerra, C.; Bickelhaupt, F. M.; Lewis, F. D.; Berlin, Y. A.; Ratner, M. A.; Siebbeles, L. D. A. Absolute Rates of Hole Transfer in DNA. J. Am. Chem. Soc. 2005, 127, 14894–14903.

26

ACS Paragon Plus Environment

Page 26 of 40

Page 27 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(27) Sugiyama, H.; Saito, I. Theoretical Studies of GG-Specific Photocleavage of DNA via Electron Transfer: Significant Lowering of Ionization Potential and 50 -Localization of HOMO of Stacked GG Bases in B-form DNA. J. Am. Chem. Soc. 1996, 118, 7063–7068. (28) Etheve, L.; Martin, J.; Lavery, R. Decomposing Protein-DNA Binding and Recognition Using simplified Protein Models. Nucleic Acids Res. 2017, 45, 10270–10283. (29) Clore, G. M.; Gronenborn, A. M. An Investigation into the Solution Sstructure of the Single-Stranded DNA Undecamer 50 d AAGTGTGATAT by Means of Nuclear Overhauser Enhancement measurements. Eur. Biophys. J. 1984, 11, 95–102. (30) Norberg, J.; Nilsson, L. Temperature Dependence of the Stacking Propensity of Adenylyl-30 ,50 -Adenosine. J. Phys. Chem. 1995, 99, 13056–13058. (31) Norberg, J.; Nilsson, L. Solvent Influence on Base Stacking. Biophys. J. 1998, 74, 394–402. (32) Jafilan, S.; Klein, L.; Hyun, C.; Florián, J. Intramolecular Base Stacking of Dinucleoside Monophosphate Anions in Aqueous Solution. J. Phys. Chem. B 2012, 116, 3613–3618. (33) Brown, R. F.; Andrews, C. T.; Elcock, A. H. Stacking Free Energies of All DNA and RNA Nucleoside Pairs and Dinucleoside-Monophosphates Computed Using Recently Revised AMBER Parameters and Compared with Experiment. J. Chem. Theory Comput. 2015, 11, 2315–2328. (34) Chen, J.; Kohler, B. Base Stacking in Adenosine Dimers Revealed by Femtosecond Transient Absorption Spectroscopy. J. Am. Chem. Soc. 2014, 136, 6362–6372. (35) Capobianco, A.; Landi, A.; Peluso, A. Modeling DNA Oxidation in Water. Phys. Chem. Chem. Phys. 2017, 19, 13571–13571.

27

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(36) Smith, D. A.; Holroyd, L. F.; van Mourik, T.; Jones, A. C. A DFT Study of 2Aminopurine-Containing Dinucleotides: Prediction of Stacked Conformations with BDNA Structure. Phys. Chem. Chem. Phys. 2016, 18, 14691–14700. (37) Lu, X.-J.; Olson, W. K. 3DNA: A Versatile, Integrated Software System for the Analysis, Rebuilding and Visualization of Three-dimensional Nucleic-Acid Structures. Nat. Protoc. 2008, 3, 1213–1227. (38) Lu, X.-J.; Olson, W. K. 3DNA: A Software Package for the Analysis, Rebuilding and Visualization of Three-Dimensional Nucleic Acid Structures. Nucleic Acids Res. 2003, 31, 5108–5121. (39) Zhao, Y.; Truhlar, D. G. The M06 Suite of Density Functionals for Main Group Thermochemistry, Thermochemical Kinetics, Noncovalent Interactions, Excited States, and Transition Elements: Two New Functionals and Systematic Testing of Four M06-class Functionals and 12 Other Functionals. Theor. Chem. Acc. 2007, 120, 215–241. (40) Becke, A. D. Density-Functional Thermochemistry. III. The Role of Exact Exchange. J. Chem. Phys. 1993, 98, 5648–5652. (41) Stephens, P. J.; Devlin, F. J.; Chabalowski, C. F.; Frisch, M. J. Ab Initio Calculation of Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force Fields. J. Phys. Chem. 1994, 98, 11623–11627. (42) Grimme, S. Semiempirical GGA-Type Density Functional Constructed with a LongRange Dispersion Correction. J. Comput. Chem. 2006, 27, 1787–1799. (43) Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. A Consistent and Accurate Ab Initio Parametrization of Density Functional Dispersion Correction (DFT-D) for the 94 Elements H-Pu. J. Chem. Phys. 2010, 132, 154104–19.

28

ACS Paragon Plus Environment

Page 28 of 40

Page 29 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

(44) Grimme, S.; Ehrlich, S.; Goerigk, L. Effect of the Damping Function in Dispersion Corrected Density Functional Theory. J. Comput. Chem. 2011, 32, 1456–1465. (45) Capobianco, A.; Carotenuto, M.; Caruso, T.; Peluso, A. The Charge-Transfer Band of an Oxidized Watson-Crick Guanosine-Cytidine Complex. Angew. Chem. Int. Ed. 2009, 48, 9526–9528. (46) Borrelli, R.; Capobianco, A.; Landi, A.; Peluso, A. Vibronic Couplings and Coherent Electron Transfer in Bridged Systems. Phys. Chem. Chem. Phys. 2015, 17, 30937– 30945. (47) Borrelli, R.; Capobianco, A.; Peluso, A. Hole Hopping Rates in Single Strand Oligonucleotides. Chem. Phys. 2014, 440, 25–30. (48) Capobianco, A.; Velardo, A.; Peluso, A. DFT Predictions of the Oxidation Potential of Organic Dyes for Opto-Electronic Devices. Comp. Theor. Chem. 2015, 1070, 68–75. (49) Capobianco, A.; Borrelli, R.; Landi, A.; Velardo, A.; Peluso, A. Absorption Band Shapes of a Push-Pull Dye Approaching the Cyanine Limit: A Challenging Case for First Principle Calculations. J. Phys. Chem. A 2016, 120, 5581–5589. (50) Miertuš, S.; Scrocco, E.; Tomasi, J. Electrostatic Interaction of a Solute with a Continuum. A Direct Utilization of Ab Initio Molecular Potentials for the Prevision of Solvent Effects. Chem. Phys. 1981, 55, 117–129. (51) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; al, E.-T. Gaussian 09 Revision D.01. Gaussian Inc. Wallingford CT 2009. (52) TURBOMOLE V6.3.1 2011, a development of University of Karlsruhe and Forschungszentrum Karlsruhe GmbH, 1989-2007, TURBOMOLE GmbH, since 2007; available from http://www.turbomole.com.

29

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(53) Weigend, F.; Köhn, A.; Hättig, C. Efficient Use of the Correlation Consistent Basis Sets in Resolution of the Identity MP2 Calculations. J. Chem. Phys. 2002, 116, 3175–3183. (54) Boys, S. F.; Bernardi, F. The Calculation of Small Molecular Interactions by the Differences of Separate Total Energies. Some Procedures with Reduced Errors. Mol. Phys. 1970, 19, 553–566. (55) Schaefer, A.; Huber, C.; Ahlrichs, R. Fully Optimized Contracted Gaussian-Basis Sets of Triple Zeta Valence Quality for Atoms Li to Kr. J. Chem. Phys. 1994, 100, 5829– 5835. (56) Weigend, F.; Ahlrichs, R. Balanced Basis Sets of Split Valence, Triple Zeta Valence and Quadruple Zeta Valence Quality for H to Rn: Design and Assessment of Accuracy. Phys. Chem. Chem. Phys. 2005, 7, 3297–3305. (57) Zheng, J.; Xu, X.; Truhlar, D. G. Minimally Augmented Karlsruhe Basis Sets. Theor. Chem. Acc. 2011, 128, 295–305. (58) Hunter, R. S.; van Mourik, T. DNA Base Stacking: The Stacked Uracil/uracil and Thymine/Thymine Minima. J. Comput. Chem. 2012, 33, 2161–2172. (59) El Hassan, M. A.; Calladine, C. R. Conformational Characteristics of DNA: Empirical Classifications and a Hypothesis for the Conformational Behaviour of Dinucleotide Steps. Phil. Trans. R. Soc. A 1997, 355, 43–100. (60) Hehre, W. J.; Radom, L.; Schleyer, P. R.; Pople, J. A. Ab initio Molecular Orbital Theory; Wiley, New York., 1986. (61) Dunning, T. H. Gaussian Basis Sets for Use in Correlated Molecular Calculations. I. The Atoms Boron through Neon and Hydrogen. J. Chem. Phys. 1989, 90, 1007–1023. (62) Valdés, H.; Klusák, V.; Pitonák, M.; Exner, O.; Starý, I.; Hobza, P.; Rulíšek, L. Evaluation of the Intramolecular Basis Set Superposition Error in the Calculations of Larger 30

ACS Paragon Plus Environment

Page 30 of 40

Page 31 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Molecules: [n]Helicenes and Phe-Gly-Phe Tripeptide. J. Comput. Chem. 2008, 29, 861–870. (63) Kruse, H.; Grimme, S. A Geometrical Correction for the Inter- and Intra-Molecular Basis Set Superposition Error in Hartree-Fock and Density Functional Theory Calculations for Large Systems. J. Chem. Phys. 2012, 136, 154101–16. (64) Case, D. A.; Darden, T. A.; Cheatham III, T. E.; Simmerling, C. L.; Wang, J.; Duke, R. E.; Luo, R.; Walker, R. C.; Zhang, W.; Merz, K. M. et al. AMBER 11. University of California, San Francisco, 2011. (65) Dickerson, R. E. In International Tables for Cristallography Volume F: Crystallography of Biological Macromolecules; Rossmann, R. G., Arnold, E., Eds.; Kluwer Academic Press: Dordrecht, 2001; pp 588–622. (66) Horn, B. K. P. Closed-Form Solution of Absolute Orientation Using Unit Quaternions. J. Opt. Soc. Am. A 1987, 4, 629–642. (67) Coutsias, E. A.; Seok, C.; Dill, K. A. Using Quaternions to Calculate RMSD. J. Comput. Chem. 2004, 25, 1849–1857. (68) Hartmann, B.; Piazzola, D.; Lavery, R. BI -BII Transitions in B-DNA. Nucleic Acids Res. 1993, 21, 561–568. (69) van Dam, L.; Levitt, M. H. BII Nucleotides in the B and C Forms of Natural-sequence Polymeric DNA: A New Model for the C Form of DNA. J. Mol. Biol. 2000, 304, 541–561. (70) Heddi, B.; Oguey, C.; Lavelle, C.; Foloppe, N.; Hartmann, B. Intrinsic Flexibility of B-DNA: The Experimental TRX Scale. Nucleic Acids Res. 2010, 38, 1034–1047. (71) Sen, S.; Nilsson, L. MD Simulations of Homomorphous PNA, DNA, and RNA Single

31

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Strands: Characterization and Comparison of Conformations and Dynamics. J. Am. Chem. Soc. 2001, 123, 7414–7422. (72) Norberg, J.; Nilsson, L. Potential of Mean Force Calculations of the StackingUnstacking Process in Single Stranded Deoxyribonucleoside Monophosphates. Biophys. J. 1995, 69, 2277–2285. (73) Norberg, J.; Nilsson, L. Influence of Adjacent Bases on the Stacking-Unstacking Process of Single-stranded Oligonucleotides. Biopolymers 1996, 39, 765–768. (74) Erie, D. A.; Breslauer, K. J.; Olson, W. K. A Monte Carlo Method for Generating Structures of Short Single-Stranded DNA Sequences. Biopolymers 1993, 33, 75–105. (75) Vokáčová, Z.; Buděšínský, M.; Rosenberg, I.; Schneider, B.; Šponer, J.; Sychrovský, V. Structure and Dynamics of the ApA, ApC, CpA, and CpC RNA Dinucleoside Monophosphates Resolved with NMR Scalar Spin-Spin Couplings. J. Phys. Chem. B 2009, 113, 1182–1191. (76) Pearlman, D. A.; Kim, S.-H. Conformational Studies of Nucleic Acids. V. Sequence Specificities in the Conformational Energetics of Oligonucleotides: The HomoTetramers. Biopolymers 1988, 27, 59–77. (77) Kabelác, M.; Hobza, P. Potential Energy and Free Energy Surfaces of all Ten Canonical and Methylated Nucleic Acid Base Pairs: Molecular Dynamics and Quantum Chemical ab initio Studies. J. Phys. Chem. B 2001, 105, 5804–5817. (78) Zgarbová, M.; Otyepka, M.; Šponer, J.; Lankaš, F.; Jurečka, P. Base Pair Fraying in Molecular Dynamics Simulations of DNA and RNA. J. Chem. Theory Comput. 2014, 10, 3177–3189. (79) Isaksson, J.; Acharya, S.; Barman, J.; Cheruku, P.; Chattopadhyaya, J. Single-Stranded Adenine-Rich DNA and RNA Retain Structural Characteristics of Their Respective 32

ACS Paragon Plus Environment

Page 32 of 40

Page 33 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Double-Stranded Conformations and Show Directional Differences in Stacking Pattern. Biochemistry 2004, 43, 15996–16010. (80) Ke, C.; Humeniuk, M.; S-Gracz, H.; Marszalek, P. E. Direct Measurements of Base Stacking Interactions in DNA by Single-Molecule Atomic-Force Spectroscopy. Phys. Rev. Lett. 2007, 99, 018302. (81) Norberg, J.; Nilsson, L. Conformational Free Energy Landscape of ApApA from Molecular Dynamics Simulations. J. Phys. Chem. 1996, 100, 2550–2554. (82) Adhikary, A.; Kumar, A.; Khanduri, D.; Sevilla, M. D. Effect of Base Stacking on the Acid-Base Properties of the Adenine Cation Radical [A·]+ in Solution: ESR and DFT Studies. J. Am. Chem. Soc. 2008, 130, 10282–10292. (83) Ho, P. S. The Non-B-DNA Structure of d(CA/TG)n Does Not Differ from That of Z-DNA. Proc. Natl. Acad. Sci. USA 1994, 91, 9549–9553. (84) Johnston, B. H.; Ohara, W.; Rich, A. Stochastic Distribution of a Short Region of Z-DNA within a long Repeated Sequence in Negatively Supercoiled Plasmids. J. Biol. Chem. 1988, 263, 4512–4515. (85) Johnston, B. H.; Rich, A. Chemical Probes of DNA Conformation: Detection of Z-DNA at Nucleotide Resolution. Cell. 1985, 42, 713–724. (86) Foloppe, N.; MacKerell, A. D. J. Intrinsic Conformational Properties of Deoxyribonucleosides: Implicated Role for Cytosine in the Equilibrium among the A, B, and Z Forms of DNA. Biophys. J. 1999, 76, 3206–3218. (87) Hays, F. A.; Teegarden, A.; Jones, Z. J.; Harms, M.; Raup, D.; Watson, J.; Cavaliere, E.; Ho, P. S. How Sequence Defines Structure: A Crystallographic Map of DNA Structure and Conformation. Proc. Natl. Acad. Sci. USA 2005, 102, 7157–7162.

33

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(88) Peticolas, W. L.; Wang, Y.; Thomas, G. A. Some Rules for Predicting the BaseSequence Dependence of DNA Conformation. Proc. Natl. Acad. Sci. USA 1988, 85, 2579–2583. (89) Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank. Nucleic Acids Res. 2000, 28, 235–242, https://www.rcsb.org.

34

ACS Paragon Plus Environment

Page 34 of 40

Page 35 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Table 1: Predicted (PCM/B3LYP-D3) base step parameters (shift, slide, rise in Å; tilt, roll, twist in degrees) for the A/G stack by using different basis sets.

SV(P) 6-31G(d,p)c 6-31+G(d) 6-31+G(d,p) TZVP 6-311+G(2d,p) ma-TZVP(-f) def2-TZVP ma-TZVP def2-QZVP

Ncf a

Shift

Slide

Rise

Tilt

Roll

Twist

RMSDb

314 365 419 449 459 627 648 791 795 1497

1.50 1.52 1.47 1.46 1.48 1.48 1.50 1.50 1.48 1.49

-0.17 -0.11 -0.19 -0.18 -0.13 0.00 0.00 -0.01 -0.02 -0.01

3.14 3.13 3.20 3.21 3.22 3.24 3.25 3.24 3.26 3.26

2.56 3.19 2.92 2.84 2.85 2.44 2.62 2.61 2.54 2.60

1.42 3.20 -0.48 -0.73 -0.23 -0.52 -0.58 -0.37 -0.76 -0.67

40.79 40.12 40.19 40.27 41.11 41.15 41.07 41.63 41.09 41.52

0.09 0.11 0.08 0.07 0.04 0.02 0.02 0.01 0.01 –

a Number

of contracted basis functions. b RMSD (Å) vs def2-QZVP equilibrium geometry. c Spherical functions have been used for all basis sets, but for those of the 6-31G family, adopting Cartesian d functions as in the original formulation. 60

35

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 40

Table 2: Gas phase interaction energy and BSSE (kcal/mol) of the A/G stack at constrained MP2/cc-pVQZ optimized geometry preserving B-DNA configuration, evaluated with different functionals and basis sets. B3LYP-D3a basis set

Ncf

b

Eint

SV(P) 6-31G(d) cc-pVDZ 6-31G(d,p) 6-31+G(d) 6-311G(d,p) 6-31+G(d,p) 6-31++G(d,p) TZVP 6-311+G(d,p) 6-311++G(d,p) aug-cc-pVDZ 6-311+G(2d,p) ma-TZVP(-f) def2-TZVP 6-311+G(3d,p) cc-pVTZ 6-311+G(2df,p) def2-TZVPP ma-TZVP 6-311+G(2df,2pd) 6-311++G(3df,3pd) aug-cc-pVTZ def2-QZVPd

314 335 344 365 419 438 449 459 459 522 532 573 627 648 711 732 770 774 791 795 854 999 1196 1497

8.4 7.9 8.0 7.9 8.4 8.2 8.4 8.4 8.6 8.4 8.4 8.5 8.5 8.6 8.6 8.5 8.5 8.5 8.6 8.6 8.6 8.6 8.7 8.6

c

M06-2X

BSSE

Eint

BSSE

3.0 2.9 2.8 2.9 1.2 2.4 1.1 1.1 1.0 1.3 1.3 1.2 0.8 0.4 0.8 0.8 1.5 0.9 0.7 0.4 0.8 0.8 0.3 0.2

6.4 5.3 5.9 5.3 6.9 6.5 6.9 6.9 6.5 7.1 7.0 6.9 6.6 6.3 6.4 6.6 6.4 6.9 6.3 6.5 6.8 6.7 6.7 6.5

2.6 2.5 2.5 2.5 1.3 2.2 1.2 1.2 1.0 1.5 1.5 1.2 0.8 0.4 0.7 0.8 1.3 1.0 0.6 0.4 0.9 0.9 0.4 0.2

a Since

they differ only for the sum of atomic pairwise contributions, B3LYP-D2 and B3LYP-D3 yield the same BSSE; for the same reason, all B3LYP-D2 interaction energies differ by the same amount (they are lower by 0.5 kcal/mol in absolute value) from B3LYP-D3 estimates. b Number of contracted basis set functions. c BSSE corrected interaction energy (absolute value) referred to rigid monomers. d Although B3LYP-D3 was parameterized with the def2-QZVP basis set in such a way that BSSE was absorbed into molecular C6 coefficients, 43 we have reported the BSSE also for def2-QVZP for comparison purposes.

36

ACS Paragon Plus Environment

Page 37 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Table 3: Predicted base step parameters (shift, slide, rise, Å; tilt, roll, twist, degrees) and puckering conformations of the deoxyribose units for the stacked conformation of 50 -AA-30 dinucleotide. shift

slide

rise

tilt

roll

twist

puckering

M06-2X/6-31G(d,p) M06-2X/TZVP B3LYP-D2/6-31G(d,p) B3LYP-D2/TZVP B3LYP-D3/6-31G(d,p) B3LYP-D3/TZVP

1.56 1.38 1.65 1.63 1.49 1.36

-0.11 -0.08 0.52 0.40 0.52 0.66

3.02 3.16 3.10 3.16 3.17 3.24

4.42 2.12 2.73 2.81 4.24 4.39

2.18 2.94 10.70 8.83 9.17 8.04

38.09 36.49 39.48 37.07 39.49 37.29

C20 -endo/C10 -exo C30 -exo /C10 -exo C30 -exo/C20 -endo C30 -exo/C10 -exo C30 -exo/C20 -endo C30 -exo/C20 -endo

B-DNAa

0.46

0.47

3.33

4.36

1.78

35.68

C20 -endo/C20 -endo

a Starting

configuration, average values from fiber diffraction of calf thymus B-DNA. 37,38

Table 4: Electronic energy (∆Eel ) and Gibbs free energy at 298.15 K (∆G298 ) of the unstacked conformation of 50 -AA-30 single strand. Energies are expressed in kcal/mol and referred to the stacked conformation. ∆Eel

6-31G(d,p) TZVP

M06-2X

B3LYP-D2

B3LYP-D3

9.2 8.2

11.3 10.4

11.5 10.4

∆G298

6-31G(d,p) TZVP

M06-2X

B3LYP-D2

B3LYP-D3

6.5 3.1

9.6 6.0

9.0 7.5

37

ACS Paragon Plus Environment

The Journal of Physical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 40

Table 5: Predicted rise coordinates (Å) for 50 -AAC-30 and 50 -CAA-30 single stranded oligonucleotides. 50 -AAC-30 M06-2Xa

B3LYP-D2

B3LYP-D3

6-31G(d,p)

TZVP

6-31G(d,p)

TZVP

6-31G(d,p)

TZVP

A/A A/C

2.94 2.76

3.10 3.09

3.01 2.83

3.14 3.38

3.16 2.96

3.26 3.55

avg

2.85

3.10

2.92

3.26

3.06

3.40

50 -CAA-30 M06-2X

B3LYP-D2

B3LYP-D3

6-31G(d,p)

TZVP

6-31G(d,p)

TZVP

6-31G(d,p)

TZVP

C/A A/A

2.76 3.17

3.11 3.17

2.68 3.23

3.18 3.23

2.69 3.33

3.22 3.37

avg

2.94

3.14

2.96

3.20

3.01

3.30

a B3LYP-D2/6-31G(d,p)

and M06-2X/6-31G(d,p) equilibrium geometries from ref 24.

Table 6: Predicted average twist coordinates (degrees) of trimeric single strands. CAA

M06-2X B3LYP-D2 B3LYP-D3

AAC

6-31G(d,p)

TZVP

6-31G(d,p)

TZVP

40.20 42.32 40.12

47.22 47.92 47.58

35.56 32.43 33.65

41.18 41.56 41.50

Table 7: Predicted average twist coordinates (degrees) of hexameric single strands. M06-2X

A1 A2 A3 A4 a

B3LYP-D2

B3LYP-D3

DZa

TZ

DZ

TZ

DZ

TZ

33.11 30.94 30.99 34.91

30.75 34.33 32.73 35.39

34.20 34.95 35.78 36.27

31.20 36.25 39.95 39.06

33.28 34.60 35.43 36.25

30.00 35.59 37.47 41.35

DZ denotes 6-31G(d,p), TZ denotes TZVP.

38

ACS Paragon Plus Environment

Page 39 of 40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Table 8: Average rise (Å) for the dinucleotide steps (50 position on top) of double stranded DNA nucleotides containing the base sequences reported in the second row and matching as closely as possible the single strands studied here (first row). Data were retrieved from the RCSB protein data bank, 89 PDB codes are reported in the last row. 50 -CAA-30

C/Aa A/A -

3.24 3.15 -

158db ,1fzx, 1g14,1rvh

50 -AAC-30

A/A A/C -

3.14 3.14 -

1zf0,1d89, 1fzx,1g14

A1

A2

A3

A4

A6

50 -TAT-30

50 -TTAATT-30 50 -TAAT-30 50 -TAA-30 50 -AATT-30

50 -TTAAA-30

50 -TAAAA-30 50 -AAAAT-30

50 -AAAAAA-30

T/T T/A A/A A/T T/T

T/T T/A A/A A/A -

T/A A/A A/A A/A A/T

A/A A/A A/A A/A A/A

T/A A/T -

3.32 3.15 -

1d56,1zfc, 2lwg

3.09 3.28 3.27 3.30 3.01

1d49,1zfh, 1bna

a Several

2.94 3.20 3.32 3.28 -

1ikk

3.04 3.30 3.18 3.32 3.85

1sk5,1rvh

instances of a given step are usually contained within a single PDB. b 158d, 1zf0, 1d89, 1d56, 1zfc, 1d49, 1zfh, 1ikk, and 1sk5 refer to X-ray measurements, the remaining structures were characterized by NMR.

39

ACS Paragon Plus Environment

3.21 3.15 3.21 3.19 3.25

1d89,1fzx

The Journal of Physical Chemistry

TOC Graphic

rise coordinate

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 40

P P P

2.5 Å

2.9 Å

extension of the basis set

ACS Paragon Plus Environment

3.3 Å