Nonplanar geometries of DNA bases. Ab initio second-order Moeller

182 23 Prague, Czech Republic, andDepartment of Physical and Theoretical ... Here, the second-order Moller-Plesset (MP2) 6-31G*-optimized nonplanar...
0 downloads 0 Views 485KB Size
3161

J. Phys. Chem. 1994,98, 3161-3164

Nonplanar Geometries of DNA Bases. Ab Initio Second-Order Mdler-Plesset Study Jifi Sponert and Pave1 Hobza’Jp* J. Heyrovskj Institute of Physical Chemistry, Academy of Sciences of the Czech Republic, DolejJkova 3, 182 23 Prague, Czech Republic, and Department of Physical and Theoretical Chemistry, Technical University of Munich, Garching, Germany

Received: November 16, 1993’

Interactions of DNA bases represent a crucial source of DNA conformational variability. Oligonucleotide crystal studies revealed a number of base-base interactions which seem to be stabilized by nonplanar DNA base amino groups. Therefore, an accurate description of the geometry and deformability of the DNA base amino groups is very important. Here, the second-order Maller-Plesset (MP2) 6-3 1G*-optimized nonplanar geometries of adenine, cytosine, guanine, thymine, and isocytosine are presented. The amino groups of the bases exhibit significant sp3 pyramidalization. The dihedral angles between the cytosine and adenine rings and their amino group hydrogen atoms range from 10 to 25O, and the nonplanar cytosine and adenine are 0.4 kcal/mol more stable than the planar molecules. Dihedral angles between the two guanine amino group hydrogen atoms and the guanine ring are 43 and 12O, and the nonplanar guanine is 1.6 kcal/mol more stable than the planar molecule. Isocytosine exhibits amino group properties similar to those of guanine. Selected DNA bases were also optimized using larger basis sets of atomic orbitals: 6-31G**, DZP+, and DZ(2d). The MP2/6-31G** calculations yield results very similar to those of the MP2/6-3 lG* calculations, while the larger DZP+ and DZ(2d) basis sets indicate an even greater amino group nonplanarity.

Introduction

06

Base-stackingla+ and hydrogen-bondingh4 interactions between DNA bases represent an importantsourceof conformational variability in DNA molecules. For many years, isolated DNA bases were expected to be planar.3a-c However, there is increasing evidence 3a-c that the bases can adopt nonplanar geometry in the amino groups. As discussed in detail elsewhere?” amino group nonplanarity and deformability are essential for the explanation of many base-base interactions observed in DNA crystal s t r u c t ~ r e s : ~close * ~ amino group contacts between neighboring base pairs?” interstrand bifurcated hydrogen bonds,“ and base pairs with a large degreeof nonplanarity. Ab initio computations represent the most promising means for analysis of the properties of the DNA base amino group~,~b.C because no experimentaldata are available3bto estimate the DNA base nonplanarity. To obtain reliable results, high-level ab initio calculations should be performed. Large basis sets containing the polarization functions as well as the electron correlation should be taken into account. Previously, isolated cytosine was optimized at various theoretical levels up to MP2/6-31GS.3C The inclusion of electron correlationsignificantlyincreased the amino group nonplanarity. Here, the MP2/6-3 1G*-optimized nonplanar geometries of the four most frequent DNA bases (adenine, cytosine, guanine, and thymine) are presented. Small basis sets and the HartreeFock (HF) theory level are usually quite sufficient for an accurate description of intramolecular geometries. But this is not true in the case of the DNA base amino groups. Their geometry is the result of a delicate balance between the lone-pair character of the nitrogen electrons and their d e l o c a l i z a t i ~ n ~ toward ~ , ~ . ~ ~the . ~base rings. Therefore, selected DNA bases were also optimized using larger basis sets (6-31G**, DZP+, and DZ(2d)). Method First, the geometry of adenine, cytosine, guanine, thymine, and isocytosine was optimized using the standard 6-3 1G*6basis t Academy of Sciences of the Czech Republic. t Technical University of Munich.

Abstract published in Advance ACS Abstracts, February 15, 1994.

I

Hi

Hn

Figure 1. Molecular structure and atom numbering of adenine (A), cytosine (C), guanine (G),isocytosine (iC), and thymine (T).

set; the electron correlation was included using the second-order Merller-Plesset7 (MP2) theory. Three different optimized geometries wereobtained: (1) planar molecule (PLAN geometry); (2) molecule with planar ring (including the amino group nitrogen atom) and nonplanar amino group hydrogen atoms (NPA geometry); (3) NPA geometry as the starting point of the unconstrainedoptimization (FULL geometry). The isocytosine was included in the analysis because its amino group had the same properties as the guanine amino group (which exhibited a much larger nonplanarity compared to cytosine and adenine; see below). For the small isocytosine molecule, calculations with larger and more reliable basis sets are feasible. Second, the NPA and PLAN geometries of selected DNA bases were obtained using larger basis sets: 6-31G** (cytosine, guanine), DZP+ (cytosine, isocytosine), and DZ(2d) (cytosine, isocytosine). The double-c (DZ) basis set was the [4s2p/2s] contraction of the ( 9 ~ 5 ~ 1 4basis s ) set by Dunning.* DZP+ was the DZ basis set augmented by a set of d-polarization functions (a= 0.8) on all the heavy atoms, by a set of p-polarizationfunctions (CY = 1.O) on the hydrogen atoms, and also by a set of standard

0022-365419412098-3 161%04.50/0 0 1994 American Chemical Society

Sponer and Hobza

3162 The Journal of Physical Chemistry, Vol. 98, No. 12, 1994

TABLE 1: Optimized Geometries of Isolated Cytosine Obtained at the MP2 Level Using Various Basis Sets level: MP2/6-31G* MP2/6-3 1G** MP2/DZP+ MP2/DZ(2d) geometry: PLAN' NPAb FULLC PLAN' NPAb FULLc PLAN' NPAb PLAN' NPAb -0.764 66 -0.765 08 -0.765 26 -0.808 53 -0.808 88 -0.809 04 -0.882 92 -0383 94 -0.917 19 -0,917 72 E'(au)d -0.22 -0.32 -0.64 -0.34 -0.27 -0.38 bE (kcal/mol)c ~~

Nl-CZ Cbcl C5-C6 c4-c5 N3-C4 C2-N3 c2-02 C4-N4 Nl-HI C6-H6 C5-H5 N4-H41 N4-H42 C6-Nl-C2 C5-CGN1 C4-C5-C6 N3-C4-C5 C2-N3-C4 Nl-C2-N3 02-CZ-Nl N4-C4-N3 H 1-N 1 x 6 HbCbC5 H5-C5-C4 H41-N4-C4 H42-N4-C4 H42-N4-H41 C5426-N 1-C2 C4-C5-C6-N1 N3-C4-C5-C6 02-CZ-Nl-C6 N4-C4-N3-C2 H5-C5-C4-N3 H 1-N 1-C6-C5 Hbcbc5-C4 H41-N4-C4-C5 H42-N4-C4-N3

1.420 1.358 1.359 1.438 1.320 1.379 1.227 1.358 1.014 1.086 1.083 1.007 1.010 123.9 119.7 116.0 124.5 120.0 115.9 118.7 116.6 121.4 123.5 122.5 122.5 117.9 119.7 0.0 0.0

0.0 180.0 180.0 180.0 180.0 180.0

0.0 0.0

1.417 1.356 1.358 1.435 1.317 1.379 1.226 1.367 1.014 1.086 1.083 1.010 1.013 124.0 119.6 116.0 124.5 120.0 115.9 118.8 116.8 121.4 123.5 122.4 118.9 114.8 116.4

0.0 0.0 0.0 180.0 180.0 180.0 180.0 180.0 -22.8 12.7

1.417 1.357 1.358 1.435 1.317 1.379 1.225 1.369 1.014 1.086 1.083 1.011 1.013 124.0 119.6 116.0 124.5 120.0 115.9 118.9 116.8 121.4 123.5 122.4 118.5 114.4 115.9 0.2 0.2

1.419 1.357 1.358 1.438 1.320 1.379 1.227 1.358 1.009 1.08 1 1.079 1.002 1.005 124.0 119.6 116.0 124.6 119.8 116.0 118.7 116.7 121.3 123.5 122.5 122.3 117.8 120.0

0.0

0.0 0.0 0.0

179.8 176.7 179.1 179.8 180.1 -26.2 14.1

180.0 180.0 180.0 180.0 180.0 0.0 0.0

1.418 1.358 1.359 1.437 1.318 1.381 1.226 1.368 1.009 1.081 1.079 1.005 1.008 124.0 119.6 116.0 124.6 119.9 115.9 118.8 116.8 121.3 123.6 122.4 118.8 114.8 116.7 0.0 0.0 0.0 180.0 180.0 180.0 180.0 180.0 -22.6 12.6

1.418 1.358 1.359 1.437 1.318 1.382 1.226 1.369 1.009 1.081 1.079 1.006 1.008 124.0 119.6 116.0 124.6 119.9 115.9 118.9 116.8 121.2 123.6 122.4 118.4 114.4 116.2 0.0 0.3 -0.1 179.9 176.6 179.9 180.0 179.2 -26.1 13.9

1.421 1.361 1.366 1.444 1.325 1.382 1.232 1.365 1.013 1.085 1.082 1.006 1.009 123.9 119.6 115.9 124.5 119.7 116.4 118.7 116.8 121.3 123.4 122.7 121.9 118.1 120.0 0.0 0.0 0.0 180.0 180.0 180.0 180.0 180.0

0.0 0.0

1.420 1.361 1.367 1.442 1.323 1.382 1.232 1.379 1.013 1.085 1.082 1.01 1.013 123.9 119.5 116.0 124.5 119.8 116.3 118.8 116.9 121.2 123.5 122.6 117.4 114.2 115.6 0.0 0.0

1.417 1.354 1.366 1.439 1.321 1.376 1.225 1.361 1.017 1.091 1.088 1.009 1.013 123.9 119.8 115.8 124.3 120.1 116.2 118.7 116.8 121.4 123.3 122.8 122.0 118.2 119.9 0.0 0.0

1.416 1.355 1.366 1.438 1.319 1.378 1.224 1.375 1.017 1.091 1.088 1.013 1.016 123.9 119.7 115.9 124.3 120.2 116.0 118.9 116.9 121.4 123.5 122.7 117.6 114.2 115.5 0.0 0.0

0.0

0.0

0.0

180.0 180.0 180.0 180.0 180.0 -24.9 15.0

180.0 180.0 180.0 180.0 180.0 0.0 0.0

180.0 180.0 180.0 180.0 180.0 -24.7 15.3

0 Planar structure. b Only the amino group hydrogen atoms are nonplanar. Optimizationmade without any constraint. E = E'- 393. e The energy difference between the respective nonplanar structure and the planar one.

diffuses- and p-orbitals on the heavy atomsag The DZ(2d) basis set was Dunning's DZ basis set augmented by two sets of d-polarizations functions (a1= 1.6 and a2 = 0.4) on all the heavy atoms.10 The calculations were carried out with the Gaussian 929 sets of programs. The 'frozen a r e approximation" was used; Le., 1s electronsof C, N, and 0 atoms were not considered in the electron correlation energy calculations. The gradient optimization technique was used for determination of the optimum geometry; the gradient convergencecriterion was equal to 0.000 45, and the 6d-polarization functions were used. The molecular structure and atom numbering of the DNA bases are shown in Figure 1.

Results and Discussion Cytosine. Table 1 presents the cytosine geometries obtained using various basis sets. Evidently, the geometrical parameters other than the amino group geometry are quite insensitive to the choice of the basis set. The amino group adopts a nonplanar (pyramidal) geometry. The amino group hydrogen atoms deviate from the cytosine plane in one direction, while the amino group nitrogen atom is slightly shifted in the opposite direction. The amino group deformation is asymmetric; i.e., the absolute value of the H41N4C4C5 dihedral angle is significantly larger than that of the H42N4C4N1 dihedral angle. This is caused by a repulsion between the H41 amino group hydrogen atom and the neighboring H5 ring hydrogen atom.3c The repulsion influences

the planar structure as well, because the C4-N4-H41 valence bond angle is larger (122-122.5O) than the C4-N4-H42 angle (approximately 118O). The less expensive NPA geometry optimization yields a result similar to that of the unconstrained FULL optimization,although both the amino group hydrogen atom dihedrals and the energy stabilizing the nonplanarity are somewhat underestimated. Table 1 demonstratesthat the optimized aminogroupgeometry does not depend much on the choice of the basis set. However, the energy 6E (the energy difference between nonplanar cytosine and the planar molecule) is a more basis-set-dependentquantity. The smallest value of 6E was obtained with the 6-31G** basis set (-0.32 and-0.22 kcal/mol for the FULL and NPA structures, respectively). The 6-31G* basis set yields very similar results (-0.38 and -0.27 kcal/mol), indicating that the p-polarization functions on the hydrogen atoms are unimportant. On the other hand, the absolute value of 6E increases when the split-valence 6-31G with polarization function basis sets are replaced by more reliable DZ with polarization function basis sets (6E is -0.34 and -0.64 kcal/mol for the MP2/DZ(2d) NPA and MP2/DZP+ NPA structures, respectively). Becausethe polarization functions on the hydrogen atoms do not change the results significantly, the MP2/DZ(2d) level is expected to be the most reliable in the present study. Guanine and Isocytosine. Table 2 presents the MP2/6-3 1G*and MP2/6-3 1G**-optimizedgeometries of guanine. Theresults confirm the previous calculations made at the HF/6-3 1G(NH2*)

The Journal of Physical Chemistry, Vol. 98, No. 12, 1994 3163

Nonplanar Geometries of DNA Bases

TABLE 2 Optimized Geometries of Isolated Guanine Obtained at the MP2/6-31G* and MP2/631C** Levels level: MP2/6-3 1G* MP2/6-3 1G**

TABLE 3: Optimized Geometries of Isolated Isocytosine Obtained at the MP2 Level Using Various Basis Sets

~

geometry: E'(au)d 6E (kcal/mol)* 06426 N1-C6 C2-N 1 N3-C2 C4-N3 c5-C4 CS-C6 N7-C5 C8-N7 N9-C8 C4N9 N242 H1-Nl H8-C8 H9-C9 H21-N2 H22-N2 Nl-C6-06 C2-Nl-C6 N3-C2-N1 C4-N3-C2 C5-C4-N3 c6-c5-C4 Cl-CbC5 N7-C5-C4 C8-N7-C5 N9-C8-N7 C4-N9-C8 N9-CM5 N2-C2-N 1 H 1-N 1 x 6 H8-C8-N7 H9-C9-C8 H21-N2-C2 HZZ-NZ-CZ H22-N2-H21 CZ-Nl-C6-N6 N3-C2-Nl-C6 C4-N3-C2-N1 C5-C4-N3-C2 N7-C5-C4-N3 C8-N7-C5-C4 N9-C8-N7-C5 N2-C2-Nl-C6 Hl-Nl-C6425 H8-C8-N7-C5 H9-N9-C8-N7 H21-N2-C2-N3 H22-NZ-CZ-Nl

PLAN'

NPAb

FULLc

PLAN'

NPAb

-0.993 46 -0,995 58 -0.996 05 -1.038 79 -1.040 70 -1.63 -1.20 -1.33 1.226 1.225 1.225 1.225 1.225 1.431 1.430 1.430 1.431 1.430 1.372 1.373 1.374 1.372 1.375 1.314 1.311 1.311 1.313 1.311 1.366 1.364 1.366 1.364 1.366 1.394 1.394 1.394 1.394 1.394 1.440 1.442 1.440 1.440 1.440 1.379 1.378 1.378 1.377 1.378 1.323 1.324 1.324 1.323 1.324 1.375 1.376 1.375 1.375 1.376 1.370 1.370 1.370 1.370 1.370 1.363 1.385 1.363 1.384 1.386 1.017 1.012 1.012 1.017 1.017 1.078 1.082 1.082 1.078 1.082 1.013 1.013 1.008 1.008 1.013 1.015 1.009 1.003 1.009 1.015 1.014 1.007 1.001 1.009 1.015 119.9 119.6 119.7 119.9 112.0 127.1 127.0 127.0 127.1 127.0 124.0 123.8 123.9 124.0 124.0 111.6 111.5 111.5 111.6 111.6 129.6 129.8 129.6 129.5 129.8 118.9 118.9 118.9 118.9 118.9 109.0 109.1 109.1 109.0 109.0 111.6 111.6 111.7 111.6 111.6 103.8 103.8 103.8 103.8 103.8 112.9 112.9 112.9 113.0 112.9 107.0 106.9 107.0 107.0 107.0 104.7 104.7 104.7 104.7 104.7 116.2 116.9 116.8 116.1 115.9 113.0 113.4 113.4 113.3 112.8 125.2 125.2 125.3 125.3 125.2 127.7 127.6 127.6 127.7 127.7 111.1 117.5 117.4 111.1 110.9 115.5 123.2 122.9 115.4 115.2 112.3 119.4 119.7 112.6 112.0 180.0 180.0 180.0 180.0 180.3 0.0 0.0 0.0 0.0 0.4 0.0 0.8 0.0 0.0 0.0 -2.2 0.0 0.0 0.0 0.0 180.0 180.0 180.0 180.0 181.1 0.0 0.0 0.0 0.0 0.3 0.0 0.0 0.0 0.0 0.0 180.0 180.0 176.8 180.0 180.0 180.0 180.0 174.3 180.0 180.0 180.0 179.9 180.0 180.0 180.0 180.0 180.0 180.0 180.3 180.0 -11.2 -11.8 0.0 -11.5 0.0 39.3 43.2 0.0 0.0 38.7

a Planar structure. Only the amino group hydrogen atoms are nonplanar. C Optimization made without any constraint. E = E'- 540. The energy difference between the respective nonplanar structure and the planar one.

theoretical level? namely, that the guanine amino group nonplanarity is significantly larger than that of cytosine. The MP2/6-3 1G*nonplanar (FULL)guanineisstabilizedby as much as 1.6kcal/molcompared to the planar molecule. Guanine amino group pyramidalization exhibits pronounced asymmetry due to the repulsive interaction of the H1 ring hydrogen atom with the amino group H22 hydrogen atom. 'Also,the significant out-ofplane deviation (6") of the H1 hydrogen atom indicates the H1amino group repulsion. The repulsion also influences the planar structure. The H21N2C2 amino group hydrogen valence angle is only 117.5" while the other H22N2C2 angle is 123". Because of the size of guanine, the larger DZP+ and DZ(2d) basis sets could not be used. These basis sets were employed, however, for isocytosine (Table 3). Isocytosine has the same atomic structure as guanine except that the five-membered ring is missing (see

level: geometry:

MP2/6-31G* PLAN'

NPAb

MP2/DZP+ PLAN*

NPAb

MP2/DZ(2d) PLAN#

NPAb

E'(auV 6E (k&l/ mol)d

-0.765 19 -0,766 83 -0.882 93 -0.885 24 -0.917 39 -0.919 38 -1.03 -1.45 -1.25

06426 N1-C6 C2-Nl N3-C2 C4-N3 C5-C4 CW5 N2-C2 HI-Nl H4J24 HS-CS H21-N2 H22-N2 Nl-C6-06 C2-Nl-C6 N3-C2-N1 C4-N3-C2 C5-C4-N3 C6425-C4 Nl-CMS N2-C2-N1 HI-Nl-CZ H4-CM5 H5-C5-C6 H21-N2c2 H22-N2c2 H22-N2H21 H21-N2C2-N3 H22-N2C2-Nl

1.231 1.417 1.366 1.314 1.372 1.366 1.442 1.360 1.017 1.088 1.083 1.009 1.007 119.5 124.5 123.2 115.1 125.7 119.9 111.6 117.6 121.5 120.0 118.1 117.1

1.230 1.415 1.365 1.311 1.375 1.365 1.444 1.380 1.018 1.088 1.084 1.014 1.013 119.8 124.5 123.3 115.3 125.4 120.0 111.5 116.8 121.0 120.2 118.0 111.4

1.236 1.418 1.369 1.318 1.375 1.374 1.447 1.368 1.016 1.087 1.082 1.008 1.006 119.5 124.4 123.3 115.2 125.6 119.7 111.8 117.4 121.1 119.9 118.4 117.2

1.236 1.416 1.368 1.314 1.378 1.373 1.450 1.388 1.016 1.087 1.082 1.013 1.012 119.8 124.4 123.4 115.3 125.3 119.8 111.7 116.6 120.6 120.1 118.3 111.3

1.229 1.416 1.362 1.315 1.370 1.373 1.443 1.362 1.020 1.093 1.088 1.011 1.009 119.3 124.7 123.2 115.2 125.6 119.7 111.6 117.7 121.1 119.9 118.3 117.2

1.229 1.414 1.361 1.311 1.374 1.371 1.445 1.386 1.020 1.092 1.089 1.018 1.018 119.7 124.7 123.3 115.3 125.3 119.8 111.5 116.8 120.6 120.2 118.1 110.5

123.4

116.4

122.9

115.7

123.0

115.0

119.5

113.1

119.8

112.9

119.7

111.7

0.0

-11.6

0.0

-12.8

0.0

-13.6

0.0

36.7

0.0

36.7

0.0

38.8

a Planar structure. Only the amino group hydrogen atoms are nonplanar. e E = E'- 393. The energy between the nonplanar structure and the planar one.

Figure 1). The isocytosine amino group is closely related to that of guanine, although the amino group nonplanarity and its stabilization energy are a bit smaller. The use of the DZ(2d) and DZP+ basis sets further increases the predicted amino group nonplanarity compared to the case for the MP2/6-3 lG* level. Adenine. Table 4 presents the MP2/6-3 lG*-optimized geometries of adenine. The nonplanar molecule is stabilized by 0.34 kcal/mol compared to the planar molecule. The absolute values of the two amino group hydrogen dihedralangles are almost the same, because only bare ring atoms adjoin the amino group. The degree of the amino group nonplanarity is similar to that of cytosine, and no other basis set was employed. Thymine. Table 5 presents the MP2/6-3 1G*-optimized geometry of thymine. The calculations started with the methyl group hydrogen, coplanar with the ring, pointed toward the 0 4 oxygen. However, during the course of the optimization, the methyl group turned by 60" (this geometry is shown in Figure 1).

Concluding Remark8 An accurate description of the DNA base amino group properties is a very delicate problem, and the present results still exhibit some basis-set dependence. This concerns mainly the energy stabilizing the nonplanar base geometry. Nonetheless, some conclusionscan be derived from the present and the previous studie~.3a+

3164 The Journal of Physical Chemistry, Vol. 98, No. 12, 1994 TABLE 4: Optimized Geometries of Isolated Adenine Obtained at the MP2/6-31G* Level of Theory level:

MP2/6-3 1G'

geometry:

PLAN"

NPAb

FULLc

E'(au)d 6E (kcal/molp

-0.947 06

-0,947 42 -0.23

-0.947 60 -0.34

C2-N3 Nl-CZ C6-N 1 C5-C6 c4-C5 N3-C4 N9-C4 C8-N9 N7-C8 C5-N7 C6-N6 C2-N2 C8-H8 N9-H9 N b H 61 N6-H62 Nl-C2-N3 CbNl-C2 C5-CbN1 C4-C5-C6 N3-C4-C5 C2-N3-C4 N9-C4-C5 C8-N9-C4 N7-C8-N9 C5-N7-C8 C4-C5-N7 N6-C6-C5 02-CZ-Nl H8-C8-N7 H9-N9-C8 H61-N6-C6 H62-N6-C6 H62-N2-H6 1 C6-N 1-C2-N3 CS-CdNl-C2 C4-C5-CbNl N9-C4-C5-C6 C8-N9-C4-C5 N7-C8-N9-C4 N6-C6-CS-C4 H2-C2-N 1 x 6 H8-C8-N7-C5 H9-N9-C8-N7 H6 1- N 6 - C d N 1 H62-N6-C6-C5

1.339 1.352 1.343 1.410 1.398 1.344 1.378 1.372 1.326 1.381 1.353 1.088 1.083 1.013 1.009 1.009 129.1 118.2 118.9 116.0 127.1 110.7 104.2 106.9 113.5 103.2 112.1 121.8 115.0 124.7 127.4 119.2 120.3 120.6 0.0 0.0 0.0 180.0 0.0 0.0 180.0 180.0 180.0 180.0 0.0 0.0

1.339 1.353 1.342 1.409 1.399 1.343 1.378 1.372 1.326 1.38 1 1.362 1.088 1.083 1.013 1.012 1.012 128.9 118.3 118.9 116.0 127.1 110.8 104.3 106.9 113.6 103.2 112.0 121.9 115.1 124.7 127.4 116.1 117.3 117.5 0.0 0.0 0.0 180.0 0.0

1.339 1.353 1.341 1.409 1.399 1.343 1.378 1.372 1.326 1.381 1.364 1.088 1.083 1.013 1.012 1.012 128.9 118.3 118.9 116.0 127.1 110.8 104.3 106.9 113.6 103.2 112.0 121.9 115.1 124.7 127.4 115.6 116.8 116.9 0.2 -0.1 -0.3 181.0 0.4 -0.1 183.0 180.0 179.8 180.0 18.7 -21.1

0.0 180.0 180.0 180.0 180.0 16.3 -17.5

Planar structure. Only the amino group hydrogen atoms are nonplanar. Optimization made without any constraint. d E = E'- 465. e The energy difference between the respective nonplanar structure and the planar one.

TABLE 5 Optimized Geometry of Isolated Thymine Obtained at the MP2/6-31G* Level of Theory. The Energy of the Optimized Structure is -452.806 02 hartrees. 04-C4 N3-C4 C2-N3 Nl-C2 C6-N1 C5-C6 CM5 02-C2 CM-CS H3-N3 HI-Nl

1.230 1.403 1.386 1.386 1.380 1.354 1.462 1.225 1.496 1.017 1.013

H6-C6 HMl-CM HMZ-CM HM3-CM N344-04 C2-N3-C4 Nl-C2-N3 CbNl-C2 C5-C6-N1 C4-C5-C6 N3-C4-C5

1.086 1.094 1.093 1.094 120.7 128.6 112.2 124.1 122.4 118.4 114.4

02-C2-N3 CM-CS-C6 H3-N3-C4 H1-Nl-CZ HW6-Nl HMl-CM-CS HM2-CM-C5 HM3-CM-CS HMl-CM-CS-C4 HM2-CM-C5-C4 HM3-CM-C5-C4

124.2 124.0 116.0 114.9 115.2 110.5 111.0 110.5 59.1 179.9 -59.2

(i) The amino groups of isolated DNA bases adopt pyramidal geometry. (ii) The pyramidalization is asymmetric due to the interaction of the amino group with a hydrogen atom or other group bonded to the ring atom adjoining the amino group.

&mer and Hobza (iii) The amino group nonplanarity is more pronounced for guanine than for adenine or cytosine. The nonplanar guanine molecule is favored over the planar form by more than 1 kcal/ mol. For adenine and cytosine, the energy stabilizing the nonplanarity is less than 1 kcal/mol. (iv) A comparison of the present results with those obtained previouslygcshows that the amino group geometry and ratio of the amino group nonplanarity of different DNA bases are already qualitatively reproduced at the HF/6-3 1G(NH2*) level (dpolarization functions only on the amino group nitrogen atom). (v) The availableempirical potentialspenalize nonplanar amino group geometries. They are therefore unsuitable for analysis of the highly deformed hydrogen-bonded base pairs and, especially, the close amino group contacts and bifurcated hydrogen bonds between neighboring base pairs. Both the latter phenomena are significantly influenced by nonplanar DNA base amino group geometries, which are facilitated by the intrinsic amino group nonplanarity and promoted by the intermolecular interactions.b*c (vi) We are certainly aware of the fact that even our largest basis set is far from saturation. For the present systems we were, however, unable to perform the optimization at a higher level. Pilot calculations on formamidel' and formamidine12 demonstrated a decrease of the amino group nonplanarity on passing to extended basis sets (6-31 1G(3df,2p) and 5s4p3df/3s2p for formamide and formamidine, respectively). In the case of formamidine (having the amino group nonplanarity comparable with A or C), the energy difference between planar and nonplanar structures decreased, compared to that for the MP2/6-31GS level, from 0.85 to 0.3 kcal/mol, while the amino group hydrogen atom dihedrals were reduced by less than 15%. The higher theoretical level used for formamidell-12 yielded a planar structure; nonplanarity of this system is extremely small with any basis set.lIJ2 It should be stressed that neither formamide nor formamidine can be used as a representative model for DNA base amino group analysis.12 Work is in progress in our laboratory to investigate more reliable models of the DNA bases.12

References and Notes (1) (a) Dickerson, R. E. J. Mol. Bioi. 1983, 166 419. (b) Sponer, J.; Kypr, J. J. Biomol. Srrucr. Dyn. 1990, 7, 1211. (c) Sponer, J.; Kypr, J. J. Mol. Biol. 1991,222,761. (d) Hunter, C. A. J . Mol. Biol. 1993, 230, 1025. (e) Sponer, J.; Kypr, J. J. Biomol. Srrucr. Dyn. 1993, 2 2 , 27. (2) (a) Sanger, W. Principles of Nucleic Acid Srrucrure; SpringerVerlag: New York, 1984. (b) Hobza, P.; Sandorfy, C. J. Am. Chem. Soc. 1987,109,1302. (c) Aida, M. J. Comput. Chem. 1988,9,362. (d) Cheng, Y.K.; Pettitt, B. M. Prog. Biophys. Mol. Biol. 1992, 58, 225. (3) (a) Riggs, N. V.Chem. Phys. Lett. 1991,277,447. (b) Leszczynski, J. Inr. J. Quantum Chem. Quanrum Biol. Symp. 1992, 19,43. (c) Sponer, J.; Hobza, P. J. Mol. Srrucr. (THEOCHEM) 1994, 304, 35. f )(a) Sponer, J.; Burcl, R.; Hobza, P. J. Biomol. Srrucr.Dyn., in press. (b) poner, J.; Kypr, J. Inr. J . Biol. Macromol., in press. (c) Sponer, J.; Hobza, P. J. Am. Chem. Soc., 1994, 216,709-714. ( 5 ) (a) Boggs, J. E.; Niu, Z . J . Compur. Chem. 1985, 6, 46. (b) Niu, Z.; Boggs, J. E.J. Mol. Srrucr. (THEOCHEM) 1984, 109, 381. (6) Hariharan, P. C.; Pople, J. A. Theor. Chim. Acta 1993, 28, 213. (7) Mdler, C.; Plesset, M. S.Phys. Rev. 1934, 46, 618. (8) Dunning, T. H., Jr. J. Chem. Phys. 1970.53, 2823. (9) Frisch, H. B.;Head-Gordon, M.; Trucks, G. W.; Foresman, J. B.; Schlegel, H. B.; Raghavachari, K.; Binkiey, J. S.;Gonzales, C.; Defrees, D. J.; Fox, D. J.; Whiteside, R. A.; Seeger, R.; Melius, C. F.; Baker, J.; Kahn, L. R.; Stewart, J. J.; Fluder, E. M.; Topiol, S.;Pople, J. A. Gaussian 92, Gaussian Inc.: Pittsburgh, PA, 1992. (10) The DZ(2d) basis set was used for the following reasons. We plan to investigate the planar as well as stacked base pairs, and the DZ(2d) basis set is the first basis set which could providereliable results for the two different base pairs. (11) Kwiatkowski, J. S.;Leszczynski, J. J. Mol. Srrucr. 1993, 297, 277. (12) Sponer, J.; Hobza,P. Unpublished results.