J. Phys. Chem. 1996, 100, 18875-18881
18875
Electron-Correlated Calculations of Electric Properties of Nucleic Acid Bases R. C. Johnson, T. D. Power, J. S. Holt,† B. Immaraporn, J. E. Monat, A. A. Sissoko, M. M. Yanik,‡ A. V. Zagorodny, and S. M. Cybulski* Department of Chemistry, Miami UniVersity, Oxford, Ohio 45056 ReceiVed: July 17, 1996; In Final Form: September 25, 1996X
Full geometry optimizations have been performed at the MP2 level of theory using the 6-31G(d,p) basis set for the five common nucleic acid bases (uracil, thymine, cytosine, adenine, and guanine) and three related bases (fluorouracil, 5-methylcytosine, and hypoxanthine). Several electric properties were subsequently calculated using the optimized geometries and the larger polarized basis sets of Sadlej. Including electron correlation decreases the magnitude of the dipole moments by 10-15% for every base except adenine. In all cases, inclusion of electron correlation increases the mean polarizability by 8-12% and also increases the polarizability anisotropy by as much as 14-20%. Results for structurally similar bases were compared in order to qualitatively study the effects of functional group substitution on geometries and electric properties.
Introduction Intermolecular interactions greatly affect large molecules, particularly biomolecules. The macrostructures and properties of proteins and DNA are highly dependent on such interactions.1 Perhaps the most important biological intermolecular interactions are those between nucleic acid bases in DNA. The most significant of these are hydrogen-bonded and stacking interactions between bases, which preserve the unique structure of DNA. Hydrogen bonding is also responsible for DNA’s pairing scheme.1,2 These interactions are electric in nature; thus, an accurate determination of the electric properties of the nucleic acid bases is desirable. Several computational studies have been aimed at accurately determining structures and properties of the nucleic acid bases and their tautomers.3-14 Ab initio methods are efficient for small systems, but for large molecules, such as nucleic acid bases, calculations are time-consuming because of the size of basis sets needed for reliable results. Furthermore, to obtain results that even approach experimental accuracy, it is necessary to include electron correlation.3,4 These two requirements limit, at least for the moment, the number of electron-correlated methods applicable to geometry optimization of large molecules to second-order Møller-Plesset perturbation theory (MP2) and to different variants of density functional theory (DFT). The most recent full geometry optimizations of nucleic acid bases and their tautomers have been performed at the MP2 level of theory.3,5,6 It has been shown that including electron correlation improves agreement with the geometries obtained from gas phase microwave spectroscopy.3 Theoretically obtained structures of nucleic acid bases are important because geometries derived from X-ray or neutron diffraction experiments contain errors due to the presence of lone pair electrons on nitrogen and oxygen, distortions caused by hydrogen bonding, and thermal motion.3 The main goal of this research project was to accurately determine the electric properties of nucleic acid bases. We contemplated the use of experimental geometries, but since these were not available for some bases, we decided to use as a † Current address: Department of Chemistry, University of Michigan, Ann Arbor, MI 48109. ‡ Current address: Department of Chemistry, University of Virginia, Charlottesville, VA 22901. X Abstract published in AdVance ACS Abstracts, November 15, 1996.
S0022-3654(96)02161-2 CCC: $12.00
starting point for our calculations theoretically determined geometries. Recent geometry optimizations of the nucleic acid bases have been performed by Sˇ poner and Hobza6 and by Stewart et al.3 Sˇ poner and Hobza optimized the geometries of adenine, cytosine, guanine, and thymine as well as isocytosine at the MP2 level of theory. Only cytosine was fully optimized with the 6-31G(d,p) basis set. Guanine was also optimized with the same basis set but was constrained to be planar except for the amino group hydrogens. Thymine, adenine, and guanine were fully optimized with a smaller 6-31G(d) basis set with no planarity constraints. Stewart et al. optimized the geometries of the four common bases and their methyl derivatives at the MP2 level of theory using the 6-31G(d,p) basis set but did not report all the bond lengths and bond angles involving hydrogen atoms. Furthermore, their results for cytosine were slightly different from those of Sˇ poner and Hobza. Because of these inconsistencies, we decided to fully reoptimize the geometries of all of the bases with the same basis set and without using any constraints. The most recent calculations of electric properties of nucleic acid bases are those of Basch et al.4 and of Jasien and Fitzgerald.7 Basch et al. calculated dipole moments and polarizabilities of the five common bases using a variety of double-zeta (DZ) basis sets. SCF values were obtained with the DZ basis set augmented with diffuse polarization functions on heavy atoms. Only for uracil and cytosine were MP2 calculations performed using basis sets that included these diffuse polarization functions. MP2 values for other bases were extrapolated using data obtained for smaller molecules. It thus appears that accurate MP2 calculations of electric properties for all of the bases have not yet been performed. Jasien and Fitzgerald performed first principles local density functional (LDF) calculations to investigate electric properties. Dipole moments were calculated using DN (“double numerical”) basis sets with and without 2p polarization functions on hydrogens. Polarizabilities and polarizability anisotropies were calculated without inclusion of the polarization functions on hydrogens. The geometries for these calculations were taken from Fratini et al.,15 except for the mean polarizability calculations for which the geometries of Arnott and Hukins16 were used. In this paper, we present the results of MP2 calculations on the five common nucleic acid bases (uracil, thymine, cytosine, © 1996 American Chemical Society
18876 J. Phys. Chem., Vol. 100, No. 48, 1996
Figure 1. Schematic structures and numbering scheme of nucleic acid bases corresponding to the Cartesian coordinates given in Table 1ah. Thymine (b) and 5-methylcytosine (e) have been slightly rotated to better show the arrangement of atoms in the methyl group.
adenine, and guanine) as well as three closely related bases (fluorouracil, 5-methylcytosine, and hypoxanthine). The latter three bases are important for a number of reasons. Fluorouracil is commonly used to treat colorectal and other types of cancer.17 5-Methylcytosine is the main modified base in eukaryotes.18 Hypoxanthine is a common purine base found as a product of metabolism.2 Optimized geometries for the bases as well as calculated values of the dipole moment, quadrupole moment, and dipole polarizability are presented and compared with previous theoretical and experimental results. The mean polarizability and polarizability anisotropy are also tabulated. These quantities are of interest in optoelectronics as well as studies of intermolecular interactions.19 Finally, values of the nuclear quadrupole coupling constants at the nitrogen and oxygen nuclei for all of the bases are presented. They may be of interest to microwave spectroscopists trying to identify hyperfine splitting patterns. Methods The geometries of the eight nucleic acid bases we investigated were fully optimized without planarity constraints at the MP2 level of theory using the 6-31G(d,p) basis set without resorting to the frozen core approximation. All optimizations have been performed with tighter than usual convergence criteria for both forces and displacements. We note, however, that the potential energy hypersurfaces in the vicinity of the minimum tend to be very flat, and tight convergence criteria for displacements are difficult to satisfy. We present our results by giving Cartesian coordinates for all of the atoms rather than bond lengths, bond angles, and dihedral angles. The numbering scheme is the conventional one for pyrimidines and purines (Figure 1).20 We believe that explicitly giving Cartesian coordinates will allow
Johnson et al. other workers to easily transfer the geometries as well as properties to other orientations. This will facilitate comparisons with future calculations which undoubtedly will show some modification of our results. For the calculation of electric properties the origin for each system is at the center of mass, and the Cartesian axes coincide with the principal axes of the moment of inertia (a ) x, b ) y, c ) z) with the eigenvalues of the moment of inertia tensor arranged in the order Ia e Ib e Ic. With this convention the z axis is perpendicular to the heterocyclic ring. The SCF values of the dipole moment, quadrupole moment, dipole polarizability, and nuclear quadrupole coupling constants at nitrogen and oxygen nuclei for each base were determined using the fully optimized MP2 structures. Second-order correlation corrections were introduced in calculations of the dipole moment and dipole polarizability. Preliminary analytical SCF results obtained with the 6-31G(d,p) basis set showed that it was not very reliable in calculations of polarizabilities. At the SCF level the 6-31G(d,p) basis set underestimated the mean polarizability of guanine and cytosine by 20% and overestimated the polarizability anisotropy by 15% in comparison to the LDF results obtained by Jasien and Fitzgerald. The perpendicular (zz) component of the polarizability was particularly affected by the choice of the basis set. The 6-31G(d,p) basis set underestimated the perpendicular component by 43% for cytosine and 38% for guanine in comparison to the results of Jasien and Fitzgerald. All calculations of electric properties, therefore, were performed using somewhat larger [5s3p2d]/ [3s2p] basis sets of Sadlej.21 These basis sets were designed to give accurate values of molecular electric properties such as dipole moments and polarizabilities. They contain two sets of doubly contracted polarization functions on both heavy nuclei and hydrogens but are still of fairly modest size. Despite this, they are comparable in quality to larger ANO basis sets for the calculation of electric properties.22 The finite field method was employed to calculate the electric properties. We applied electric field perturbations of magnitude (0.001 au along the x, y, and z axes. The numerically calculated SCF dipole moments were compared with the analytically calculated SCF values. They were found to match to within (0.0001 au. We expect similar accuracy for the MP2 results. The finite field method was also used to calculate the three diagonal components and the in-plane (xy) off-diagonal component of the polarizability tensor. The other off-diagonal elements (xz and yz) were not calculated because in four of the bases (uracil, thymine, fluorouracil, hypoxanthine) they are identically zero while in the remaining four cases they are very small. For the off-diagonal component we used the equation23
E ) E0 - µiFi - µjFj - 1/2RiiFi2 - 1/2RjjFj2 - RijFiFj
(1)
using the previously determined values of µi, µj, Rii, and Rjj. The finite field method provides three to four significant digits for diagonal elements of the polarizability tensor. It provides no more than two digits for Rxy, because of the error involved in using the above equation with our calculated values for µi, µj, Rii, and Rjj. All ab initio calculations were performed using the Gaussian 94 sets of programs.24 Results Geometries. The Cartesian coordinates of the optimized structures of the nucleic acid bases are given in Table 1a-h. There is good agreement with the previously published optimized geometries of adenine, cytosine, thymine, and guanine.
Electric Properties of Nucleic Acid Bases
J. Phys. Chem., Vol. 100, No. 48, 1996 18877
TABLE 1: Cartestian Coordinates (in Å) of the MP2 Optimized Structures of Nucleic Acid Bases atom
X
Y
Z
atom
X
Y
Z
-2.071 777 -2.270 502 0.047 935 2.322 553 2.115 284 -0.125 868
1.475 157 -1.000 933 -1.953 045 -0.974 376 1.711 337 2.815 544
0.000 058 0.000 028 -0.000 044 0.000 045 -0.000 021 0.000 024
N(1) C(2) N(3) C(4) C(5) C(6)
-1.174 296 -1.224 121 0.034 686 1.291 347 1.202 336 -0.006 214
1.018 261 -0.368 769 -0.941 961 -0.311 259 1.141 382 1.742 462
(a) Uracil 0.000 007 H(7) -0.000 009 O(8) -0.000 070 H(9) 0.000 004 O(10) -0.000 024 H(11) 0.000 001 H(12)
N(1) C(2) N(3) C(4) C(5) C(6) H(7) O(8)
1.114 086 1.626 210 0.628 974 -0.760 116 -1.183 411 -0.231 890 1.805 414 2.823 480
-1.236 750 0.048 783 1.009 145 0.827 576 -0.570 551 -1.531 627 -1.969 089 0.300 037
(b) Thymine 0.000 000 H(9) 0.000 019 O(10) 0.000 007 C(11) 0.000 006 H(12) 0.000 007 H(13) -0.000 005 H(14) -0.000 004 H(15) -0.000 013
0.951 135 -1.520 183 -2.650 450 -3.122 861 -3.122 950 -2.842 326 -0.473 617
1.967 647 1.792 956 -0.850 987 -0.409 559 -0.409 083 -1.921 587 -2.585 055
-0.000 008 -0.000 009 -0.000 002 0.875 171 -0.874 888 -0.000 289 -0.000 008
N(1) C(2) N(3) C(4) C(5) C(6)
-1.147 536 -1.669 667 -0.681 322 0.711 161 1.109 391 0.202 197
-1.235 494 0.045 107 1.022 431 0.878 370 -0.526 520 -1.521 727
(c) Fluorouracil -0.000 019 H(7) -0.000 004 O(8) 0.000 010 H(9) 0.000 005 O(10) 0.000 003 F(11) -0.000 002 H(12)
-1.829 526 -2.867 240 -1.022 647 1.469 841 2.426 759 0.490 027
-1.976 267 0.288 889 1.974 661 1.838 052 -0.787 273 -2.560 881
0.000 026 0.000 006 0.000 002 -0.000 011 0.000 006 0.000 013
N(1) C(2) N(3) C(4) C(5) C(6) H(7)
-1.181 728 -1.170 446 0.075 738 1.158 736 1.156 448 -0.064 015 -2.101 277
0.984 620 -0.431 707 -1.024 211 -0.276 137 1.158 983 1.752 433 1.398 998
(d) Cytosine -0.000 913 O(8) -0.000 335 N(9) 0.012 092 H(10) 0.009 044 H(11) 0.001 408 H(12) -0.000 193 H(13) -0.003 885
-2.238 348 2.359 310 2.306 964 3.178 525 2.065 043 -0.207 098
-1.031 810 -0.928 447 -1.917 158 -0.454 818 1.738 027 2.823 226
-0.002 788 -0.045 951 0.139 261 0.291 660 -0.013 026 -0.004 795
N(1) C(2) N(3) C(4) C(5) C(6) H(7) O(8)
1.113 522 1.596 380 0.629 870 -0.644 885 -1.157 358 -0.203 202 1.829 347 2.806 348
-1.215 590 0.111 469 1.096 445 0.769 775 -0.577 622 -1.546 916 -1.926 255 0.307 875
(e) 5-Methylcytosine -0.001 083 N(9) -0.001 026 H(10) 0.013 088 H(11) 0.013 651 C(12) 0.007 824 H(13) 0.002 958 H(14) -0.007 144 H(15) -0.005 699 H(16)
-1.541 628 -1.132 745 -2.465 441 -2.623 545 -3.120 341 -2.793 114 -3.111 788 -0.439 304
1.804 138 2.708 918 1.662 026 -0.877 965 -0.521 178 -1.950 722 -0.410 947 -2.602 465
-0.047 945 0.125 277 0.320 998 -0.007 493 0.896 339 -0.068 731 -0.862 475 -0.003 940
N(1) C(2) N(3) C(4) C(5) C(6) N(7) C(8)
1.961 731 1.340 334 0.030 085 -0.686 339 -0.193 767 1.206 939 -1.207 847 -2.304 179
0.450 880 1.650 314 1.916 833 0.783 329 -0.524 396 -0.655 693 -1.458 910 -0.716 359
(f) Adenine 0.014 463 N(9) 0.005 335 H(10) -0.007 013 N(11) -0.007 526 H(12) -0.006 637 H(13) 0.006 524 H(14) 0.007 506 H(15) 0.007 479
-2.053 714 1.998 803 1.814 889 2.790 579 1.259 861 -3.309 583 -2.729 298
0.631 061 2.509 461 -1.872 356 -1.888 555 -2.671 943 -1.103 368 1.377 871
-0.002 746 0.011 514 -0.047 258 0.197 324 0.206 417 0.014 906 -0.004 809
N(1) C(2) N(3) C(4) C(5) C(6) N(7)
1.884 735 1.327 994 0.044 479 -0.676 043 -0.223 298 1.197 205 -1.268 429
0.432 049 1.679 683 1.920 233 0.761 136 -0.557 971 -0.816 604 -1.447 891
(g) Hypoxanthine -0.000 004 C(8) -0.000 032 N(9) -0.000 019 H(10) 0.000 034 H(11) 0.000 036 O(12) 0.000 031 H(13) -0.000 018 H(14)
-2.343 004 -2.041 504 2.892 657 2.025 132 1.817 807 -3.360 527 -2.684 046
-0.674 087 0.662 059 0.345 448 2.506 888 -1.871 254 -1.026 448 1.438 117
-0.000 032 -0.000 037 -0.000 007 -0.000 026 0.000 018 -0.000 122 0.000 548
N(1) C(2) N(3) C(4) C(5) C(6) N(7) C(8)
-1.474 127 -1.698 418 -0.747 895 0.488 553 0.837 614 -0.208 689 2.202 610 2.673 637
0.783 864 -0.568 414 -1.468 097 -0.892 081 0.455 259 1.445 598 0.624 348 -0.611 081
(h) Guanine 0.001 244 N(9) 0.011 679 H(10) 0.019 161 N(11) -0.000 680 H(12) 0.011 445 H(13) 0.005 117 O(14) 0.004 532 H(15) -0.006 214 H(16)
1.682 183 -2.259 657 -3.022 269 -3.663 262 -3.112 904 -0.157 442 3.716 342 1.790 156
-1.560 993 1.414 656 -0.963 106 -0.426 937 -1.955 758 2.668 142 -0.879 486 -2.562 307
-0.007 483 -0.088 229 -0.061 862 0.503 255 0.094 357 -0.007 285 -0.013 173 -0.017 738
Most of our optimized bond lengths and bond angles agree with those of the previous works to within (0.005 Å and (1°. However, there are several larger discrepancies between our optimized structures and the previous results of Stewart et al.
We find the length of the N(9)-C(4) bond in cytosine to be 1.367 Å; Stewart et al. give this bond length as 1.358 Å. The N(11)-(C6) bond in adenine is 1.361 Å by our calculations; Stewart et al. list it as 1.353 Å. The N(3)-C(2) and N(11)-
18878 J. Phys. Chem., Vol. 100, No. 48, 1996
Johnson et al.
TABLE 2 (a) Dipole Moment Components (au) base
method
µx
µy
µz
SCF MP2 SCF MP2 SCF MP2 SCF MP2 SCF MP2 SCF MP2 SCF MP2 SCF MP2
0.532 0.472 0.377 0.345 0.480 0.413 -1.839 -1.655 2.629 2.368 0.989 0.971 0.606 0.612 1.248 1.094
-1.919 -1.640 1.903 1.662 1.756 1.495 -2.128 -1.836 1.364 1.195 0.258 0.319 -2.266 -1.926 2.460 2.192
0.000 0.000 0.000 0.000 0.000 0.000 0.292 0.272 0.299 0.269 0.289 0.269 0.000 0.000 0.359 0.338
uracil thymine fluorouracil cytosine 5-methylcytosine adenine hypoxanthine guanine
(b) Dipole Moments (au) base
µ (this work)
µ (this work)
µSCF (ref 4)
µMP2 (ref 4)
LDF b µDN+d (ref 7)
LDF c µDNP (ref 7)
µMP2 (fc)d (ref 10)
µexp (ref 4)
uracil thymine fluorouracil cytosine 5-methylcytosine adenine hypoxanthine guanine
1.99 1.94 1.82 2.83 2.98 1.06 2.35 2.78
1.71 1.70 1.55 2.49 2.67 1.06 2.02 2.52
2.05 2.00
1.74 1.69a
1.79 1.72
1.86 1.82
1.52
1.7 1.6
3.11
2.7
2.62
2.73
2.42
2.8e
0.95
1.0a
0.95
0.95
0.98
1.2f
3.07
2.8a
2.84
2.86
2.42
SCF
MP2
LDF LDF Extrapolated MP2 value. µDN+d and µDNP functions on hydrogens, respectively. d fc stands a
b,c
denote LDF results obtained with a double numerical basis set without and with 2p polarization for frozen core approximation. e According to Bakalarski et al.10 this value should only be treated as a crude estimate. f Experimental value for 9-methyladenine.
C(2) bonds in guanine have lengths 1.309 and 1.383 Å, respectively; Stewart et al. find these bonds to be 1.314 and 1.363 Å in length. The origin of these discrepancies is not entirely clear, since Stewart et al. did not specify their convergence criteria or the number of occupied orbitals used in the MP2 calculations. The most accurate results of Sˇ poner and Hobza for all four bases, obtained with the methods and basis sets listed earlier, agree with our results to within the (0.005 Å and (1° thresholds. Only bases containing amino groups (cytosine, 5-methylcytosine, adenine, and guanine) were found to be significantly nonplanar. The amino nitrogens of these bases deviated from the planarity of the ring, giving dihedral angles of about 3°. The dihedral angles involving amino hydrogens for three of the bases (cytosine, 5-methylcytosine, and adenine) ranged from 10° to 28°, but for guanine they were found to be 8.8° for H13N11-C2-N1 and 42.2° for H12-N11-C2-N1. These results validate the approach of Sˇ poner and Hobza, who showed that constraining the rings of adenine, cytosine, and guanine to be planar while allowing the amino hydrogens to be nonplanar is a good approximation to the fully optimized geometries.6 Several pairs of bases (e.g., uracil and thymine) differ by only a single functional group. This allows us to investigate the effects of the substitution of certain functional groups on the optimized geometries of these pairs. We compared the geometries of the pairs uracil/thymine, uracil/fluorouracil, cytosine/ 5-methylcytosine, and hypoxanthine/guanine, which differ by substitution of a hydrogen atom in the former by a methyl group, a fluorine atom, a methyl group, and an amino group, respectively, in the latter. The most visible structural differences are between uracil and fluorouracil. The substitution of a fluorine atom at the 5-position distorts bond lengths all around the ring and also affects bond angles near the 5-position. The bond lengths are distorted in
an alternating fashion: the N(1)-C(2) and N(3)-C(4) bonds are shorter in fluorouracil, while the C(2)-N(3), C(4)-C(5), and C(6)-N(1) bonds are longer. Differences in the geometries of the other pairs are present but not as pronounced. The pairs uracil/thymine and cytosine/5-methylcytosine are related in that each pair differs by the addition of a methyl group at the 5-position. Not surprisingly, the changes for each pair nearly shadow one another. With the methyl substitution, the C(4)C(5) bond is stretched and the C(6)-C(5)-R (R ) H or CH3) angle is increased. The C(3)-C(4)-C(5) angle decreases by about 1.5° in each case. Except for the amino group, there are no major differences between the structures of hypoxanthine and guanine. Dipole Moments. Table 2a shows the components of the dipole moments of the eight bases at the SCF and MP2 levels of theory. The total dipole moments are given in Table 2b. Our MP2 results are consistent with experimental data for uracil and thymine. Bakalarski et al.10 note that while the dipole moments of thymine and uracil have been measured, the dipole moments of adenine, cytosine, and guanine are still unknown. They also note that experimental values for the dipole moments of adenine and cytosine are often misquoted in other papers. Our MP2 value of the dipole moment of adenine is higher than any previously calculated value. Electron correlation decreases the magnitude of the dipole moment of all of the bases studied except adenine, for which correlation is negligible. The correlation contribution modifies the SCF values by 9-15%. Since the precise orientations of the bases used in the calculations of Basch et al. and of Jasien and Fitzgerald were not provided, we can only compare the total dipole moments. At the SCF level, our results for uracil, thymine, and adenine resemble those of Basch et al. The major differences are for guanine (10%) and cytosine (8%). The results listed as µMP2 (ref 4) in Table 2b are, however, somewhat misleading. As
Electric Properties of Nucleic Acid Bases
J. Phys. Chem., Vol. 100, No. 48, 1996 18879
TABLE 3: SCF Quadrupole Moments (au)a Θxx
base
Θyy
Θxy
Θxz
Θyz
uracil -13.314 11.657 -2.829 0.000 0.000 thymine -11.393 9.032 3.501 0.000 0.000 fluorouracil -16.523 12.158 -1.374 0.000 0.000 cytosine 1.326 3.103 -11.763 2.857 -0.980 5-methylcytosine -9.175 11.231 -9.914 -2.009 2.054 adenine 11.015 -4.148 -6.982 1.811 -2.172 hypoxanthine 18.915 -15.194 8.003 -0.003 0.002 guanine 17.500 -12.032 -11.903 -4.604 -0.835 a The quadrupole moment tensor is traceless, i.e., Θ + Θ + Θ xx yy zz ) 0.
stated in that paper, MP2 calculations with polarized basis sets were actually carried out only for cytosine and uracil, while values for the other bases were extrapolated from calculations with smaller basis sets. Of the bases for which MP2 results were extrapolated, agreement worsens as the size of the base (number of atoms) increases. Since our determination of these values did not involve extrapolation, we can infer that the extrapolated results for larger bases (i.e., adenine and guanine) are not very reliable. There is a difference ranging from 7 to 13% for all of the bases between our results and those of Jasien and Fitzgerald in which 2p polarization functions on hydrogens were included LDF ). Again, our results are closer to experimental values (µDNP than those of Jasien and Fitzgerald for both uracil and thymine. Surprisingly, our results are in better agreement with the LDF results obtained with a smaller basis set without polarization functions on hydrogens. As with geometries, we compared dipole moments for related bases. The magnitude of the MP2 dipole moment of fluorouracil is about 9% smaller than that of uracil. The magnitude of the dipole moment of 5-methylcytosine exceeds that of cytosine by 7% at the MP2 level, while the dipole moments of thymine
and uracil are very similar. This is surprising, since both pairs involve substitution of a methyl group at the 5-position. The only difference between hypoxanthine and guanine is the presence of an amino group at the 2-position. This increases the dipole moment of guanine by 20% in comparison with hypoxanthine. Quadrupole Moments. The most important factor in characterizing the charge distribution in a molecule is always the lowest nonvanishing multipole moment. But there is also some interest in the values of higher multipole moments, for example, in estimating the importance of higher order terms in the multipole expanded intermolecular electrostatic interaction energy. With this in mind, we give in Table 3 the SCF values of the quadrupole moments. Unfortunately, there are no previous values with which to compare our results. As is wellknown, the values of the components of the quadrupole moment tensor of polar molecules are origin-dependent, but values for the new origin can be easily obtained from the formula23
Θ′Rβ ) ΘRβ - (3/2r′Rµβ + 3/2r′βµR - r′γµ′γδRβ)
(2)
Polarizabilities. The calculated SCF values of the mean polarizabilities are in excellent agreement with the experimental data25 as well as the calculations of Basch et al.4 The inclusion of second-order correlation yields excellent agreement with the results of the LDF calculations of Jasien and Fitzgerald. This confirms the conclusion of Jasien and Fitzgerald that the results of LDF calculations mirror MP2 results for the nucleic acid bases. While the MP2 data from Basch et al. for uracil and cytosine are in good agreement with ours, the data for thymine, adenine, and guanine are more noticeably different. These are the bases for which results were extrapolated from calculations with smaller basis sets. Again, we believe the extrapolated results are not very reliable.
TABLE 4 (a) Components of Dipole Polarizabilities (au) base uracil thymine fluorouracil cytosine 5-methylcytosine adenine hypoxanthine guanine
method
Rxx
Rxy
Ryy
Rzz
SCF MP2 SCF MP2 SCF MP2 SCF MP2 SCF MP2 SCF MP2 SCF MP2 SCF MP2
86.3 95.7 100.1 111.8 82.3 93.5 94.8 107.5 109.4 123.5 115.9 125.4 111.2 120.5 121.0 137.9
-4.0 -1.6 -7.1 -4.7 9.0 7.5 -0.3 0.8 -5.0 -4.1 -0.9 -0.5 -2.3 0.5 -4.9 -3.4
68.3 75.9 81.3 88.2 73.0 79.4 74.1 83.1 85.9 95.2 99.3 111.5 93.0 102.3 105.2 114.6
39.8 42.2 48.2 51.0 39.1 41.8 43.7 46.8 51.8 55.2 52.0 55.3 48.5 51.1 53.8 57.5
(b) Dipole Polarizabilities and Anisotropies (au)b base
R j SCF (this work)
∆RSCF (this work)
R j MP2 (this work)
∆RMP2 (this work)
R j (ref 4)
∆R (ref 4)
R j (ref 7)
∆R (ref 7)
R j (expt25)
uracil thymine fluorouracil cytosine 5-methylcytosine adenine hypoxanthine guanine
64.8 76.5 64.8 70.9 82.4 89.1 84.2 93.3
41.2 47.1 42.4 44.5 50.9 57.4 55.8 61.4
71.3 83.7 71.6 79.1 91.3 97.4 91.3 103.3
46.9 53.7 48.1 52.9 59.9 64.3 62.3 71.9
68.1 76.1a
46.7 48.8a
71.9 85.2
46.6 54.4
75.8
76.7
54.6
79.1
53.9
69.5
88.7a
59.5a
99.3
68.7
88.4
94.9a
67.0a
106.7
80.9
91.8
Extrapolated MP2 value. R j ) (Raa + Rbb + Rcc)/3, where Raa, Rbb, and Rcc are the eigenvalues of the polarizability tensor. ∆R ) (1/x2)[(Raa - Rbb)2 + (Raa - Rcc)2 + (Rbb - Rcc)2]1/2. a
b
18880 J. Phys. Chem., Vol. 100, No. 48, 1996
Johnson et al.
TABLE 5: Nuclear Quadrupole Coupling Constants (in MHz) at Nitrogen and Oxygen Nucleia base uracil thymine fluorouracil cytosine 5-methylcytosine adenine
hypoxanthine
guanine
uracil thymine fluorouracil cytosine 5-methylcytosine hypoxanthine guanine
χbb
χcc
(a) At Nitrogen Nuclei N(1) -2.20 N(3) -2.38 N(1) -2.32 N(3) -2.20 N(1) -2.56 N(3) -2.33 N(1) -2.10 N(3) -3.11 N(9) -2.38 N(1) -2.16 N(3) -2.54 N(9) -2.03 N(1) 4.10 N(3) -1.37 N(7) -2.09 N(9) -1.83 N(11) -2.28 N(1) -1.75 N(3) -1.41 N(7) -2.03 N(9) -1.77 N(1) -1.75 N(3) -2.98 N(7) -0.32 N(9) -1.62 N(11) -2.38
-2.43 -1.91 -2.35 -2.05 -2.31 -2.02 -1.87 3.74 -2.18 -1.86 3.26 -2.35 -2.64 2.84 4.11 -1.79 -2.44 -1.83 2.73 4.50 -1.60 -2.45 2.95 2.86 -1.85 -1.80
4.63 4.29 4.67 4.25 4.87 4.35 3.97 -0.63 4.56 4.02 -0.72 4.38 -1.46 -1.47 -2.02 3.62 4.72 3.58 -1.32 -2.47 3.37 4.20 0.03 -2.54 3.47 4.18
(b) At Oxygen Nuclei O(8) 2.67 O(10) 1.42 O(8) 6.39 O(10) -4.01 O(8) 6.38 O(10) -4.24 O(8) 2.73 O(8) 6.36 O(12) -5.88 O(14) -10.04
-5.11 -5.96 -8.67 -0.56 -8.75 -0.56 -5.90 -9.34 2.16 6.29
2.44 4.54 2.28 4.57 2.37 4.80 3.17 2.98 3.72 3.75
atom
χaa
a a, b, and c denote the principal axes of the moment of inertia (I a e Ib e Ic).
The MP2 polarizability anisotropies for the pyrimidine bases are also very close to Jasien and Fitzgerald’s results. The differences are 0.3, 0.7, and 1.0 au for uracil, thymine, and cytosine, respectively. However, there are greater differences between their values and ours for the purine bases, namely, 4.4 and 9.0 au for adenine and guanine. Comparing pairs of bases as before with the optimized geometries and dipole moments, we find that the mean polarizability and polarizability anisotropy of uracil are hardly affected by the substitution of fluorine. Substitution of a methyl group has a much greater effect, though, as can be seen by comparing uracil and thymine as well as cytosine and 5-methylcytosine. This substitution increases the MP2 mean polarizability and anisotropy by about 15% for both pairs. Similar increases are noticed for the amino group substitution of the hypoxanthine/guanine pair. Nuclear Quadrupole Coupling Constants. Calculated values of the nuclear quadrupole coupling constants (χ) at nitrogen and oxygen nuclei are given in Table 5, a and b, respectively. In calculating the nuclear quadrupole coupling constants, we used the equation
χRR ) -Q 234.964730 qRR
(3)
in which Q denotes the nuclear quadrupole moment in barns (1 b ) 10-28 m2) and qRR denotes the electric field gradient at a nucleus in atomic units (e/4π0a03 = 9.717 365 × 1021 V m2);
the resulting coupling constants are in megahertz. The recommended values of the nuclear quadrupole moments are 20.1 mb for nitrogen and -25.58 mb for oxygen.26 For convenience, we give the values for all three inertial axes. The differences between nuclear quadrupole coupling constants for pyrrolic and pyrimidinic nitrogens are well-known.27 For the former, the largest in magnitude appears to always be χcc, i.e., the constant in the direction perpendicular to the heterocyclic ring. We also find that nitrogens in amine groups in cytosine, 5-methylcytosine, adenine, and guanine are qualitatively similar to pyrrolic nitrogens. In sharp contrast, the values of χcc for pyridinic nitrogens are, in our convention, negative. Only for N(3) in guanine is the value of χcc positive, albeit very slightly. A more complex pattern can be seen for oxygen coupling constants. They are, in general, somewhat larger in magnitude than the constants for the nitrogen nuclei. The largest in magnitude appear to be either the χaa or χbb components, although in two cases (O10 in thymine and fluorouracil) χcc is the largest. Conclusions We have optimized the geometries of eight nucleic acid bases at the MP2 level of theory and used those geometries to calculate several electric properties of the bases. The geometries and values of the properties we present appear to be the most accurate to date. The optimizations were performed without the frozen core approximation and without planarity constraints. Only the four bases containing amino groups were found to be significantly nonplanar. The calculations of electric properties utilized these optimized geometries as well as the basis sets of Sadlej, which are known to be reliable for calculating such properties. In most cases inclusion of electron correlation was necessary to obtain accurate results. Including electron correlation decreased the magnitude of the dipole moment of every base but adenine by 10-15% and increased the mean polarizability of every base by 8-12%. We found that, for calculations involving nucleic acid bases, results obtained with Sadlej’s basis sets can be very different from those obtained with 6-31G(d) or 6-31G(d,p), especially for polarizabilities. Our findings have important implications for work currently underway involving calculations of interaction energies of nucleic acid bases. Recently Hobza, Sˇ poner, and co-workers have performed calculations to study hydrogenbonded and stacking interactions of various nucleic acid base pairs.28-33 The 6-31G(d) and 6-31G(d,p) basis sets were used for these calculations. However, as noted earlier, the perpendicular (zz) component of the polarizability tensor is underestimated by as much as 40% with the 6-31G(d,p) basis set at the SCF level. Therefore, the relative strengths of hydrogen-bonded and stacking interactions will be biased in favor of the former ones, and the dispersion contribution to the energy of the stacking interaction will be substantially underestimated. Acknowledgment. Calculations reported in this work were performed on an SGI workstation purchased with NSF Grant DUE-9551091. They constituted a research project performed by the first eight authors on a one base per student basis as part of a course on “Molecular Orbital Theory” taught by the last author in the spring semester of 1996. Cartesian coordinates of all of the bases can be obtained via e-mail by sending a request to
[email protected]. References and Notes (1) Ables, R. H.; Frey, P. A.; Jencks, W. P. Biochemistry; Jones and Bartlett: Boston, 1992.
Electric Properties of Nucleic Acid Bases (2) White, A.; Handler, P.; Smith, E. L. Principles of Biochemistry, 4th ed.; McGraw-Hill: New York, 1968. (3) Stewart, E. L.; Foley, C. K.; Allinger, N. L.; Bowen, J. P. J. Am. Chem. Soc. 1994, 116, 7283. (4) Basch, H.; Garmer, D. R.; Jasien, P. G.; Krauss, M.; Stevens, W. J. Chem. Phys. Lett. 1989, 163, 514. (5) Leszczyn´ski, J. Int. J. Quantum Chem., Quantum Biol. Symp. 1992, 19, 43. (6) Sˇ poner, J.; Hobza, P. J. Phys. Chem. 1994, 98, 3161. (7) Jasien, P. G.; Fitzgerald, G. J. Chem. Phys. 1990, 93, 2554. (8) Kwiatkowski, J. S.; Leszczyn´ski, J. J. Mol. Struct. (THEOCHEM) 1990, 208, 35. (9) Leszczyn´ski, J. J. Phys. Chem. 1992, 96, 1649. (10) Bakalarski, G.; Grochowski, P.; Kwiatkowski, J. S.; Lesyng, B.; Leszczyn´ski, J. Chem. Phys. 1996, 204, 301. (11) Ha, T. K.; Gunthard, H. H. J. Mol. Struct. (THEOCHEM) 1992, 276, 209. (12) Sˇ poner, J.; Hobza, P. Chem. Phys. 1996, 204, 365. (13) Alkorta, I.; Perez, J. J. Int. J. Quantum. Chem. 1996, 57, 123. (14) Sˇ poner, J.; Hobza, P. Int. J. Quantum. Chem. 1996, 57, 959. (15) Fratini, A. V.; Kopka, M. L.; Drew, H. R.; Dickerson, R. E. J. Biol. Chem. 1982, 257, 14686. (16) Arnott, S.; Hukins, D. W. L. J. Mol. Biol. 1973, 81, 93. (17) Mayer, R. J. New England J. Med. 1990, 322, 399. (18) Kornberg, A.; Baker, T. A. DNA Replication, 2nd ed.; Freeman: New York, 1992. (19) Hinchliffe, A.; Soscun, H. J. J. Mol. Struct. (THEOCHEM) 1995, 331, 109. (20) Saenger, W. Principles of Nucleic Acid Structure; SpringerVerlag: New York, 1984.
J. Phys. Chem., Vol. 100, No. 48, 1996 18881 (21) Sadlej, A. J. Collect. Czech. Chem. Commun. 1988, 53, 1995. (22) Sekino, H.; Bartlett, R. J. J. Chem. Phys. 1993, 98, 3022. (23) Buckingham, A. D. AdV. Chem. Phys. 1967, 12, 107. (24) Gaussian 94, Revision B.3: Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Gill, P. M. W.; Johnson, B. G.; Robb, M. A.; Cheeseman, J. R.; Keith, T.; Petersson, G. A.; Montgomery, J. A.; Raghavachari, K.; Al-Laham, M. A.; Zakrzewski, V. G.; Ortiz, J. V.; Foresman, J. B.; Peng, C. Y.; Ayala, P. Y.; Chen, W.; Wong, M. W.; Andres, J. L.; Replogle, E. S.; Gomperts, R.; Martin, R. L.; Fox, D. J.; Binkley, J. S.; Defrees, D. J.; Baker, J.; Stewart, J. P.; Head-Gordon, M.; Gonzalez, C.; Pople, J. A. Gaussian, Inc., Pittsburgh, PA, 1995. (25) Bottcher, C. F. J. Theory of Electric Polarization; Elsevier: Amsterdam, 1952. (26) Pyykko¨, P. Z. Naturforsch. 1992, 47A, 189. (27) Brown, R. D.; Godfrey, P. D. ; McNaughton, D.; Pierlot, A. P. J. Chem. Soc., Chem. Commun. 1989, 37. (28) Hobza, P.; Sˇ poner, J.; Polasek, M. J. Am. Chem. Soc. 1995, 117, 792. (29) Sˇ poner, J.; Leszczyn´ski, J.; Hobza, P. J. Phys. Chem. 1996, 100, 1965. (30) Sˇ poner, J.; Leszczyn´ski, J.; Hobza, P. J. Comput. Chem. 1996, 17, 841. (31) Sˇ poner, J.; Leszczyn´ski, J.; Hobza, P. J. Phys. Chem. 1996, 100, 5590. (32) Sˇ poner, J.; Florian, J.; Leszczyn´ski, J.; Hobza, P. J. Biomol. Struct. Dyn. 1996, 13, 695. (33) Sˇ poner, J.; Florian, J.; Hobza, P.; Leszczyn´ski, J. J. Biomol. Struct. Dyn. 1996, 13, 827.
JP962161O