J. Phys. Chem. B 2009, 113, 2933–2936
2933
Nature of Bonding in Nine Planar Hydrogen-Bonded Adenine · · · Thymine Base Pairs Jiong Ran† and Pavel Hobza*,†,‡ Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for Biomolecules and Complex Molecular Systems, FlemingoVo na´m. 2, 166 10 Prague 6, Czech Republic, and Department of Physical Chemistry, Palacky´ UniVersity, Olomouc, 771 46 Olomouc, Czech Republic ReceiVed: NoVember 13, 2008; ReVised Manuscript ReceiVed: January 5, 2009
In this work, we investigate the mode of binding of all nine hydrogen-bonded structures of the adenine · · · thymine base pair. The planar H-bonded structures were optimized at the MP2/cc-pVTZ level, and the respective interaction energies, corrected for the basis set superposition error, were determined with the aug-cc-pVDZ basis set. The energy components were obtained from the DFT-SAPT procedure using the aug-cc-pVDZ basis set. The charge-transfer character of the single structures was estimated using NBO characteristics. It was established that dipole-dipole interaction itself cannot explain the preferred structure of the pair. Of the various energy components, first-order electrostatic energy plays the most important role. Second-order energy (the sum of induction and dispersion energies) amounts to about 56% of the electrostatic energy. The δ(HF) term covering among others the charge-transfer energy is rather large. The importance of δ(HF) is reflected by the NBO characteristics and especially by the NBO charge-transfer energy. The sum of the second-order energy and the δ(HF) term is only slightly smaller than the electrostatic energy (75-77%), which reflects the importance of the nonelectrostatic terms even in the case of strong H-bonded complexes. The WC structure, which exists in DNA, represents the seventh local minimum, while the three most stable structures utilize the N9-H proton donor group of the five-membered ring. Introduction DNA contains two hydrogen-bonded nucleic acid base pairs, adenine (A) · · · thymine (T) and guanine (G) · · · cytosine (C). Adenine and thymine as well as guanine and cytosine can bind in different ways, but in DNA only one structural motif exists, namely the Watson-Crick (WC) one. It should be mentioned that nucleic acids contain also other types, such as the reversed WC (rWC), Hoogsteen (H), and reversed Hoogsteen (rH). Even broader possibilities exist in the gas phase, and molecular dynamic/quenching calculations have revealed1 as many as 12 energy minima on the potential energy surfaces (PESs) of the AT and GC pairs. In the case of the GC PES, the WC structure with three H-bonds (which exists in DNA) is clearly the most stable structure, with the other structures being about 3 kcal/ mol or more less stable.1 The situation with the AT PES is different: here the WC arrangement does not correspond to the global minimum but only to the lower local minimum. The fact that the WC arrangement of the AT base pair (not being the global energy minimum) exists in DNA is not fully understood, but a currently accepted hypothesis explains it by the considerably shorter lifetime of AT and GC WC structures (compared with other structures) in the electronically excited states.2 The aim of the present study is to explain the nature of stabilization in all the planar H-bonded AT pairs and, particularly, to elucidate the preferential stabilization in other than WC structures. The stabilization of the planar H-bonded structures stems from the electrostatic, induction, and charge-transfer contributions. For an explanation of the preferential stabilization of one structure, all these contributions should be properly * Corresponding author. † Academy of Sciences of the Czech Republic and Center for Biomolecules and Complex Molecular Systems. ‡ Palacky´ University.
considered; i.e., in addition to the total interaction energies also their components should be determined. Methods Accurate stabilization energies for various types of noncovalent complexes are only obtained at the CCSD(T) level using the complete basis set (CBS) limit. This is true for both stacked structures and for weak X-H · · · Y (Y being mostly π-electrons) H-bonded systems. For medium and strong H-bonded complexes, the situation is different, and here the MP2 procedure with extended basis sets and/or at the CBS limit already yields reliable energy as well as geometry characteristics.3 This arises from the fact that for these clusters the CCSD(T) correction term (the difference between the MP2 and CCSD(T) interaction energies) is negligible. It should be added here that for stacked structures, where a dominant part of the stabilization comes from dispersion energy, this term is large and thus cannot be neglected. The MP2 interaction energies were determined with the aug-cc-pVDZ basis set, and we have evidence that these energies are fairly close to the CBS limit for H-bonded complexes.3 The CCSD(T)/CBS and MP2/CBS interaction energies for the Watson-Crick and Hoogsteen arrangements in the optimized as well as crystal structures can be found in ref 3, while density functional theory (DFT) symmetry adapted perturbation treatment (SAPT) perturbative energies and their components for the crystal structures of the Watson-Crick arrangement are presented in ref 4. The structures of canonical adenine and thymine, and all nine of the planar H-bonded structures (cf. Figures 1 and 2) were optimized without any constraints at the MP2 level using the cc-pVTZ basis set. All the interaction energies were determined with the aug-cc-pVDZ basis set and corrected for basis set superposition error (BSSE). The frozen core approximation was used systematically throughout the paper.
10.1021/jp810001v CCC: $40.75 2009 American Chemical Society Published on Web 02/10/2009
2934 J. Phys. Chem. B, Vol. 113, No. 9, 2009
Ran and Hobza
Figure 1. Optimized planar (Cs) structures of adenine (A) and thymine (T). The arrows indicate the magnitudes and orientations of the molecular dipole moments.
To obtain a more detailed description of the bonding, we adopted the natural bond order (NBO) technique5 as implemented in the Gaussian 03 code.6 The primary aim was to determine the hyperconjugative charge-transfer E(2) energy describing the interaction between the filled orbitals in one subsystem and the unfilled orbitals in another (cf. eq 1):
E(2) ) ησFij2 /∆E
(1)
where Fij is the Fock matrix element between the i and j NBO orbitals, ∆E is the energy difference between the filled and unfilled NBO orbitals, and ησ is the population of the donor σ orbital. It should be mentioned that this energy is not directly defined in the perturbation treatment (see below). The charge-transfer energy within the NBO calculation is only obtained if the wave function is defined. These calculations were thus performed at the B3LYP/cc-pVTZ level, and the MP2/cc-pVTZ geometries were systematically utilized. In addition to the NBO analysis, the symmetry adapted perturbation treatment (SAPT)7 was also used and the DFTSAPT calculations8 were performed with the aug-cc-pVDZ basis set. The method allows for the separation of the interaction energies into physically well-defined components, such as those arising from electrostatic, induction, dispersion, and exchange. The DFT-SAPT interaction energy (Eint) is given by eq 2: (1) (1) (2) (2) Eint ) EPol + EEx + E(2) Ind + EEx-Ind + ED + (2) + σ(HF) (2) EEx-D
where the single terms describe the electrostatic, exchangerepulsion, induction, exchange-induction, dispersion, and exchange-dispersion terms. The last term is a Hartree-Fock correction for higher-order contributions to interaction energy.
Figure 2. Optimized planar (Cs) structures of adenine · · · thymine base pairs. The arrows indicate the magnitudes and orientations of the subsystem dipole moments. Structures 9, 8, 5, and 4 correspond to Watson-Crick, reverse Watson-Crick, Hoogsteen, and reverse Hoogsteen arrangements, respectively.
Throughout the study, the exchange-induction and exchangedispersion terms were included in the parent induction and dispersion terms. The SAPT interaction energies calculated with the aug-cc-pVDZ basis sets are known to be slightly underestimated, which mainly concerns the dispersion energy.9 The augcc-pVTZ SAPT energy can be approximately constructed by enlarging the dispersion energy by 15% while keeping the other terms at their aug-cc-pVDZ values. Helgaker’s CBS extrapolation scheme10 was used to estimate the SAPT CBS energy. Comparison of MP2 interaction energies of various base pairs determined at aug-cc-pVDZ, aug-cc-pVTZ, aug-cc-pVQZ, and CBS limit levels can be found in ref 3. Further, comparison of MP2, CCSD(T), and SAPT interaction energies for these complexes is presented in ref 4. Results and Discussion Isolated Systems. Figure 1 shows the optimized structure of isolated adenine and thymine, and in both cases the planar structure was considered. The amino group of adenine is nonplanar, but the nonplanarity is rather small (the sum of the amino group valence angles amounts to 353°, whereas it amounts to 360° for the planar amino group11) and so is the energy difference between the fully optimized planar and
Bonding in Planar H-Bonded A · · · T Base Pairs
J. Phys. Chem. B, Vol. 113, No. 9, 2009 2935
TABLE 1: MP2 and DFT-SAPT Interaction Energies (in kcal/mol) for Various Structures of the AT Base Pairs
TABLE 2: DFT-SAPT/aug-cc-pVDZ Interaction Energies (in kcal/mol) for Various Structures of the AT Base Pairs
ATa
∆Eb
E(aVDZ)c
E(aVTZ)d
E(CBS)e
ATa
(1) EEl
(1) EEx
E(1)
(2) EInt
ED(2)
E(2)
δ(HF)
E
1 2 3 4 5 6 7 8 9
-18.08 -15.67 -16.00 -14.80 -13.34 -14.01 -15.10 -13.78 -14.00
-17.96 -15.41 -15.92 -14.74 -14.93 -14.15 -15.18 -13.83 -14.13
-19.37 -16.87 -17.38 -16.01 -16.39 -15.42 -16.54 -15.31 -15.61
-19.96 -17.48 -17.99 -16.54 -17.00 -15.95 -17.11 -15.93 -16.23
1 2 3 4 5 6 7 8 9
-32.11 -29.85 -30.64 -28.52 -28.77 -25.62 -28.99 -28.82 -29.39
38.31 37.20 38.04 35.11 35.39 31.18 35.87 37.25 37.88
6.20 7.35 7.40 6.59 6.62 5.56 6.88 8.43 8.49
-8.29 -7.25 -7.49 -6.40 -6.48 -6.35 -7.22 -6.73 -6.88
-9.44 -9.57 -9.71 -9.68 -9.76 -8.52 -9.11 -9.86 -9.94
-17.74 -16.83 -17.20 -16.08 -16.24 -14.87 -16.34 -16.59 -16.82
-6.42 -5.93 -6.12 -5.25 -5.31 -4.83 -5.72 -5.67 -5.79
-17.96 -15.41 -15.92 -14.74 -14.93 -14.15 -15.18 -13.83 -14.13
a See Figure 2. b BSSE-corrected interaction energies at the MP2/aug-cc-pVDZ. c DFT-SAPT at the aug-cc-pVDZ. d Estimated DFTSAPT at the aug-cc-pVTZ. e Estimated DFT-SAPT at the CBS.
nonplanar structures (0.13 kcal/mol11). The arrow originating in the center of mass indicates the direction and magnitude of the system dipole moment (MP2/cc-pVTZ), from which it is clear that the dipole moment of thymine is considerably larger. Base Pairs. Figure 2 shows the optimized structures of all nine of the H-bonded adenine · · · thymine base pairs, while Table 1 presents their interaction energies determined at various levels. All the MP2 stabilization energies are large; for the most stable structure (structure 1, cf. Figure 2) it is slightly more than 18 kcal/mol. Structure 1 is considerably more stable than the other structures, and the second and third most stable minima are separated by more than 2 kcal/mol. Structure 9, the WC structure, corresponds to the seventh local minimum, with the stabilization energy difference (with respect to the global minimum) being as large as about 4 kcal/mol. The least stable structure is structure 5. It is interesting to note that four naturally occurring structures of the pair (WC 9, rWC 8, H 5, and rH 4) do not belong among four most stable structures. Let us now analyze what is behind the different stabilizations of the single structures. Figure 2 shows that the global minimum, structure 1, possesses two practically antiparallel H-bonds of the N-H · · · N and N-H · · · O types and both proton donors are of the N-H type. Evidently, this arrangement of the dipole moments is energetically not very favorable. When investigating the structures of the first and second local minima, we realized that they possess H-bonds of the same type, but in these cases the orientation of the dipoles is more favorable. All three structures thus utilize imino N-H proton donor groups from the five-membered and six-membered rings and the N and CdO proton acceptor groups from the six-membered group. The third local minimum possesses two practically linear H-bonds of the N-H · · · N and N-H · · · O types, and in this case the amino and imino N-H proton donors as well as N and CdO proton acceptors belong to the six-membered group. The two following structures (on the basis of their stabilization energies), i.e., structures 4 and 6, have the same pattern of H-bonds, but unlike structures 1, 2, and 3 they utilize the N-proton acceptor group from the five-membered ring. The least stable structures are structures 9, 8, and 5. Unmistakably, all these structures possess nicely linear H-bonds, and this geometric arrangement cannot be used as an explanation of their different stabilities. As far as the orientation of the adenine and thymine dipole moments is concerned, the most favorable orientation exists in the least stable structure, structure 5, where both dipoles are practically linear. On the other hand, the orientation of the dipoles in the more stable structures, structures 9 and 8, is clearly unfavorable. Still favorable dipole orientation was found in structure 4 and partially also in structures 7 and 2. We can thus conclude that the mere geometric structure, linearity of H-bonds, and orienta-
a
See Figure 2.
tion of molecular dipoles cannot explain the relative stabilities of various AT structures. A more detailed view of the role of the different energy terms is provided by the DFT-SAPT perturbative calculations. First, the total DFT-SAPT interaction energies should be compared with those from the MP2. Table 1 shows that the SAPT/CBS stabilization energies are systematically larger than the MP2 ones and the relative order of single structures is similar. The order of the four most stable structures is the same from both procedures, with the only important change concerning structure 5, for which the SAPT stabilization energy is considerably larger (by 27%; in other cases, the SAPT energies are larger by about 13%). For our purposes, however, the single energy terms determined with the aug-cc-pVDZ basis set (cf. Table 2) are more important. Let us remember that SAPT results with the larger aug-cc-pVTZ basis set were only extrapolated; see Methods. The discussed dipole-dipole energy term forms a part of the first-order electrostatic energy component. Table 2 shows that this term is most favorable for structure 1, followed by structures 3 and 2, and that this term is also important for structures 9 and 7. Further, the electrostatic energy difference between structures 1, 3, and 2 is about 2 kcal/mol, which is in good agreement with this difference found at the MP2 level. By far the smallest electrostatic stabilization energy was found for structure 6. Evidently, the electrostatic energy plays an important role in stabilizing single structures. Following our expectations, both second-order terms, the induction and dispersion energies are important, with the latter term being about 36% larger. The total second-order energy is most favorable for structures 1, 3, and 2, while it is far less favorable for structure 6. Finally, when examining the σ(HF) term, we realized that it is largest for structures 1, 3, and 2 and smallest for structure 6. The σ(HF) term is large, which compares well to the second-order induction energy. It should be mentioned that charge-transfer energy is covered in this energy contribution. We can conclude that the exceptional stabilities of structures 1, 3, and 2 (as shown by the MP2 and SAPT calculations) are caused by favorable electrostatic, E(2), and σ(HF) energy terms. On the other hand, these terms cannot account for the fact that the least stable structure is structure 5 and not, as expected from the above discussion, structure 6. When investigating the exchangerepulsion term, we realized that this term was considerably more repulsive for structure 5 than for structure 6. A more detailed analysis of the charge-transfer energy can be obtained from the NBO characteristics and the charge-transfer energies presented in Tables 3 and 4. All nine of the H-bonded structures are characterized by charge flow from adenine to thymine, and this flow is largest for structures 4, 8, 5, and 9 and smallest for the global minimum, structure 1. The relatively small values of the charge transfer arise from the fact that adenine acts as an electron donor in one H-bond while acting as an electron acceptor in the other. The first and second columns in Table 3 show an increase of the electron density in
2936 J. Phys. Chem. B, Vol. 113, No. 9, 2009
Ran and Hobza
TABLE 3: NBO characteristics (in e-) for Various Structures of the AT Base Pairs ATa ∆σ*(T to A)b ∆σ*(A to T)c ∆∆σ*(A to T)d ∆CT(A to T)e 1 2 3 4 5 6 7 8 9
0.037 19 0.028 54 0.030 97 0.018 49 0.020 34 0.026 31 0.030 41 0.023 06 0.025 54
0.052 64 0.057 82 0.058 18 0.062 18 0.061 76 0.047 02 0.055 83 0.064 87 0.064 85
0.015 45 0.029 28 0.027 21 0.043 69 0.041 42 0.020 71 0.025 42 0.041 81 0.039 31
0.013 87 0.024 49 0.022 14 0.040 02 0.038 10 0.022 66 0.027 61 0.038 91 0.036 63
a See Figure 2. b Charge transfer from the oxygen lone pairs of thymine to the NH antibonding orbitals of adenine. c Charge transfer from the nitrogen lone pairs of adenine to the NH antibonding orbitals of thymine. d Charge transfer difference of the two hydrogen bonds in opposite directions. e Total charge transfer at the DFT-B3LYP/cc-pVTZ//MP2/cc-pVTZ.
TABLE 4: Comparison of E(2) Energies (in kcal/mol) from the NBO and the δ(HF) and the Induction from the DFT-SAPT ATa
δ(HF)b
(2) b EInd
E(2)(A to T)c
E(2)(T to A)d
E(2)(total)
1 2 3 4 5 6 7 8 9
-6.42 -5.93 -6.12 -5.25 -5.31 -4.83 -5.72 -5.67 -5.79
-8.29 -7.25 -7.49 -6.40 -6.48 -6.35 -7.22 -6.73 -6.88
21.75 24.14 24.2 27.83 27.56 20.57 23.36 27.44 27.36
20.31 15.83 16.65 8.61 9.12 12.68 15.05 11.35 12.02
42.06 39.97 40.85 36.44 36.68 33.25 38.41 38.79 39.38
a
See Figure 2. b DFT-SAPT at the aug-cc-pVDZ. c NBO analysis at the B3LYP/cc-pVTZ//MP2/cc-pVTZ NH-N hydrogen bond from adenine to thymine. d NBO analysis at the B3LYP/cc-pVTZ//MP2/ cc-pVTZ NH-O hydrogen bond from thymine to adenine.
the antibonding σ* orbital of the proton acceptor in both H-bonds. Evidently, this electron density systematically increases, which indicates a weakening of the bond and its elongation. The sum of the total changes (the last column in Table 3) provides a measure of the total charge-transfer energies, and this value is the smallest for structure 1, followed by structures 3 and 2. Not only does the charge-transfer E(2) energy from Table 4 include the electron density changes, but it also reflects the geometry of a pair. The total E(2) energy (the last column in Table 4) is the largest for structure 1, followed by structures 3 and 2, i.e., the most stable structures. On the other hand, the smallest E(2) energy was found for structure 6, which is among the weakest ones. The WC and rWC arrangements show larger E(2) energies than H and rH ones. The remaining (unnatural) structures are connected with larger as well as smaller E(2) energies. Investigating, however, only the charge transfer (cf. Table 3), we found that all four naturally occurring base pairs exhibit larger values than all remaining unnatural structures. Clearly, the E(2) charge transfer is the term determining to a large extent the stability of a pair, but the correlation of both values is not too close. Conclusion (i) Nine H-bonded planar structures with two H-bonds were localized at the PES of the AT base pair. Their stability ranged from 18 to 13 kcal/mol, and the three most stable structures utilized the N9-H proton donor group of the five-membered ring. These structures (as well as structures 7 and 6) cannot be realized in nucleic acids since the N9 and N1 positions are blocked by the
sugar group. The WC structure, which exists in DNA, represents the seventh local minimum. (ii) From among the various energy components, first-order electrostatic energy plays the most important role. The secondorder dispersion energy is slightly larger than the second-order induction energy, with the second-order energy (the sum of the induction and dispersion energies) amounting to about 56% of the electrostatic energy. The δ(HF) term, covering among others the charge-transfer energy, is rather large (only slightly smaller than the induction term). The importance of the δ(HF) term is reflected by the NBO characteristics and especially by the NBO E(2) charge-transfer energy. The sum of the second-order energy and the δ(HF) term is only slightly smaller than the electrostatic energy (75-77%), which reflects the importance of the nonelectrostatic terms even in the case of strong H-bonded complexes. (iii) The stability order cannot be explained merely by the dipole-dipole electrostatic interaction, but the order based on the SAPT first-order electrostatic energy agrees fairly well. Very similar stability order is based on the SAPT second-order induction energy, SAPT second-order energy, SAPT δ(HF) energy, and E(2) chargetransfer NBO energy. We can thus conclude that the stabilities of the various H-bonded structures of the AT base pairs cannot be explained on the basis of a single energy term but stem from the combination of various energy components. (iv) Four naturally occurring structures (WC, rWC, H, and rH) possess by far the largest charge transfer (from adenine to thymine) among all structures. Acknowledgment. This work has been part of the research project No. Z4 055 0506 and was supported by a grant from the Ministry of Education, Youth and Sports of the Czech Republic (Center for Biomolecules and Complex Molecular Systems, LC512 and MSM6198959216). The authors acknowledge support from the Praemium Academiae, Academy of Sciences of the Czech Republic, awarded to P.H. in 2007. References and Notes (1) Kabela´cˇ, M.; Hobza, P. J. Phys. Chem. B 2001, 105, 5804. (2) Abo-Riziq, A.; Grace, L.; Nir, E.; Kabela´cˇ, M.; Hobza, P.; de Vries, M. S. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 20. (3) Jurecˇka, P.; Sˇponer, J.; C`erny´, J.; Hobza, P. Phys. Chem. Chem. Phys. 2006, 8, 1985. (4) Sedla´k, R.; Jurecˇka, P.; Hobza, P. J. Chem. Phys. 2007, 127, 075104. (5) Reed, E.; Curtiss, L. A.; Weinhold, F. Chem. ReV. 1988, 88, 899. (6) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Montgomery, J. A., Jr.; Vreven, T.; Kudin, K. N.; Burant, J. C.; Millam, J. M.; Iyengar, S. S.; Tomasi, J.; Barone, V.; Mennucci, B.; Cossi, M.; Scalmani, G.; Rega, N.; Petersson, G. A.; Nakatsuji, H.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Klene, M.; Li, X.; Knox, J. E.; Hratchian, H. P.; Cross, J. B.; Bakken, V.; Adamo, C.; Jaramillo, J.; Gomperts, R.; Stratmann, R. E.; Yazyev, O.; Austin, A. J.; Cammi, R.; Pomelli, C.; Ochterski, J. W.; Ayala, P. Y.; Morokuma, K.; Voth, G. A.; Salvador, P.; Dannenberg, J. J.; Zakrzewski, V. G.; Dapprich, S.; Daniels, A. D.; Strain, M. C.; Farkas, O.; Malick, D. K.; Rabuck, A. D.; Raghavachari, K.; Foresman, J. B.; Ortiz, J. V.; Cui, Q.; Baboul, A. G.; Clifford, S.; Cioslowski, J.; Stefanov, B. B.; Liu, G.; Liashenko, A.; Piskorz, P.; Komaromi, I.; Martin, R. L.; Fox, D. J.; Keith, T.; Al-Laham, M. A.; Peng, C. Y.; Nanayakkara, A.; Challacombe, M.; Gill, P. M. W.; Johnson, B.; Chen, W.; Wong, M. W.; Gonzalez, C.; Pople, J. A. Gaussian 03; Gaussian, Inc.: Wallingford, CT, 2004. (7) Jeziorski, B.; Moszynski, R.; Szalewicz, K. Chem. ReV. 1994, 94, 1887. (8) Hesselman, A.; Jansesn, G.; Schutz, M. J. Chem. Phys. 2005, 122, 14103. (9) Hesselman, A.; Jansesn, G.; Schutz, M. J. J. Am. Chem. Soc. 2006, 128, 11730. (10) Halkier, A.; Helgaker, T.; Jorgensen, P.; Klopper, W.; Koch, H.; Olsen, J.; Wilson, A. K. Chem. Phys. Lett. 1998, 286, 243. (11) Sˇponer, J.; Hobza, P. Collect. Czech. Chem. Commun. 2003, 68, 2231.
JP810001V