Orbital Interactions in Hydrogen Bonds Important for Cohesion in Molecular Crystals and Mismatched Pairs of DNA Bases Ce´lia Fonseca Guerra, F. Matthias Bickelhaupt,* and Evert Jan Baerends
CRYSTAL GROWTH & DESIGN 2002 VOL. 2, NO. 3 239-245
Afdeling Theoretische Chemie, Scheikundig Laboratorium der Vrije Universiteit, De Boelelaan 1083, NL-1081 HV Amsterdam, The Netherlands Received December 3, 2001
ABSTRACT: Model systems (of up to 116 atoms) for molecular crystals and mismatched pairs of DNA bases have been studied using GGA density functional theory (DFT) at BP86/TZ2P. In line with previous studies, we find that our approach is adequate for accurately describing the present model systems all of which involve hydrogen bonding. For example, our DFT bond energies for 17 DNA base pairs involving adenine (A), thymine (T), guanine (G), and cytosine (C) agree excellently with ab initio MP2 results (root-mean-square deviation ) 1.1 kcal/mol). Our main purpose is to clarify the relative importance of electrostatic attraction versus orbital interaction in the hydrogen bonds involved in our model systems, using a quantitative bond energy decomposition scheme. At variance with widespread belief, the orbital interaction component in these hydrogen bonds is found to contribute about two-fifths (36-43%) of the attractive interactions and is thus of the same order of magnitude as the electrostatic component. Interestingly, we find a similarly prominent role for orbital interaction in the hydrogen bonds that are responsible for the cohesion within a layer of the molecular crystal of Watson-Crick pairs of 9-ethylguanine and 1-methylcytosine. 1. Introduction Hydrogen bonding plays an important role in the formation of DNA base pairs and has, therefore, been the subject of several theoretical studies.1,2 Previously, we have shown that the generalized gradient approximation (GGA) of density functional theory (DFT) is an efficient alternative to conventional ab initio theory for accurately describing the hydrogen bonds involved in Watson-Crick base pairs (AT and GC, see Scheme 1)1 and in the weakly bound water dimer.1b Our bond analyses in the framework of Kohn-Sham DFT3 revealed that the contribution of occupied-virtual orbital interactions to the Watson-Crick hydrogen bonds is of the same order of magnitude as electrostatic interactions.1a,c The orbital interaction component mostly originates from donor-acceptor orbital interactions of lone-pairs on nitrogen and oxygen atoms of one DNA base with empty N-H σ* orbitals of the other base.1a,c Very recently, we found that, at variance with widespread belief,4 such an orbital interaction component is prominent even in rather weakly bound base pairs, such as those of adenine (A) with 2,4-difluorotoluene (F) or other mimics of thymine (T). These findings raise an important question. Are these isolated examples, or is the occurrence of orbital interaction in such hydrogen bonds a more general phenomenon? In the present study, we tackle the above question by extending our analyses to a series of 16 mismatched pairs of DNA bases. The main purpose is the elucidation of the hydrogen-bonding mechanism, in particular, the relative importance of frontier orbital interaction and electrostatic interactions. This is done in the conceptual framework of the Kohn-Sham molecular orbital model using a quantitative bond energy decomposition scheme (vide infra).3 * To whom correspondence should be addressed. Fax: +31-20-44 47 629; e-mail: bickel@chem.vu.nl.
However, our study also serves as a further validation of the performance of DFT for this category of molecular systems because, to the best of our knowledge, it represents the first investigation in which mismatched DNA base pairs have been studied with this quantum chemical method. Thus, we compare our optimized structures and bonding energies with ab initio results obtained by Sponer et al.2f To get an idea how the picture is affected by environment effects in molecular crystals of the DNA base pairs, we have included into our investigation model systems for the crystal of the base pair between 9-ethylguanine and 1-methylcytosine.5 This allows also for assessing the nature of the intermolecular forces between GC pairs that are responsible for the cohesion within the layers of which this crystal consists. 2. Theoretical Methods 2A. General Procedure. All calculations were performed using the Amsterdam density functional (ADF) program6 developed by Baerends et al.6b-d and parallelized6b as well as linearized6f by Fonseca Guerra et al. The numerical integration was performed using the procedure developed by te Velde et al.6g,h The MOs were expanded in a large uncontracted set of Slater type orbitals (STOs) containing diffuse functions: TZ2P (no Gaussian functions are involved).6i The basis set is of triple-ζ quality for all atoms and has been augmented with two sets of polarization functions, i.e., 3d and 4f on C, N, O, and 2p and 3d on H. The 1s core shell of carbon, nitrogen, and oxygen were treated by the frozen-core approximation.6c An auxiliary set of s, p, d, f, and g STOs was used to fit the molecular density and to represent the Coulomb and exchange potentials accurately in each self-consistent field cycle.6j Equilibrium structures were optimized using analytical gradient techniques.6k Geometries and energies were calculated at the BP86 level of the generalized gradient approximation (GGA): exchange is described by Slater’s XR potential6l with corrections due to Becke6m,n added self-consistently and correlation is treated in the Vosko-Wilk-Nusair (VWN) parametrization6o with nonlocal corrections due to Perdew6p added, again, self-consistently (BP86).6q
10.1021/cg010034y CCC: $22.00 © 2002 American Chemical Society Published on Web 03/26/2002
240
Crystal Growth & Design, Vol. 2, No. 3, 2002
Fonseca Guerra et al. Scheme 1
2B. Bonding Energy Analysis. The overall bond energy ∆E is made up of two major components (eq 1). In this formula,
∆E ) ∆Eprep + ∆Eint
(1)
the preparation energy ∆Eprep is the amount of energy required to deform the separate bases from their equilibrium structure to the geometry that they acquire in the pair or the crystal model. The interaction energy ∆Eint corresponds to the actual energy change when the prepared bases are combined to form the base pair. It is analyzed in the hydrogen-bonded model systems in the framework of the Kohn-Sham MO model using a Morokuma-type decomposition7 of the bond into electrostatic interaction, exchange repulsion (or Pauli repulsion), and (attractive) orbital interactions (eq 2).3
∆Eint ) ∆Velst + ∆EPauli + ∆Eoi
(2)
The term ∆Velst corresponds to the classical electrostatic interaction between the unperturbed charge distributions of the prepared (i.e., deformed) bases and is usually attractive. The Pauli-repulsion ∆EPauli comprises the destabilizing interactions between occupied orbitals and is responsible for the steric repulsion. The orbital interaction ∆Eoi in any MO model, and therefore also in Kohn-Sham theory, accounts for charge transfer (i.e., donor-acceptor interactions between occupied orbitals on one moiety with unoccupied orbitals of the other, including the HOMO-LUMO interactions) and polarization (empty/occupied orbital mixing on one fragment due to the presence of another fragment). Since the Kohn-Sham MO method of DFT in principle yields exact energies and, in practice, with the available density functionals for exchange and correlation, rather accurate energies, we have the special situation that a seemingly one-particle model (an MO method) in principle completely accounts for the bonding energy. In particular, the orbital-interaction term of the Kohn-Sham theory comprises the often distinguished attractive contributions charge transfer, induction (polarization), and dispersion. One could in the Kohn-Sham MO method try to separate polarization and charge transfer, as has been done by Morokuma in the Hartree-Fock model, but this distinction is not sharp. In fact, contributions such as induction and charge transfer, and also dispersion, can be given an intuitive meaning, but whether, or with what precision, they can be quantified, remains a controversial subject. In view of the conceptual difficulties, we refrain from further decomposing the KS orbital interaction term, except by symmetry; see below. We have observed that the orbital interactions are mostly of the donor-acceptor type (N or O lone pair on one moiety with N-H σ* orbital of the other), and we feel it is therefore justified to denote the full orbital interaction term for brevity just as “charge transfer” or “covalent” contribution, as opposed to the electrostatic and Pauli repulsion contributions. However, the straightforward denotation “orbital interaction” avoids confusion with the charge-transfer energy, which features in other elaborate decomposition schemes9 that also give rise to induction and dispersion contributions, which we do not attempt to quantify but which are all lumped together in the Kohn-Sham orbital interaction.
The orbital interaction energy can be decomposed into the contributions from each irreducible representation Γ of the interacting system (eq 3) using the extended transition state (ETS) scheme developed by Ziegler and Rauk.8 In systems with a clear σ, π, or A′, A′′ separation (such as our DNA bases), this symmetry partitioning proves to be most informative.
∆Eoi )
∑ ∆E
Γ
(3)
Γ
3. Results and Discussion 3A. Watson-Crick and Mismatched DNA Base Pairs. Geometries and Hydrogen Bond Strengths. The results of our BP86/TZ2P study on the formation of mismatches from DNA bases are summarized in Table 1 (energies) and Figure 1 (geometries of WatsonCrick base pairs and the mismatched pairs in Cs symmetry). We follow the nomenclature of Jeffrey and Saenger (ref 4c, Chapter 16), except for (reverse) Watson-Crick AT and GC and (reverse) Hoogsteen AT pairs, which we indicate as AT(R)WC, GC(R)WC, and AT(R)H. For comparison, Table 1 also shows the acronyms used by Sponer et al.2f The choice for the BP86 density functional6m-p and the TZ2P basis set is based on our previous investigation1b of the performance of various GGA density functionals and basis sets for the AT and GC WatsonCrick base pairs in which it was shown that BP86/TZ2P agrees excellently with experiment. In the present study, we have optimized all systems in two ways: (1) in C1 symmetry without any geometry restriction (results not shown in Figure 1), and (2) in Cs symmetry for the purpose of analyzing the intermolecular interactions (geometries shown in Figure 1). The difference between geometries obtained in these two approaches is, in general, very small. All base pairs have equilibrium structures (obtained in C1 symmetry) in which the aromatic cores of the two bases are arranged more or less in one plane. An exception is the GA pair, which is slightly propeller twisted (i.e., ∠C6(G)O6(G)N4(A)C4(A) ) 23° and ∠C6(G)N1(G)N3(A)C4(A) ) 18°) in line with computations of Sponer et al.2k Differences between full (C1) and restricted (Cs) geometry optimization are particularly small in regard to hydrogen bond distances, most of which do not differ at all within the numerical precision of 0.01 Å that can be obtained for the extremely shallow potential energy surfaces under consideration. But, again, for the GA pair, we find slight differences: the O6(G)-N6(A) distance optimizes to 2.77 and 2.78 Å and the N1-N1 distance to 2.94 and 2.89 Å in Cs and C1 symmetry, respectively. In the Cs symmetry
Cohesion in Molecular Crystals and Mismatched DNA Bases
Crystal Growth & Design, Vol. 2, No. 3, 2002 241
Figure 1. Distances (in Å) between proton-donor and proton-acceptor atoms of hydrogen bonds in mismatched pairs of DNA bases computed at BP86/TZ2P (black, blue, red, and small white spheres represent C, N, O, and H, respectively).
optimizations, amino groups are forced into a planar geometry. Amino groups involved in hydrogen bonds remain essentially planar also in case of full optimization in C1 symmetry (except for that in GCRWC). However, if they are not involved in hydrogen bonding (e.g., in GA, GCRWC, GG1, GG3, GT2, and GT3), amino groups adopt in C1 symmetry pyramidal equilibrium structures. In general, the geometrical deformation of bases in a base pair is more pronounced if the basepairing interaction is stronger. This is also reflected by a more destabilizing preparation energy ∆Eprep. Note, for example, that GCWC and GG3 are the most firmly bound base pairs and, at the same time, they also have the largest preparation energy (Table 1). Hydrogen bond strengths have been computed for the fully C1 optimized and the Cs symmetric base pairs. Table 1 shows the bond energies ∆E(C1) and ∆E(Cs), which are both relative to fully C1 optimized bases. The bond energies range from -9.9 kcal/mol for the CT6
mismatch to -26.1 kcal/mol for the GC Watson-Crick pair (GCWC). The latter is only slightly more stable than the GG3 mismatch, which is bound by -24.0 kcal/ mol. Note that ∆E(C1) and ∆E(Cs) are essentially equal, the difference being 0.1 kcal/mol or less. Only for GA and GG1, pyramidalization of amino groups makes ∆E(C1) slightly more stabilizing than ∆E(Cs) by 0.6 and 0.3 kcal/mol, respectively. Thus, we can use Cs symmetry for analyzing the hydrogen bonds in the mismatched DNA pairs (only for GA and GG1, this implies a deviation from the true bond energy that is larger than 0.1 kcal/mol). This offers the possibility to separate orbital interactions into the σ and the π (i.e., A′ and A′′) electron system (vide infra). Comparison with ab Initio Studies. In our previous work1b on AT and GC Watson-Crick base pairs, we found that hydrogen bond lengths optimized at the Hartree-Fock (HF) level2f are up to 0.2 Å longer than those optimized at BP86/TZ2P. Here, we arrive at a
242
Crystal Growth & Design, Vol. 2, No. 3, 2002
Fonseca Guerra et al.
Table 1. Analysis of the Interaction Energy (kcal/mol) between DNA Bases in Watson-Crick and Mismatched Pairsa base pair
∆EPauli
∆Velst
∆EPauli + ∆Velst
∆Eσ
∆Eπ
∆Eoi
∆Eint
∆Eprep
∆E (Cs)b
∆E (C1)c
AA4 AC4 ATH ATRH ATRWC ATWC CC3 CT5 CT6 GA GCRWC GCWC GG1 GG3 GT2 GT3 TT4 TT5
34.1 35.4 34.5 33.5 37.3 38.7 37.7 29.4 26.5 35.8 16.4 51.9 27.5 50.5 35.4 39.5 31.4 28.7
-29.0 -31.3 -30.2 -29.5 -30.1 -31.9 -35.6 -24.0 -22.1 -30.6 -18.3 -48.5 -28.9 -47.0 -30.2 -32.9 -25.2 -23.7
5.1 4.1 4.3 4.0 6.4 6.9 2.1 5.4 4.4 5.2 -1.9 3.4 -1.5 3.6 5.2 6.6 6.2 5.0
-17.2 -18.5 -18.1 -17.5 -19.5 -20.4 -20.4 -16.1 -14.3 -19.8 -9.1 -29.2 -14.5 -28.8 -19.6 -22.1 -17.1 -15.3
-1.5 -2.1 -1.6 -1.5 -1.5 -1.7 -3.0 -2.0 -1.8 -2.5 -1.2 -4.8 -2.2 -4.9 -2.7 -3.0 -1.8 -1.6
-18.7 -20.6 -19.7 -19.0 -21.0 -22.1 -23.4 -18.1 -16.1 -22.3 -10.3 -34.0 -16.7 -33.7 -22.3 -25.1 -18.9 -16.9
-13.6 -16.5 -15.4 -15.0 -14.6 -15.2 -21.3 -12.7 -11.7 -17.1 -12.2 -30.6 -18.2 -30.1 -17.1 -18.5 -12.7 -11.9
1.6 1.9 1.9 1.9 2.0 2.1 2.5 1.9 1.8 2.8 1.1 4.5 2.3 6.0 3.2 3.5 1.9 1.6
-12.0 -14.6 -13.5 -13.1 -12.6 -13.1 -18.8 -10.8 -9.9 -14.3 -11.1 -26.1 -15.9 -24.1 -13.9 -15.0 -10.8 -10.3
-12.0 -14.6 -13.5 -13.1 -12.6 -13.1 -18.8 -10.8 -9.9 -14.9 -11.2 -26.1 -16.2 -24.0 -14.0 -15.1 -10.8 -10.3
∆E (Sponer et al.)d
nomenclature (Sponer et al.)e
-11.0 -13.5 -12.7 -12.6 -11.7 -11.8 -17.5 -10.7 -10.7 -14.1
AA1 AC1
-23.8 -17.0 -22.2 -13.5 -13.9 -10.0 -10.0
CC TC2 TC1 GA1 GG3 GG1 GT1 TT2 TT1
a BP86/TZ2P. b Base pair optimized in C and bases in C symmetry. c Full optimization in C symmetry of base pair and bases. s 1 1 6-31G*(0.25)//HF/6-31G** (ref 2f). e Nomenclature used in ref 2f.
similar result for the mismatches: hydrogen bonds are 0.12 to 0.25 Å longer at the HF/6-31G** level2f than at our BP86/TZ2P level (see Figure 1). For CC3 (called CC in ref 2f), also a geometry optimization at the MP2/631G** level has been published.2f The MP2 geometry agrees better with our BP86 results than the HF geometry. For example, the N3-N4 distance in CC3 is 3.05 Å at HF, 2.92 Å at MP2, and 2.87 Å at our BP86 level. Bond energies calculated at MP2/6-31G*(0.25)//HF/ 6-31G** (0.25 refers to the exponent of the d-polarization functions on second-row elements) do not differ much from our results at BP86/TZ2P//BP86/TZ2P, despite the longer Hartree-Fock bonds. For 17 base pairs compared, the root-mean-square deviation between ab initio and DFT values is 1.1 kcal/mol, and the maximum deviation is 2.3 kcal/mol. For CC3, there is, in addition, also an MP2/6-31G**//MP2/6-31G** bond energy (i.e., bond energy and geometry computed at the same level),2f which agrees even better with our BP86/TZ2P value than the MP2/6-31G*(0.25)//HF/6-31G** bond energy. The bond energy in CC3 is -17.5 kcal/mol at MP2//HF (Table 1), -18.4 kcal/mol at MP2//MP2 (not shown in table) and -18.8 kcal/mol at BP86 (Table 1). Agreement with our BP86 values improves even further if the basis set used in the MP2 computations becomes more flexible. For example, the interaction energy ∆Eint in CC3 is -21.4 kcal/mol at the MP2/cc-pVTZ//MP2/621G** level2i,j (not shown in table) and -21.3 at our BP86/TZ2P level (Table 1). Quantitative Decomposition of the Interaction Energy. The quantitative decomposition of the interaction energy (carried out in Cs symmetry, vide supra) reveals that orbital interactions ∆Eoi constitute a sizable component of 36-43% of the bonding forces ∆Velst + ∆Eoi in the hydrogen bonds (see Table 1 and Figure 2). Thus, the orbital interaction term ∆Eoi is even of the same order of magnitude as the electrostatic interaction ∆Velst. This finding, which differs from the common conception of such hydrogen bonds (i.e., in the range of -10 to -26 kcal/mol) being mainly electrostatic phenomena,4 becomes particularly clear in the bar diagram of Figure 2. Furthermore, as follows from Table 1, without the bonding orbital interactions ∆Eoi, all base pairs would
d
MP2/
Figure 2. Relative contribution (in %) of orbital interaction ∆Eoi (black bars) and electrostatic interaction ∆Velst (grey bars) to the bonding interactions ∆Velst + ∆Eoi in Watson-Crick and mismatched DNA base pairs.
be unbound at their equilibrium geometry, except for two base pairs: the reverse GC Watson-Crick pair (GCRWC) and the GG1 mismatch, for which ∆EPauli + ∆Velst is slightly attractive (-1.9 and -1.5 kcal/mol, respectively). This is in perfect agreement with earlier studies on Watson-Crick base pairs and weakly bound mimics of AT (such as AF) in which CdO and N-H bonds have been replaced by C-F and C-H, respectively.1c,10 In the latter,10 orbital interactions are somewhat less important than in the presently studied mismatches of natural DNA bases, but still they are never less than one-third of the bonding interactions. Further decomposition of ∆Eoi into contributions from the σ and π electron systems shows that it is strongly dominated by ∆Eσ, which is about 1 order of magnitude larger than ∆Eπ (Table 1). We have shown previously that the σ orbital interactions in the hydrogen bonds of Watson-Crick pairs and artificial mimics thereof are
Cohesion in Molecular Crystals and Mismatched DNA Bases
Crystal Growth & Design, Vol. 2, No. 3, 2002 243
Figure 3. Distances (in Å) between proton-donor and proton-acceptor atoms of hydrogen bonds in the guanine-cytosine (GC) base pair and in two models for a layer of the corresponding crystal (i.e., [GC]4-A and [GC]4-B) from BP86/TZ2P computations (this work), and experimental distances from X-ray diffraction studies on a crystal of 9-ethylguanine-1-methylcytosine (from ref 5).
provided by donor-acceptor-type frontier orbital interactions between occupied, lone-pair like orbitals (e.g., on N, O, or F) with unoccupied σ* orbitals (e.g., of N-H or C-H bonds).1c,10 3B. Orbital Interactions in Hydrogen Bonds of Molecular Crystals. Proceeding from the above findings, one may wonder if orbital interactions are also important for the intermolecular forces that hold together molecular crystals of DNA bases. Thus, we have investigated the crystal of Watson-Crick pairs of 9-ethylguanine (G′) and 1-methylcytosine (C′). This system has been structurally characterized by O’Brien5 using X-ray diffraction techniques. It consists of layers in which infinite [G′C′]∞ strips are arranged next to each other; a fragment of such a strip is shown in Figure 3, right. In the crystal, the hydrogen bond distances O6N4, N1-N3, and N2-O2 are 2.94, 2.92, and 2.81 Å.5 Note how this differs from the corresponding values 2.73, 2.88, and 2.87 Å of the isolated GC Watson-Crick pair in the gas phase (Figure 3, left: GCWC). Not only do the values differ as such but also the qualitative bond length pattern is reversed, i.e., long-long-short in the crystal and short-long-long in the gas phase. Earlier,1a,b we encountered a similar situation when we compared our gas-phase GCWC model with the X-ray crystal structure11 of sodium guanylyl-3′,5′-cytidine nonahydrate. Excellent agreement between the theoretical (BP86/TZ2P) and the experimental (X-ray) structure was obtained only with model systems that account for the molecular environment of GC in the crystal, i.e., all close contacts with crystal water, counterions, and hydroxy groups of neighboring sugar units (modeled also by a water molecule).1a,b Accordingly, the molecular environment in the present case is modeled by extending our model system from GCWC to a segment of four GC pairs. This can be done in two manners, leading to [GC]4-A and [GC]4-B (each consisting of 116 atoms; Figure 3, center) in which the two central base pairs
experience all the N-H‚‚‚N close contacts that occur between the base pairs within an infinite [G′C′]∞ strip in the crystal. Note however that our [GC]4-A and [GC]4B models neither include the interactions between [G′C′]∞ strips within a layer nor the stacking interactions between [G′C′]∞ strips in two different layers. Nevertheless, going from GC (2.73, 2.88, and 2.87 Å) to [GC]4-A (2.86, 2.89, and 2.85 Å) or [GC]4-B (2.85, 2.89, and 2.84 Å) leads to a significant improvement with experiment (2.94, 2.92, and 2.81 Å; see Figure 3): the hydrogen bonds O6-N4 and N1-N3 expand (the former more than the latter) and N2-O2 contracts, making a step halfway from the bond length pattern short-longlong in gas-phase GCWC to long-long-short in the crystal. Even better agreement with experiment is obtained for the hydrogen bonds that connect the two inner GC pairs in [GC]4-A and [GC]4-B. For [GC]4-A, the computed distance between guanine-N7 of one GC pair and cytosine-N4 of the other GC pair is 3.07 Å, while the experimental value is 3.11 Å. Likewise, in [GC]4-B, the computed distance between guanine-N3 of one GC pair and guanine-N2 of the other GC pair is 2.96 Å, while the experimental value is 3.01 Å. These results suggest that the remaining discrepancy is caused by an incomplete model system (lacking, e.g., stacking interactions, vide supra) and makes us confident that our GGA density functional approach is adequate for accurately describing the interactions in the G′C′ crystal. Next, the intermolecular forces in our model crystal [GC]4-A are examined. We divide the model system [GC]4-A and portions thereof consisting of one or two GC pairs (geometry frozen to situation of [GC]4-A) into fragments in two different ways, such that we can analyze the Watson-Crick interactions (I-III) and the intermolecular [GC]-[GC] interactions between the two inner GC pairs of the crystal model (IV and V, see Scheme 2). The results of the quantitative decomposi-
244
Crystal Growth & Design, Vol. 2, No. 3, 2002
Fonseca Guerra et al. Scheme 2
Scheme 3
Table 2. Analysis of Intermolecular Interactions in a Layer of the GC Crystala,b I EPauli Velst EPauli + Velst Eσ Eπ Eoi Eint
II
III
IV
V
VI
VII
VIII
44.1 88.2 88.9 22.3 22.3 27.1 27.9 27.8 -44.7 -89.5 -88.6 -25.5 -25.3 -22.3 -24.1 -24.4 -0.6 -1.3 0.3 -3.2 -3.0 4.8 3.8 3.4 -25.3 -50.6 -50.2 -12.4 -12.4 -14.3 -13.3 -13.1 -4.2 -8.4 -8.5 -2.7 -2.7 -1.3 -0.9 -0.8 -29.5 -59.0 -58.7 -15.1 -15.2 -15.6 -14.2 -13.9 -30.1 -60.3 -58.4 -18.3 -18.2 -10.8 -10.4 -10.5
a G-C interactions are analyzed for model systems [GC]-A, [GC]2-A, and [GC]4-A (I, II, III). [GC]-[GC] interactions are analyzed for [GC]2-A and [GC]4-A (IV, V), and [GC]2-B, [GC]2-B, and [GC]4-B (VI, VII, VIII) (see footnote b). b BP86/TZ2P. See Schemes 2 and 3 for model systems: [GC]4-A and [GC]4-B are an optimized structure; [G]2-B, [GC]-A,B, and [GC]2-A,B are frozen fragments taken from the optimized [GC]4-A or [GC]4-B structure.
tion of the corresponding interaction energies ∆Eint are collected in Table 2. It follows from I that orbital interactions are still important in the Watson-Crick interaction of a GC portion in the model crystal where they contribute -29.5 kcal/mol or 40% of the attractive forces ∆Velst + ∆Eoi (Table 2). The slight geometrical deformation that a GC pair experiences in the crystal (compare GCWC and [GC]4-A in Figure 3) does apparently not change the nature of the hydrogen bonds involved (compare GCWC and I in Tables 1 and 2, respectively; for GCWC, see also refs 1a,c). Also, there is no significant influence of one neighboring GC pair that interacts via two hydrogen bonds (N4-H4′‚‚‚N7 and N7‚‚‚H4′-N4) with the first GC pair, as follows from II (Table 2): going from I to II leads just to a doubling both of the overall WatsonCrick interaction ∆Eint and of each individual term including the orbital interactions ∆Eoi. Furthermore, the two outer GC pairs in [GC]4-A have essentially no influence on strength and nature of the Watson-Crick interaction along the two inner GC pairs, as follows from comparing II and III in Table 2. Orbital interactions still
provide 40% of the attractive interactions; the other 60% is provided by electrostatic attraction. So far, we have considered “breaking” the [GC]4-A model strip along the Watson-Crick contacts of two neighboring GC pairs. This requires surmounting an interaction energy of about 59 kcal/mol (see III in Table 2). However, the weakest link in a [G′C′]∞ strip is not the Watson-Crick bond within a base pair but one of the two types of intermolecular interactions between two neighboring GC pairs. There are two types of [GC][GC] contacts, which occur alternately along the strip of GC pairs: (i) the two hydrogen bonds N4-H4′‚‚‚N7 and N7‚‚‚H4′-N4 involving the guanine and cytosine bases of both base pairs; (ii) the two hydrogen bonds N2-H2′‚‚‚N3 and N3‚‚‚H2′-N2 between two neigboring guanine bases. The [GC]-[GC] interaction ∆Eint provided by N4-H4′‚‚‚N7 and N7‚‚‚H4′-N4 amounts to -18 kcal/mol (see IV and V in Table 2). As follows from comparison of IV and V, this [GC]-[GC] interaction is not much affected by extending the model system from two (IV) to four (V) base pairs: the overall interaction energy as well as each of its components remains equal within one tenth of a kcal/mol. Interestingly, the relative contribution of the orbital interaction term ∆Eoi (-15.2 kcal/mol) is 38%. The weakest link in a [G′C′]∞ strip is the [GC]-[GC] interaction ∆Eint provided by N2H2′‚‚‚N3 and N3‚‚‚H2′-N2, which amounts to only -10.5 kcal/mol (see VI, VII, and VIII in Table 2). The relative contribution of the orbital interaction term is 36%. Again, extending the model system from two (VII) to four (VIII) base pairs causes only marginal changes in this [GC]-[GC] interaction energy and its components. Only when the model system is reduced from two GC base pairs (VII) to the two interacting guanine bases involved (VI), the interaction energy and, especially, its components change somewhat more, namely, by up to 2 kcal/mol (Table 2). We conclude that the cohesion within a [G′C′]∞ strip in the crystal is not a predominantly electrostatic phenomenon. Instead, it is provided for 36-40% by orbital interactions, which are, therefore, of the same order of magnitude as the electrostatic term. 4. Conclusions The cohesion in molecular crystals and in mismatched pairs of DNA bases is, in contrast to widespread belief, provided to an important extent by frontier orbital interactions (that is, lone-pair orbitals interacting with unoccupied N-H σ* orbitals) associated with the hydrogen bonds between these molecules. The orbital interaction term accounts for about 40% of all attractive forces; the other 60% stem from classical electrostatic
Cohesion in Molecular Crystals and Mismatched DNA Bases
attraction. This follows from our BP86/TZ2P study of 18 DNA base pairs, and model systems consisting of 4 GC pairs that simulate a layer in the crystal of WatsonCrick pairs of 9-ethylguanine and 1-methylcytosine. Our BP86/TZ2P DFT bond energies for the mismatched DNA base pairs are in close agreement with ab initio MP2/6-31G*(0.25)//HF/6-31G** results (rootmean-square deviation ) 1.1 kcal/mol). Also, for the hydrogen bond distances in a mismatched pair between two cytosines (CC3), we achieve excellent agreement between our BP86/TZ2P values and results2f from MP2/ 6-31G**. On the other hand, hydrogen bond lengths optimized at HF/6-31G** are up to 0.25 Å longer than our BP86/TZ2P values. This is further evidence that DFT in the generalized gradient approximation (GGA) is an adequate and more efficient alternative for traditional ab initio methods for theoretical investigation of (large) molecular systems involving hydrogen bonds.
Crystal Growth & Design, Vol. 2, No. 3, 2002 245
(3) (4)
(5) (6)
Acknowledgment. We thank the National Research School Combination-Catalysis (NRSC-C) for a postdoctoral fellowship for C.F.G. and the National Computer Facilities (NCF) foundation of The Netherlands Organization for Scientific Research (NWO) for financial support. References (1) (a) Fonseca Guerra, C.; Bickelhaupt, F. M. Angew. Chem. 1999, 111, 3120; Fonseca Guerra, C.; Bickelhaupt, F. M. Angew. Chem., Int. Ed. Engl. 1999, 38, 2942. (b) Fonseca Guerra, C.; Bickelhaupt, F. M.; Snijders, J. G.; Baerends, E. J. J. Am. Chem. Soc. 2000, 122, 4117. (c) Fonseca Guerra, C.; Bickelhaupt, F. M.; Snijders, J. G.; Baerends, E. J. Chem. Eur. J. 1999, 5, 3581. (2) (a) Hobza, P.; Sponer, J. Chem. Rev. 1999, 99, 3247. (b) Bertran, J.; Oliva, A.; Rodrı´guez-Santiago, L.; Sodupe, M. J. Am. Chem. Soc. 1998, 120, 8159. (c) Sponer, J.; Hobza, P. In Encyclopedia of Computational Chemistry; von Rague´ Schleyer, P.; Clark, T.; Gasteiger, J.; Kollman, P. A.; Schaefer III, H. F.; Schreiner, P. R., Eds.; John Wiley and Sons: Chichester, New York, Weinheim, Brisbane, Singapore, Toronto, 1998; Vol 1; pp 777-789. (d) Brameld, K.; Dasgupta, S.; Goddard, W. A., III J. Phys. Chem. B 1997, 101, 4851. (e) Sponer, J.; Hobza, P.; Leszczynski, J. In Comput. Chem. Rev. Curr. Trends; Leszczynski, J., Ed.; World Scientific Publisher: Singapore, 1996, pp 185-218. (f) Sponer, J.; Leszczynski, J.; Hobza, P. J. Phys. Chem. 1996, 100, 1965. (g) Gould, I. R.; Kollman, P. A. J. Am. Chem. Soc. 1994, 116, 2493. (h) Santamaria, R.; Va´zquez, A. J. Comput. Chem. 1994, 15, 981. (i) Sponer, J.; Hobza, P. J. Phys. Chem. A 2000, 104, 4592. (j) Hobza, P.; Sponer, J.; Cubero, E.; Orozco, M.; Luque, F. J. J. Phys. Chem. B,
(7) (8) (9) (10) (11)
2000, 104, 6286. (k) Sponer, J.; Florian, J.; Hobza, P.; Leszczynski, J. J. Biomol. Struct. Dyn. 1996, 13, 827. Bickelhaupt, F. M.; Baerends, E. J. In Rev. Comput. Chem.; Lipkowitz, K. B.; Boyd, D. B., Eds.; Wiley-VCH: New York, 2000; Vol. 15, pp 1-86. See, for example: (a) Stryer, L. Biochemistry, W. H. Freeman and Company: New York, 1988, Chapter 1. (b) Voet, D.; Voet, J. G. Biochemistry, Wiley: New York, 1995; Chapter 2. (c) Jeffrey, G. A.; Saenger, W. Hydrogen Bonding in Biological Structures, Springer-Verlag: Berlin, 1991, Chapter 2. (d) Jeffrey, G. A. An Introduction to Hydrogen Bonding, Oxford University Press: New York, 1997; Chapter 2. (e) Desiraju, G. R.; Steiner, T. The Weak Hydrogen Bond, Oxford University Press: New York, 1999; Chapter 1. O’Brien, E. J. Acta Crystallogr. 1967, 23, 92. (a) te Velde, G.; Bickelhaupt, F. M.; van Gisbergen, S. J. A.; Guerra, C Fonseca; Baerends, E. J.; Snijders, J. G.; Ziegler, T. J. Comput. Chem. 2001, 22, 931. (b) Fonseca Guerra, C.; Visser, O.; Snijders, J. G.; te Velde, G.; Baerends, E. J. In Methods and Techniques for Computational Chemistry; E. Clementi, G. Corongiu, Eds.; STEF: Cagliari, 1995; pp 305-395. (c) Baerends, E. J.; Ellis, D. E.; Ros, P. Chem. Phys. 1973, 2, 41. (d) Baerends, E. J.; Ros, P. Chem. Phys. 1975, 8, 412. (e) Baerends, E. J.; Ros, P. Int. J. Quantum. Chem. Symp. 1978, 12, 169. (f) Fonseca Guerra, C.; Snijders, J. G.; te Velde, G.; Baerends, E. J. Theor. Chem. Acc. 1998, 99, 391. (g) Boerrigter, P. M.; te Velde, G.; Baerends, E. J. Int. J. Quantum Chem. 1988, 33, 87. (h) te Velde, G.; Baerends, E. J. Comput. Phys. 1992, 99, 84. (i) Snijders, J. G.; Baerends, E. J.; Vernooijs, P. At. Nucl. Data Tables 1982, 26, 483. (j) Krijn, J.; Baerends, E. J. Fit-Functions in the HFS-Method; Internal Report (in Dutch), Vrije Universiteit, Amsterdam, 1984. (k) Versluis, L.; Ziegler, T. J. Chem. Phys. 1988, 88, 322. (l) Slater, J. C. Quantum Theory of Molecules and Solids, Vol. 4; McGraw-Hill: New York, 1974. (m) Becke, A. D. J. Chem. Phys. 1986, 84, 4524. (n) Becke, A. Phys. Rev. A 1988, 38, 3098. (o) Vosko, S. H.; Wilk, L.; Nusair, M. Can. J. Phys. 1980, 58, 1200. (p) Perdew, J. P. Phys. Rev. B 1986, 33, 8822 (Erratum: Phys. Rev. B 1986, 34, 7406). (q) Fan, L.; Ziegler, T. J. Chem. Phys. 1991, 94, 6057. (a) Morokuma, K. J. Chem. Phys. 1971, 55, 1236. (b) Kitaura, K.; Morokuma, K. Int. J. Quantum. Chem. 1976, 10, 325. (a) Ziegler, T.; Rauk, A. Inorg. Chem. 1979, 18, 1755. (b) Ziegler, T.; Rauk, A. Inorg. Chem. 1979, 18, 1558. (c) Ziegler, T.; Rauk, A. Theor. Chim. Acta 1977, 46, 1. (a) Stone, A. J. The Theory of Intermolecular Forces; Clarendon Press: Oxford, 1996. (b) Stone, A. J. Chem. Phys. Lett. 1993, 211, 101. Fonseca Guerra, C.; Bickelhaupt, F. M. Angew. Chem., accepted for publication. (a) Rosenberg, J. M.; Seeman, N. C.; Day, R. O.; Rich, A. J. Mol. Biol. 1976, 104, 145. See also (b) Saenger, W. Principles of Nucleic Acid Structure; Springer-Verlag: New York, 1984.
CG010034Y