J. Phys. Chem. B 1999, 103, 10955-10964
10955
Contribution of the Phosphodiester Backbone and Glycosyl Linkage Intrinsic Torsional Energetics to DNA Structure and Dynamics Nicolas Foloppe† and Alexander D. MacKerell, Jr.* Department of Pharmaceutical Sciences, School of Pharmacy, UniVersity of Maryland, Baltimore, Maryland 21201 ReceiVed: August 2, 1999
The intrinsic conformational energetics of macromolecules are an important contribution to their structure and dynamics. This information, however, is lacking for most dihedrals about rotatable bonds in nucleic acids. To bridge this gap, high-level ab initio quantum mechanical calculations have been performed on a number of furanose-based model compounds designed to model the intrinsic torsional energetics of the dihedrals , γ, β, and χ in deoxyribonucleic acid (DNA). Energy profiles have been obtained with furanose in both the C3′ endo and C2′ endo conformations, to model the torsions in the context of the two conformational ranges populated by the sugar in DNA. The resulting energy profiles are compared to the corresponding crystallographically determined population distributions, with which they can be reconciled to a large extent. This suggests that the models used to derive these energy surfaces are relevant to DNA. Torsion is found to be intrinsically very flexible with the low-energy regions encompassing the BI and BII substates. The BI internal energy is found to be on the order of 1.5 kcal/mol more stable than the BII conformation. The torsional energy barrier between BI and BII is relatively low, on the order of 2.1 kcal/mol. Estimates of other torsional energy barriers suggest that the paths of lowest energy between γ ) g+ and γ ) g- and between the χ anti and syn orientations are through γ ) 0° and χ ) 120°, respectively. The χ energy surfaces offer insights into the contribution of base-furanose interactions to the corresponding distributions in DNA and its components, in terms of purines versus pyrimidines, and point to the unique properties of cytosine. These surfaces also indicate that a syn base interacts more favorably with a north furanose than with a south furanose, in contrast with a widely accepted view.
1. Introduction It is now well-documented that DNA structural variability and flexibility is related to its biological functions. DNA’s inherent flexibility has been uncovered by the realization that double-stranded DNA can exist in different forms depending on its base sequence and the environmental conditions.1-3 The best characterized forms of DNA double helices are the A, B, and Z forms, with the B form being prevalent under physiological conditions. The critical importance of DNA flexibility for its biological functions is apparent from structures of DNAprotein complexes in which DNA can adopt a wide variety of conformations. This is illustrated by the wrapping of DNA around the histone core,4 by DNA bending and/or unwinding upon binding of some transcription factors,5-9 and by flipping of nucleotides out of the DNA double helix in some DNAenzyme complexes.10-12 The action of some DNA-targeting drugs may also be related to their ability to distort DNA, as suggested by the structure of DNA bound to the anticancer drug cis-platin.13 DNA triple helices14 and the growing number of unusual DNA conformations15 are further illustrations of DNA flexibility. It is clear that a better understanding of this wide array of structural diversity requires some knowledge of the energetics of the degrees of freedom involved. The internal degrees of freedom which can contribute to DNA flexibility include the sugar pseudorotation angle and ampli* To whom correspondence should be addressed. † Present address: Center for Structural Biochemistry, Department of Bioscience, Karolinska Institutet, S-141 57, Huddinge, Sweden.
tude,16 the torsions along the sugar-phosphate backbone (R, β, γ, , ζ) and about the glycosyl linkage (χ). Structural analysis of a variety of DNAs has long revealed direct relationships between the sugar conformation and the overall structure of the DNA double helices,17,18 arousing considerable interest in the intrinsic conformational energetics of the sugar moiety.19-26 However, an analysis of the intrinsic conformational energetics of the sugar-phosphate and glycosyl torsions is also essential to better understand and model the conformational properties of DNA and its components. This is evidenced by the welldocumented correlation between these torsions and the overall structure of DNA.17,18,27 The variability in the DNA torsions associated with the phosphodiester backbone is further illustrated by these torsions populating several substates, in both the B27-33 and Z forms.27,34 The two major substates of the B DNA backbone described so far are referred to as the BI ( in trans; ζ in g-) and BII substates ( in g-; ζ in trans), with the BI conformation more frequently observed than the BII in crystal structures as well as in solution. It has been proposed that this structural variability may play a role in its sequence-specific recognition by proteins.32,35 The BI /BII ratio in solution is, however, controversial.35,36 The intrinsic energetics of the torsions in the sugar-phosphate backbone and about the glycosyl linkage, however, has remained largely unexplored. This contrasts with the situation in protein science where the conformational energetics of protein building blocks has systematically been investigated using ab initio quantum mechanical techniques.37 Although experimental ap-
10.1021/jp992716q CCC: $18.00 © 1999 American Chemical Society Published on Web 11/16/1999
10956 J. Phys. Chem. B, Vol. 103, No. 49, 1999 proaches have yielded a wealth of information concerning the various conformational ranges accessible to the dihedral angles in DNA, the associated energetics remains unclear. Condensedphase structural information from experimental approaches includes possible contributions from solvent effects, crystalpacking interactions, or other internal degrees of freedom in the molecules investigated. It is thus difficult to derive the intrinsic energetics of the degrees of freedom of interest solely from statistical analysis of the condensed-phase structures. Another limitation of the experimental distributions is that they provide little, if any, information concerning the magnitude of the energy barriers which are expected to separate the populated conformational ranges. Theoretical calculations can complement experimental methods and provide further insights regarding the intrinsic energetics of the torsional degrees of freedom in DNA, independently of condensed-phase effects, and on the full conformational space of a given torsion. Hard-sphere models38-41 and early empirical potentials42-47 provided some valuable insights into the gross conformational features of DNA and its constituents; however, such coarse physical models cannot yield accurate torsional energy profiles. Other work used semiempirical quantum mechanical calculations to map the conformational energy about the torsions γ48 and χ.49-51 However, the accuracy of semiempirical calculations is limited by the approximations inherent in the methodology, and these calculations were carried out with fixed bond lengths and valence angles, adding to the uncertainty of the results. Furthermore, most of these earlier calculations were performed using full nucleosides or nucleotides, suggesting that the resulting energy profiles may also reflect steric interactions between distant chemical groups in addition to the intrinsic torsional energy about the torsion of interest. In the present work the torsional energy surfaces associated with the dihedral angles , γ, β, and χ (R and ζ have been discussed elsewhere42,52-55) are examined using high-level ab initio quantum mechanical calculations. Because the furanose ring occupies a central position in the structure of DNA, the calculations have been carried out on a series of furanose-based model compounds. The model compounds have been tailored to be of minimal complexity, in order to investigate one torsion at a time as independently as possible from other torsions and to avoid as much as possible steric clashes between chemical groups which are not directly covalently linked. Therefore, the resulting torsional energy profiles are expected to closely represent the intrinsic torsional energy contributions present in DNA and its components. Each torsional energy profile has been obtained with both the C3′ endo and the C2′ endo conformations of the furanose, which model the two conformational ranges populated by the furanose in DNA.17 Before the calculated energy profiles are used for interpretation, their relevance to DNA is assessed, where possible, by comparison to experimental data. This comparison allows for a better understanding of how the torsional energy associated with the corresponding degrees of freedom contributes to the conformational properties of DNA and related nucleosides and nucleotides. Comparison of the energy profiles obtained with the C3′ endo and the C2′ endo furanose allows analysis of the influence of the furanose conformation on the energetics of the exocyclic torsions, in relationship to the relative stabilities of the A and B forms of DNA. To analyze in more detail the B DNA backbone intrinsic flexibility, we have also investigated the relative intrinsic energies of the backbone BI and BII substates.
Foloppe and MacKerell
Figure 1. Model compounds A, B, C, D, E, and F. Torsions R, β, γ, , ζ, and χ are indicated by the curved arrows.
Although this analysis is of general interest to the structure, dynamics, and energetics of DNA, a major application of the present work will be the calibration of nucleic acids empirical force fields. 2. Methods Structures of the model compounds used in the present work are shown in Figure 1. Each model compound will be referred to by its letter designation throughout the text. For the model compounds we use the atom names and dihedral angle nomenclature corresponding to their nucleic acid counterparts.3 Accordingly, the dihedral angles are defined as follows:
R β γ δ ζ O s P s O5′ s C5′ s C4′ s C3′ s O3′ s P s O χ, purines: O4′-C1′-N9-C4 and χ, pyrimidines: O4′-C1′-N1-C2 The syn and anti orientations of the base relative to the furanose ring are defined as in ref 3. Torsion was modeled using model compounds A and B, torsion γ was modeled using compounds C and D, and torsion β was modeled using compound C. The χ torsion was examined with each of the four usual DNA bases, using the generic structure of compound E. Compound F was designed to model the DNA backbone in its BI and BII substates. Quantum mechanical calculations were carried out with the Gaussian 94 program56 using the 6-31G* and 6-31+G* basis sets for neutral and anionic compounds, respectively. All calculations were performed at the second-order Møller-Plesset (MP2) level of theory, except when noted. Energy minimizations were performed to the default tolerances in the Gaussian program. All torsional energy surfaces (except with compound F for the BI and BII substates) were obtained with both the C3′ endo
Phosphodiester Backbone and Glycosyl Linkage Energetics
J. Phys. Chem. B, Vol. 103, No. 49, 1999 10957
TABLE 1: Modal Values of the r, β, γ, E, ζ, and χ Dihedral Angles Distributions in Deoxyribonucleotides of the A, B, and Z Formsa torsion (deg)
A DNA
B DNA
Z DNA
R β γ ζc χ
291 175 57 205 287 199
298 168 51 187 262 252
70/200b 185 51/179b 265 74/296b 66/207b
a As obtained from NDB as of March ’98 (see Methods section). For the R, , γ, and χ torsions of Z DNA, the two reported values correspond to the two modes in the corresponding distributions. c R and ζ torsional energy profiles have not been investigated in this work; however, these dihedrals were constrained to the present modal values when investigating other torsions (see Methods). Only some modal values for A and B DNA were used to constrain the corresponding torsion in some energy surfaces (see Methods section). b
Figure 3. (A) Scattergram of zeta as a function of epsilon in B DNA crystal structures in the NDB as of March 1998. (B) Potential energy contours as a function of epsilon and zeta for compound F. The BI and BII substates are indicated.
Figure 2. Potential energy as a function of (epsilon) in (A) compound A and (B) compound B with the furanose in the C2′ endo ([) or C3′ endo (0) conformation. The potential energy profiles were offset relative to the global energy minimum (see the Methods section). (C) The probability distribution in A DNA (thick line), B DNA (thin line), and Z DNA (dotted line) crystal structures.
and C2′ endo conformations of the furanose to model the north and south conformational ranges populated by the sugar in DNA, respectively. The furanose pseudorotation angle was constrained to the C3′ endo or C2′ endo conformation by fixing either the C4′-O4′-C1′-C2′ or the C3′-C4′-O4′-C1′ dihedral to 0.0°, respectively. The pseudorotation angle was therefore kept at, or very close to, 18° (north) or 162° (south) by constraining only one furanose endocyclic torsion. This allows the puckering amplitude to adjust in response to changes in the orientation of the exocyclic substituents. With compounds A and C the additional exocyclic torsions (those not systematically varied when investigating a specific torsion) were fixed at their A DNA (C3′ endo furanose) or B DNA (C2′ endo furanose) modal values, as obtained from crystal structures of oligo-deoxyribonucleotides (Table 1; see below). The Z DNA modal values in Table 1 are given for the sake of completeness, but were not used to constrain any torsion. Dihedral angle energy profiles were obtained by fixing the dihedral angle of interest and
optimizing the structure with all other degrees of freedom allowed to relax, except for those mentioned above. All exocyclic torsions have been sampled in a 30° increment. The dihedral of interest in each dihedral angle energy surface was allowed to relax to its energy minima. For each torsion the global energy minimum for both the C2′ endo and C3′ endo furanose energy profiles was located. For clarity, the minimum of lowest energy within an individual energy profile (for a given furanose conformation) will be referred to as the main energy minimum, to avoid confusion with the overall global energy minimum defined above. Torsional energy surfaces were offset relative to either the global or main minima, as noted. The energy barrier between two energy minima is defined here as the energy difference between the energy minimum of lowest energy and the point of highest energy obtained by discrete sampling between the two minima. No attempt has been made to locate the true energy maximum. It has been checked that no intramolecular hydrogen bond is formed between the furanose O4′ oxygen and a 5′-OH or 3′-OH group when investigating torsions β, γ, and with model compounds A, B, C, and D. The BI and BII substates of compound F were both energyminimized at the Hartree-Fock (HF) level from an initial conformation with a C2′ endo furanose, R ) -60.0° and β ) 180.0°. The initial conformations for and ζ in compound F were 180.0° and 270.0°, respectively, for the BI substate, and 270.0° and 180.0° for the BII substate. None of these degrees of freedom were constrained during the energy minimization. The HF optimized BI and BII structures were used as the starting point for further energy optimization at the MP2 level. All the torsional degrees of freedom (including the furanose pseudorotation) remained close to their initial values in the energy-
10958 J. Phys. Chem. B, Vol. 103, No. 49, 1999
Foloppe and MacKerell TABLE 2: Selected Descriptors Related to the E, γ, β, and χ Torsional Energy Profilesa torsion compound
A B
γb
C D
Figure 4. Potential energy as a function of γ (gamma) in (A) compound C and (B) compound D with the furanose in the C2′ endo ([) or C3′ endo (0) conformation. The potential energy profiles were offset relative to the global energy minimum (see the Methods section). (C) The γ probability distribution in A DNA (thick line), B DNA (thin line), and Z DNA (dotted line) crystal structures.
minimized stuctures of compound F at both the HF and MP2 levels of theory. The path of lowest energy between BI and BII was mapped by calculating the HF/6-31+G* potential energy, by systematically varying both and ζ values in 10° increments, from 150° to 300°. When this two-dimensional energy map was calculated, R and β were fixed to their B DNA-like values (Table 1), and the furanose was constrained to C2′ endo. This map is presented as an energy contour plot, with iso-energy lines drawn every 0.25 kcal/mol. The point of highest energy along the path of lowest energy between BI and BII was taken as the most likely energy barrier between BI and BII. The corresponding structure was then energy-minimized at the MP2/6-31+G* level with and ζ fixed and R, β, and the furanose pseudorotation allowed to relax, providing an estimate of the torsional energy barrier energy between BI and BII. Distributions for the dihedral angles in crystal structures of DNA duplexes were from the nucleic acids database57 as of March 1998. Structures containing nonstandard DNA components, bound drugs, or protein were excluded. In Figures 2 and 4-6, the distributions are presented as probability distributions following sorting into 2° and have been obtained separately for the A, B, and Z DNA families. Each of these separate distributions was fitted by a Gaussian function, and the peak of the Gaussian was taken as the modal value of the distribution (Table 1). This procedure yields modal values which do not differ significantly from the modal values obtained directly from the raw distributions: 21 out of the 24 values contained in Table 1 differ by less than 5° from the modal values obtained directly from the raw distributions. The largest deviation (9°) is for β in B DNA, for which the modal value obtained directly from the raw distribution is 177°. The modal values listed in Table 1 are very close to the mean values given by Schneider et al.27
β
C
χ
E
Pucker
base
C2′ endo C3′ endo C2′ endo C3′ endo C2′ endo C3′ endo C2′ endo C3′ endo C2′ endo C3′ endo C2′ endo adenine guanine cytosine thymine C3′ endo adenine guanine cytosine thymine
∆E
B
275 198 300 287 293 46 47 49
0.9