Free-Energy Perturbation Calculations of DNA Destabilization by Base

Here we report calculations of the relative free energy of nucleotide ... stabilities of non-Watson−Crick base pairs by “first principle” calcul...
0 downloads 0 Views 137KB Size
10092

J. Phys. Chem. B 2000, 104, 10092-10099

Free-Energy Perturbation Calculations of DNA Destabilization by Base Substitutions: The Effect of Neutral Guanine‚Thymine, Adenine‚Cytosine and Adenine‚Difluorotoluene Mismatches Jan Floria´ n,*,†,‡ Myron F. Goodman,‡ and Arieh Warshel† Department of Chemistry and Department of Biological Sciences, UniVersity of Southern California, Los Angeles, California 90089-1062 ReceiVed: May 11, 2000; In Final Form: August 18, 2000

The ability to determine the stability of non-Watson-Crick base pairs in DNA by computer simulation is a necessary prerequisite for quantitative theoretical studies of DNA replication fidelity. Here we report calculations of the relative free energy of nucleotide mismatches in DNA. Our study evaluated the “solvation” free energy cost associated with thymine (T) f difluorotoluene (F), T f cytosine (C), and C f T transformations in aqueous solution and in a DNA duplex. The free energy differences were evaluated by the free energy perturbation (FEP) method based on classical all-atom simulations in a spherical surface-constraint water model. The dependence of the calculated free energies on the DNA sequence, the simulation length, size of the water droplet, and phosphate charges was examined. The calculations were carried out for both the negatively charged DNA backbone neutralized by explicit Na+ counterions and the neutral backbone with no counterions. Although the latter model provided overall free energy differences that were less sensitive to the radius of the simulation sphere and to the total simulation time, the results obtained with short 150 ps simulations for the fully charged system containing counterions in 18 Å water sphere showed the best overall agreement with the corresponding experimental results. The overall Watson-Crick geometry of an adenine‚thymine (A‚T) base pair has not changed during its transformation into an A‚F base pair, where F is a nonpolar isostere of thymine. The calculated change in the duplex binding free energy at 25 °C (∆∆G°bind) for this “mutation” was 5.1 kcal/mol, compared to 3.6 ( 1.7 kcal/mol determined previously from the observed DNA melting thermodynamics. The transformation of C, forming a Watson-Crick pair with guanine (G), into T yielded a wobble G‚T mismatch analogous to the one observed by X-ray crystallography. ∆∆G°bind obtained for this mismatch (relative to the Watson-Crick G‚C base pair) by FEP calculations were in a 1.0-3.4 kcal/mol range depending on DNA sequence and simulation protocol used. This result is in reasonable agreement with the experimental estimate of about 3.6 kcal/mol. Starting from an A‚T base pair, mutation of the thymine carbonyl group into the amino group led to a change of the overall geometry of the base pair from Watson-Crick to reverse wobble. The ∆∆G°bind of the resulting neutral A‚C mismatch (relative to the Watson-Crick A‚T base pair) calculated using the neutral and ionic DNA model was 10 and 7 kcal/mol, respectively. Using the observed binding affinity and pKa constants of the N1-protonated A+‚C base pair, the corresponding experimental ∆∆G°bind of the neutral A‚C base pair was determined to be 7.3 ( 1.5 kcal/mol. Our ability to reproduce reasonably stabilities of non-Watson-Crick base pairs by “first principle” calculations will be used in future calculations of DNA polymerase fidelity.

1. Introduction Base substitution mutations typically result from the errant copying of a DNA template when DNA polymerase fails to form a canonical adenine-thymine (A‚T) or guanine-cytosine (G‚C) base pair. Nucleotide misinsertion errors occur on the order of 10-3 to 10-6, depending on the properties of the polymerase, the type of mispair formed and its surrounding sequence context.1 In principle, DNA polymerase-catalyzed mispairs involve a variety of species, including neutral or charged (ionized or protonated) forms of nucleobases or their minor tautomers.2 The mispairs can assume Watson-Crick (W-C) geometries, but they are usually present in alternative conformations such as wobble structures.1 In the case of * Corresponding author. † Department of Chemistry. ‡ Department of Biological Sciences.

transition mutations, neutral wobble and ionized thymineguanine (T-‚G) and protonated adenine-cytosine (A+‚C) mispairs have been identified by NMR3,4 and X-ray diffraction,5 and have been studied further using DNA melting thermodynamics6,7 and enzyme kinetics.8 Although there is currently no convincing evidence for the participation of minor imino or enol tautomeric forms of the common base mispairs in the replication process, their involvement cannot be ruled out. Since mispairs might contain neutral and/or charged structures,9 some being too elusive for direct experimental examination, it is our goal to develop theoretical methods for examining the energetics of each of the mispaired components as a part of a gradual progress toward the understanding fidelity of enzymatic DNA replication.10 A detailed understanding of any biochemical process should include quantitative structure-function correlations in the form of free-energy differences among the states of interest in a given

10.1021/jp001760z CCC: $19.00 © 2000 American Chemical Society Published on Web 10/07/2000

Free Energy of Nucleotide Mismatches in DNA

J. Phys. Chem. B, Vol. 104, No. 43, 2000 10093

Figure 2. AMBER atom types and atomic charges for difluorotoluene.

Figure 1. Snapshot of a simulated DNA decamer inside an 18 Å water sphere. The mutated base was positioned near the center of this sphere.

macromolecular system.11 Such relationships are in principle obtainable from all-atom statistical-mechanical simulations. However, a quantitative evaluation of the relative free energies of the canonical and mismatched base pairs in DNA by computer simulations has not yet been reported. This might have been due to technical difficulties related to long-range effects or other convergence problems. In this work, we investigate the binding free energy change associated with the “mutation” of WC A‚T and G‚C base pairs into neutral A‚C and G‚T mispairs, respectively, in double helical DNA immersed in a water droplet (Figure 1). The method used is the free-energy perturbation (FEP) calculations for T f C and CfT mutations in DNA and in aqueous solution. The stability of the calculated results with respect to various computational parameters, such as the simulation length, size of the simulation sphere, and the phosphate neutralization, is analyzed. An identical computational methodology is also used to calculate the free energy cost of mutating thymine into its nonpolar isostere difluorotoluene (F)12 in an A‚T base pair embedded in DNA. 2. Computational Methods The configurational ensembles for the evaluation of free energies were generated from the 150 ps to 4 ns unconstrained molecular dynamics (MD) trajectories using the AMBER force field13 implemented in the program Q.14 This force field was selected for its ability to provide stable duplex DNA conformations from unconstrained simulations.15-20 In addition, our program ENZYMIX,21 which allowed us to use induced dipoles on the DNA atoms, was used to examine the force-field dependence of the calculated results. The B-DNA decamer 5′-d(CATGGCCATG)222 was used as a starting structure for the MD simulations. (The position of the mutated base is denoted by bold type). This decamer was neutralized by adding 18 Na+ ions 4.0 Å from the phosphorus atoms. Prior to the calculation of the G‚C f G‚T transformation, the A‚T base pair in the 5′-d(CATGGCCATG)2 decamer was

replaced by the G‚C base pair using the command “mutate residue” in the SYBYL 6.6 program.23 To evaluate the sequence and starting structure dependence of the calculated free energies, the FEP calculation of the G‚C f G‚T transformation was also carried out for the Dickerson-Drew B-DNA dodecamer 5′-d(CGCGAATTCGCG)2.24 The simulated DNA was immersed in 18 or 21 Å sphere of TIP3P water molecules (Figure 1) subject to the surfaceconstraint all-atom solvent (SCAAS) type boundary conditions.14,25,26 The mutated base was positioned near the center of this sphere. A restraining harmonic potential with the force constant of 100 kcal/mol/Å2 was applied on the position of the phosphorus atom of the mutated nucleotide to prevent diffusion of DNA to the edge of the simulation sphere. In addition, positions of DNA atoms protruding beyond the water sphere (Figure 1) were fixed and their nonbonded interactions with the atoms within the simulation sphere were turned off. The DNA in the surface-constrained water sphere (which is designed to mimic an infinite aqueous solution) was equilibrated by the 120 ps simulation that included 10 ps initial simulation with constrained DNA at the temperature 10 K, gradual heating from 10 to 298 K in 20K increments over period of 5 ps (total 75 ps), followed by 30 ps unconstrained simulation with the step 1 fs and without shake on the DNA bonds at 298 K. A shorter 50 ps equilibration protocol was used for the simulation of nucleosides in the same surface-constrained water droplet. Otherwise, the computational protocol in equilibration and production parts of nucleoside simulations was identical to the one used in corresponding DNA simulations. In both the equilibration and production simulations, the nonbonded interactions were evaluated explicitly for distances shorter than 10 Å. The local-reaction field (LRF) method26,27 was used to treat long-range electrostatic interactions for distances beyond 10 Å cutoff. The free energy differences (∆G°) were calculated by the series of single-topology FEP calculations for 51 equally spaced “windows” at 298 K. In the calculations of ∆G° for the T f C and TfF, the atomic radii and charges of thymine residue were mutated concomitantly to their values in cytosine or difluorotoluene. The hydrogen atoms of the thymine methyl group and H3 were mutated to “nothing”, i.e., a dummy atom with zero charge and atomic radius. Analogously, charges and atomic radii of the cytosine molecule were gradually changed to those in thymine in the calculation of ∆∆G°(C f T). This change included shrinking the hydrogen atoms of the cytosine amino group into “nothing”. The mutated region (also called region I) involved the whole nucleobase including the C1′ and H1′ atoms. The atomic charges that were used for the difluorotoluene residue, which are not a standard part of the AMBER force field, are presented in Figure 2. These charges were obtained

10094 J. Phys. Chem. B, Vol. 104, No. 43, 2000 by the ab initio HF/6-31G* calculation of the molecular electrostatic potential, followed by the least-squares fit of the atomic charges to reproduce this potential. Standard AMBER parameters for the atom types indicated in Figure 2 were used for other terms of the difluorotoluene force field. For the neutral DNA calculations, the charges on the ionic phosphate oxygens (O1P and O2P) were changed to -0.2761 au (from -0.7761 au). FEP calculations11,28,29 were carried out with the step 1 fs for 150 to 500 ps trajectories and 2 fs for 1 to 4 ns trajectories. In these calculations, the “shake” algorithm was applied to constrain the bond distances of DNA and solvent molecules. The energy of the system was sampled every 10th step. The first 100 energies (1-2 ps) were discarded for each window before the free energy change was evaluated. The contributions of intramolecular energy terms were determined by FEP simulations of the mutation of neutral nucleosides in gas phase. Separate FEP calculations with the ENZYMIX program and the polarizable ENZYMIX force field21 were carried out to estimate the contribution of the atomic polarizabilities on DNA atoms to the base substitution energetics. For this purpose, we compared the results of two 500 ps simulations (10 ps per FEP window, 1 fs integration steps) that were carried out using the ENZYMIX force field with the induced dipoles turned off and on. The A-coefficients for the van der Waals repulsion term (Aii) of the C3, N2, N3, O2, and O1 atom types of the ENZYMIX force field were decreased to 1200, 700, 700, 400, and 400 kcal/mol, respectively. This change was needed because the use of the default values of these parameters, which were determined for the simulation of proteins, resulted in DNA melting during the first 200 ps of the simulation. In addition, AMBER atomic charges13 were used for all DNA atoms. The “shake” algorithm was not used. Other parameters for the FEP calculations with ENZYMIX were identical to those described above for the simulations with the program Q. 3. Results and Discussion 3.1. A‚T f A‚F Substitution. Thymine was converted into difluorotoluene by replacing its pyrimidine ring and oxygen atoms by benzene ring and fluorine atoms, respectively. The free energy change accompanying this mutation when thymine and difluorotoluene are inserted in double helical DNA opposite adenine is presented in Table 1 in the column denoted ∆∆GDNA. The ∆∆GDNA energies reflect the change in the interaction of the mutated base with its environment, which includes DNA and the surrounding solvent. Because thymine and difluorotoluene are attached to DNA by covalent bonds, it is practically impossible to evaluate ∆∆GDNA experimentally. On the other hand, the calculated change in the hydration free energy, ∆∆Ghydr (Table 1) has a meaningful experimental counterpart. Because ∆Ghydr corresponds to the change in the free energy upon moving solute from 1 M gas to 1 M aqueous solution,30 ∆∆Ghydr can be determined, at least in principle, from the observed partition coefficients of thymine and difluorotoluene between aqueous solution and the gas phase. Unfortunately, for thymine, the vapor pressure that would be in equilibrium with its aqueous solution turned out to be below the detection limit of current experimental methods.30,31 Thus, for the sake of comparison with the results of our FEP calculations, we use for thymine a ∆Ghydr of -12.6 kcal/mol estimated by the Langevin dipoles solvation model.32.33 Also, experimental ∆Ghydr for difluorotoluene is not available in the literature, but its magnitude can be reasonably estimated as -0.3 kcal/mol by correcting the observed ∆Ghydr for toluene (-0.9 kcal/mol).34

Floria´n et al. TABLE 1: Calculated Destabilization of a Negatively Charged DNA Duplexa by the Presence of a A‚F Mispair (kcal/mol)

simulation time (ns)

simulation sphereb (Å)

∆∆GDNAc,d

∆∆Ghydrd

∆∆Gibind

0.15 0.5 1.0 2.0 4.0 0.15 0.5 1.0 2.0 4.0 0.15 0.5 1.0 2.0 exptf

15 15 15 15 15 18 18 18 18 18 21 21 21 21

26.2 25.9 25.5 24.5 25.0 21.5 22.0 22.1 21.8 21.6 24.3 24.0 24.0 23.6

17.8 17.8 17.5 16.8 17.1 16.4 16.3 16.2 16.0 16.2 16.4e 16.3e 16.2e 16.0e

8.4 8.1 8.0 7.7 7.9 5.1 5.7 5.9 5.8 5.4 7.9 7.7 7.8 7.6 3.6 ( 1.7

a DNA was neutralized with Na+ counterions. b Radius of the sphere of water molecules used in the calculation of ∆∆GDNA and ∆∆Ghydr. c Relative to the (5′-CATGGCCATG) DNA duplex22 containing the 2 canonical A‚T base pair (Figure 3). The actual position of the mutated base in the DNA sequence is indicated by the bold letter. d The presented ∆∆GDNA and ∆∆Ghydr energies do not include intramolecular free-energy changes associated with the mutation process. This contribution, evaluated by mutation of an isolated (gas-phase) nucleoside using 2 ns FEP calculation, amounts to 16.6 kcal/mol. The error of the calculated free energies was estimated as half of the difference between the free-energies calculated using the forward and reverse sampling. For all simulations presented in this table, this error was found to be less than 3% of the reported free energy. Note however, that the actual error should be determined by comparing different simulation conditions (see the text). e 18 Å simulation sphere. f Determined from the observed melting temperatures for the 5′-CTTTTCTTTCTT DNA duplex (the position of the base mutated to F is indicated by the bold letter).12 Note that this experimental ∆∆Gbind corresponds to a different sequence (-CTT-) than the sequence studied by our simulation (-ATG-). Therefore, we increased the uncertainty of the experimental ∆∆Gbind presented in this table from the value of 0.7 kcal/mol reported by Moran et al.12 to 1.7 kcal/mol.

Here, the larger surface area of fluorine atoms was taken into account by adding half of the difference in ∆∆Ghydr for CH4 (1.9 kcal/mol) 34 and CF4 (3.1 kcal/mol).34 The difference between ∆Ghydr of thymine and difluorotoluene mentioned above yields ∆∆Ghydr (T f F) of 12.3 kcal/mol, which is about 4 kcal/mol smaller than ∆∆Ghydr calculated in this work. This discrepancy might be related to the parametrization of the difluorotoluene force field, for which we used the default AMBER atom types (Figure 2). However, a detailed examination of this rather complex issue exceeds the scope of the present paper. Using the calculated ∆∆GDNA and ∆∆Ghydr free energies and the thermodynamic cycle presented in Figure 3, it is possible to evaluate the change in the standard binding free energy of the two complementary single-stranded oligonucleotides caused by the nucleotide base substitution in one of these oligonucleotides. It should be noted, however, that the free energy change associated with the mutation of a nucleoside in a single-stranded DNA is approximated here by the corresponding change for mutation of the nucleoside in water (Figure 3). This is a reasonable approximation because the stacking interactions in the single stranded DNA are weak so that each nucleoside base is surrounded most of the time by water molecules only. The calculated ∆∆G°bind has the advantage of being comparable to ∆∆G°bind determined calorimetrically or by measuring the dependence of DNA melting temperature (Tm) on the oligonucleotide concentration.35 Unfortunately, for mutations in

Free Energy of Nucleotide Mismatches in DNA

J. Phys. Chem. B, Vol. 104, No. 43, 2000 10095 TABLE 2: Calculated Destabilization of a Neutral DNA Duplexa by the Presence of the A‚F Mispair simulation time (ns) 0.15 0.5 1.0 2.0 4.0 0.15 0.5 1.0 2.0 4.0 expt

Figure 3. Thermodynamic cycle that was used to calculate ∆∆G°bind for the A‚T f A‚C, A‚T f A‚F, and G‚C f G‚T mutations (X ) T, C, Y ) C, F, T). The actual DNA sequences shown correspond to the A‚X f A‚Y mutation in the 5′-d(CAXGGCCATG)2 DNA decamer. ∆Gss denotes the difference in the hydration free energy of an isolated nucleoside and the same nucleoside in single-stranded DNA. It is assumed that ∆∆Gss ) ∆Gss (X) - ∆Gss (Y) ) 0. -T∆Ssym ) RT ln2 ) 0.4 kcal/mol is the free energy contribution due to the loss of the C2 symmetry of duplexes made of self-complementary oligonucleotides (the CAXGGCCATG sequence shown in the figure is selfcomplementary for X ) T, otherwise ∆Ssym is zero). ∆∆Gibind represents the intrinsic contribution of the given nucleobase mutation to the overall stability of the DNA duplex. ∆∆G°bind ) ∆∆Gibind for nonsymmetric duplexes.

duplexes that possess C2 symmetry, ∆∆G°bind contains an additional entropy term. Consequently, the magnitude of ∆∆G°bind cannot be rigorously considered as a measure of the intrinsic stability of the studied mismatch in its local environment. Therefore, we introduced the term ∆∆Gibind, which is independent of the overall DNA symmetry. The ∆∆Gibind values predicted by our calculations for the AT f AF mutation in the fully charged DNA + counterion system overestimate the experimental estimate of 3.6 ( 1.7 kcal/ mol by 1.5 to 4.8 kcal/mol (Table 1). This error may have a similar origin as the overestimate in ∆∆Ghydr for the T f F mutation, which we tentatively attributed to the difluorotoluene force field. To investigate other possible sources of this discrepancy, and also to verify the robustness of our computational methodology, we examined the extent to which the calculated results are affected by various simulation conditions. The calculated ∆∆Gibind (T f F) is relatively insensitive to the simulation length. Although this is an encouraging result, it is quite possible that even our longest 4 ns simulation may not capture completely the effect of the DNA conformational variability.36 This is because the distant DNA atoms that do not fit inside the simulation sphere are constrained at their X-ray geometry corresponding to the initial state of the transformation process. This might be the reason the 150 or 500 ps simulations show practically the same level of agreement or disagreement with the experiment as the much longer 2 or 4 ns simulations. However, the relative importance of the simulation length should increase for larger simulation spheres. Recently, it has been shown by Cubero et al.17 that the A‚F base pair tends to open for several hundred ps by flipping F out of the DNA helix after several ns of MD simulation, during which DNA stays in its standard duplex form. Although this opening was not sampled

(kcal/mol) ∆∆GDNAb 15 Å Simulation Sphere 20.9 20.9 20.8 21.0 21.1 18 Å Simulation Sphere 21.4 21.3 21.0 21.2 21.3

∆∆Gibind 3.1 3.1 3.3 4.2 4.0 5.0 5.0 4.8 5.2 5.1 3.6 ( 1.7c

a Relative to the (5′-CATGGCCATG) DNA decamer containing 2 the canonical A‚T base pair (Figure 3). The actual position of the mutated base in the DNA sequence is indicated by the bold letter. The calculated ∆∆Ghydr energies are given in Table 1. b See footnote d in Table 1. c See footnote f in Table 1.

during our simulation, the contribution of an infrequent base pair opening to the calculated free energy of the A‚F mismatch should be less than RTln 2, i.e., below 0.4 kcal/mol at 298 K. Another important component of our computational protocol is the size of the explicit water sphere that was used for FEP simulations. The effects of the radius of the simulation sphere were studied for the T f F mutation (Tables 1 and 2) and also for the T f C mutation (see below). The calculated ∆∆GDNA for the T f F mutation in a negatively charged DNA with Na+ counterions changes from 25.0 to 21.6 and 23.6 kcal/mol upon going from a 15 to 18 Å and 21 Å simulation sphere, respectively (Table 1). The sphere-size dependence of ∆∆Ghydr was found to be significantly smaller. Although, ideally, the calculated free energies should not depend on the radius of the similation sphere,26 our calculations show that such a goal is very difficult to achieve if the simulated system contains both negatively charged DNA and counterions. We suspect that a significant part of the sphere-size dependence of ∆∆GDNA can be attributed to the fact that Na+ ions, which happen to be close to the outer edge of the simulation sphere, interact in an nonphysical way with the polarization-constrained surface water molecules. This would not be surprising since the SCAAS model used here25 has been designed to treat solutions with zero ionic strength. In addition to the boundary effects, part of the error in the calculated ∆∆Gibind might be due to insufficient sampling of the DNA-water-counterion system. Since the full -1 e and +1 e charges on the phosphate groups and the counterions partially immobilize the water molecules in their vicinity, the polarization response of the solvent to the changes in the DNA structure is rather slow. Although the results presented for 150 ps to 4 ns trajectories (3-80 ps per window) seem to converge reasonably well, these trajectories might have been too short for a correct sampling of a charged macromolecule surrounded by an ionic atmosphere. In other words, the correlation time for the response of the solvated ionic atmosphere to the change in the DNA structure may be too long to be captured by the present simulations. Consequently, our simulations are not sufficiently accurate to address the effects of the ionic strength on ∆∆Gibind. In this respect, the macroscopic approaches of modeling the ionic atmosphere by a grid of residual charges37-39 would be more appropriate for addressing ionic strength effects than the explicit simulation. In particular, the trajectories obtained with

10096 J. Phys. Chem. B, Vol. 104, No. 43, 2000

Floria´n et al.

TABLE 3: Comparison of the Calculated and Experimental Geometry of the A‚F Base Pair angle (deg) distance (Å)a sequence

method

F4F-N6A

N3F-N1A

C1′F-C1′A

propeller twistb

-AFG-AFG-TFG-f

MDd MDe NMR

3.1 3.1 4.1

3.4 3.5 3.3

11.5 11.5 10.4

16.3 6.6 18.9

C1′-N torsion (A, F)c -122, -136 -114, -125 -112, -127

a Standard numbering of nucleobases is used. Superscripts F and A denote difluorotoluene and adenine, respectively. F4 in F replaces O4 in T. The angle between normal vectors of the planes defined by the atoms of the F and A rings. The plane equations were calculated using the program SYBYL.23 Positive angles correspond to the clockwise rotation of the F-plane with respect to the A-plane when viewed along the C1′F f C1′A vector. c Torsion of the glycosidic bond (O1′-C1′-N9-C4 for A, O1′-C1′-N1-C2 for F). d Average of the 80 ps trajectory from the last window of the FEP calculation for the T f F mutation in the negatively charged 5′-d(CAFGGCCATG) DNA decamer surrounded by Na+ counterions and a 21 Å water sphere. e Average of the 80 ps trajectory from the last window of the FEP calculation for the T f F mutation in the 5′-CAFGGCCATG DNA decamer with neutral phosphates, surrounded by water droplet with an 18 Å radius. f 5′-CGCATFGTTACC DNA dodecamer.42 b

the current explicit approach should be followed by an evaluation of the macroscopic energy of moving the counterions to infinity and adding the macroscopic ionic atmosphere at the given ionic strength. At any rate, inaccuracies in the calculated ∆∆Gibind that stem from the imperfect description of the effects of the ionic atmosphere in aqueous solution can be expected to be less pronounced in the simulations involving a protein environment. This is because the phosphate charges become neutralized by forming close contacts with arginine and lysine residues of the protein, thus decreasing the local charge densities in the simulated DNA + protein system. Because the possible computational limitations described in the preceding two paragraphs are related to the high charge density present in the simulated system, we recalculated ∆∆GDNA for the A‚T f A‚F mutation for the neutral DNA model (Table 2). In this model, we removed all Na+ counterions from the simulated system and modified the phosphate charges in such a way that the DNA becomes neutral. (Such a procedure is similar in some respect to the one used in our simulations of ionized proteins38,40 where the calculations are carried out with neutral ionizable groups, and the effect of these groups is then treated on a macroscopic level.41) This simplification of the studied system improved considerably the sphere-size dependence of the calculated ∆∆GDNA. Also, the ∆∆Gibind (T f F) of 5.1 kcal/mol (obtained for the 4 ns simulation of the neutral DNA simulation in an 18 Å sphere) is in reasonable agreement with the observed ∆∆Gibind of 3.6 ( 1.7 kcal/mol. Although the neutral DNA model offers a chance to obtain more stable results, it is important to verify that this simplification does not lead to a significant change in the DNA structure. Because of the small simulation spheres, the DNA is allowed to change its geometry only in the vicinity of the mutated base pair. Therefore, the changes in the geometry of the A‚F base pair should reveal any artifacts associated with the neutral DNA model. In Table 3, we compare the average structure of the A‚F base pair obtained for the neutral DNA in an 18 Å water droplet with a charged DNA model in a 21 Å water droplet. We also included in this table intermolecular geometrical parameters for the A‚F base pair determined by NMR spectroscopy.42 Regardless of the simulation conditions, the A‚F base pair retained an overall structure similar to the WC A‚T base pair. A closer inspection of the calculated F4‚‚‚N6, N3‚‚‚N1 distances, which coincide with the hydrogen bonds in the isosteric A‚T base pair, reveals that phosphate neutralization does not lead to an increase in the intermonomer distance. In addition, both simulation protocols yield positively propeller-twisted base pair and similar nucleobase orientation with respect to their deoxyribose rings. Also, a comparison of the calculated and experimental geometries indicates that both simulation protocols provide realistic structural representation of the A‚F base pair. A notable

Figure 4. Comparison of the observed (X-ray diffraction, 2.5 Å resolution45) and calculated (average geometry from 80 ps trajectory with neutral phosphate and an 18 Å water sphere) geometry of the G‚T mismatch involving nucleosides G4 and T21 of the DickersonDrew B-DNA dodecamer.

difference between the calculated and NMR structures of the A‚F base pair involves the width of the minor groove, characterized by the distance between C1′ atoms. This difference may indicate that the parameter for the van der Waals repulsion term for fluorine atoms is too small (making the F4‚‚‚N6 distance too short and the C1′‚‚‚C1′ distance too long). However, it could also reflect differences in stacking interactions associated with different DNA sequences used in MD and NMR studies. 3.2. G‚C f G‚T Substitution. The investigation of a single mutation, however detailed, cannot yet establish computer simulations as a relible tool for calculations of the energetics of non-Watson-Crick base pairs, nor can it reveal all of the obstacles that may accompany such calculations in the DNA or the DNA-protein complexes. This is especially true for the conversion of A‚T f A‚F, which represents the simplest possible case, because the topology of the mutated base is unchanged during the mutation process. Technically more challenging are those situations in which one or more atoms are created from “nothing” or the existing atoms are anihilated into “nothing”. Two such cases, namely G‚C f G‚T and A‚T f A‚C transformations, which fall into the category of transition (pyrimidine f pyrimidine) mutations, are examined in the second part of this work. The G‚T base pair was identified as the most stable mispair of all possible non-WC base pairs in DNA duplex.43 Crystallographic studies44,45 have shown that the structure of the G‚T base pair is stabilized by two O‚‚‚H-N hydrogen bonds. This hydrogen bonding arrangement was enabled by a small shift of the thymine residue into the DNA major groove, resulting in a “wobble” mispair. The gradual transformation of C into T during the FEP simulation caused the G‚T mispair to adopt the wobble structure very similar to that observed by X-ray spectroscopy (Figure 4). The only difference between the crystal and calculated structures involves thymidine deoxyribose moiety that was calculated to be in the C3′-endo (A-DNA) conformation.

Free Energy of Nucleotide Mismatches in DNA

J. Phys. Chem. B, Vol. 104, No. 43, 2000 10097

TABLE 4: Sequence and Simulation Time Dependence of the DNA Destabilization by the Presence of a G‚T Mispair (kcal/mol)

simulation time (ns)

∆∆GDNAc

∆∆Gibind

∆∆Ghydrc

(

)

Negatively Charged 5′-CATGGCCATG 3′-GTGCCGGTAC DNA Decamer with Na+ Counterionsa 0.15 8.3 6.2 2.1 0.5 7.5 5.7 1.8 1.0 6.2 5.0 1.2 2.0 6.3 5.3 1.0 4.0 6.3 5.2 1.1 a DNA Decamer Neutral 5′-CATGGCCATG 3′-GTGCCGGTAC 8.3 6.2 2.1 7.8 5.7 2.1 7.3 5.0 2.3 6.8 5.3 1.5 6.9 5.2 1.7 3.6 ( 0.4

(

0.15 0.5 1.0 2.0 4.0 exptd

)

b Neutral 5′-CGCGAATTTGCG DNA Dodecamer 3′-GCGCTTAAGCGC 0.15 9.5 6.2 3.3 1.0 8.4 5.0 3.4 2.0 7.2 5.3 1.9 4.0 7.9 5.2 2.7 expte 3.7 ( 0.4

(

a

)

(

)

Relative to the 5′-CACGGCCATG DNA decamer containing 3′-GTGCCGGTAC the canonical G‚C base pair (Figure 3). The actual position of the mutated base in the DNA sequence is indicated by the bold letter. b Relative to the (5′-CGCGAATTCGCG) DNA dodecamer24 containing 2 the canonical G‚C base pair (Figure 3). The actual position of the mutated base in the DNA sequence is indicated by the bold letter. c Calculated for the 18 Å simulation sphere. The presented ∆∆G DNA and ∆∆Ghydr energies do not include intramolecular free-energy changes associated with the mutation process. This contribution (evaluated by 2 ns FEP calculations for the cytidine f thymidine mutation in the gas-phase) was 87.6 kcal/mol. The error of the calculated free energies was estimated as half of the difference between the free-energies calculated using the forward and reverse sampling. This error was found to be less than 1% of the reported free energy for all simulations presented in this table. Note, however, that the actual error should be determined by comparing different simulation conditions (see the text). d Calculated from the experimental thermodynamic parameters for nearest-neighbor DNA steps7 as follows. ∆∆Gibind ) ∆G°25 (GC f GT) ) ∆G°25 (G‚T) - ∆G°25 (G‚C) ) ∆G°25 (CG/GT step) + ∆G°25 (GT/TA step) - ∆G°25 (CG/GC step) - ∆G°25 (GT/CA step) ) -0.61 - 0.03 + 2.5 + 1.70 ) 3.6 kcal/mol. The 0.4 kcal/mol error range was obtained by using 0.1 kcal/mol uncertainty for each nearest neighbor parameter.7 e Calculated from the experimental thermodynamic parameters for nearest-neighbor DNA steps7 as follows. ∆∆Gibind ) ∆G°25 (GC f GT) ) ∆G°25 (G‚T) - ∆G°25 (G‚C) ) ∆G°25 (CG/GT step) + ∆G°25 (GA/TT step) - ∆G°25 (CG/GC step) - ∆G°25 (GA/CT step) ) -0.61 + 0.28 + 2.5 + 1.57 ) 3.7 kcal/mol.

It is however arguable whether the 2.5 Å resolution of the crystal structure is good enough for reliable determination of the sugar puckering. The wobble structure was reached even in the shortest 150 ps simulation, regardless of the DNA sequence or the phosphate charges used in the simulation. The energetic cost of mutating the G‚C base pair into the wobble G‚T pair is summarized in Table 4. The calculated ∆∆Gibind falls in a 1 to 3.4 kcal/mol range for simulation times between 150 and 4000 ps and for three different simulated systems. The corresponding experimental ∆∆Gibind amounts to 3.6 ( 0.2 and 4.3 ( 0.2 kcal/mol for the -ACA- and -TCTsequences, respectively.43 Unfortunately, these sequences differ from the -ACG- and -TCG- sequences studied in this work.

Figure 5. Comparison of the geometries of the protonated (A+‚C, wireframe-model) and neutral (A‚C, stick-model) adenine-cytosine base pairs. The coordinates of the heavy atoms of the protonated base pair were taken from ref 5. The atomic coordinates of the neutral base pair were determined by averaging over the 80 ps trajectory for the last window in the A‚T f A‚C free-energy perturbation calculation.

Therefore, we evaluated “experimental” ∆∆Gibind for -ACGand -TCG- sequences (Table 4) using the nearest neighbor parameters of Allawi and SantaLucia46 as 3.6 ( 0.4 and 3.7 ( 0.4 kcal/mol, respectively. Note that the nearest neighbor parameters were determined by the least-squares fit to a large set of experimental binding free energies for DNA oligomers,46,47 and as such they represent reliable estimates of ∆∆Gibind. ∆∆Gibind of 1.0 kcal/mol was calculated by a 4 ns FEP simulation of the fully charged DNA decamer involving the -ACG- sequence. Using the neutral DNA simulation for the same sequence, the calculated ∆∆Gibind improved to 1.7 kcal/ mol. The best agreement between the calculated and observed ∆∆Gibind was obtained for the -TCG- sequence that was a part of Dickerson-Drew B-DNA dodecamer, for which a newly refined structure with a 1 Å resolution was available.24 3.3. A‚T f A‚C Substitution. DNA melting thermodynamics6 and NMR experiments3,4,48 have shown that the A‚C mismatch in DNA contains protonated adenine at neutral pH. In addition, wobble A‚C mismatch containing N1-protonated adenine (Figure 5) is consistent with the crystal structure of this mispair.5 Because the melting temperatures of DNA molecules containing the A‚C mismatch at neutral pH reflect mainly the contribution of the protonated base pair, and DNA melting experiments for pH > 8 have not been reported, ∆∆Gibind for the neutral base pair has not been published. The FEP calculations presented in this work provide a straightforward means for the evaluation of this quantity. In fact, the calculation of ∆∆Gibind for the neutral A‚C mismatch by FEP simulations is significantly easier than for the protonated base pair. In this respect, the theoretical and experimental approaches nicely complement each other. Intuitively, the ∆∆Gibind for the neutral AC mispair should be significantly larger than the corresponding value for the protonated base pair. This is so because ∆∆Gibind reflects the difference in free energies of the duplex and single-stranded DNA molecules (Figure 3), and the neutral adenosine in singlestranded DNA is more stable than protonated adenosine (pKa of adenosine is 3.5). To provide a more quantitative experimental estimate of ∆∆Gibind (A‚T f neutral A‚C) we used the relationship

∆∆Gibind (A‚T f neutral A‚C) ) ∆∆Gibind (A‚T f A+‚C) + ∆∆Gibind (A+‚C f A‚C) (1) To estimate ∆∆Gibind (A‚T f A+‚C) we used a set of nearestneighbor DNA parameters for AC mispair at pH ) 5,7,49

10098 J. Phys. Chem. B, Vol. 104, No. 43, 2000

Floria´n et al. TABLE 5: Calculated Destabilization of a DNA Duplexa by the Presence of a Neutral A‚C Mispair simulation time (ns)

Figure 6. Thermodynamic cycle used for the evaluation of the ∆∆Gibind for the A+‚C f A‚C mutation in a duplex DNA. ∆GM ) 2.3RT(pKa pH) and ∆GDNA ) 2.3RT(pKa - pH) denote, respectively, free energies for the protonation of free adenosine (pKa ) 3.5) and adenosine embedded in the duplex DNA (pKa ) 7.1 ( 0.5). The latter pKa value was determined as an average of the 7.54 4 and 6.6 48 values reported by two different research groups.

obtaining ∆∆Gibind (A‚T f A+‚C) ) 2.4 ( 0.4 kcal/mol. In the next step, we determined the magnitude of ∆∆Gibind (A+‚C f A‚C) as 4.9 ( 1.1 kcal/mol using the thermodynamic cycle of Figure 6 and the pKa constants for N1-protonation of free adenine and adenine embedded in duplex DNA opposite to C.4,48 Thus, the overall ∆∆Gibind (A‚T f neutral A‚C) becomes 7.3 ( 1.5 kcal/mol. The large estimated ∆∆Gibind (A‚T f neutral A‚C) agrees qualitatively well with the results of the FEP calculations (Table 5). Here, again, the neutral DNA model provided more stable results with respect to the variations in the size of the simulation sphere, and both models showed a good stability with respect to the variations in the simulation time. While the calculations with the neutral DNA model overestimated ∆∆Gibind by about 3 kcal/mol, the results for the A‚T f neutral A‚C mutation in the negatively charged DNA enclosed in the 18 Å simulation sphere were in a perfect agreement with the corresponding experimental estimate. Also, in both cases the simulations yielded the reversedwobble AC base pair shown in Figure 5. This base pair arrangement was previously obtained by MD simulation4 and it is also consistent with the H NMR spectra observed at pH ) 8.9.4 However, given the large experimentally based ∆∆Gibind (A‚T f neutral A‚C), it is quite possible that the cytosine may be flipped out of the DNA duplex instead of forming the unstable reversed-wobble base pair. Such a conformation, although conceivable, was not sampled by our FEP simulation due to its limited length. Thus, as in the case of the A‚F base pair, the calculated ∆∆Gibind (A‚T f neutral A‚C) may be larger than its experimental counterpart because it reflects the stability of the conformer, which differs from the one observed experimentally. More probably, however, the discrepancies between the calculated and experimental ∆∆Gibind obtained in this work originate from the difficulties to reproduce accurately the energetics of stacking interactions between nucleobases using a relatively simple force field. In this respect, the most desirable force field improvement is its ability to mimic the electronic polarization of nucleobases. The polarizable force field might also lower calculated ∆∆Gibind by reducing the unfavorable electrostatic interactions between the mismatched bases. To estimate quantitatively the actual contribution of the atomic polarizabilities, we compared the results of the two 500 ps simulations that were carried out using the ENZYMIX force field with the induced dipoles turned off and on. These calculations indicated that the contribution of induced dipoles to ∆∆Gibind is quite small (-0.4 kcal/mol). A more detailed analysis of the dependence of ∆∆Gibind for base-base mismatches on the force field parameters will be needed to improve

0.15 0.5 1.0 2.0 4.0 0.15 0.5 1.0 2.0 4.0 0.15 1.0 2.0 4.0 0.15 0.5 1.0 2.0 4.0 pKa-based estimate

(kcal/mol)

simulation sphereb (Å)

∆∆GDNAc

Negatively charged DNA with 15 5.9 15 5.8 15 5.5 15 5.8 15 5.1 18 -1.0 18 -1.0 18 -0.5 18 -0.1 18 1.2 15 15 15 15 18 18 18 18 18

Neutral DNA 0.9 1.9 2.3 2.3 3.3 2.3 2.1 2.5 2.5

∆∆Ghydrc

∆∆Gibind

Na+

counterions -7.9 13.8 -7.8 13.6 -7.9 13.4 -8.3 14.1 -8.3 13.4 -8.0 7.0 -7.7 6.7 -7.4 6.9 -7.6 7.5 -7.8 6.6

-7.9 -7.9 -8.3 -8.3 -8.0 -7.7 -7.4 -7.6 -7.8

8.8 9.8 10.6 10.6 11.3 10.0 9.5 10.1 10.3 7.3 ( 1.5d

a Relative to the (5′-CATGGCCATG) DNA decamer containing 2 the canonical A‚T base pair22 (Figure 3). The actual position of the mutated base in the DNA sequence is indicated by the bold letter. b Radius of the sphere of water molecules surrounding the DNA duplex in the calculation of ∆∆GDNA and ∆∆Ghydr. c The presented ∆∆GDNA and ∆∆Ghydr energies do not include intramolecular free-energy changes associated with the mutation process. This contribution (evaluated by 2 ns FEP calculations for the isolated nucleosides) was -81.1 kcal/ mol. The statistical error of the calculated free energies is often estimated as half of the difference between the free-energies calculated using the forward and reverse sampling. This error was found to be less than 1% of the reported free energy for all simulations presented in this table, Note however, that the actual error should be determined by comparing different simulation conditions (see the text). d See eq 1, Figure 6 and footnote.49

the capabilities of free-energy calculations in predicting secondary DNA and RNA structures. 4. Concluding Remarks An important goal in understanding the fidelity of DNA replication is the elucidation of the source of the free energy enabling polymerases to discriminate Watson-Crick (WC) from non-WC base pairs. In the present study, we made a first step in this direction by calculating the changes in the binding free energies of DNA duplexes due to the formation of non-WC base pairs. Our FEP calculations for the G‚C f G‚T and A‚T f A‚F substitutions yielded ∆∆Gibind values within 2 kcal/ mol from their experimental counterparts. In addition, calculations involving the neutral DNA model provided stable results even for simulation times shorter than 500 ps. Further improvement of these results can be expected when a larger set of mispairs and computational protocols are compared. However, 2 kcal/mol accuracy was sufficient to estimate the ∆∆Gibind for the replacement of the A‚T base pair by the neutral reversedwobble A‚C base pair, for which the experimental value was not available. ∆∆Gibind for this mutation was found to be significantly larger than for the formation of the protonated A‚C base pair. Nevertheless, the neutral AC base pair may still play a role in DNA mutagenesis as only about one-half of ∆∆Gibind, i.e., about 3-4 kcal/mol, can be passively utilized by DNA

Free Energy of Nucleotide Mismatches in DNA polymerases. This is because the incoming nucleotide has only one stacking neighbor in the active site of the DNA polymerase. Although the protonated A‚C base pair has a smaller ∆∆Gibind, DNA polymerases may eliminate the protonated bases by electrostatic effects. For example, crystal structure of DNA polymerase β shows that the template base is directly interacting with the positively charged protein residue (Arg 283).50 Also, template protonation might be limited kinetically because the polymerase active site is not accessible to solvent water molecules that might serve as proton donors. In general, it is clear that the ∆∆Gibind values obtained from aqueous thermal denaturation studies of base pairs and mispairs, translated to ∼ 0 to 3 kcal/mol per nearest-neighbor doublet interaction35,43,51 are insufficient to account for high polymerase fidelity, exclusive of proofreading.10 It has been pointed out that the relatively small differences between the association free energies of W-C and non-W-C base pairs reflect an “enthalpy-entropy” compensation effect, whereby large enthalpic differences between matched and mismatched base pairs are counteracted by strongly correlated entropic differences, hence minimizing ∆∆Gibind values. Heuristic arguments have been made suggesting that geometrical constraints operative in the polymerase active cleft, accommodating W-C in preference to non-W-C base pairs, might suppress ∆∆S°bind differences, allowing the enzyme to exploit ∆∆H°bind values as a source of free energy sufficient to account for nucleotide insertion fidelity.52,53 It is not clear, however, whether this geometric selection principle can account for the entire energetics of polymerase fidelity. It is still possible that here, as well as in the case of other proposals about enzyme catalysis,54 the entropic effects are smaller than what is commonly assumed.11,55 What is needed for a quantitative elucidation of the origin of polymerase fidelity is an energy-based theoretical analysis. This paper is an attempt to begin filling this void. Acknowledgment. This work was supported by the NIH grants GM21422 (to M.F.G.) and GM24492 (to A.W.). References and Notes (1) Goodman, M. F.; Creighton, S.; Bloom, L. B.; Petruska, J. Crit. ReV. Biochem. Mol. Biol. 1993, 28, 83. (2) Sowers, L. C.; Shaw, B. R.; Veigl, M. L.; Sedwick, W. D. Mutat. Res. 1987, 177, 201. (3) Sowers, L. C.; Fazakerley, G. V.; Kim, H.; Dalton, L.; Goodman, M. F. Biochemistry 1986, 25, 3983. (4) Boulard, Y.; Cognet, J. A. H.; Gabarro-Arpa, J.; Le Bret, M.; Sowers, L. C.; Fazakerley, G. V. Nucleic Acids Res. 1992, 20, 1933. (5) Hunter, W. N.; Brown, T.; Kennard, O. Nucleic Acids Res. 1987, 15, 6589. (6) Brown, T.; Leonard, G. A.; Booth, E. D.; Kneale, G. J. Mol. Biol. 1990, 212, 437. (7) Allawi, H. T.; SantaLucia, J., Jr. Biochemistry 1998, 37, 9435. (8) Yu, H.; Eritja, R.; Bloom, L. B.; Goodman, M. F. J. Biol. Chem. 1993, 268, 15935. (9) Sowers, L. C.; Eritja, R.; Kaplan, B.; Goodman, M. F.; Fazakerly, G. V. J. Biol. Chem. 1988, 263, 14794. (10) Echols, H.; Goodman, M. F. Annu. ReV. Biochem. 1991, 60, 477. (11) Warshel, A. Computer Modeling of Chemical Reactions in Enzymes and Solutions; John Wiley & Sons: New York, 1991. (12) Moran, S.; Ren, R. X. F.; Kool, E. T. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 10506. (13) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K. M., Jr.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 5179. (14) Marelius, J.; Kolmodin, K.; Feierberg, I.; A° qvist, J. J. Mol. Graphics Model. 1999, 16, 213. (15) Cheatham T. E., III; Miller, J. L.; Fox, T.; Darden, T. A.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 4193. (16) Cheatham, T. E., III; Kollman, P. A. J. Mol. Biol. 1996, 259, 434. (17) Cubero, E.; Sherer, E. C.; Luque, F. J.; Orozco, M.; Laughton, C. A. J. Am. Chem. Soc. 1999, 121, 8653.

J. Phys. Chem. B, Vol. 104, No. 43, 2000 10099 (18) Jayaram, B.; Sprous, D.; Young, M. A.; Beveridge, D. L. J. Am. Chem. Soc. 1998, 120, 10629. (19) Pardo, L.; Pastor, N.; Weinstein, H. Biophys. J. 1998, 75, 2411. (20) Spackova, N.; Berger, I.; Egli, M.; Sponer, J. J. Am. Chem. Soc. 1998, 120, 6147. (21) Warshel, A.; Creighton, S. In Computer simulation of biomolecular systems; van Gunsteren, W. F., Wiener, P. K., Eds.; ESCOM: Leiden, 1989; pp 120. (22) Goodsell, D. S.; Kopka, M. L.; Cascio, D.; Dickerson, R. E. Proc. Natl. Acad. Sci. U.S.A. 1993, 90, 2930. (23) SYBYL 6.6; Tripos Inc., 1699 South Hanley Rd., St. Louis, Missouri, 63144. (24) Tereshko, V.; Minasov, G.; Egli, M. J. Am. Chem. Soc. 1999, 121, 470. (25) King, G.; Warshel, A. J. Chem. Phys. 1989, 91, 3647. (26) Sham, Y. Y.; Warshel, A. J. Chem. Phys. 1998, 109, 7940. (27) Lee, F. S.; Warshel, A. J. Chem. Phys. 1992, 97, 3100. (28) Valleau, J. P.; Torrie, G. M. In Modern Theoretical Chemistry; B. J. Berne, Ed.; Plenum: New York, 1977; Vol. 5. (29) Meirovitch, H. In ReViews in Computational Chemistry; Lipkowitz, K. B., Boyd, D. B.,Eds.; Wiley-VCH: New York, 1998; Vol. 12; pp 1. (30) Wolfenden, R. Science 1983, 222, 1087. (31) Shih, P.; Pedersen, L. G.; Gibbs, P. R.; Wolfenden, R. J. Mol. Biol. 1998, 280, 421. (32) Floria´n, J.; Warshel, A. J. Phys. Chem. B 1997, 101, 5583. (33) Floria´n, J.; Sponer, J.; Warshel, A. J. Phys. Chem. B 1999, 103, 884. (34) Cabani, S.; Gianni, P.; Mollica, V.; Lepori, L. J. Solution Chem. 1981, 10, 563. (35) Law, S. M.; Eritja, R.; Goodman, M. F.; Breslauer, K. J. Biochemistry 1996, 35, 12329. (36) One should realize, however, that a complete exploration of the available configuration space or a “complete” convergence is an unrealistic goal for any current simulation method because of computer time limitations. A realistic goal is to obtain sufficient convergence in the quantity under consideration without running infinitely long trajectories. This is why we have developped the SCAAS and LRF methods, which attempt to obtain reasonable convergence in electrostatic energies. As with any related scientific problem it is essential to examine whether one could reproduce the relevant observed quantity, while realizing that no simulation is perfect. In doing so, it is much more important to examine sensitivity to different boundary conditions and long-range treatments than to run one very long trajectory. (37) Alden, R. G.; Parson, W. W.; Chu, Z. T.; Warshel, A. J. Am. Chem. Soc. 1995, 117, 12284. (38) Lee, F. S., Chu, Z. T., Warshel, A. J. Comput. Chem. 1993, 14, 161. (39) Klein, B. J.; Pack, G. R. Biopolymers 1983, 22, 2331. (40) Sham, Y. Y.; Chu, Z. T.; Warshel, A. J. Phys. Chem. B 1997, 101, 4458. (41) Fuxreiter, M.; Warshel, A.; Osman, R. Biochemistry 1999, 38, 9577. (42) Guckian, K. M.; Krugh, T. R.; Kool, E. T. Nature Struct. Biol. 1998, 5, 954. (43) Aboul-ela, F.; Koh, D.; Tinoco, I., Jr.; Martin, F. H. Nucleic Acids Res. 1985, 13, 4811. (44) Brown, T.; Kennard, O.; Kneale, G.; Rabinovich, D. Nature 1985, 315, 604. (45) Hunter, W. N.; Brown, T.; Kneale, G.; Anand, N. N.; Rabinovich, D.; Kennard, O. J. Biol. Chem. 1987, 262, 9962. (46) Allawi, H. T.; SantaLucia, J., Jr. Biochemistry 1997, 36, 10581. (47) SantaLucia J., Jr.; Allawi, H. T.; Senevirante, P. A. Biochemistry 1996, 35, 3555. (48) Wang, C.; Gao, H.; Gaffney, B. L.; Jones, R. A. J. Am. Chem. Soc. 1991, 113, 5486. (49) ∆∆Gibind (A‚T f A+‚C) ) ∆Gi25 (A+‚C) - ∆G°25 (A‚T) ) ∆Gi25 (AC/TA step) + ∆Gi25 (CG/AC step) - ∆Gi25 (AT/TA step) - ∆Gi25 (TG/AC step) ) -0.26-0.14 + 1.12 + 1.72 ) 2.4 ( 0.4 kcal/mol. The error range was estimated by assuming 0.1 kcal/mol uncertainty for each nearest neighbor parameter. (50) Pelletier, H.; Sawaya, M. R.; Kumar, A.; Wilson, S. H.; Kraut, J. Science 1994, 264, 1891. (51) Petruska, J.; Goodman, M. F.; Boosalis, M. S.; Sowers, L. C. Proc. Natl. Acad. Sci. U.S.A. 1988, 85, 6252. (52) Goodman, M. F. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 10493. (53) Petruska, J.; Goodman, M. F. J. Biol. Chem. 1995, 270, 746. (54) Jencks, W. P. In AdVances in Enzymology and Related Areas of Molecular Biology; Meister, A. Ed.; J. Wiley & Sons: New York, 1975; Vol. 43, pp 219. (55) Villa`, J.; Strajbl, M.; Glennon, T. M.; Sham, Y. Y.; Chu, Z.-T.; Warshel, A. Proc. Natl. Acad. Sci. U.S.A., in press.