Unraveling Base Stacking Driving Forces in DNA - The Journal of

United States. J. Phys. Chem. B , 2016, 120 (26), pp 6010–6020. DOI: 10.1021/acs.jpcb.6b01934. Publication Date (Web): April 4, 2016. Copyright ...
0 downloads 11 Views 2MB Size
Article pubs.acs.org/JPCB

Unraveling Base Stacking Driving Forces in DNA Chi H. Mak* Department of Chemistry and Center of Applied Mathematical Sciences, University of Southern California, Los Angeles, California 90089, United States ABSTRACT: Base stacking is a key determinant of nucleic acid structures, but the precise origin of the thermodynamic driving force behind the stacking of nucleobases remains open. The rather mild stacking free energy measured experimentally, roughly a kcal/mol depending on the identity of the bases, is physiologically significant because while base stacking confers stability to the genome in its double helix form, the duplex also has to be unwound in order to be replicated or transcribed. A stacking free energy that is either too high or too low will overor understabilize the genome, impacting the storage of genetic information and also its retrieval. While the molecular origin of stacking driving force has been attributed to many different sources including dispersion, electrostatics, and solvent hydrogen bonding, here we show via a systematic decomposition of the stacking free energy using large-scale computer simulations that the dominant driving force stabilizing base stacking is nonhydrophobic solvent entropy. Counteracting this is the conformational entropic penalty on the sugar−phosphate backbone against stacking, while solvent hydrogen-bonding, charge−charge interactions, and dispersive forces produce only secondary perturbations. Solvent entropic forces and DNA backbone conformational strains therefore work against each other, leading to a very mild composite stacking free energy in agreement with experiments.



°C for a single pair of bases is typically stabilized by a free energy of no more than a kcal/mol.18,22,24,33 While each individual measurement is subject to uncertainties, this experimental consensus has been derived from a large number of different experiments subjected to extensive cross validations.16 The very moderate magnitude of this stacking free energy is physiologically significant, because, while base stacking confers stability to the genome in its double helix form, the duplex also has to be unwound in order to be replicated or transcribed.22 A stacking free energy that is either too high or too low will over- or understabilize the genome, impacting the storage of genetic information and also its retrieval. Direct as they may seem to be, experimental stacking data are often difficult to interpret. Since nucleobases are bounded to the sugar−phosphate backbone, experimentally measured stacking free energies do not simply pertain to the association of two naked bases in solution but are convoluted with the conformational entropy of the DNA polymer backbone. This factor is frequently overlooked when reconciling theoretical and experimental stacking data. A recent report suggests that the conformational entropy of the sugar−phosphate backbone when stacking one base onto another could incur a free energy cost as high as 2.5 kcal/mol at 37 °C.34,35 Disentangling the intrinsic thermodynamics of base stacking interaction strengths

INTRODUCTION Although base stacking is known to be a central determinant of nucleic acid structures, the precise origin of the molecular forces that stabilize stacking between nucleobases is still open. A large body of experimental data pertaining to the strength of base stacking is available.1−28 The experimental consensus is that the stacking free energy between two nucleobases is rather moderate, of the order of 1 kcal/mol depending on the identities of the stacked bases. A number of theoretical calculations have also been carried out to try to ascertain the physical origin of base stacking interactions. Since the magnitudes of stacking energies computed by quantum mechanical calculations29 between planar aromatic molecules in a vacuum are often much larger than what is observed experimentally, solvation must play a significant if not the dominant role in determining the stacking free energy. A molecular understanding of the driving forces behind base stacking must therefore come from a careful assessment of solvent effects. Solvation factors such as dispersion, electrostatic, and hydrophobic forces have all been suggested to play key roles in stacking,12,13,20−22,30−32 but the fundamental physical basis of DNA base stacking is still unclear. Water creates a complex solvent environment for the nucleobases. Water molecules have the ability to interact with themselves as well as with DNA bases via dispersion, electrostatic, as well as hydrogen bonding interactions. Individually, each of these forces is strong, but together, these factors counterbalance each other producing a composite effect that is experimentally found to be rather mildstacking at 37 © XXXX American Chemical Society

Special Issue: William M. Gelbart Festschrift Received: February 25, 2016 Revised: April 4, 2016

A

DOI: 10.1021/acs.jpcb.6b01934 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

electrostatics, hydrogen bonding, or dispersion will either add to or subtract from this baseline. In this paper, we will show that baseline entropic effects in fact underpin the stacking free energy, while dispersion, electrostatics, and solvent hydrogen bonding exert secondary perturbations. Since entropy is frequently cited as the driving force behind hydrophobicity, it is important here to distinguish between “baseline” solvent entropic effects and hydrophobic solvent effects. Water molecules have the ability to make directional hydrogen bonds with each other, and because of this unusual property, water can form networks and cages around nonpolar solvent aggregates. Experimentally determined free energies of transfer of nonpolar solutes from nonpolar to polar solvents are found to be dominated by entropy,56,57 which is often identified as being the cause of the hydrophobic effect. The baseline solvent entropy effects described here are distinct from and exist independently of hydrophobicitya solute excludes solvent molecules from the volume it occupies regardless of whether the solvent molecules can make hydrogen bonds with themselves or not. These baseline solvent entropy effects result entirely from excluded volume interactions, and they are ubiquitous for any solute, polar or nonpolar, and in any solvent, hydrogen-bonding or non-hydrogen-bonding, but solvent entropy is also difficult to compute directly.60 Surprisingly, a systematic calculation of the different molecular forces (entropy, hydrogen bonds, electrostatic, and dispersion) driving base stacking in DNA has not been reported. While computational studies of base stacking have been carried out with a number of different water models,30,32,36,38−53,61 the significance of baseline solvent entropic effects has not been established. To derive a firstprinciple physical model for base stacking, it is important to be able to disentangle intrinsic entropic effects from the other free energy factors such as dispersion, electrostatics, and hydrophobic forces. Here, we show how a systematic perturbation of the solvent model can be used in computer simulations to unravel these intricate solvent effects. Origin and Magnitude of the Solvent-Mediated Stacking Free Energy. How large is the intrinsic entropic cost for the solvent when it must accommodate a solute molecule within it? Recent studies of the solvation of proteins in water and the driving force of hydrophobic assemblies55,62−69 suggest that the probability of observing a certain number of solvent molecules N in a cavity with volume V within the solvent is well described by a Gaussian when V is not too large

from the conformational free energy of the DNA backbone has not been easy. Computational estimates of base stacking free energies in water have come from classical molecular dynamics (MD) or Monte Carlo (MC) simulations, 31,32,36−53 primarily of dinucleoside monophosphates in water. Like experimentally derived stacking data, theoretical calculations are also subject to the same complications that affect the interpretation of stacking thermodynamics. Furthermore, MD-derived stacking free energies also depend on the parametrization of the force field. Since dispersion, electrostatic, and hydrogen bonding forces are individually strong, small variations in the force field parameters could lead to large variabilities in the computed stacking free energies. Consequently, even though supplying a reliable estimate for the magnitude of the stacking free energy should be a key goal for computational studies, ascertaining the physical origin of stacking interactions in DNA using theoretical calculations is perhaps even more pertinent to the understanding of stacking. Stacking forces22 can arise from either direct interactions between the bases, such as dispersion or electrostatics, or indirect solvent-mediated forces, such as hydrophobicity, solvation, and solvent entropic effects. Of the direct forces, dispersion is expected to be nearly neutral in terms of its effects on base stacking. While exceptions exist,54 dispersions are generally nonspecific interactions, and the London forces between two stacked bases are likely similar compared to the interactions between the individual unstacked bases and the surrounding solvent molecules. On the other hand, electrostatic interactions between two nucleobases are specific because they depend on the atomic partial charges. However, since water molecules have strong dipole moments and are easily reorientable, they can interact almost as favorably with the partial charges on the bases as the bases can with themselves. Because of this, the net change in direct electrostatic interactions when two bases are being unstacked may also be close to being neutral. Indirect electrostatic effects arising from the hydrogen-bonding capacity of solvent water molecules are even more difficult to assess. Recent statistical mechanical theories suggest that solvent effects that are due to the hydrogen-bonding capability of water, often loosely referred to as “hydrophobic forces”, have a significant entropic component on molecular length scales.55 While entropy is often implicated as the reason for hydrophobic aggregation,56,57 solvating any solute, polar or nonpolar, takes an entropic toll on any solvent. A solute must exclude solvent molecules from the volume it occupies regardless of whether the solvent molecules can make hydrogen bonds with themselves or not. If one base excludes a certain number of solvent molecules from the volume it occupies, two stacked bases exclude approximately twice as much solvent. When examining the free energy of the association of two nucleobases, the key question is whether the change in solvent entropy of the stacked pair relative to two separated bases results in net stabilization or destabilization. While many other factors such as electrostatics, dispersion forces, and hydrogen bonds also contribute to the stacking free energy, intrinsic solvent entropy sets a baseline. These baseline solvent entropic effects on stacking are expected to be robust against the details of the solvent model in a dense liquid whose structure is controlled primarily by excluded volume interactions.58,59 Having established the baseline effects coming from solvent entropy, the inclusion of other solvent forces such as

PV (Ν) = (2πχV )−1/2 exp[−(N − ⟨N ⟩V )2 /2χV ]

(1)

where ⟨N⟩V = ρV is the equilibrium value of N and ρ is the number density of the solvent, and χV = ⟨(δN )2 ⟩V = ρV + ρ2

∫V dr ∫V dr′[g(|r − r′|) − 1] (2)

gives the second moment in the fluctuations of N, where g(r) is the radial distribution function of the liquid solvent. The free energy of observing no solvent molecule in the cavity then gives the approximate solvation free energy ΔG ≈ kBT (ρV )2 /2χV + kBT (ln 2πχV )/2

(3)

where kB is Boltzmann’s constant and T is absolute temperature. Since [g(r) − 1] cuts off quickly beyond the correlation B

DOI: 10.1021/acs.jpcb.6b01934 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

⎧ ε[(r /r )12 − 2(r /r )6 + 1] for r < r = 21/6σ 0 0 0 u wca(r ) = ⎨ ⎩0 otherwise

length of the liquid, the free energy ΔG is roughly proportional to V. This gives ΔS /kB = −ΔG /kBT ≈ 7⟨N ⟩V





(4)

(5)

when physical parameters appropriate for water are used in the formula. The magnitude of this entropic force is not insignificant. At 37 °C, the exclusion of just one solvent molecule would amount to approximately 4 kcal/mol of free energy. For the stacking of two bases, it is not the entropy in eq 4 but the change ΔSstack between two stacked bases versus two individually solvated bases that controls stacking stability. If the volume of one solvated base is V, their total volume should be approximately twice when two bases are stacked against each other. ΔSstack is consequently expected to be nearly neutral too, and because of this, a precise quantification of solvent entropy effects is particularly important for understanding the stability of nucleobase stacking. Depending on the precise number of water molecules excluded by two stacked bases, a difference in just one solvent molecule is sufficient to either stabilize or destabilize the stack, given the experimentally derived stacking free energy is of the order of just ∼1 kcal/mol. To accurately assess solvent entropic effects on base stacking in nucleic acids, we design a set of computer experiments to isolate the contributions of solvent entropy from other factors such as electrostatics, hydrogen bonding, dispersion, or hydrophobic forces. To do this, we perform large-scale Monte Carlo calculations of the free energy of association between two nucleobases in solution using a series of solvent models that contain physical features of varying complexity. By a systematic decomposition of these features, we are able to develop a detailed understanding of the role of the solvent on base stacking. Using a separate conformational entropy calculation for the DNA backbone, we also derive independent estimates for the conformational effects of the sugar−phosphate backbone on the stacking free energy between sequenceneighbor bases. Together, these two sets of results provide a clear picture of how solvent entropic effects stabilize base stacking while backbone conformational entropy destabilizes it, producing the small but physiologically significant DNA base stacking free energy that is typically seen in experiments. On the basis of these findings, we can formulate a revised physical chemical framework for understanding base stacking driving forces in DNA.

uwca projects out the repulsive branch in the van der Waals (VDW) interactions uvdw, frequently represented by a LennardJones potential with well depth parameter ε and zero-energy distance σ. The remainder, uatt = uvdw − uwca, then constitutes the attractive branch. Structural properties of a WCA liquid are predominantly athermal, and its thermodynamics is controlled entirely by entropy. Perturbation theory shows that the repulsive branch of the interparticle potential alone describes the fundamental structure of a dense liquid rather well,58 and a WCA liquid is therefore a good baseline model for understanding the entropic effects of the solvent. The WCA parameters ε = 0.1521 kcal/mol and σ = 3.150 Å for the solvent were chosen to match the TIP3P model for water,70 but unlike TIP3P water, the baseline solvent has no attractive dispersion interactions with each other or with the solute base molecules. Purine and pyrimidine bases (adenine, cytosine, guanine, or thymine without ribose or phosphate) are represented by rigid planar molecules with their ideal molecular structures.71 Within this baseline model, the WCA potential was also used to represent base−base and base−solvent interactions. The interaction between a solvent molecule i with each base atom j was computed with the standard combination rules: εij = (εi × εi)1/2 and σij = (σi + σi)/2. Base−base interactions were computed analogously. Atomic parameters for the bases were taken from Amber ff99.72 All partial charges were turned off. Because the bases interact only through WCA interactions, there is no direct dispersion force between them except sterics; therefore, if two bases in the solvent exhibit any attractive free energy between them in the simulation, this effective attractive interaction must have arisen through solvent entropy. Because a solvent with purely excluded volume has no cohesive energy, the system was contained in a NPT simulation with an external pressure to produce the same number density as liquid water (0.995 g/cm3 at 37 °C). The free energies of two stacked bases were evaluated as a function of their separation by umbrella sampling using a large ensemble of independent configurations (∼107). This zero-order solvent model enables us to establish precise theoretical estimates for the baseline nonhydrophobic solvent entropic effects on nucleobase stacking. Beyond the baseline WCA model, simulations on a series of solvent/solute models with increasing levels of molecular complexity were carried out to unravel solvent effects. (1) WCA+HB: To the baseline model, hydrogen bonding was added to generate a WCA+HB model. Hydrogen bonding in solvent water molecules was modeled by adding embedded TIP3P partial charges to the WCA potential. In contrast to regular TIP3P water, solvents in the WCA+HB model have no attractive dispersion interactions with each other, but interactions resulting from the TIP3P partial charges were sufficient to stabilize the solvent at a physical external pressure of 1.013 bar with approximately the same liquid density as the baseline solvent. (2) WCA+HB_CHG: Next, electrostatic interactions between the bases were reintroduced into the model by restoring atomic partial charges to the two bases according to Amber ff99. To maintain overall charge neutrality, excess partial charges were reassigned to the N9 (in purines) or



METHODS

Base and Solvent Models. To calculate stacking free energy, we simulated two purine or pyrimidine bases (adenine, cytosine, guanine, or thymine without ribose-phosphate) solvated within as many as 80,000 solvent molecules within a cubic box of sizes up to (134 Å)3 using NPT Monte Carlo calculations. Simulations were carried out with a number of different solvent models having increasing molecular complexity to ascertain the physical origin of stacking forces. First, a baseline solvent model was used to establish intrinsic solvent entropy effects. This zero-order model consists of solvent molecules interacting only via excluded volume interactions tuned to the physical diameter of water, in the form of a Weeks−Chandler−Andersen (WCA) potential:59 C

DOI: 10.1021/acs.jpcb.6b01934 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B N1 (in pyrimidines) atom on each base. This constitutes the WCA+HB+CHG model. (3) VDW+HB+CHG: Finally, full van der Waals interactions among all solute and solvent atoms were reintroduced by adding back all attractive interactions uatt to the model, forming the VDW+HB+CHG model. The final VDW+HB+CHG model has all the interactions typically present in a full force field calculation. Simulations. Independent computer simulations were performed on the four different models (WCA, WCA+HB, WCA+HB+CHG, VDW+HB+CHG) using constant NPT Monte Carlo (MC). Most of the simulations were carried out with 10,000 solvent molecules in a cubic box with periodic boundary conditions. System size convergence was verified by repeating some simulations using up to 80,000 solvent molecules. Standard Metropolis MC73 was used to sample the positions and orientations of the solvent molecules and the bases, with an auxiliary constant pressure move74 every 10 MC passes generating volume fluctuations, where a MC pass is defined as every particle having undergone one trial move on average. Translation and rotation displacements were adjusted to yield approximately 40% acceptance. For the WCA model, all simulations were carried out at constant temperature T = 310 K = 37 °C. In reduced units, this temperature corresponds to T* = kBT/ε = 4.05, where ε is the Lennard-Jones parameter for water. In order to produce a physical density that matches the density of liquid water of 0.995 g/cm3 at 37 °C, the reduced number density of the solvent ρ* = ρσ3 was empirically calibrated by varying the reduced pressure P* = Pσ3/ε in the simulation. A previous study of the WCA liquid75 reported that the reduced pressure at T* = 4.05 at the solid−liquid coexistence line was approximately P* ≈ 77, with a liquid branch density ρl* = 1.20 and a solid branch density ρs* = 1.25. In our simulations, we found empirically that P* ≈ 50 produced the proper target liquid density ρ* ≈ 1.03 at T* = 4.05, which when assuming the molecular weight of water of 18.0 g/mol translates to a physical density for liquid water of 0.994 g/cm3. For the three models with charges (WCA+HB, WCA+HB +CHG, and VDW+HB+CHG), the additional electrostatic interactions were computed using the fast multipole method (FMM).76 Far-field interactions were evaluated by an eighthorder particle-cell (single-sided) solid harmonic multipolar expansion, with an infinite-periodic-image full Ewald sum carried out using cell−cell (double-sided) multipoles.77 With FMM electrostatics, the simulations were substantially more CPU-expensive compared to the baseline model. Typically, 4 × 106 MC passes were used to converge the free energy profiles. Unless stated otherwise, all models described below were simulated at T = 310 K, using an external pressure which produced similar densities to the baseline solvent model. Stacking Free Energy Calculations. Prior to each free energy calculation, the two bases were solvated in the solvent by extensive equilibration. For stacking calculations, the bases were positioned parallel to each other with the expected stacked geometry observed for two sequence-neighbor residues in a standard B-form helix shown in Figure 1A. The standard umbrella sampling technique78 was used to accumulate histograms for their separation, with the bases constrained in the simulation to stay parallel with respect to each other. Because the probability of different base−base separations varies over several orders of magnitude over the distance range from 3 to 15 Å for stacked bases, umbrella window sizes of 1 Å were employed with every window having at least 50% overlap

Figure 1. Four base−base geometries studied in the stacking free energy simulations. (A) X|Y: Two bases X, Y positioned parallel to each other with the expected stacked geometry for two sequenceneighbor residues in a standard B-form helix. (B) X/Y: A clamshell geometry where one of the two bases is tilted 30° away from the plane. (C) X||Y: Two stacked bases sliding against each other while remaining parallel to each other. (D) X:Y: Two bases in their Watson−Crick side-by-side geometry.

with the previous one, and simulations were repeated with 0.5 Å windows to verify convergence. The logarithms of the histograms from the different windows were aligned for maximal overlaps. For each of the free energy profiles, 4−24 million MC passes were used to accumulate the histograms. In addition to (A) parallel stacked bases X|Y (Figure 1A), base pairs in other geometries were also simulated. These include: (B) X/Y base pairs tilted away from their parallel geometries at a 30° angle from each other (the “clamshell”, Figure 1B), (C) two stacked bases sliding over each other X||Y (Figure 1C), and (D) A:T or G:C couples in their “paired” side-by-side geometries with their Watson−Crick hydrogen bonding donor and acceptor atoms directed at each other (Figure 1D). For the X:Y pair simulations, the two bases were constrained to remain coplanar throughout (but there were no explicit hydrogen-bonding interactions or electrostatics between them). Because free energy variations in the paired configurations were much milder compared to the other geometries, paired simulations were typically carried out using just two overlapping 7 Å wide umbrella windows, spanning base−base separations from 2 to 15 Å, typically with a total of 48 million MC passes for each free energy profile.



RESULTS AND DISCUSSION A. Base Stacking Is Stabilized by Solvent Entropic Forces. The free energy ΔGG|G(h) of a parallel stacked G|G pair as a function of their separation h in the WCA solvent is shown in Figure 2. MC results from simulations using two different umbrella window sizes (1 and 0.5 Å) are shown as black and orange solid lines. The free energy profile for every window is plotted individually in Figure 2. Each window overlaps with the last half of the previous window and the first half of the next, allowing us to reliably align the free energy profiles among all the windows. The window-to-window variations between the overlapping profiles provide an estimate for the statistical error, shown to the right of the curve. The plotted ΔG(h) values are referenced to two infinitely separated bases, i.e., ΔG(∞) ≡ 0. These results were generated from MC simulations with 10,000 solvent molecules. Under constant D

DOI: 10.1021/acs.jpcb.6b01934 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

kcal/mol, is weaker than G|G, since cytosines are smaller than guanines. The ratio ΔGC|Cmin to ΔGG|Gmin is roughly 2−3. While the free energy profiles for G|G and C|C stacks both show a minimum at 3.4 Å, another more prominent feature is the peak at ∼5.1 Å, which is highlighted by the upside-down blue triangle in Figure 2. Two bases interacting with dispersion forces or with charge−charge interactions should not have a maximum in their free energy profile, because these interactions decay monotonically with distance. A maximum in the free energy profile between two bases is therefore a distinctive signature of solvent entropic effects. For G|G, the uphill free energy from the minimum to this peak produces a substantial barrier to unstacking, ΔG* ∼ 7.2 kcal/mol. This barrier exists because, as the two bases begin to separate, there is a range of distances larger than the minimum distance of 3.4 Å where excess solvent is excluded from the interface between the two bases because solvent molecules cannot reinfiltrate a space narrower than their diameter allows, and this leads to the unstacking barrier ΔG*. However, as the gap between the two bases continues to grow, at some point, it becomes large enough to allow solvent molecules to refill the cavity between the two bases. At this point, the free energy falls again, leading to the second minimum in ΔGG|G(h) at h = 6.2 Å. A similar behavior is exhibited by the C|C stack but with a lower free energy barrier of ΔG*C|C ∼ 4.9 kcal/mol. From this distance on out, ΔGX|Y(h) for both G|G and C|C show oscillations with similar periodicities. The presence of this rather large free energy barrier to unstacking is physically significant because it kinetically protects stacked bases from being unstacked. Conversely, there is also a kinetic barrier in the reverse direction; i.e., collapsing two separated bases into stacking position must also surmount an entropic barrier in order to drive solvent molecules out of the interface between the bases. Alternations between excess solvent being expelled followed by solvent returning to refill the interface between the two bases produce the characteristic oscillations in ΔG(h) exhibited in Figure 2. However, unlike the first peak, the subsequent oscillations are not likely to be physically as significant because there are many more ways to unstack the bases as their separation increases without them being necessarily parallel to each other. At small separations, steric interactions between the two planar base molecules preclude rotational motions, and the peak near h = 5.1 Å would much more closely correspond to the free energy barrier actually encountered when two bases unstack from each other. The computed free energy profile between two stacked A|A and A|T are compared to G|G and C|C in Figure 3A. All pairs show similar oscillations as a function of separation, but the magnitudes of the minima and maxima vary. In general, the stability of the base stack correlates with the size of the bases, with G|G and A|A being the most stable, C|C the least stable, and A|T in between. The free energy profile between two guanines in the clamshell geometry is shown in Figure 3B. Two bases in this geometry have essentially zero stability in a WCA solvent. Figure 3C shows the free energy profile for two guanines sliding against each other. Unlike separating two bases perpendicularly (Figures 2 and 3A), unstacking two bases in a WCA solvent by sliding encounters no additional free energy barrier. This is because no excess solvent molecules are excluded by the bases when sliding against each other. Finally, Figure 3D shows free energy profiles between A:T (solid lines) and G:C (dashed lines) in the Watson−Crick geometry. Because the bases were uncharged in these calculations, the free

Figure 2. Free energy profiles ΔG(h) for stacked bases as a function of their separation h along the direction perpendicular to the rings in a WCA solvent. MC umbrella sampling results for a G|G stack are shown as solid lines for two independent simulations, both with 10,000 solvent moleculesresults using 24 overlapping 1 Å windows are shown in black and those using 24 overlapping 0.5 Å windows in yellow. Statistical errors, estimated from the window-to-window variations, are indicated by the error bar to the right. System size convergence was verified by a simulation with 80,000 solvent molecules, indicated by the purple open circles. MC results for parallel stacked C|C bases are shown as the green dashed lines. ΔG(h) for both G|G and C|C are referenced to infinite separation with ΔG(∞) ≡ 0. The G|G stack is stable relative to two separated G by ΔG of −3.8 kcal/mol, whereas the stability of the C|C stack is somewhat lower at −2.6 kcal/mol. The minimum separation between the two bases is approximately 3.4 Å for both (indicated by the red triangle). When two stacked bases are separated in the perpendicular direction, the solvent imposes a significant entropic barrier ΔG* (indicated by the upside-down blue triangle), approximately 7.2 kcal/ mol for G|G and 4.9 kcal/mol for C|C.

pressure, the system size fluctuates. The average dimension of the simulation box over all of these simulations was approximately (66.9 Å)3. This size was large enough such that the periodic images of the two bases did not interact directly, and simulations 8 times larger (with 80,000 solvent molecules) shown as the purple open circles in Figure 2 confirm this. The ΔGG|G(h) profile for stacked G|G pairs (solid black and solid orange lines in Figure 2) shows a global minimum at hmin = 3.4 Å, which is consistent with the typical rise per step between two stacked bases in a B-form helix. This feature is highlighted by a red triangle in Figure 2. The minimum value ΔGG|Gmin is approximately −3.8 kcal/mol relative to two separated, individually solvated bases. A stacking free energy of this magnitude derived from solvent entropic effects alone is significant, given that experimental estimates for stacking stability are ∼1 kcal/mol.14,22,24,33 These results illustrate that, in the presence of solvent entropic forces alone, the stacking of two guanines against each other is stable, and its free energy is similar to experimental measurements. We will see below that the ∼3 kcal/mol difference between simulations and experiments is attributable to the conformational entropic penalty on the DNA backbone required for two sequence-neighbor bases to stack. Figure 2 also shows the computed ΔGC|C(h) profile for C|C stacks as the green dashed line. The stacking of two cytosines is also stable under solvent entropic forces alone. The minimum base−base separation in a C|C stack is similar to G|G. Not surprisingly, the stability of the C|C stack, with ΔGC|Cmin = −2.6 E

DOI: 10.1021/acs.jpcb.6b01934 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

Figure 3. Free energy profiles ΔG for two bases as a function of their separation calculated in a WCA solvent in the alternate geometries depicted in Figure 1. (A) Stacked pairs: A|A (red line), C|C (green dashes), G|G (orange line), and A|T (blue dots). All show similar solvent-entropy-induced oscillations with distance. (B) Free energy profile ΔG(d) for two G, with one of them tilted 30° away from the stacked geometry, as a function of the separation between them (black solid line). ΔG for two parallel G|G (orange line in A) are shown as the dashed line for comparison. The clamshell G/G pair with a 30° tilt is thermodynamically unstable. (C) MC results for the free energy profile ΔG(y) when two parallel stacked G||G slide across each other. In contrast to G|G, sliding incurs no excess free energy barrier. (The complete unstacking of two G in this geometry by sliding alone requires a translation of one base across the other by more than 12 Å, which is physically untenable when the bases are bonded to the sugar− phosphate backbone.) (D) Free energy profiles ΔG(r) as a function of distance r between the bases for the A:T pair (solid lines) and G:C pair (dashed line) in the side-by-side base-pair. Since no base pairing hydrogen bonds or electrostatics were used in these simulations, the observed negative free energy is produced entirely by the solvent.

purple curve in Figure 4 labeled WCA+HB, is only weakly perturbed. The most stable distance between the two bases remains ∼3.4 Å, and the minimum free energy is roughly unchanged at −3.6 kcal/mol. The solvent-induced excess free energy barrier to unstacking is shifted to a slightly larger distance at ∼5.5 Å, while the height of this barrier in a hydrogen-bonding solvent is also somewhat muted compared to the WCA solvent. Overall, the distinctive oscillatory behavior in the free energy profile produced by baseline solvent entropic effects persists even in a hydrogen-bond-forming solvent. The excess free energy barrier upon unstacking, a key signature of solvent entropic forces, is also clearly present. The perturbation to the stacking free energy created by the addition of hydrogen bonding to the solvent over the baseline WCA model defined as ΔΔG = ΔGwca+hb − ΔGwca is shown in the inset in Figure 4, where ΔΔG is plotted as the purple dashed line. ΔΔG(h) is close to zero at the minimum-freeenergy distance of 3.4 Å, confirming that solvent hydrogen bonding does not alter the intrinsic thermodynamic stability of base stackingsolvent entropic effects remain the dominant physical force. However, since hydrogen bonding produces a different solvent structure, the solvent-induced maximum is shifted to larger distance, and ΔΔG(h) exhibits mild variations within the range −2.7 to +1.3 kcal/mol because the distance dependence of the free energy profile is modified by hydrogen

energy variations as a function of pairing distance in Figure 3D reflect the contribution of solvent entropy alone to the stability of hydrogen-bonded base pairs. Notice that, while the hydrogen-bonding distance in Watson−Crick base pairs is typically 2.7−2.9 Å, the free energy profiles in Figure 3D are repulsive at these distances because hydrogen bonding is absent in the WCA-only model. However, the oscillations in the free energy profiles in Figure 3D characteristic of baseline solvent entropic effects are similar to those observed previously in a simulation of A:T and G:C base pairs in water with full electrostatics and dispersion interactions turned on.79 This suggests solvent entropic effects are not only relevant for the stability of base stacking but are at least partially responsible for base pairing forces too. B. Solvent Hydrogen Bonding Produces Minor Perturbations to Stacking Free Energy. In the presence of a hydrogen-bonding solvent, the salient features of the free energy profile between two stacked bases produced by baseline solvent entropic effects remain remarkably robust. This is demonstrated in Figure 4. The bottom curve (orange, labeled WCA) shows the same solvent-entropy-driven free energy profile between two parallel guanines G|G shown in Figure 2. When hydrogen bonding was turned on in the solvent model by restoring TIP3P partial charges to the WCA water molecules, the resulting free energy profile, shown as the F

DOI: 10.1021/acs.jpcb.6b01934 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

interaction between the two bases over the WCA+HB model is plotted as the green dotted line in the inset of Figure 4, clearly showing the net repulsive effect introduced by adding partial charges to the bases. However, while this ΔΔG seems to suggest that the direct electrostatic interaction between the two guanines is causing the instability, the situation is actually more complicated. Within a hydrogen-bonding dielectric solvent environment, electrostatic forces between the two bases are renormalized by the solvent in a complex manner. In fact, when the direct Coulombic interactions between the two bases are computed using their atomic partial charges (green solid line in the inset of Figure 4) and compared against ΔΔG, neither the total magnitude nor the distance dependence of it could reproduce the simulated results in the green dotted line regardless of whether a constant dielectric or an exponential screening length is applied to the Coulomb interactions. The ΔΔG resulting from adding charge−charge interactions between the bases shows an almost linear decay with distance up to approximately 6 Å. This indicates that, while the perturbation ΔΔG suggests there is electrostatic repulsion between the two bases, the magnitude of the resulting forces could not have been accurately determined without a free energy calculation involving the full solvent. Even though the charge−charge interactions between the two bases perturb the base stacking free energy sufficiently strongly to alter its sign changing it from attractive to repulsive, the salient features of the free energy profile between two stacked bases produced by baseline solvent entropic effects remain remarkably robust. The WCA+HB+CHG free energy profile in Figure 4 (solid green line in the main panel) shows characteristic solvent-entropy-driven features in its free energy profile similar to the baseline WCA model (solid orange line), with the same distinctive oscillatory pattern. The excess free energy barrier to unstacking, a key signature of baseline solvent entropic effects in both the WCA and WCA+HB models, is also clearly visible in the WCA+HB+CHG free energy profile though its magnitude has been modified by electrostatic forces. While the magnitude and the sign of the net electrostatic interaction between the bases are necessarily dependent on the partial charge assignments and solvent model used in the Amber ff99 force field parametrization, the physical effects produced by intrinsic solvent entropic effects appear to be strongly conserved across all solvent models. D. The Solvent Heavily Tampers Dispersion Interactions between Stacked Bases. The solid red line in Figure 4 shows the free energy profile for two stacked guanines when attractive VDW forces are fully restored into both the solvent and the solute molecules. This VDW+HB+CHG model contains full dispersion and electrostatic interactions among all solvent and solute atoms. With the addition of attractive dispersion forces, the VDW+HB+CHG free energy profile reverts to being attractive. The overall free energy profile for the VDW+HB+CHG model (solid red line, main panel Figure 4) is remarkably consistent with the WCA only model (solid orange line), with the positions of their peaks and valleys almost matching perfectly while the magnitudes are only mildly perturbed by the sum of the other complex molecular interactions. Overall, the complex collective effects of solvent hydrogen bonding, electrostatic, and attractive dispersion forces appear to largely cancel each other, producing a net effect on the base stacking free energy that is very close to neutral. Considering that each of the individual atomistic interactions is rather strong, this net cancellation among all the atomic forces

Figure 4. Free energy profile between two stacked G in four different solvent models as a function of separation. The orange line (bottom) shows ΔG(h) in a WCA solvent (same as orange lines in Figures 1 and 2A) with only repulsive excluded volume interactions. Adding hydrogen bonds to the solvent by restoring TIP3P charges to the solvent molecules changes the free energy profile to the one shown in purple (WCA+HB, second to the bottom). Adding electrostatics between the two bases by restoring their partial charges changes the free energy profile to the one shown in green (WCA+HB+CHG, second from the top). Finally, restoring attractive branches (ATT) back into all the van der Waals interactions between solvent−solvent, solute−solute, and solute−solvent changes ΔG(h) to the one shown in red (VDW+HB+CHG, top). The free-energy scales on the vertical axis have been color-coded to match each of the four free energy profiles. Inset: Perturbations ΔΔG(h) with the addition of each type of interaction to the model in the order WCA → HB → CHG → ATT are shown in the inset. The purple dashed line shows ΔGwca+hb − ΔGwca. The green dotted line shows ΔGwca+hb+chg − ΔGwca+chg. The short-dashed red line shows ΔGvdw+hb+chg − ΔGwca+hb+chg. For comparison, direct electrostatic interaction between the two bases without solvent is shown as the solid green line and direct van der Waals interaction as the red solid line.

bonding among solvent molecules. Overall, however, the qualitative as well as quantitative effects of solvent hydrogen bonding on the stacking free energy are surprisingly minor. C. Electrostatic Interactions between Stacked Bases Are Overall Repulsive. When electrostatic interactions between two stacked guanines due to their atomic partial charges are turned on, the free energy profile becomes net repulsive. This is shown as the solid green line in Figure 4, labeled WCA+HB+CHG. The free energy perturbation ΔΔG = ΔGwca+hb+chg − ΔGwca+hb produced by the charge−charge G

DOI: 10.1021/acs.jpcb.6b01934 J. Phys. Chem. B XXXX, XXX, XXX−XXX

Article

The Journal of Physical Chemistry B

backbone, but a reliable estimate for the free energy costs associated with this backbone entropic penalty has to date been unavailable. An analytical formula for calculating the conformational volume of the sugar−phosphate backbone on a DNA has been reported recently. 34,35,80 Given a configuration of the nucleobases (i.e., their positions), this BCV/-c34,35,80 formulation of the backbone entropy is formally equivalent to an exact enumeration of all the sugar−phosphate backbone conformations consistent with this base configuration, derived from an analytical inverse kinematic solution of the torsional angle backbone conformational volume (BCV) of the entire DNA backbone, properly accounting for steric collisions (-c) among the backbone atoms by removing the sterically offending conformations numerically. The BCV/-c formulation enables the free energy of two adjacent nucleobases linked by the sugar−phosphate structure to be accurately computed via a Monte Carlo sampling of the DNA backbone conformational space.35 Figure 5 shows the computed free energy profiles for a

in terms of the effect they produce on two stacked bases is remarkable. Perhaps even more astounding is the fact that the dominant physical driving force behind base stacking in this highly complex solvent environment remains to be intrinsic solvent entropic effectsthis feature is strongly pervasive across all solvent models. The perturbation to the free energy, ΔΔG = ΔGvdw+hb+chg − ΔGwca+hb+chg, produced by the restoration of attractive VDW interactions is shown in the inset in Figure 4 as the shortdashed red line. ΔΔG confirms that the attractive dispersion interactions between the two bases recover the stacking stability lost when charge−charge was added. However, this again would be an oversimplified interpretation of the results, since dispersion forces, like electrostatics, are also renormalized by the solvent in a highly intricate manner. The thin solid red line in the inset in Figure 4 shows that, if the direct attractive dispersion interactions were to be used to estimate the net attractive force between two stacked bases, their VDW attraction would have been much lower at the minimum distance of h = 3.4 Å instead of ∼4.5 kcal/mol according to the calculated ΔΔG. Furthermore, direct VDW interaction between the two bases has a very different distance dependence too. ΔΔG shows that, on close approach, the two bases have dispersion interactions between them that are stronger than each one has with the solvent when they are individually solvated. However, this dispersion interaction also turns off at much shorter distances (∼4.5 Å) compared to direct van der Waals forces, since the solvent molecules make up for the lost interbase van der Waals attraction by refilling the space between the two bases as they separate from each other. The overall stability in the base stack recovered by the addition of attractive dispersion interactions back into the force field is not the result of direct van der Waals forces between the two bases. Instead, these dispersion interactions have been heavily tampered by the solvent. E. DNA Backbone Conformational Entropy Destabilizes Base Stacks. The computed free energy profiles in Figure 4 suggest that two stacked guanines are stabilized relative to two individually solvated G by a free energy of ∼ −4 kcal/mol (orange solid line, Figure 4). While we believe the order of magnitude of these estimates is accurate since the universal driving force behind stacking is solvent entropy, its precise computed value is sensitive to the force field parameters. In the VDW+HB+CHG model where the full ff99 force field has been used, the computed stabilization free energy is ∼ −2 kcal/mol (red solid line, main panel Figure 4). Therefore, while these results have demonstrated definitively that solvent entropy is the physical origin of stacking driving force, our estimates of ∼ −2 to −4 kcal/mol for the stacking free energy can only be regarded as an approximation. Despite this, there is still a discrepancy between these estimates and the experimentally measured values, which are somewhat smaller at ∼ −1 kcal/ mol. To understand this difference, it is important to recognize that experimentally determined stacking free energies contain contributions from the polymeric structure of the DNA. Since the nucleobases are chemically linked to the sugar−phosphate backbone, experimentally measured thermodynamics data are necessarily a convolution between intrinsic solvent-mediated stacking free energy with the conformational free energy of the DNA backbone. Since the stacking of two sequence-neighbor bases forces the DNA into a restricted subset of conformations, base stacking exacts a conformational entropic penalty on the

Figure 5. DNA backbone entropic contribution to the free energy for a G|G pair (orange solid line) and a C|C pair (green dashed line) in an equilibrium ensemble of conformations of the 6-nt sequence (dT)2(dX)2(dT)2 (X = G or C) as a function of the base−base separation estimated from the C5−C5 distance (for X = G) or N3−N3 distance (for X = C) from a conformational Monte Carlo sampling of the DNA sugar−phosphate backbone. Stick drawings show typical conformations for two G at large and small separations from the simulations. Configurations with base distances ∼