6108
J. Phys. Chem. B 2000, 104, 6108-6111
A Model for Simulating Dynamics of DNA Denaturation Karen Drukker and George C. Schatz* Chemistry Department, Northwestern UniVersity, 2145 Sheridan Road, EVanston, Illinois 60208-3113 ReceiVed: February 10, 2000; In Final Form: May 5, 2000
We present a simplified model of DNA for computer simulation of melting dynamics. This model is simple enough to allow simulation times up to the microsecond range for DNA duplexes with tens of base pairs, while its level of detail is sufficient to describe individual base pairs and individual hydrogen bond formation and breakage. We have applied this model to the simulation of the melting of two model B-DNA decamers. The DNA melting profiles are similar to those seen in experiment, as are the differences between duplexes containing only AT pairs versus those with only CG pairs.
Introduction Modeling the structural and dynamical properties of DNA provides an important challenge to theoretical chemistry. In past work, there has been a great deal of interest (and success) in atomistic level descriptions of DNA structure using traditional force fields,1,2 and there have also been a variety of simple models developed which replace the DNA by elastic rods, strings of beads, or continuum mechanical models3-5 that are appropriate for describing DNA coiling. What these descriptions have trouble with is DNA melting, wherein the oligomers become unraveled over a range of temperatures. In this letter we present the first results of a new model that lies between the atomistic and continuum limits, which appears capable of describing the melting of short duplexes (tens of base pairs) in a semiquantitative way. We have been motivated to undertake this study because DNA melting is an important feature of a new class of biological sensors that has been developed recently by Mirkin and coworkers.6 In these sensors, short (12 bases) single-stranded oligonucleotides that are attached to gold nanoparticles hybridize to complementary single stranded fragments of the DNA that one wishes to detect. This results in nanoparticle aggregate formation that produces a color change due to electromagnetic coupling between the particles.7 This aggregation can be reversed by heating above the melting temperature, and the melting of the DNA-gold aggregates is found to occur over a much narrower range of temperatures than it does for unbound DNA.8 As a result, the spectroscopic signatures of this melting transition are remarkably sensitive to the degree of mismatch between bases on the synthesized oligonucleotides and the analyte DNA. This means that the variation of melting temperature with duplex base-pair composition for duplexes with 12 (up to 72) base pairs is important to sensor function. Although there are many statistical mechanical models for describing DNA melting temperatures in free duplexes,9-16 the DNA-gold aggregates represent a very different physical situation, for which a detailed microscopic model is desirable. In addition, a microscopic model can be used to determine structural information about melting that is not contained in the statistical mechanical models. Although atomistic simulations are very useful for understanding DNA duplex structural properties, to simulate melting at this level is more than a formidable task. The most extensive atomistic simulations done recently have involved the treatment
of small DNA duplexes including explicit solvent.17,18 Even though ever increasing time scales are feasible in atomistic simulations, the time scale considered in this type of calculations (1-10 nanosecond range) is still less than is needed to provide realistic information about melting. To make the study of melting properties of DNA more tractable, and to study a large number of trajectories for much longer times, we have developed a simplified model of DNA which reduces a 10 base pair duplex to just 40 sites. Here we use this model in conjunction with Langevin dynamics19 to study the melting behavior of DNA duplexes at several temperatures. The model is not limited to these applications, however, and could, for example, be used with an explicit solvent in molecular dynamics simulations. The Model Previously, simplified models of DNA have been developed with varying levels of detail.4,20,21 The model that is closest to ours is that of Zhang and Collins.22 In their model, each base is represented by several interaction sites (for Lennard-Jones, hydrogen bonds, etc.) with different spatial locations and the backbone (phosphate plus sugar) is represented by a single site. Motion of each base pair is restricted to two dimensions, so the model is not useful for describing three-dimensional motions. Our model is simpler but more general. All interactions (see next paragraph) are calculated based only on the location of the centers of mass of bases and backbone groups, i.e., all interaction sites are located on the center of mass of a base or backbone group. Motion is fully three-dimensional, and the masses involved are easily derived from the chemical formulas for each base and backbone. The model geometry is illustrated in Figure 1. The form and equations of the force field are generally the same as those used by Zhang and Collins.22 Differences are that our model is fully three-dimensional, has a more reduced number of sites, has angle-dependent hydrogen bonding interactions, and has different values for the force constants. The potential energy V is a sum of various contributions:
V ) VLJ + Vvib + VH + Vbend + Vtors
(1)
Here the first two terms are two-body interactions and represent Lennard-Jones interactions between nonbonded sites and harmonic vibration between bonded sites, respectively. The next two terms, VH and Vbend are three-body terms, representing
10.1021/jp000550j CCC: $19.00 © 2000 American Chemical Society Published on Web 06/10/2000
Letters
J. Phys. Chem. B, Vol. 104, No. 26, 2000 6109 Charges are currently missing in the model, and the values for the model parameters have been chosen to give reasonable melting behavior at a specific salt concentration (0.1 M) that is commonly used in experiments.6 This makes the model impractical to use in its current form for simulations under different salt concentrations. Here, we are mainly interested in the relative stability (melting behavior) of A-T and C-G duplexes, so this defect is of secondary importance. We are also not including an explicit solvent at this stage. Instead, the solvent is treated in a stochastic way by Langevin dynamics.27 Calculations
Figure 1. Model geometry: d(adenine-thymine) decamer in its ideal helical conformation (left) and a snapshot of its configuration during a simulation in which this decamer is stable (right). The adenine sites are portrayed as light large circles and the thymine sites as dark small ones.
hydrogen bonding and bending interactions. Vtors is the torsional potential energy, which is a four-body term. Depending on the nature of the basesadenine (A), thymine (T), cytosine (C), or guanine (G)sdifferent values are used for the model parameters. All backbone sites are identical. The bending and torsion interactions are designed to keep the DNA duplex helical,22 but we find that this alone does not give the duplexes enough stability. Additional stiffness is added along the backbone in the form of next-nearest neighbor harmonic potentials. Each base has the possibility to form hydrogen bonds to conjugate bases. The hydrogen bonds are the key ingredients of the model. The hydrogen bonding parameters were taken from literature values.22,23 Only the equilibrium length of the hydrogen bonds has been changed in our model to be consistent with the sites being located at the center of mass of each base. In our model, a base can form one or more hydrogen bonds with a conjugate base. A and T both have a single acceptor and a single donor site, so two hydrogen bonds are formed in a conjugate AT pair. The ideal number for a CG pair, on the other hand, is three, since C has one donor site and two acceptor sites, matching the two donor and one acceptor sites of G. Hydrogen bonding is allowed to occur only between conjugate bases, but the number of conjugate base interactions is not restricted, i.e., an A can interact with any number of T’s. The hydrogen bonding potential is strongly anisotropic, however, favoring linear backbone-donor-acceptor geometries. This assumed form restricts strong hydrogen bonding interaction to conjugate pairs that are in close contact. These restrictions lead to duplex structures which are stable for microseconds at low temperature and which match, at least approximately, the known structures of measured duplexes in nature.24 Further details of the model will be presented in a more comprehensive publication,25 where also the occurrence of base pair mismatches will be considered. The salt concentration of the solvent is known to influence the stability and geometry of DNA duplexes to a large extent.26
We performed numerous simulations to describe duplex melting. Here melting is defined (see further below) in terms of the fraction of stable duplexes which persist at a given temperature. For each temperature, 50 trajectories were integrated for 75 ns each. In principle, for these short sequences a simulation time on the order of a few microseconds is attainable, but 75 ns proved to be sufficient to accurately determine thermal stability. (A single 75 ns trajectory for a duplex decamer takes slightly less than 2 h computation time on a single node of an IBM SP2.) All trajectories were started from different initial configurations that were stored periodically during equilibration runs at the appropriate temperatures in the absence of stochastic forces. In addition, different initial seeds for the stochastic forces were used in the Langevin equations of motion. In an ideal helical decamer structure, each base hydrogen bonds fully to its conjugate partner, yielding two hydrogen bonds for A-T pairs and three for C-G pairs. This makes the ideal number of hydrogen bonds for an A-T decamer 20, and that for a decamer of C-G 30. For the simulations discussed here, the onset of melting has been defined as the moment at which the number of hydrogen bonded base pairs drops (and stays) below half of its ideal value. A trajectory has been labeled “melted” when the number of hydrogen bonded base pairs stays below three for more than 15 ns. The time at which melting is considered to be complete is then defined as the last instant at which the number of base pairs drops below the critical number of three. Results Figure 2 depicts the fraction of d(A)d(T) decamer duplexes that melted for the given temperatures. The derivative with respect to temperature of the best curve through the data points is also shown, and this is approximately described as a Gaussian in the transition region. The maximum of this Gaussian curve determines the melting temperature, while its full width at halfmaximum gives what we will call in this paper the melting range. For the d(A)d(T) decamer we estimate the melting temperature to be 290 K (17 degrees Celsius), and the range is 18 degrees. This temperature is close to estimates for small duplexes with similar base pair composition and salt concentrations of 0.1 M.28 At each temperature, many trajectories (2040%) showed considerable periods of instability without satisfying the melting criterion, i.e., their hydrogen bonding structure recuperated within 15 ns. These percentages do not seem to depend on temperature. In trajectories displaying considerable instability, the hydrogen bonding structure either recovers fully in its original configuration, or partially after a configurational adjustment. The latter is feasible since in the d(A)d(T) decamer all bases on one strand are complimentary to the ones on the other strand, allowing strands to shift with respect to each other when hydrogen bonds are broken. Hydrogen bonds may then be re-formed.
6110 J. Phys. Chem. B, Vol. 104, No. 26, 2000
Figure 2. Melting curve (fraction of trajectories (f) melted at a given temperature) for the d(adenine)d(thymine) decamer and its derivative with respect to temperature. The circles are the data points, and the dotted line serves as a guide to the eye, and is used for the calculation of the derivative (dashed line). The derivative curve (dashed line) has been normalized.
The time elapsed between the start of a trajectory and the onset of melting decreases with increasing temperature, as expected. This average time drops from over 30 ns for the lowest temperatures considered here, to about 23 ns in the intermediate temperature region, to 10 ns for the highest temperature. One should note, however, that in all cases the standard deviation of the melting times is large (10-15 ns). A slow initiation of melting does not necessarily mean that completion of melting also requires a long time, and vice versa. Some helices fall apart extremely fast after melting has initiated, others linger for a long time before denaturing completely. For the temperatures considered here, the average time elapsed between the onset of melting and the instant of melting completion is 5 ( 2 ns and does not seem to depend on temperature. If one compares the hydrogen bonds associated with bases at the end of the helix to those of bases located more internally in the strands, one sees that the hydrogen bonds at the end are on average slightly longer (a few percent) and much less stable. At the lowest temperature considered here, for example, only an average of about 70% of the ideal number of hydrogen bonds are formed in the two base pairs at each end of the d(A)-d(T) decamer, while for the other 6 “internal” base pairs more than 90% of the bonds are, on average, intact. The hydrogen bonding angle (backbone-donor-acceptor angle) is small in all cases (about 5 degrees), but base pairs at the end these angles measure about one half a degree more, while their fluctuations are up to 50% larger than for the ‘internal’ base pairs. These observations are clearly signs of fraying of the ends of the helix as frequently observed in other DNA models29,22 and which in nature is of importance in, e.g., DNA-protein interactions.30,31 Figure 3 shows the calculated melting curve for the d(C)d(G) decamer. The melting temperature is estimated to be 333 ( 4 K, i.e., 60 °C. There are a few things to note regarding the calculated melting curves. First, the melting temperature for d(C)d(G) is considerably higher (by 43 degrees) than for the d(A)d(T) counterpart. This increased stability with respect to A-T is as expected, since it is known that for long DNA helices the melting temperature rises linearly from 77 to 100 °C, as the C-G content is increased from 20% to 78%.32 In addition, the observed melting temperature in our simulations matches the predictions of thermodynamic models.28 Second, the temperature interval over which melting occurs is similar to what
Letters
Figure 3. Melting curve for the d(cytosine)d(guanine) decamer. See previous figure for annotation.
we find for d(A)d(T) duplexes. This is again similar to what is seen in experiment. Third, a remarkable difference is the virtual absence of trajectories that show considerable instabilities without truly melting. These were abundant in the A-T case, while for the d(C)d(G) decamer these are only rarely observed within the range of temperatures investigated here. The time at which the d(C)d(G) decamer starts to melt after initiation of a trajectory decreases with temperature from over 40 ns to a little over 20 ns. The time necessary for the helix to fall apart completely after melting has been initiated does not seem to depend on temperature, in agreement with what was found for the d(A)d(T) decamer. So, even though the d(C)d(G) structure is stable for longer times than the d(A)d(T) duplex at temperatures within their respective melting range, the time between the onset of melting and final unraveling is roughly the same. Apparently the extra hydrogen bond of each C-G pair keeps the helix stable longer, but once hydrogen bonds begin to break it seems a largely cooperative process. This effect could be caused, or amplified, by the fact that in the model the hydrogen bonding donor and acceptor sites are collapsed onto the center of mass of each base. This creates, e.g., the same angle dependency of the hydrogen bonding forces for both donor sites in guanine. The percentage of hydrogen bonds at the lowest temperature considered for the d(C)d(G) decamer is the same as that for the most stable d(A)d(T) decamer. About 70% of the ideal number of hydrogen bonds at the ends of the helix are, on average, intact, while that number is about 90% for the “internal” hydrogen bonds. Since the average percentages are the same for C-G as for A-T, the absolute number of hydrogen bonds is 1.5 times higher in the C-G helix, which stabilizes the helical structure at sufficiently low temperature. The hydrogen bonds formed between bases at the end of the C-G helix are not significantly longer than those formed between more internally located bases. The hydrogen bonding angle is slightly larger, however, and fluctuates to a larger extent, in the same fashion as seen for A-T. Hence the fraying at the ends of a model C-G helix is much less severe and mostly manifests itself in the hydrogen bonding angles. Conclusion The results of our calculations suggest that this model is able to describe realistically many aspects of DNA melting, including melting temperatures and ranges and the dependence of melting
Letters temperature on DNA base composition. Admittedly, the model is extremely simple, using just 40 sites to describe a 10 base pair duplex, and omitting ions and explicit solvent. However, the structures that we have obtained are both accurate and stable at low temperature, and we have obtained realistic melting temperatures and ranges using hydrogen bonding parameters from the literature that are easily connected to atomistic force fields. Preliminary results have been presented here. However, a more extensive study is forthcoming,25which demonstrates that the influence of duplex length, base-pair composition, and the presence and location of single base pair mismatches on melting is properly described by our model. Moreover, the model can be easily generalized to describe melting in duplexes that link gold nanoparticles. Hence, the model is promising as a tool for understanding the behavior of recently developed biosensors based on DNA-linked gold nanoparticle aggregates.6 In the long run, this model may be useful for studying DNA mechanical properties, the properties of partially unwound DNA, DNA supercoiling, transformations between different DNA structural forms, and the binding of proteins and drugs to DNA. Acknowledgment. This work was supported by ARO Grant DAAG55-97-1-0133 and by NSF Grant CHE-9871903. We thank James Storhoff and Chad Mirkin for many helpful discussions. References and Notes (1) Beveridge, D. L.; DiCapua, F. M. Annu. ReV. Biophys. Biophys. Chem. 1989, 18, 431. (2) Kollman, P. Chem. ReV. 1993, 93, 2395. (3) Marko, J. F. Physica A 1997, 244, 263. (4) Martino, J. A.; Olson, W. K. Biophys. J. 1998, 74, 2491. (5) Jian, H.; Schlick, T.; Vologodskii, A. J. Mol. Biol. 1998, 284, 287. (6) Mirkin, C. A.; Letsinger, R. L.; Mucic, R. C.; Storhoff, J. J. Nature 1996, 382, 607.
J. Phys. Chem. B, Vol. 104, No. 26, 2000 6111 (7) Lazarides, A. A.; Schatz, G. C. J. Phys. Chem. B 2000, 104, 460. (8) Storhoff, J. J.; Lazarides, A. A.; Mirkin, C. A.; Letsinger, R. L.; Mucic, R. C.; Schatz, G. C. J. Am. Chem. Soc. 2000, 122, 4640. (9) Poland, D. Biopol. 1974, 13, 1859. (10) Fixman, M.; Freire, J. J. Biopolymers 1977, 16, 2693. (11) Breslauer, K. J.; Frank, R.; Blocker, H.; Marky, L. A. Proc. Natl. Acad. Sci. U.S.A. 1986, 83, 3746. (12) Sugimoto, N.; Nakano, S.; Yoneyama, M.; Honda, K. Nucl. Acids. Res. 1996, 24, 4501. (13) Chalikian, T. V.; Vo¨lker, J.; Plum, G. E.; Breslauer, K. J. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 7853. (14) Owczarzr, R.; Vallone, P. M.; Gallo, F. J.; Paner, T. M.; Lane, M. J.; Benight, A. S. Biopolymers 1997, 44, 217. (15) Chalikian, T. V.; Vo¨lker, J.; Plum, G. E.; Breslauer, K. J. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 7853. (16) Campa, A.; Giansanti, A. Phys. ReV. E 1998, 58, 3585. (17) Young, M. A.; Ravishanker, G.; Beveridge, D. L. Biophys. J. 1997, 73, 2313. (18) Konerding, D. E.; Cheatham, T. E.; Kollman, P. A.; James, T. L. J. Biomol. NMR 1999, 22, 119. (19) Allen, M. P.; Tildesley, D. J. Computer Simulations of Liquids; Clarendon Press: Oxford, 1987. (20) Gil Montoro, J. C.; Abascal, J. L. F. J. Chem. Phys. 1995, 103, 8273. (21) Gil Montoro, J. C.; Abascal, J. L. F. J. Chem. Phys. 1998, 109, 6200. (22) Zhang, F.; Collins, M. A. Phys. ReV. E 1995, 52, 4217. (23) Chen, Y. Z.; Prohofsky, E. W. Phys. ReV. E 1993, 47, 2100. (24) Mathews, C. K.; van Holde, K. E. Biochemistry; The Benjamin/ Cummings Publishing Company: Redwood City, California, 1990. (25) Drukker, K.; Schatz, G. C. J. Chem. Phys., submitted. (26) Korolev, N.; Lyubartsev, A. P.; Nordenskio¨ld, L. Biophys. J. 1998, 75, 3041. (27) Sprous, D.; Young, M. A.; Beveridge, D. L. J. Phys. Chem. B 1998, 102, 4658. (28) Steger, G. Nucleic Acids Res. 1994, 22, 2760. (29) Olmsted, M. C.; Anderson, C. F.; Record, M. T. Biopolymers 1991, 31, 1593. (30) Choo, Y. Nucleic Acids Res. 1998, 26, 554. (31) Jelesarov, I.; Crane-Robinson, C.; Privalov, P. L. J. Mol. Biol. 1999, 294, 981. (32) Stryer, L. Biochemistry, Freeman, W. H.: New York, 1995.