J. Phys. Chem. B 2008, 112, 4113-4122
4113
Structures and Energetics of Base Flipping of the Thymine Dimer Depend on DNA Sequence Lauren L. O’Neil and Olaf Wiest* Department of Chemistry and Biochemistry, UniVersity of Notre Dame, Notre Dame, Indiana 46556-5670 ReceiVed: October 24, 2007; In Final Form: December 20, 2007
The cis,syn-cyclobutane pyrimidine dimer (CPD) is a photoinduced DNA lesion leading to a significant distortion of the DNA structure. Its repair by DNA photolyase requires a flip of the damaged base into an extrahelical position. This base flip is expected to be sequence-dependent, but the structures and energetics as a function of the bases 3′ and 5′ to the CPD lesion are unknown. Eight-nanosecond MD simulations of four different hexadecamer duplexes with the CPD were performed for the flipped-in and flipped-out structures. Analysis of these results indicates clear sequence-dependent differences. Significant disruptions of the base pairs to the 3′ side of the CPD are observed for the flipped-out structures with adjacent A-T pairs, whereas those with G-C pairs adjacent show no such distortions. The conformational spaces occupied by these two duplexes are significantly different. The structural differences correlate well with the free energy differences for base flipping calculated using the previously established 2D potential of mean force (PMF) method. The energy differences for base flipping in duplexes containing A, T, G, and C pairs adjacent to the CPD were found to be 6.25-6.5, 5.25-5.5, 7.25-7.5, and 6.5-6.75 kcal/mol, respectively. These energy differences of up to 2 kcal/mol should be large enough to be detected experimentally using sensitive probes.
Introduction Base flipping is the conformational change of a DNA base from its base-stacked, hydrogen-bonded, intrahelical position to a solvent-exposed, extrahelical position. The exposure of flipped-out bases to the aqueous environment increases their accessibility to proteins and small molecules. It is therefore no surprise that most DNA repair and modification reactions involve base flipping of one or more bases. Crystal structures of DNA/enzyme complexes in which there is a flipped-out base have been reported for M.HaeI, M.HaeIII, hOGG1, and T4 endonuclease.1-4 A consistent source of debate, the mechanism of recognition of bases by DNA repair and modification enzymes has been previously studied.5-7 The two major pathways that have been set forth are (1) base flipping followed by binding, termed the “passive” mechanism and (2) simultaneous base flipping and binding, termed the “active” mechanism. For the DNA repair enzyme human 8-oxoguanine DNA glycosylase (hOGG1), the temporal coupling of base flipping and enzyme recognition was studied.5 For hOGG1, there is the need to recognize and repair DNA damage that occurs in very low numbers compared to the number of undamaged bases. It was determined that the enzyme uses “fast sliding” along the DNA backbone and interrogates possible damaged bases in an additional pocket on the surface, allowing for the nonproductive recognition events to be fast and to require low activation energies.6 A recent study by Stivers and co-workers on the recognition mechanism of uracil DNA glycosylase (UNG) showed that the enzyme recognizes a transient, high-energy species in which a modified thymine base is partially flipped out of the duplex.7 Therefore, UNG follows the passive recognition mechanism and relies on DNA base-pair breathing. As DNA bases are involved in hydrogen bonding and base stacking when intrahelical, it is expected that the energy required * Corresponding author. E-mail:
[email protected].
to undergo base flipping is high. Studies of spontaneous base flipping of DNA bases using imino proton-exchange measurements by NMR spectroscopy have reported the equilibrium constant for base flipping of a guanine base in a GCGC tetramer to be 3.3 × 10-7.8 This corresponds to a free energy difference of ∼9 kcal/mol. Computational studies of base flipping have primarily focused on undamaged DNA using potential of mean force (PMF) calculations.9 The potential of mean force for base flipping of adenine and its base pair thymine were computed by Guidice et al., who reported the ∆G values to be 15 and 13 kcal/mol, respectively.10 In a later study, the energetics of base flipping of cytosine and guanine bases were computed, and the ∆G values for those processes were found to be 16 kcal/mol for both bases.11 The base flipping of the cytosine residue in the GCGC sequence recognized by M.HhaI was studied using a center-of-mass pseudodihedral coordinate.12 The free energy required for flipping of the cytosine base was determined to be 15.3 and 17.6 kcal/mol for flipping through the major and minor grooves, respectively. For the base-pairing partner guanine, the base flipping free energy was determined to be 21.3 and 18.7 kcal/mol for flipping through the major and minor grooves, respectively. Although the free energy required for base flipping appears to be systematically overestimated compared to the limited available experimental data, it is clear that a distinct sequence dependence of base flipping is obtained in these calculations. Undamaged DNA bases have optimal hydrogen bonding and base stacking in the double helix, making base flipping an unlikely process. However, damaged DNA bases do not have optimal structural characteristics, which might decrease the energy required to undergo base flipping. One such type of DNA damage is the cis,syn-cyclobutanepyrimidine dimer (CPD), which is formed by a photochemical [2+2] reaction between two adjacent thymines in DNA, as shown in Figure 1. It is
10.1021/jp7102935 CCC: $40.75 © 2008 American Chemical Society Published on Web 03/12/2008
4114 J. Phys. Chem. B, Vol. 112, No. 13, 2008
O’Neil and Wiest TABLE 1: DNA Sequences Studied duplex
sequence
Aa
5-GCACGAATTAAGCAGC-3 3-CGTGCTTA ATTCGTGC-5 5-GCACGGGTTGGGCAGC-3 3-CGTGCCCAACCCGTGC-5 5-GCACGCCTTCCGCAGC-3 3-CGTGCGGAAGGCGTCG-5 5-GCACGTTTTTTGCAGC-3 3-CGTGCAAAAAACGTCG-5
G C T Figure 1. Formation of the cis,syn-cyclobutane pyrimidine dimer (CPD).
known to induce skin cancer and is thus is particularly relevant for human health.13 The CPD lesion, also known as the thymine dimer, introduces significant disruption into the DNA helix. The results from a 500-ps molecular dynamics (MD) simulation performed by Miaskiewicz et al. showed a disruption of the hydrogen bonding of the 5′ thymine to its complementary adenine base.14 The hydrogen-bond distance was increased to 2.5 Å (N-H-N), and the observed hydrogen-bond angle was 125°. The X-ray crystal structure also confirms the disruption of base pairing of the thymine dimer with the complementary adenine bases. The distance between the O4 atom of the 5′ thymine and the N6 hydrogen of the complementary adenine base was found to be 2.49 Å, and normal hydrogen-bonding patterns for the 3′ thymine were observed.15 In fact, the authors of that study stated that the X-ray structure is “in remarkably good agreement” with the computed values. Another interesting feature of this structure is that the tilt and roll of the thymine bases that make up the thymine dimer are such that the base stacking is disrupted (i.e., bases are not parallel). Finally, the inclusion of a thymine dimer in a DNA duplex has been found to induce a bend or kink in the DNA helix. The kink angles calculated from an X-ray crystal structure of a thymine-dimer-containing DNA decamer was 30°.15 In comparison, the kink angles of both 10-mer and 48mer DNA strands that had undergone energy minimization were found to be 27°.16,17 The average structure resulting from an 800-ps MD simulation of a thymine dimer in a 12-mer DNA strand was shown to have a kink angle of 22.3°. In comparison, the kink angle calculated for the average structure of an undamaged DNA duplex from an 800-ps MD simulation was found to be 8.2°.18 The experimental and computational measurements of the kink angles of thymine-dimer-containing DNA sequences show excellent agreement, which is encouraging for the computational study of structure and dynamics of these DNA sequences. DNA photolyases, enzymes with between 454 and 614 amino acid residues that contain two noncovalently bound cofactors, repair the thymine dimer by electron-transfer-induced cycloreversion under irradiation with visible light.19,20 A crystal structure of A. nidulans DNA photolyase in complex with a model of the photodamaged DNA substrate shows that the lesion site is flipped-out into the enzyme active site.21 This was also suggested by earlier experimental work, computational models, and the crystal structure of E. coli DNA photolyase without bound substrate.22-25 The changes in the DNA structure due to the thymine dimer have been postulated to aid in the recognition of the thymine dimer by DNA photolyases, possibly by destabilizing the DNA structure in comparison to that of undamaged DNA.15 This destabilization could allow for the thymine dimer to undergo spontaneous base flipping, leading to a passive mechanism of base flipping and damage recognition. A recent computational study on flipping of the thymine dimer in a duplex of the sequence 5′-GCACGAATTAAGCACG-
a
Previously reported in reference 24.
3′ (where TT signifies the thymine dimer) showed that the energy required for this process is, at ∼6.25 kcal/mol, much lower than that for undamaged bases, ∼15 kcal, depending on the identity of the base.26 Given the significant distortion of the duplex not just for the flipping base but also for the bases adjacent to it, a dependence of the equilibrium constant for flipping on the identities of the bases 3′ and 5′ to the flipping base can be expected. This is demonstrated in the flipping of a DNA base opposite an abasic site. Numerous NMR studies have demonstrated that the base pairs to the 5′ and 3′ sides of the abasic site have an influence on base flipping.27-35 These observations are most likely due to the collapse of the helix that follows base flipping, which allows base stacking of the bases adjacent to the abasic site. A recent study of base flipping in DNA sequences containing an abasic site using fluorescence detection of small-molecule binding to the flipped-out base also reported the observation of sequence dependence.36 A computational study of the base flipping of the target sequence for M.HhaI reported possible sequence dependence based on hydrogen bonding of the flipping base to adjacent bases.12 Sequence dependence was also reported in a study of the rates of base flipping in G-C-containing sequences.8 This effect was attributed to a number of structural properties, such as minor and major groove widths and hydration. However, to the best of out knowledge, no systematic computational study of the sequence dependence of base flipping for the CPD has been published. This article reports the calculation of the potential of mean force for various DNA sequences that contain a thymine dimer. In previous work, a novel method was used to calculate the 2D PMF for the base flipping of the thymine dimer in the sequence 5′-GCACGAATTAAGCACG-3′/3′-CGTGCTTAATTCGTGC-5′, where TT represents the thymine dimer.26 This method has now been applied to the sequences listed in Table 1. The goal of this work was to study the sequence dependence of the structures of the base-flipped conformations and the energetics of the base-flipping process. The structures of the flipped-in and flipped-out states of all duplexes were determined and were found to exhibit sequence-dependent differences. A detailed analysis of the structures of the flipped-out conformations below is followed by a discussion of the 2D potentials of mean force calculated for the DNA sequences studied and the implications for enzymatic recognition. Methods All molecular dynamics simulations were performed using the Amber 8 suite of programs and the Cornell et al. force field with the adjustments added by Wang et al.37-39 The parameters for the thymine dimer were those used by Spector et al.18 The DNA structures, both flipped-in and flipped-out, were prepared using Insight II.40 The flipped-out structures were prepared by first removing the bonds between the 5′ and 3′ ends of the thymine dimer and the phosphate backbone and then manually
Structures and Energetics of Thymine Dimer Base Flipping
J. Phys. Chem. B, Vol. 112, No. 13, 2008 4115
TABLE 2: Sizes of Systems duplex
structure
DNA residues
Na+
H2O
A
flipped-in flipped-out flipped-in flipped-out flipped-in flipped-out flipped-in flipped-out
32 32 32 32 32 32 32 32
30 30 30 30 30 30 30 30
3513 4410 3504 3778 3703 4009 3602 4477
T C G
rotating the base(s) of interest out of the DNA helix. The bonds between the thymine dimer and the phosphate backbone were then replaced.26 The DNA was neutralized using Na+ counterions and solvated using the TIP3P water model as provided in xleap. The solvent box extended 8 Å beyond the DNA structure in each direction. The total system sizes are listed in Table 2. The systems were subjected to three rounds of minimization with 60 000 steps per round. The first 100 steps were performed using the steepest-descent method, and the remaining steps using the conjugate-gradient method. The first round of minimization was performed with constraints placed on the DNA heavy atoms to allow the water box and hydrogen atoms to equilibrate. A second round of minimization with constraints on the two base pairs to both the 5′ and 3′ sides of the dimer, but not on the adenine residues opposite the dimer, was conducted. Finally, a third round of minimization with no constraints was performed. The minimized system was then equilibrated in the constantvolume, isothermal (NVT) ensemble for 20 ps with constraints of 10 kcal/(mol Å2) on the DNA. The system was then heated to the final temperature of 300 K over 100 ps in the constantpressure, isothermal (NPT) ensemble with constraints of 10.0 kcal/(mol Å2) on the two base pairs to both the 5′ and 3′ sides of the dimer, but not on the adenine residues opposite the dimer. Isotropic position scaling with a relaxation time of 2 ps was used to maintain a pressure of 1 atm, and Langevin dynamics with a collision frequency of 1.0 ps-1 was used to maintain the temperature at 300 K. Over the subsequent 300 ps, the constraints on the bases to the 5′ and 3′ sides of the dimer were removed, starting from the residues farthest from the dimer and moving closer. After this period of equilibration (420 ps), production MD simulations (8 ns) were run. After 8 ns of simulation of the flipped-out structure, fraying of the 3′ bases was observed, which is typical for these lower-stability structures. Thus, no attempt to extend was made the simulation to longer timescales. All calculations used SHAKE to constrain covalent bonds to hydrogen, which allowed for the use of a 0.002-ps time step. Long-range electrostatic interactions were treated using the particle mesh Ewald (PME) method with a long-range cutoff of 10 Å applied to the Lennard-Jones interactions.41 Periodic boundary conditions were used in all calculations. CURVES analysis, including the global bend and base-pair parameters (buckle, open, propeller), was performed on 100 structures that were output every 80 ps from the trajectory files of the MD simulations.42 All simulations were analyzed using the ptraj module of Amber 9. It was confirmed, using the hbond module of ptraj, that water molecules filled the “hole” left by the thymine dimer in the flipped-out structure for all sequences studied, as has been previously shown.26 The values of two pseudodihedral angles, defined as a dihedral angle connecting the glycosidic nitrogen and the 1′ carbon atom of either the 5′ or 3′ thymine of the thymine dimer and the 1′ carbon atoms of the base pair immediately adjacent to that thymine (see Figure 5), were used to define the base-flipping coordinate and were output as a
Figure 2. Sampling scheme used for duplexes A and T. The open squares show the flipping coordinate in which each window was equilibrated for 100 ps and then sampled for 2.5 ns. The sampling for each of these windows was started from one of the extensively equilibrated structures. The large filled squares show the extensively equilibrated structures in which each window was equilibrated for 3 ns and then sampled for 2.5 ns. The filled points show the sampling extensions used to move high-energy artifacts away from flipping coordinate; each window was equilibrated for 100 ps and then sampled for 2.5 ns.
Figure 3. Sampling scheme used for duplexes G and C. For further details, see Figure 2.
function of time for the 8-ns production runs. These data were used to create a histogram, the modes of which were used to define the flipped-in and flipped-out states. The details of the computation of the 2D PMF were presented in a previous report.26 Umbrella sampling was performed using constrained [0.05 kcal/(mol deg)] harmonic potentials with the values of the pseudodihedral for each window as shown in Figures 2 and 3. Two different sampling schemes were used because of the sequence-dependent differences in the values of the pseudodihedral angles of the flipped-out state. To access the region of the phase space that was needed for duplexes G and C, windows were added to the sampling scheme in Figure 2 to create the sampling scheme shown in Figure 3. The windows added to the sampling scheme are shown by the triangles. The sampling scheme shown in Figure 2 was used to calculate the potentials of mean force for duplexes A and T, whereas the scheme shown in Figure 3 was used for duplexes G and C. The step sizes of the 5′ and 3′ pseudodihedral angles were set to 5° and 6°, respectively. For the sampling scheme shown in Figure 2, the windows at (70°, 4°), (95°, -26°), and (125°, -62°), where the angles are in (5′, 3′) pseudodihedral pairs, were started from the last frame of the equilibrium simulations, which had been stripped of water molecules and ions, resolvated and neutralized, and equilibrated for 100 ps (NVT) or 500 ps (NPT). Two additional windows of that type, at (125°, -20°) and (90°, -62°), were added to the sampling scheme shown in Figure 3. Each of these windows was then equilibrated for 3 ns. The results of the equilibrations of these windows were then
4116 J. Phys. Chem. B, Vol. 112, No. 13, 2008
O’Neil and Wiest
Figure 4. Average structures (8 ns) of flipped-in conformations of thymine-dimer-containing DNA sequences studied: (a) duplex A, (b) duplex T, (c) duplex C, and (d) duplex G.
used to start the equilibrations of neighboring windows, which were done independently. In this way, the risk of hysteresis was minimized because each simulation was independent of the one before it. Each window was equilibrated for 100 ps and sampled for 2.5 ns. The total simulation times used to generate the 2D potentials of mean force were 159.5 ns (duplex T) and 258.9 ns (duplex C and G). The values of the pseudodihedral angles were saved every 0.2 ps. The unbiased free energy was obtained by using the two-dimensional weighted histogram analysis method (WHAM) as implemented by Grossfield.43-45 The convergence criterion was 0.0001, and the data were placed into bins of 0.5° and 0.6° for the 5′ and 3′ pseudodihedral angles, respectively. Convergence of the free energy curves computed using this method was demonstrated previously.26 Results and Discussion Structures of Flipped-in and Flipped-out Conformations. The proper calculation of the potential of mean force (PMF) for the base flipping of the thymine dimer requires the careful identification of starting and ending points for this process because these points define the energy differences studied here. The definitions of the flipped-in and flipped-out states of the thymine-dimer-containing DNA were based on the results of unconstrained molecular dynamics (MD) simulations of the two states. The average structures over 8 ns of simulation of the flipped-in conformations of the DNA sequences are shown in Figure 4. The values of the 5′ and 3′ pseudodihedral angles, defined in Figure 5, were saved every 1 ps, and the results were used to create histograms with 1° bin widths. The 5′ and 3′ pseudodihedral angles of the flipped-in and flipped-out states were defined as the modes of the respective histograms for all sequences studied. Although these pseudodihedral angles do not necessarily describe the pathway taken for the interconversion of the flipped-in and flipped-out forms, they can be used define the PMF for the process. The histograms created for the flippedin states of duplexes A, G, C, and T are shown in Figure 6, and the values are listed in Table 3. The narrow distributions of the flipped-in states indicate a low degree of conformational
Figure 5. Two-dimensional pseudodihedral coordinate used to define the flipping process. Each pseudodihedral angle connects the N1 and C1′ atoms of the flipping base with the C1′ atom of the adjacent base and the C1′ atom of the base-pairing partner of the adjacent base.
flexibility. As expected, there is little deviation in the values of the pseudodihedral angles in the flipped-in state for the sequences studied. However, the values of the pseudodihedral angles of the flipped-out states of the DNA sequences showed some interesting differences, as shown in Figure 7 and listed in Table 3. The average structures of the flipped-out conformations of the DNA sequences are shown in Figure 8. The 5′ pseudodihedral was located at 122-125° for duplexes A, G, and C. A large difference, ranging from ∼120° to ∼190°, in the 5′ pseudodihedral was encountered in the simulation of duplex T, even though the simulations started from the same initial coordinates. As this result was unexpected, it was investigated further. The average structure of the flipped-out conformation of duplex T (Figure 8b) shows a large-scale kink in the DNA that is also accompanied by a loss of base stacking between the adenines
Structures and Energetics of Thymine Dimer Base Flipping
Figure 6. Normalized probability histograms of the 5′ (dark) and 3′ (light) pseudodihedral angles over 8-ns equilibrium MD simulations of flipped-in DNA sequences for duplexes A (blue), C (green), G (purple), and T (orange). Data were saved every picosecond. Histograms were created using 1° bin widths.
TABLE 3: Values of Pseudodihedral Angles for Sequences Studieda pseudodihedral angle (deg) DNA sequence
system
5′
3′
duplex A
flipped-in flipped-out flipped-in flipped-out flipped-in flipped-out flipped-in flipped-out
71 122 78 191 74, 75 125 75 124
4 -65 1 -44, -47 4 -24 -3 -18
duplex Tb duplex C duplex G
a Values given are the modes of histograms created using the values of the pseudodihedral angles output from the 8-ns production MD simulations. Data were saved every 1 ps. b Values from 8-ns unconstrained MD simulation.
opposite the thymine dimer (A24 and A25). As previously defined by Norberg and Nilsson, the distance between glycosidic nitrogens in the base-stacked and non-base-stacked conformations are 4.5 and 9 Å, respectively.46 The distance between the glycosidic nitrogens (N9) of A24 and A25 over the simulation time is shown in Figure 9. The adenines opposite the thymine dimer are stacked from 0 to 4 ns, as indicated by the average N9-N9 distance of 4.6 Å. However, from 4 to 8 ns, the average N9-N9 distance increases to 7.0 Å and is quite variable, up to 8.5 Å. This indicates that, during the later time period, the adenine bases are not stacked, which presumably causes the large-scale kink in the DNA structure at that point. However, neither the average structure nor snapshots from the simulation show spontaneous base flipping. The average global bend of duplex T from 0 to 4 ns was found to be 43.8°, whereas that from 4 to 8 ns was found to be significantly higher, 54.1°. This phenomenon was not observed for any of the other DNA sequences studied, as evidenced by the average glycosidic nitrogen distances between ∼4 and 5 Å in duplexes C, G, and A (see Supporting Information). In an attempt to confirm the conformation of duplex T in which A24 and A25 are not well stacked, an MD simulation was run as a control using values of the 5′ and 3′ pseudodihedral angles that were constrained to 188° and -43°. At the end of the simulation time, 500 ps, the 5′ and 3′ pseudodihedral angles were 178.9° and -39.2°. This structure was then subjected to 500 ps of unconstrained MD simulation, after which the 5′ and 3′ pseudodihedral angles were found to be 118.5° and -56°. These values are similar to the pseudodihedral angles of the
J. Phys. Chem. B, Vol. 112, No. 13, 2008 4117 flipped-out state of duplex A. Unconstrained simulations of 8-ns duration led only to a flipping of the CPD back into the duplex. Judging from the large difference in flipped-out conformations of duplex T and the other sequences, the unexpected loss of base stacking between the adenines opposite the thymine dimer, and the formation of a flipped-out conformation similar to the conformations of the other sequences after additional simulations, it can be concluded that the simulation of duplex T samples a region of phase space that constitutes a local minimum. This local minimum includes the highly kinked conformation in which there is a loss of base stacking between the two adenines across from the thymine dimer. This local minimum is not encountered for the other sequences studied, most notably duplex A, which is similar in sequence. However, the actual flipped-out structure was found to be very similar to the structure observed for duplex A, and the same sampling scheme was used for duplex T as was used for duplex A. As discussed in more detail below, the sequences with A-T bases adjacent to the thymine dimer show high deformability of the base pairs as compared to those with G-C base pairs adjacent to the dimer. These structural deformations might disrupt the base stacking of the bases adjacent to the orphaned adenine bases across from the thymine dimer with the orphaned adenines. One possible hypothesis is that, if stacking in the AAAAAA sequence across from the thymine dimer in duplex T is disrupted, then the two central adenines break stacking interactions in order to stack more efficiently with their neighboring adenine bases (i.e., stacking of three adjacent adenine bases, AAA-AAA, versus two adjacent bases, AAAA-AA). For the TTAATT sequence across from the thymine dimer in duplex A, the breaking of stacking interactions between the central adenines would allow for more efficient stacking with the adjacent thymine bases. Because purine-pyrimidine stacking interactions are not as favorable as the purine-purine interactions present in duplex T, this structure will be less stabilized by base stacking. Sequence-dependent differences were also found in the 3′ pseudodihedral angles, as listed in Table 2. The values for duplexes A and T were found to be -65° and -56°, respectively, whereas those for duplexes G and C were found to be -18° and -24°, respectively. These results indicate that there is a sequence-dependent difference in the structures of the flipped-out conformations of the DNA sequences, particularly to the 3′ side of the thymine dimer. To evaluate whether the structural difference was due to differences in hydrogen bonding between base pairs adjacent to the thymine dimer, we calculated the average distances between hydrogen-bond donors and acceptors over the 8-ns simulation times (see Supporting Information). There were no noticeable differences in the hydrogen-bond distances between sequences. However, the average distances might not necessarily be best suited to show the subtle structural differences that are observed. As previously discussed, there were also no noticeable differences in the distances between glycosidic nitrogens of adjacent bases that could be used to evaluate differences in base stacking. Therefore, a more detailed structural analysis was needed to determine the cause of the sequence-dependent structural differences. The sequence-dependent structural differences observed in the unconstrained MD simulations are most likely due to bases or base pairs to the 3′ side of the thymine dimer, as it is in the 3′ pseudodihedral angles that these differences become apparent. Because the structural differences seem to depend on the identity of the base pairs adjacent to the thymine dimer rather than the individual bases, the base-pair propeller, buckle, and opening
4118 J. Phys. Chem. B, Vol. 112, No. 13, 2008
O’Neil and Wiest
Figure 7. Normalized probability histograms of the 5′ (dark) and 3′ (light) pseudodihedral angles over 8-ns equilibrium MD simulations of flippedout DNA sequences. Left: Duplex A (gray), duplex T (blue). Right: Duplex G (gray), duplex C (blue). Data were saved every picosecond. Histograms were created using 1° bin widths. Note that the distributions have different scales than the distribution shown in Figure 5.
Figure 8. Average structures (8 ns) of flipped-out conformations of the thymine-dimer-containing DNA sequences studied: (a) duplex A, (b) duplex T, (c) duplex C, and (d) duplex G.
Figure 10. Base-pair parameters as defined in CURVES: (a) propeller, (b) buckle, and (c) opening.
Figure 9. Distance between glycodsidic nitrogens (N9) of adenines opposite the thymine dimer (A24, A25) over the simulation time (8 ns). Data were saved every picosecond.
parameters were calculated using CURVES.42 A propeller configuration is the twisting of bases along an axis that is perpendicular to the helical axis, as shown in Figure 10. A propeller value of 0° for a base pair indicates that the two bases in the pair share a common plane, whereas a large positive or
negative value indicates that the two bases have twisted away from one another and no longer share a common plane. The propeller twists of the two base pairs to the 5′ side of the thymine dimer (bases 6-27 and 7-26) for all sequences studied are shown in Figure 11. Although the values of the propeller twist for the bases 5′ to the thymine dimer are higher than expected for undamaged DNA, typically below (15°, these values are not unexpected.42,47,48 Higher values of propeller twist can be expected on the basis of the deformation of the structure caused by both the inclusion of and the base flipping of a thymine dimer, as has been previously demonstrated.16-18 The
Structures and Energetics of Thymine Dimer Base Flipping
Figure 11. Propeller values (degrees) over the simulation time (8 ns) for base pairs 6-27 and 7-26 of the flipped-out conformations of duplexes A, G, C, and T. Data were generated using CURVES analysis of Protein Data Bank structures saved every 80 ps.
buckle and opening parameters of the 5′ base pairs are similar to the propeller parameters (see Supporting Information). The propeller twists of the base pairs to the 3′ side of the thymine dimer (10-23 and 11-22) exhibit sequence-dependent differences, as shown in Figure 12. These differences are also reflected in the buckle and opening parameters of these base pairs (see Supporting Information). The base pairs 10-23 and 11-22 in duplexes A and T exhibit high degrees of propeller twist that also show high variability over the course of the simulation. However, for duplexes G and C, this large propeller twist is not observed. For duplex C, the propeller twist exhibits variability from 0 to 2 ns, after which it more closely resembles that of duplex G. These results indicate that the conformations of A-T base pairs to the 3′ side of the thymine dimer are more likely to undergo deformations than are those of G-C pairs, which is most likely the ultimate cause of the differences observed in the 3′ pseudodihedral angles of the flipped-out conformations. This might be due to the inherently weaker interactions of A-T pairs, with only two hydrogen bonds, as compared to G-C pairs, with three hydrogen bonds. A number of other possibilities, including base stacking or non-WatsonCrick hydrogen-bonding patterns, were proposed by Nelson et al.49 This effect has, to the best of our knowledge, not been previously observed, although this might be due to the lack of long-timescale computational investigations of such deformed DNA structures.
J. Phys. Chem. B, Vol. 112, No. 13, 2008 4119 Energetics of Base Flipping. The 3′ pseudodihedral angle of the flipped-out conformations of A-T-containing sequences, duplexes A and T, differ from those of G-C-containing sequences, duplexes G and C. Therefore, they require the sampling of different regions of phase space for computation of the PMF. The original sampling scheme, shown in Figure 2, was modified to include the relevant regions of the 2D phase space as shown in Figure 3. The computed potentials of mean force for duplexes C, G, and T are shown in Figures 13 and 14, and the respective ∆Gflip energies are listed in Table 4. The energy required to undergo base flipping for duplex A, 6.25-6.5 kcal/mol, was reported previously.26 As shown in Figure 14, the energy required for base flipping in duplex T was found to be 5.25-5.5 kcal/mol. The flipped-out states of duplexes A and T were found to be structurally similar, but the energetics of base flipping differ by ∼1 kcal/mol. This difference depends on the error associated with the methods used to calculate the respective energy surfaces, which have previously been shown to have a high degree of accuracy within the limits of adequate sampling of the phase space.43,44 Given the similarities of the systems studies and the identical sampling parameters used, it can be expected that the relative energy differences as a function of the sequence will be fairly reliable. This correlation of structural similarity and energetic differences was also observed for duplexes G and C. As shown in Figure 13, the energy required for the thymine dimer to undergo base flipping in duplex C was calculated to be 6.5-6.75 kcal/ mol. However, the energy required for base flipping in duplex G was found to be 7.25-7.5 kcal/mol, as shown in Figure 14. There is a clear sequence dependence of the energy required for the thymine dimer to flip out of the DNA duplex, as evidenced by the ∼2 kcal/mol difference in the energies for the sequences studied. The trend in the computed energies required for base flipping is duplex T < duplex A e duplex C < duplex G. As previously discussed, there is also a structural difference in the flipped-out states of the sequences studied, particularly with respect to A-T and G-C base pairs adjacent to the thymine dimer. Sequences in which A-T pairs flank the thymine dimer show deformability of the base pairs to the 3′ side of the dimer that is not observed for flanking G-C base pairs. The relationship between of the deformation of the flanking base pairs and the energy required to undergo base flipping can thus be understood using a combination of both the data obtained from the unconstrained MD simulations and the PMF results. The sequences in which deformations of the
Figure 12. Propeller values (degrees) over the simulation time (8 ns) for base pairs 10-23 and 11-22 of the flipped-out conformations of (left) duplex A (10-23, red, 11-22, pink) and duplex T (10-23, dark blue, 11-22, light blue), (right) duplex C (10-23, red, 11-22, pink) and duplex G (10-23, dark blue, 11-22, light blue). Data were generated using CURVES analysis of Protein Data Bank structures saved every 80 ps.
4120 J. Phys. Chem. B, Vol. 112, No. 13, 2008
O’Neil and Wiest
Figure 13. Contour maps of the potentials of mean force for the base flipping of thymine dimer in duplexes (a) C and (b) G. Free energy as a function of 5′ and 3′ pseudodihedral angles. Each color represents a 0.25 kcal/mol change in energy. The points indicated on the chart with crosshairs are also labeled with the corresponding pseudodihedral pairs (5′, 3′).
is that the distortions of 3′ base pairs are a result of the base flipping of the thymine dimer and that, for the weaker hydrogen bonding in A-T base pairs, these distortions are a quite prominent feature of the flipped-out DNA structures. For G-C base pairs, which have more hydrogen bonds and more favorable base stacking interactions than A-T pairs, the energy required to deform the 3′ base pairs is greater, and the conformations and energetics of the flipped-out structures reflect this increase. The more rigid G-C base pairs are more difficult to distort than A-T base pairs. This distortion to the 3′ side of the thymine dimer lowers the energy required to undergo base flipping by ∼0.5-2 kcal/mol, depending on the exact sequence context. Although these arguments about the sequence dependence of the base-flipping process are intuitive based on the known properties of the base pairs, the data gathered in this study allows a quantitative analysis of the structure and energetics of the flipping process, as well as the conformations of the flippedout structures.
Figure 14. Contour map of the potential of mean force for the base flipping of thymine dimer in duplex T. For more details, see Figure 13.
TABLE 4: Summary of Energetics of Base Flipping location of conformation (5′, 3′ pseudodihedral angles) duplex T Aa C G a
∆Gflip (kcal/mol)
flipped-in
flipped-out
5.25-5.5 6.25-6.5 6.5-6.75 7.2-7.5
79, -1 70, 10 71, 8 70, 10
110, -56 111, -56 118, -13 117, -13
Previously reported in ref 24.
base pairs to the 3′ side of the thymine dimer are observed, duplexes A and T, were found to require less energy to undergo base flipping than sequences in which no deformations are observed, duplexes G and C. Therefore, one possible explanation
The observation of the sequence dependence of the energetics of base flipping of the thymine dimer by computational methods warrants a full exploration of this phenomenon using experimental methods. Using a sensitive experimental technique, such as fluorescence detection of a small-molecule probe for base flipping,36,50 the energy difference between sequences, ∼0.5-2 kcal/mol, would be easily discernible. The sequence-dependent differences in base flipping energies might also have implications for the recognition of the thymine dimer by DNA photolyase. The complex between damaged DNA and DNA photolyase has been shown to include the thymine dimer flipped out of the duplex into the active site of the enzyme. It has not yet been shown whether the thymine dimer is recognized by the enzyme after spontaneous base flipping or whether base flipping is induced by the enzyme. In either case, the differences in energy for base flipping of various sequences will play a role in the enzymatic repair of the thymine dimer. Experimental studies of CPD repair by CPD photolyase have so far not been able to detect a sequence dependence, possibly because of the detection limits of studies in such a relatively complex system.19
Structures and Energetics of Thymine Dimer Base Flipping Conclusions Using unconstrained MD simulations, the starting and ending points, flipped-in and flipped-out, of thymine-dimer-containing DNA sequences were identified. These simulations revealed sequence-dependent differences in the flipped-out structures. The 5′ pseudodihedral of duplex T was found to be significantly different from those of the other sequences, which was attributed to the loss of base stacking between the adenine bases opposite the thymine dimer. Upon further investigation, it was concluded that the simulation of duplex T sampled a local minimum of the potential energy surface in which the non-base-stacked conformation exists and that the flipped-out state is similar to that of duplex A. Sequence-dependent differences in the 3′ pseudodihedral, approximately -60 for duplexes A and T and approximately -20 for duplexes G and C, were also observed. Deformations of the base pairs 3′ to the thymine dimer of duplexes A and T are apparent from plots of the base-pair buckle, propeller, and opening parameters. These deformations were not observed for the simulations of duplexes G and C. The previously reported method used to calculate the PMF for base flipping of the thymine dimer26 was modified to include the regions of phase space necessary to describe the flippedout conformations of duplexes G and C. Sequence-dependent differences in the energetics of base flipping were also observed. The energies required for the thymine dimer to undergo base flipping in duplexes A, T, G, and C were found to be 6.256.5, 5.25-5.5, 7.25-7.5, and 6.5-6.75 kcal/mol, respectively. The energies required for duplexes A and T to undergo base flipping, in which base-pair deformations are observed, are less than those for duplexes G and C, in which no deformations are observed. This might be due to the differences in hydrogen bonding and base stacking of A-T and G-C base pairs, with G-C pairs being more difficult to deform. The results presented suggest that the deformation of base pairs 3′ to the thymine dimer is a prominent feature of the flipped-out structures and the more-difficult-to-deform G-C base pairs require more energy to undergo base flipping. The validation of these results using experimental methods, including the sequence-dependent energetic differences, would be possible using a sensitive method. The enzymatic recognition of the thymine dimer, which depends on the mechanism of recognition, should depend on the sequence-dependent energy differences described herein and warrants further study. Acknowledgment. We acknowledge the generous allocation of computer resources by the Center for Research Computing at the University of Notre Dame and many helpful discussions with Dr. Alan Grossfield (University of Rochester Medical Center). L.L.O. is the recipient of a Grace fellowship from the University of Notre Dame. Supporting Information Available: Analysis of flippedin and flipped-out structures for all sequences studied, including RMSd, global bend, hydrogen-bonding analysis, base-stacking analysis, and snapshots of additional constrained and unconstrained simulations of duplex T. CURVES analysis, including buckle and opening parameters, of all DNA structures and detailed sampling schemes for the calculation of the potentials of mean force. This material is available free of charge via the Internet at http://pubs.acs.org. References and Notes (1) Klimasauskas, S.; Kumar, S.; Roberts, R. J.; Cheng, X. Cell 1994, 76, 357-369.
J. Phys. Chem. B, Vol. 112, No. 13, 2008 4121 (2) Reinisch, K. M.; Chen, L.; Verdine, G. L.; Lipscomb, W. N. Cell 1995, 82, 143-153. (3) Banerjee, A.; Yang, W.; Karplus, M.; Verdine, G. L. Nature 2005, 434, 612-618. (4) Vassylyev, D. G.; Kashiwagi, T.; Mikami, Y.; Ariyoshi, M.; Iwai, S.; Ohtsuka, E.; Morikawa, K. Cell 1995, 83, 773-782. (5) Banerjee, A.; Santos, W. L.; Verdine, G. L. Science 2006, 311, 1153-1157. (6) Blainey, P. C.; van Oijen, A. M.; Banerjee, A.; Verdine. G. L.; Xie, X. S. Proc. Natl. Acad. Sci. U.S.A. 2006, 103, 5752-5757. (7) Parker, J. B.; Bianchet, M. A.; Krosky, D. J.; Friedman, J. I.; Amzel, L. M.; Stivers, J. T. Nature 2007, 449, 433-437. (8) Dornberger, U.; Leijon, M.; Fritzsche, H. J. Biol. Chem. 1999, 274, 6957-6962. (9) Priyakumar, U. D.; MacKerrel, A. D., Jr. Chem. ReV. 2006, 106, 489-505. (10) Guidice, E.; Va´rnai, P.; Lavery, R. Chem. Phys. Chem. 2001, 11, 673-677. (11) Guidice, E.; Va´rnai, P.; Lavery, R. Nucleic Acids Res. 2003, 31, 1434-1443. (12) Banavali, N. K.; MacKerrel, A. D., Jr. J. Mol. Biol. 2002, 319, 141-160. (13) Friedberg, E. C. DNA Repair; W. H. Freeman & Co.: New York, 1985, Chapters 1-5, 2-1, and 2-2. (14) Miaskiewicz, K.; Miller, J.; Cooney, M.; Osman, R. J. Am. Chem. Soc. 1996, 118, 9156-9163. (15) (a) Park, H.; Zhang, K.; Ren, Y.; Nadji, S.; Sinha, N.; Taylor, J-S.; Kang, C. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 15965-15970. It should be mentioned that NMR and gel shift studies indicate a smaller kink angle: (b) McAteer, K., Jing, Y., Kao, J., Taylor, J. S., Kennedy, M. A. J. Mol. Biol. 1998, 282, 1013-1032. (c) Lee, J. H.; Choi, Y. J.; Choi, B. S. Nucleic Acids Res. 2000, 28, 1794-1801. (d) Husain I, Griffith J, Sancar A. Proc. Natl. Acad. Sci. U.S.A. 1988, 85, 2558-25562. (e) Wang, C. I.; Taylor, J. S. Chem. Res. Toxicol. 1993, 6, 519-523. (f) Wang, C. I.; Taylor, J. S. Proc. Natl. Acad. Sci. U.S.A. 1991, 88, 9072-9076. (16) Kim, S.-H.; Pearlman, D. A.; Holbrook, S. R.; Pirkle, D. Prog. Clin. Biol. Res. 1985, 172, 143-152. (17) Pearlman, D. A.; Holbrook, S. R.; Pirkle, D. H.; Kim, S.-H. Science 1985, 227, 1304-1308. (18) Spector, T. I.; Cheatham, T. E., III; Kollman, P. A. J. Am. Chem. Soc. 1997, 119, 7095-7104. (19) Sancar, A. Chem. ReV. 2003, 103, 2203-2238. (20) Harrison, C. B.; O’Neil, L. L.; Wiest, O. J. Phys. Chem. A 2005, 109, 7001-7012. (21) Mees, A.; Klar, T.; Gnau, P.; Hennecke, U.; Eker, A. P. M.; Carell, T.; Essen, L.-O. Science 2004, 306, 1789-1793. (22) Christine, K. S.; MacFarlane, A. W., IV; Yang, K.; Stanley, R. J. J. Biol. Chem. 2002, 277, 38339-38344. (23) Sanders, D. B.; Wiest, O. J. Am. Chem. Soc. 1999, 121, 51275134. (24) Antony, J.; Medvedev, D. M.; Stuchebrukhov, A. A. J. Am. Chem. Soc. 2000, 122, 1057-1065. (25) Park, H.-W.; Kim, S.-T.; Sancar, A.; Deisenhofer, J. Science 1995, 268, 1866-1872. (26) O’Neil, L. L.; Grossfield, A.; Wiest, O. J. Phys. Chem. B 2007, 111, 11843-11849. (27) Cuniasse, Ph.; Fazakerley, G. V.; Guschlbauer, W.; Kaplan, B. E.; Sowers, L. C. J. Mol. Biol. 1990, 213, 303-314. (28) Singh, M. P.; Hill, G. C.; Pe´oc’h, D.; Rayner, B.; Imbach, J.-L.; Lown, J. W. Biochemistry 1994, 33, 10271-10285. (29) Coppel, Y.; Berthet, N.; Colombeau, C.; Colombeau, C.; Garcia, J.; Lhomme, J. Biochemistry 1997, 36, 4817-4830. (30) Wang, K. Y.; Parker, S. A.; Goljer, I.; Bolton, P. H. Biochemistry 1997, 36, 11629-11639. (31) Gelfand, C. A.; Plum, G. E.; Grollman, A. P.; Johnson, F.; Breslauer, K. J. Biochemistry 1998, 37, 7321-7327. (32) Berger, R. D.; Bolton, P. H. J. Biol. Chem. 1998, 273, 1556515573. (33) Barsky, D.; Foloppe, N.; Ahmadia, S.; Wilson, D. M., III; MacKerell, A. D., Jr. Nucleic Acids Res. 2000, 28, 2613-2626. (34) Hoehn, S. T.; Turner, C. J.; Stubbe, J. Nucleic Acids Res. 2001, 29, 3413-3423. (35) Chen, J.; Dupradeau, F.-Y.; Case, D. A.; Turner, C. J.; Stubbe, J. Biochemistry 2007, 46, 3096-3107. (36) O’Neil, L. L.; Wiest, O., Org. Biomol. Chem. 2008, 6, 485-492. (37) Case, D. A.; Darden, T. A.; Cheatham, T. E., III; Simmerling, C. L.; Wang, J.; Duke, R. E.; Luo, R.; Merz, K. M.; Pearlman, D. A.; Crowley, M.; Walker, R. C.; Zhang, W.; Wang, B.; Hayik, S.; Roitberg, A.; Seabra, G.; Wong, K. F.; Paesani, F.; Wu, X.; Brozell, S.; Tsui, V.; Gohlke, H.; Yang, L.; Tan, C.; Mongan, J.; Hornak, V.; Cui, G.; Beroza, P.; Mathews, D. H.; Schafmeister, C.; Ross, W. S.; Kollman, P. A. AMBER 9; University of California, San Francisco, 2006.
4122 J. Phys. Chem. B, Vol. 112, No. 13, 2008 (38) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, D. M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman, P. A. J. Am. Chem. Soc. 1995, 117, 5179-5197. (39) Wang, J.; Cieplak, P.; Kollman, P. A. J. Comput. Chem. 2000, 21, 1049-1074. (40) Insight II; Accelrys Software Inc.: San Diego, CA, 2005. (41) Darden, T.; York, D.; Pedersen. L. G. J. Chem. Phys. 1993, 98, 10089-10092. (42) Lavery, R.; Sklenar, H. J. Biomol. Struct. Dyn. 1988, 6, 63-91. (43) Kumar, S.; Bouzida, D.; Swendsen, R. H.; Swendsen, R. H.; Kollman, P. A. J. Comput. Chem. 1992, 13, 1011-1021. (44) Kumar, S.; Rosenberg, J. M.; Bouzida, J.; Swendsen, R. H.; Kollman, P. A. J. Comput. Chem. 1995, 16, 1339-1350.
O’Neil and Wiest (45) Grossfield, A. http://dasher.wustl.edu/alan/ (accessed Aug 29, 2006). (46) Norberg, J.; Nilsson, L. J. Am. Chem. Soc. 1995, 117, 1083210840. (47) El Hassan, M. A.; Calladine, C. R. J. Mol. Biol. 1996, 259, 95103. (48) Mukherjee, S.; Bansal, M.; Bhattacharyya, D. J. Comput.-Aided Mol. Des. 2006, 20, 629-645. (49) Nelson, H. C. M.; Finch, J. T.; Bonaventura, F. L.; Klug, A. Nature 1987, 330, 221-226. (50) O’Neil, L. L.; Wiest, O. J. Am. Chem. Soc. 2005, 127, 1680016801.