Biasing Simulations of DNA Base Pair Parameters with Application to

Dec 15, 2016 - An accurate and efficient implementation of the six DNA base pair parameters as order parameters for enhanced sampling simulations is p...
1 downloads 12 Views 4MB Size
Article pubs.acs.org/jcim

Biasing Simulations of DNA Base Pair Parameters with Application to Propellor Twisting in AT/AT, AA/TT, and AC/GT Steps and Their Uracil Analogs Alfredo Peguero-Tejada and Arjan van der Vaart* Department of Chemistry, University of South Florida, 4202 East Fowler Avenue, CHE 205, Tampa, Florida 33620, United States ABSTRACT: An accurate and efficient implementation of the six DNA base pair parameters as order parameters for enhanced sampling simulations is presented. The parameter definitions are defined by vector algebra operations on a reduced atomic set of the base pair, and correlate very well with standard definitions. Application of the model is illustrated by umbrella sampling simulations of propeller twisting within AT/AT, AA/TT, and AC/GT steps and their uracil analogs. Strong correlations are found between propeller twisting and a number of conformational parameters, including buckle, opening, BI/BII backbone configuration, and sugar puckering. The thymine methyl group is observed to notably alter the local conformational free energy landscape, with effects within and directly upstream of the thymine containing base pair.



the flexibilities of normal,18 damaged,19 and methylated DNA,18,20 and into DNA bending21−23 and its role in binding.24,25 Here we will describe an efficient and robust way to calculate base parameters and their spatial derivatives for use in biasing simulations. This method prevents the expense of least-squares fitting procedures such as ideal base pair overlays and base pair plane fitting which are part of the standard base parameter definitions.17 To accomplish this, straightforward vector algebra computation of atomic coordinates is used, without reference to idealized base pairs. The calculations are based on the FREEHELIX16 method but differ in three important aspects: they do not require a least-square fit for the vectors normal to the plane of the base pair, they use weights for the centers, and they employ a linear transformation to optimize agreement with the standard definitions. To illustrate the method, a detailed study of propeller twisting in AA/TT, AT/AT, and AC/GT steps is presented. Propeller twisting describes the twisting of bases about their long axes within a base pair (Figure 1A), and is particularly important for DNA structure and flexibility. Early X-ray crystallography showed that most bases are propeller twisted at negative angles;26 a surprising finding at a time when base pairs were thought to be planar. Statistical studies of DNA structures have shown that propeller twisting is highly sequence dependent, with particularly strong propeller twisting for AA/ TT and AT/AT steps.27 AG steps were excluded from this study due to lack of data.27 Since propeller twisting is closely correlated to the slide step parameter, which describes the

INTRODUCTION The flexibility of DNA is important for protein−DNA recognition, and DNA is often bent, kinked, or otherwise deformed in protein−DNA complexes.1,2 These deformations play a significant role in protein−DNA binding thermodynamics3,4 and often serve biological purposes, for example in DNA packaging,5 looping,6 and repair.7 A quantitative understanding of the local and global flexibility of DNA as a function of sequence, and the energetic costs of DNA deformations would help in explaining how recognition and binding are achieved, and aid the mechanistic understanding of these processes. In principle, these insights can be readily obtained from molecular dynamics (MD) simulations,2,8,9 but only a limited amount of time and configurational space can be sampled in normal MD simulations. To enlarge the amount of sampled space without increasing the overall simulation time, biasing methods can be used to enhance sampling along preselected order parameters.10−12 By employing reweighting techniques,13−15 thermodynamic properties can be readily calculated from such biasing simulations, and flexibilities and energetic aspects of the deformations can be quantified through calculated potentials of mean force. Since DNA bases are planar and rigid, the shape of DNA is well-described by the relative orientation of bases within a base pair and the relative orientation of adjacent bases.16,17 Of these, the base parameters describe the rigid body rotations and translations of bases within a base pair, and the step parameters describe the rigid body rotations and translations of adjacent base pairs.17 Step parameters have previously served as order parameters in biasing simulations of DNA,18 yielding important insights into © XXXX American Chemical Society

Received: October 27, 2016 Published: December 15, 2016 A

DOI: 10.1021/acs.jcim.6b00660 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Journal of Chemical Information and Modeling

Article



METHODS Base Pair Definitions. Base pair parameters are calculated using three atomic coordinates per base. For purines these are the C8, C6, and N3 atoms, while for pyrimidines the C6, C4, and C2 atoms are used. This choice corresponds to our previously developed scheme to calculate step parameters.18 In the following, the main strand and its bases will be denoted as strand A and base a, respectively, while strand B and base b will be used for the complementary strand. We will first define the local axes (Pa, Pb, La, and Lb) associated with the bases, and then the parameter axes (P, L, and S) associated with each base pair; both are needed for the base pair parameter calculation. The axes are illustrated in Figure 1C. The three base atoms define the plane of the base, which in turn determines the local Pa and Pb axes that are perpendicular to the plane. Local perpendicular axes are calculated by taking the cross product between two vectors forming the base plane. For purines these vectors are from N3 to C6 and from N3 to C8, while for pyrimidines the vectors were from C2 to C4 and from C2 to C6. The results of these cross products are then normalized and given the proper sense to ensure 5′ to 3′ directionality along the main strand. The perpendicular axis (P) for the base pair is defined as

Figure 1. Base pair parameters. (A) Translational and rotational base parameters, with arrows pointing in direction of positive sense. The primary strand base is shown in light gray; the minor groove faces the reader, and the stagger axis points 5′ to 3′ relative to the primary strand. (B) Schematic representation of thymine−methyl clash occurring in the AX/XT step. (C) C:G base pair overlaid with nonnormalized local axes for each base (Pa, Pb, La, and Lb) and the parameter axes of the base pair (P, L, and S). Atoms C6 and C8 on the far left and right of the base pair correspond to edge atoms of base a (Ea) and base b (Eb), respectively. Ca and Cb indicate the center locations used in current model. (D) AT and CG base pairs with preoptimized (dark gray) and optimized (light gray) base center locations. Trio atom base representations are in black. (E) Schematic representation of the effect that negative propeller on the 6th base pair has on buckle of the adjacent base pairs. 5th and 7th base pairs buckle away from the 6th corresponding to the positive cup in the 5th and 6th step.

P=

Pa + Pb ∥Pa + Pb∥

(1)

This normalized sum of perpendicular base plane vectors gives equal weight to the base orientations on each strand. Local long axes are defined for each base along a line containing its “center”, C, and edge atom, E. Definitions for base centers were chosen to maximize agreement with the formal base step definitions17 as implemented in 3DNA30 and will be discussed below. Edge atoms are C8 and C6 for purines and pyrimidines, respectively. Local long axes then follow from La =

Ea − Ca , ∥Ea − Ca∥

and

Lb =

Cb − Eb ∥E b − C b∥

(2)

The local long axes point from base b to base a. The long axis (L) is constructed from the local axes and weights each equally:

relative translation of an adjacent base pair along its long axis, there is also a strong correlation between propeller twisting and base rigidity.27 Zero and mildly negative propeller angles (≥−10°) are thought to allow more movement of the base steps, while strongly negative propeller angles (≤−15°) would lock the base steps in place.27,28 The strongly negative propeller for AA/TT and AT/AT steps are thought to be due to thymine methyl groups, which avoid steric clashes with the sugar phosphate backbone by propeller twisting (Figure 1B).29 A similar steric clash happens for the AC step, although observed propeller angles are much smaller for AC/GT than AA/TT and AT/AT steps.27−29 To further investigate the strongly negative propeller angle in AA/TT and AT/AT steps, its relation to methyl clashing, and its effect on other structural features, we present free energy simulations of propeller twisting in DNA strands containing central AC/GT, AA/TT, and AT/AT steps, as well as simulations on their methyl-less analogs with central AC/GU, AA/UU, and AU/AU steps. AG/CT steps were excluded because of the lack of experimental data to which to compare the simulations.27 Detailed analyses of the free energy cost of the deformations will be presented to explain the origins of propeller twisting and its geometrical effects in the studied steps.

L=

La + L b ∥La + L b∥

(3)

The S axis completes the right-handed reference frame and points from the minor groove to the major groove: S=

L×P ∥L × P∥

(4)

The translational parameters are calculated as projections of total displacement between the base centers (D) onto each of the parameter axes. Total displacement is oriented from base b to base a and calculated as

D = Ca − C b

(5)

The intermediate translational parameters are then defined as shear′ = D·S

(6)

stretch′ = D·L

(7)

stagger′ = D·P

(8)

A prime is used to indicate that these are intermediate values that will be further transformed to maximize the agreement B

DOI: 10.1021/acs.jcim.6b00660 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling Table 1. Optimal atomic weights and linear transformation constants purine atom weights A−T C−G

β α

C8

C6

−0.151331 −0.163626

1.01405 1.00573

pyrimidine atom weights N3

C6

0.137291 −0.932684 0.157906 −0.926932 Linear Transformation Constants

C4

C2

1.15968 1.23805

0.773008 0.688895

shear

stretch

stagger

buckle

propeller

opening

0.9936 −0.0001

1.0052 −2.0543

1.0094 −0.0034

1.0030 −0.0026

1.0152 0.1603

0.9921 −1.3628

different parameter values via umbrella sampling32 with the following harmonic biasing potential:

with 3DNA. The intermediate rotational parameters are obtained from projections of the local base axes onto the planes perpendicular to the parameter axes. These effectively correspond to the dihedral angle rotations of the local axes about the parameter axes: ⎛ P ×S P ×S ⎞ buckle′ = arccos⎜ a · b ⎟ ⎝ ∥Pa × S∥ ∥Pb × S∥ ⎠

⎛ P ×L P ×L ⎞ propeller′ = arccos⎜ a · b ⎟ ⎝ ∥Pa × L∥ ∥Pb × L∥ ⎠ ⎛ L ×P L ×P ⎞ · b opening′ = arccos⎜ a ⎟ ⎝ ∥La × P∥ ∥L b × P∥ ⎠

k (θ − θ0)2 (12) 2 where θ is the instantaneous parameter value calculated during simulation, θ0 the desired value, and k the force constant. A total of six double stranded DNA sequences were simulated, which differed in the identity of the central step (i.e., the central sixth and seventh base pairs); the sequences will be named after these (Table 2). The initial structure for the AT/AT sequence E bias =

(9)

(10)

Table 2. Simulated Sequences and Naming Conventiona name

sequence

(11)

AT/AT

In these definitions, positive buckle is associated with the bases forming a cup that points upstream relative the primary strand, positive propeller is associated with rotation of the major groove sides of both bases toward their respective upstream directions, and positive opening is associated with an in-base pair plane rotation of the bases such that the major groove sides on both bases rotate away from each other (Figure 1A). To distinguish propeller twisting from base step twisting, we will use the terms propeller and propeller twisting interchangeably. Center Definitions and Linear Transformation. A logical choice for the center of the base (Ca and Cb) would be the geometric center of the three selected base atoms. However, since all six parameters are implicitly a function of the center vectors, a weighted average of atomic positions to define the base center considerably improved the agreement with 3DNA. The simplex downhill method31 was chosen to find the atomic weights which maximized the sum of correlations between the model and 3DNA definitions across all six parameters. The weight vector was expanded to treat A−T and C−G base pairs separately; thus a total of 12 weights (6 for each base pair type) were optimized. The training set for this optimization consisted of 3004 base pair coordinates obtained from a survey of 235 experimental protein−DNA complexes in the protein data bank, which was also used for the development of step parameters.18 The optimized center definitions are shown in Table 1, and illustrated in Figure 1D. While the center definitions maximized the correlation with 3DNA (which is invariant upon linear transformation of the training data), a linear transformation was used to further minimize absolute deviations from 3DNA values: parameter = β·parameter′ + α. The optimized values of α and β are also shown in Table 1. Umbrella Sampling Simulations. The above parameter definitions and their analytical derivatives were used to bias DNA base pairs toward conformations corresponding to

AU/AU

5′-CGCGAATTCGCG-3′ 3′-GCGCTTAAGCGC-5′ 5′-CGCGAAUTCGCG-3′ 3′-GCGCTUAAGCGC-5′ 5′-CGCGAAATCGCG-3′ 3′-GCGCTTTAGCGC-5′ 5′-CGCGAAATCGCG-3′ 3′-GCGCTUUAGCGC-5′ 5′-CGCGAACTCGCG-3′ 3′-GCGCTTGAGCGC-5′ 5′-CGCGAACTCGCG-3′ 3′-GCGCTUGAGCGC-5′

AA/TT AA/UU AC/GT AC/GU a

The central step consisting of the 6th and 7th base pair is indicated in bold.

was taken from the crystal structure of the Dickerson dodecamer (pdb entry 4C6439); coordinates for the sixth and seventh base pair of the other sequences were obtained from 3DNA30 while retaining the same 3DNA parameters of the original system. The analogs were solvated in a rectangular box of 0.19 M KCl in TIP3P33 water with 15 Å margins. After equilibration of the solvent for 0.2 ns while keeping the DNA fixed, the systems were heated from 150 to 300 K over a period of 3 ns in increments of 30 K while harmonically restraining the heavy DNA atoms with a force constant of 1 kcal mol−1 Å−2. Restraints were then gradually reduced to 0.5, 0.25, and 0.1 kcal mol−1 Å −2 in successive 1 ns simulations at 300 K. This was followed by a 20 ns equilibration simulation at 300 K without any restraints. After equilibration, two series of one-dimensional umbrella simulations at 300 K were run for each sequence, in which either the propeller angle of the sixth or seventh base pair was biased. Biasing was performed in the range of −40° to +10° in increments of 5° using 20 ns of simulation per umbrella window. All simulations were performed with an in-house modified version of the CHARMM34 program and the CHARMM force field.35 Free energy surfaces at 300 K and associated error bars were calculated from MBAR;15 as shown C

DOI: 10.1021/acs.jcim.6b00660 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling in the figures, errors varied between 0.1 and 0.5 kcal/mol. Free energy curves shown for the fifth and sixth base pair were calculated from biasing simulations on the sixth base pair, while free energy curves shown for the seventh and eighth base pair were calculated from biasing simulations on the seventh base pair. The leapfrog integrator was used with a time step of 2 fs, snapshots were saved every 1 ps, SHAKE36 was applied to all covalent bonds involving hydrogen atoms, the temperature was controlled with the Nosé−Hoover thermostat,37 and longrange electrostatic interactions were handled by the particle mesh Ewald method.38

secondary strand (i.e., having purine vs pyrimidine on strand A) amounts to simply negating the sign of shear, the two separated distributions of the intermediate model were brought to overlap continuously in the optimized model (Figure 2). For all systems, methylation of the uracils within the step consistently destabilized positive propeller angles, thereby stabilizing negative propeller angles (Figure 3 graphs for sixth



RESULTS AND DISCUSSION While R2 values between the intermediate base parameters (eqs 6−11) and those of 3DNA were all above 0.84, correlations were significantly improved by using weighted centers and linear transformations. These increased all R2 values to 0.98 or above, with the strongest improvement for the translational parameters. Shear, in particular, showed a clear bias in the intermediate model depending on whether purine or pyrimidine was present on the primary strand. This bias was reflected by the presence of two bands in the intermediate shear model graph (shown in inset of Figure 2). This problem was

Figure 3. Free energy curves and error bars as a function of propeller twisting on base pairs 5 through 8. Curves in black for thymine systems, and curves in green for uracil systems.

and seventh base pair). This stabilization was subtle, since the difference in free energy between the uracil and thymine systems was less than 1.0 kcal for most propeller values. In the AC/GT system the thymine methyl of the central step only occurred in the sixth base pair and not the seventh, yet the stabilization was slightly more pronounced in the seventh base pair and not existent in the fifth (Figure 3). Considering that the central step methyl group of this sequence is on strand B, this suggests that the thymine methyl group has a predominantly upstream effect on the local DNA conformation. The effect wanes quickly, however, as no influence was seen in the propeller values of the eighth base pair (Figure 3). It would then be expected that propeller population shifts between the AU/AU and AT/AT steps would be slightly stronger, since in both base pairs, the replacement of uracils with thymines corresponds to methylation within the current base pair as well as methylation immediately downstream on the other strand. Figure 3 shows that this is indeed the case for the sixth base pair; the strongest energetic effect of methylation was on the sixth base pair of AT/AT system. Despite the dodecamer’s palindromic structure, however, the seventh base pair of the AT/AT system exhibits a smaller energetic effect due to methylation. The free energy curves of the AA/TT and AA/UU systems further illustrate the upstream effect of uracil methylation. AA/TT has two central step thymine methyl groups on the sixth and seventh base pair of the same strand, and its population shift is greater on the seventh base pair. The tilting of the free energy curves to favor more negative propeller angles upon methylation of uracil supports Hunter’s thymine methyl clash hypothesis.29 This states that the source of strongly negative propeller values in AX/XT steps is due to the steric clashing between the methyl group and the 5′ neighboring sugar (Figure 1B), which can be avoided by several conformational maneuvers within the dinucleotide step. One of the proposed maneuvers to avoid the clash is a more negative propeller angle in either base pair.29 Our results above indicate, however, that these modes of relieving the clash are not equal, and that a more negative propeller in the base pair upstream of

Figure 2. Correlation of model parameters with 3DNA. The inset graphs for each of the six parameters show correlations of the intermediate model values (that is, definitions before linear transformation and optimization of atomic weights) with 3DNA. The larger graphs show the correlations of the final model values (that is, after using optimal weights and applying linear transformations) with 3DNA. Unoptimized and (optimized) R2 values: shear 0.887 [left] 0.883 [right] (0.995), stretch 0.845 (0.980), stagger 0.906 (0.994), buckle 0.980 (0.995), propeller 0.978 (0.995), and opening 0.931 (0.993). Two correlations were calculated for the intermediate shear values due to the presence of systematic bias, which was resolved in the final model. The left and right bands for intermediate shear correspond to base pairs with purine or pyrimidine on the primary strand, respectively.

fixed by picking the base center locations such that they are approximately collinear with the C8 purine and C6 pyrimidine atoms, and approximately equidistant to their respective edge atoms at vanishing 3DNA base pair parameter values. In this way, when the 3DNA shear value is 0 Å, the long parameter axis (L) will be collinear with the displacement vector (D), which will in turn be perpendicular to the short axis (S), yielding a vanishing shear value in the model as desired. Such center placement was achieved naturally in the optimization procedure. Since exchanging the labels of the primary and D

DOI: 10.1021/acs.jcim.6b00660 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

and twist are more important for relieving the methyl clash than buckle. The muted importance of buckling for relief of the methyl clash was further confirmed by correlation trends between buckling and propeller twisting, which showed that buckling is more important for uracil than for thymine containing systems. All sequences showed strong correlations between the propeller on either base pair and buckling within the central step; Figure 1E shows a schematic of the effect that propeller twisting has on the buckle of adjacent base pairs. Such correlations have previously been described for CG/CG steps,40 for which strongly negative propeller causes minor groove interstrand clashes which can be alleviated by buckling movements in the positive cup direction. Here, a similar stabilization was seen for AT/AT, AA/TT, and AC/GT steps. The buckle by propeller free energy curves of Figure 6 show a clear tendency to buckle

the thymine is more effective than a more negative propeller on the base pair that contains the thymine. Additional conformational changes suggested to relieve the clash are the reduction of twist, and the introduction of buckle in the direction of a negative cup in any base pair (Figure 1E).29 Here, this negative cup would correspond to more negative buckle values in the sixth base pair and more positive values in the seventh. If a change in twist or buckle can avoid the thymine methyl clash, the free energy of the thymine containing system in these coordinates should be higher than that for the uracil containing system. This energy shift is then expected because removal of the methyl group on thymine would stabilize the otherwise sterically hindered conformation. For twist, this was indeed observed, confirming that a reduced twist relieves the clash (Figure 4). Figure 4 shows that the

Figure 4. Free energy curves and error bars as a function of shift, slide, twist, and tilt of the middle step. Curves in black for thymine systems, and curves in green for uracil systems.

Figure 6. Free energy curves and error bars as a function of buckle on each base pair, shown at particular values of propeller on the other base pair within the step. For example, “6th bp” indicates buckle on the 6th base pair as a function of propeller on the 7th base pair, while colors indicate the particular value of propeller, with propeller between −10 and 0° shown in black, between −20 and −10° in green, between −30 and −20° in light green, and between −40 and −30° in yellow. Arrows indicate the direction of increasingly negative propeller. The two columns on the far-right show the moving averages of buckle as propeller on the other base pair changes; curves in black for thymine systems, and curves in green for uracil systems.

thymine containing systems had higher free energy than the uracil containing systems at large positive twist, by values up to 2 kcal/mol. The greatest destabilization of the thymine was seen in the AT/AT step, due to the sugar−methyl clashes that occurred on both strands at high twist. For buckle, however, the differences in free energy between the uracil and thymine systems were minimal (Figure 5). While the AT/AT step

away from the other base pair within the step as the neighboring base pair adopts a more negative propeller angle. While the trends were similar, average buckle calculations indicate that the thymine systems realize lower positive cup than the uracil systems across all propeller values (the two farright columns of Figure 6). Thus, the thymine systems used less buckling to stabilize the minor groove interstrand clashes that were caused by strongly negative propeller angles than the uracil containing systems. This lower buckling movement reflects the trade-off present in preventing both the same strand clash at mildly negative propeller and positive cup, and the interstrand clash at strongly negative propeller and negative cup. The end result is that systems with thymine realize stronger negative propeller angles and have slightly lower positive cup conformations than systems with uracil due to the relatively greater importance of avoiding the same-strand thymine methyl clash. On the other hand, uracil systems consistently have greater positive cup buckling movements across all propeller angles of the neighboring base pair. An implication of this trend is that on average, strands with uracil instead of thymine expose their interior more through buckle motions. The only exception was found for buckling of the

Figure 5. Free energy curves and error bars as a function of buckle of base pairs 6 and 7. Curves in black for thymine systems, and curves in green for uracil systems.

showed a slight destabilization for more positive buckle in the sixth base pair and more negative in the seventh base pair compared to its uracil analogue, the AA/TT step only had a slight destabilization for its seventh base pair, and no destabilization was observed for the AC/GT step. These observations indicate that for the studied sequences, propeller E

DOI: 10.1021/acs.jcim.6b00660 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling seventh base pair as a function of the sixth base pair propeller angle in the AC/GT system, where the methyl dependence was unclear. This irregularity can be explained by the known resistance of guanine to buckle movements.40 Opening of the central base pairs toward the major groove was strongly correlated with the propeller angles (Figure 7).

Figure 8. Free energy curves and error bars as a function of sugar puckering on the 7th base of strand B, shown at particular values of propeller of both base pairs within the step. For example, “6th bp” indicates sugar puckering on the 7th base on strand B as a function of propeller on the 6th base pair. Colors indicate the particular value of propeller, with propeller between −10 and 0° shown in black, between −20 and −10° in green, between −30 and −20° in light green, and between −40 and −30° in yellow.

Figure 7. Free energy curves and error bars as a function of opening of each base pair, shown at particular values of propeller on the same base pair within the step. For example, “6th bp” indicates opening on the 6th base pair as a function of propeller on the 6th base pair, while colors indicate the particular value of propeller, with propeller between −10 and 0° shown in black, between −20 and −10° in green, between −30 and −20° in light green, and between −40 and −30° in yellow.

For the AT/AT, AU/AU, AA/TT, and AA/UU systems, the position of the minimum did not shift, but more negative propeller angles reduced the free energy cost of positive opening, establishing an asymmetric free energy landscape. For AC/GT and AC/GU a shift in the position of the minimum was observed at strongly negative propeller values, as well as a reduction in the cost of positive opening. At smaller propeller angles, the CG base pair is seen to have greater curvature than that of AT base pairs which may be attributed to the greater number of hydrogen bonds which must be broken upon opening. The stabilization of higher opening with increasing propeller magnitude can be rationalized by the loss of base pair hydrogen bonding which accompanies out of plane base pair rotation. This loss of hydrogen bonds facilitates increasing separation of the bases within the base pair. The stabilization of opening is generally enhanced when the methyl group is present. Opening toward the minor groove does not appear to be affected by propeller motions. Since opening of base pairs can facilitate DNA bending, localized propeller movements may initiate DNA bending. More negative propeller values were found to stabilize the C3′ endo sugar puckering conformation within the same base pair by 1−2 kcal/mol (illustrated in the third and fourth column of Figure 8 for the seventh base pair). In contrast, the C3′ endo conformation was generally destabilized by more negative propeller twisting on base pair downstream (first and second column of Figure 8). Thus, puckering of the base is affected by both the propeller angle within the same base pair and the adjacent downstream base pair in opposite senses. Steps with uracil tend to create a shallow basin for the intermediate O1′ endo conformation (second and third column of Figure 9). The effect of methylation on sugar puckering has a predominantly upstream influence, with almost no effects on the fifth base pair. The C3′ endo conformation is more stable in the presence of the downstream methyl group because the C2′

Figure 9. Free energy plots and error bars for sugar puckering of strand B for base pairs 5 through 8. Curves in black for thymine systems, and curves in green for uracil systems.

atom flips to the opposite side of the sugar plane and reduces the effective twist within the step which would otherwise cause the steric clash between the C2′ atom and the methyl group (Figure 10).

Figure 10. Strand B middle step bases illustrating C3′ and C2′ sugar puckering conformations on the 7th base. Thymine C5 methyl atom on 6th base and C2′ atom on 7th base shown in black.

The BI backbone conformation is also stabilized by the presence of a thymine methyl group immediately downstream (Figure 11). This is supported by the fact that only in the AA/ TT system, which has two C5 methyl groups on strand B, the BI conformation for both the seventh and eighth base pairs were stabilized, while for the AT/AT and AC/GT systems BI stabilization only occurred for the seventh step. Similar to sugar puckering, free energy curves for the BI/BII backbone conformation show opposite correlations between the two F

DOI: 10.1021/acs.jcim.6b00660 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling

Such simulations can be used to quantify the energetic effect of base pair deformations in bare, protein or ligand-bound DNA. Applications to propeller twisting in AT/AT, AA/TT, and AC/GT steps and their uracil containing analogs show strong correlations between the propeller angle and several other conformational parameters. Conformational populations and preferences are strongly affected by propeller twisting and the presence of the thymine methyl group. The methyl group has a predominantly upstream influence on local DNA conformation. It was found to stabilize strongly negative propeller twisting within the thymine containing base pair and also in the upstream base pair, but downstream effects were negligible. Despite the positive correlation between propeller angles of adjacent base pairs, the free energy analysis demonstrated that the energetic effects of propeller twisting are asymmetric and depend on which base pair within the step is being propeller twisted. These differences are particularly apparent for propeller induced conformational changes in the DNA backbone. Increased propeller motions in turn stabilized sugar puckering and BI/BII minima in opposing directions depending on which base pair within the step was distorted. The methyl group was found to stabilize the north sugar puckering conformation due to its avoidance of clashing with the 5′ neighboring C2′ atom and was found to stabilize the BI backbone conformation as well. The free energy analyses qualitatively agree with Hunter’s proposed maneuvers to avoid the thymine methyl clash,29 but also provide quantitative insights as to the relative importance of the conformational stabilizations. Twist and propeller relieve the clash more effectively than buckle. Propeller proves to have strong correlations with buckle and opening, the two other base pair rotational parameters. Buckle movements corresponding to positive cup within the step were found to stabilize adjacent propeller rotations, with the thymine methyl group decreasing the buckling. While it is likely that changes in buckle toward the positive cup direction are driven by the enthalpic stabilization of avoiding the minor groove interstrand clash, the opening of the base pair as propeller becomes more negative may possibly be driven by favorable enthalpic interactions with the solvent or by the entropic gain associated with sampling more opening angle values. Both propeller rotation and the thymine methyl group alter conformational backbone minima populations and the correlation between the two could possibly allow for a concerted mechanism for backbone conformational regulation driven by methylation and executed via propeller-twisting.

Figure 11. Free energy curves and error bars as a function of ε−ζ of strand B for base pairs 5 though 8. Curves in black for thymine systems, and curves in green for uracil systems.

adjacent propeller angles (Figure 12). More negative propeller angles of the base pair upstream of the O3′ atom tend to

Figure 12. Free energy curves and error bars as a function of ε−ζ on the 7th base of strand B shown at particular values of propeller of both base pairs within the step. For example, “6th bp” indicates ε−ζ of the 7th base pair on strand B as a function of propeller on the 6th base pair, while colors indicate the particular value of propeller. Propeller between −10 and 0° shown in black, between −20 and −10° in green, between −30 and −20° in light green, and between −40 and −30° in yellow.

stabilize BI, while higher propeller downstream of O3′ tend to stabilize the BII conformation. Interestingly, BI backbone stabilization has also been seen in C5 methylation of cytosine.41 Comparison of the free energy curves for shift, slide, tilt, and twist for AT/AT and AU/AU consistently show a stabilization and no shifts in the position of the minimum upon methylation. In contrast, the AA/TT, AA/UU, AC/GT, and AC/GU systems show shifts in position of the minimum upon methylation (Figure 4). This may be due to the “unbalanced” methylation on both strands of the DNA: the AT/AT step is methylated on both strands, while AA/TT and AC/GT are methylated on one strand. If so, this may suggest that selective methylation of uracil bases can be made to intentionally stabilize or shift conformational populations of RNA.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Phone: +1-813-974-8762. ORCID

Arjan van der Vaart: 0000-0002-8950-1850 Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS Computer time was provided by USF Research Computing, sponsored in part by NSF MRI CHE-1531590.



CONCLUSION The presented model allows for efficient umbrella sampling simulations of base pair parameters. This enables much wider sampling of base pair parameters than in normal, unbiased DNA, and accurate calculation of potentials of mean force.



REFERENCES

(1) Peters, J. P.; Maher, L. J. DNA Curvature and Flexibility in Vitro and in Vivo. Q. Rev. Biophys. 2010, 43, 23−63.

G

DOI: 10.1021/acs.jcim.6b00660 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX

Article

Journal of Chemical Information and Modeling (2) van der Vaart, A. Coupled Binding-Bending-Folding: The Complex Conformational Dynamics of Protein-DNA Binding Studied by Atomistic Molecular Dynamics Simulations. Biochim. Biophys. Acta, Gen. Subj. 2015, 1850, 1091−1098. (3) Jen-Jacobson, L.; Engler, L. E.; Jacobson, L. A. Structural and Thermodynamic Strategies for Site-Specific DNA Binding Proteins. Structure 2000, 8, 1015−1023. (4) Hogan, M.; Austin, R. Importance of DNA Stiffness in Protein− DNA Binding Specificity. Nature 1987, 329, 263−266. (5) Luijsterburg, M. S.; White, M. F.; van Driel, R.; Dame, R. T. The Major Architects of Chromatin: Architectural Proteins in Bacteria, Archaea and Eukaryotes. Crit. Rev. Biochem. Mol. Biol. 2008, 43, 393− 418. (6) Matthews, K. S. DNA Looping. Microbiol. Rev. 1992, 56, 123− 136. (7) Mol, C. D.; Parikh, S. S.; Putnam, C. D.; Lo, T. P.; Tainer, J. A. DNA Repair Mechanisms for the Recognition and Removal of Damaged DNA Bases. Annu. Rev. Biophys. Biomol. Struct. 1999, 28, 101−128. (8) Cheatham, T. E.; Case, D. A. Twenty-Five Years of Nucleic Acid Simulations. Biopolymers 2013, 99, 969−977. (9) Orozco, M.; Noy, A.; Perez, A. Recent Advances in the Study of Nucleic Acid Flexibility by Molecular Dynamics. Curr. Opin. Struct. Biol. 2008, 18, 185−193. (10) Spiriti, J.; Kamberaj, H.; Van Der Vaart, A. Development and Application of Enhanced Sampling Techniques to Simulate the LongTime Scale Dynamics of Biomolecular Systems. Int. J. Quantum Chem. 2012, 112, 33−43. (11) Mitsutake, A.; Mori, Y.; Okamoto, Y. Enhanced Sampling Algorithms. Methods Mol. Biol. 2013, 924, 153−195. (12) Zuckerman, D. M.; Rees, D. C.; Dill, K. A.; Williamson, J. R. Equilibrium Sampling in Biomolecular Simulations. Annu. Rev. Biophys. 2011, 40, 41−62. (13) Ferrenberg, A. M.; Swendsen, R. H. Optimized Monte Carlo Data Analysis. Phys. Rev. Lett. 1989, 63, 1195−8. (14) Kumar, S.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A.; Rosenberg, J. M. The Weighted Histogram Analysis Method for FreeEnergy Calculations on Biomolecules. I. The Method. J. Comput. Chem. 1992, 13, 1011−1021. (15) Shirts, M. R.; Chodera, J. D. Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys. 2008, 129, 124105. (16) Dickerson, R. E. DNA Bending: The Prevalence of Kinkiness and the Virtues of Normality. Nucleic Acids Res. 1998, 26, 1906−1926. (17) Olson, W. K.; Bansal, M.; Burley, S. K.; Dickerson, R. E.; Gerstein, M.; Harvey, S. C.; Heinemann, U.; Lu, X. J.; Neidle, S.; Shakked, Z.; Sklenar, H.; Suzuki, M.; Tung, C. S.; Westhof, E.; Wolberger, C.; Berman, H. M. A Standard Reference Frame for the Description of Nucleic Acid Base-Pair Geometry. J. Mol. Biol. 2001, 313, 229−237. (18) Karolak, A.; van der Vaart, A. Enhanced Sampling Simulations of DNA Step Parameters. J. Comput. Chem. 2014, 35, 2297−2304. (19) Karolak, A.; van der Vaart, A. Molecular Dynamics Simulations of 5-Hydroxycytosine Damaged DNA. J. Phys. Chem. B 2016, 120, 42− 48. (20) Karolak, A.; van der Vaart, A. BII Stability and Base Step Flexibility of N-6-adenine Methylated GATC Motifs. Biophys. Chem. 2015, 203, 22−27. (21) Spiriti, J.; Kamberaj, H.; De Graff, A.; Thorpe, M. F.; van der Vaart, A. Adaptive Umbrella Sampling on Roll Angles Indicates that DNA Bending through Large Angles is Aided by Ionic Screening. J. Chem. Theory Comput. 2012, 8, 2145−2156. (22) Spiriti, J.; van der Vaart, A. DNA Bending through Roll Angles Is Independent of Adjacent Base Pairs. J. Phys. Chem. Lett. 2012, 3, 3029−3033. (23) Ma, N.; van der Vaart, A. Anisotropy of B-DNA Groove Bending. J. Am. Chem. Soc. 2016, 138, 9951−9958. (24) Spiriti, J.; van der Vaart, A. DNA Binding and Bending by Sac7d is Stepwise. ChemBioChem 2013, 14, 1434−1437.

(25) Chung, Y.-H.; van der Vaart, A. RevErba Preferentially Deforms DNA by Induced Fit. ChemBioChem 2014, 15, 643−646. (26) Drew, H. R.; Wing, R. M.; Takano, T.; Broka, C.; Tanaka, S.; Itakura, K.; Dickerson, R. E. Structure of a B-DNA Dodecamer Conformation and Dynamics. 1. Proc. Natl. Acad. Sci. U. S. A. 1981, 78, 2179−2183. (27) El Hassan, M. A.; Calladine, C. R. Propeller-Twisting of BasePairs and the Conformational Mobility of Dinucleotide Steps in DNA. J. Mol. Biol. 1996, 259, 95−103. (28) ElHassan, M. A.; Calladine, C. R. Conformational Characteristics of DNA: Empirical Classifications and a Hypothesis for the Conformational Behaviour of Dinucleotide Steps. Philos. Trans. R. Soc., A 1997, 355, 43−100. (29) Hunter, C. A. Sequence-Dependent DNA-Structure - The Role of Base Stacking Interactions. J. Mol. Biol. 1993, 230, 1025−1054. (30) Lu, X.-J.; Olson, W. K. 3DNA: A Versatile, Integrated Software System for the Analysis, Rebuilding and Visualization of ThreeDimensional Nucleic-Acid Structures. Nat. Protoc. 2008, 3, 1213− 1227. (31) Press, W. H.; Teukolsky, S. A.; Vetterling, W. T.; Flannery, B. P. Numerical Recipes 3rd ed.: The Art of Scientific Computing. Cambridge university press, 2007. (32) Torrie, G. M.; Valleau, J. P. Non-Physical Sampling Distributions in Monte-Carlo Free-Energy Estimation - Umbrella Sampling. J. Comput. Phys. 1977, 23, 187−199. (33) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926−35. (34) Brooks, B. R.; Brooks, C. L., III; MacKerell, A. D., Jr.; Nilsson, L.; Petrella, R. J.; Roux, B.; Won, Y.; Archontis, G.; Bartels, C.; Boresch, S.; Caflisch, A.; Caves, L.; Cui, Q.; Dinner, A. R.; Feig, M.; Fischer, S.; Gao, J.; Hodoscek, M.; Im, W.; Kuczera, K.; Lazaridis, T.; Ma, J.; Ovchinnikov, V.; Paci, E.; Pastor, R. W.; Post, C. B.; Pu, J. Z.; Schaefer, M.; Tidor, B.; Venable, R. M.; Woodcock, H. L.; Wu, X.; Yang, W.; York, D. M.; Karplus, M. CHARMM: The Biomolecular Simulation Program. J. Comput. Chem. 2009, 30, 1545−1614. (35) Hart, K.; Foloppe, N.; Baker, C. M.; Denning, E. J.; Nilsson, L.; MacKerell, A. D. Optimization of the CHARMM Additive Force Field for DNA: Improved Treatment of the BI/BII Conformational Equilibrium. J. Chem. Theory Comput. 2012, 8, 348−362. (36) Ryckaert, J.-P.; Ciccotti, G.; Berendsen, H. J. C. Numerical Integration of the Cartesian Equations of Motion of a System with Constraints: Molecular dynamics of n-Alkanes. J. Comput. Phys. 1977, 23, 327−341. (37) Martyna, G. J.; Klein, M. L.; Tuckerman, M. Nose-Hoover Chains - the Canonical Ensemble Via Continuous Dynamics. J. Chem. Phys. 1992, 97, 2635−2643. (38) Darden, T.; York, D.; Pedersen, L. Particle Mesh Ewald: An N log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys. 1993, 98, 10089−10092. (39) Lercher, L.; McDonough, M. A.; El-Sagheer, A. H.; Thalhammer, A.; Kriaucionis, S.; Brown, T.; Schofield, C. J. Structural Insights Into How 5-Hydroxymethylation Influences Transcription Factor Binding. Chem. Commun. 2014, 50, 1794−1796. (40) Poner, J.; Kypr, J. Base Pair Buckling Can Eliminate the Interstrand Purine Clash at the CpG Steps in B-DNA Caused by the Base Pair Propeller Twisting. J. Biomol. Struct. Dyn. 1990, 7, 1211− 1220. (41) Rauch, C.; Trieb, M.; Wellenzohn, B.; Loferer, M.; Voegele, A.; Wibowo, F. R.; Liedl, K. R. C5-Methylation of Cytosine in B-DNA Thermodynamically and Kinetically Stabilizes BI. J. Am. Chem. Soc. 2003, 125, 14990−14991.

H

DOI: 10.1021/acs.jcim.6b00660 J. Chem. Inf. Model. XXXX, XXX, XXX−XXX