J. Phys. Chem. 1995, 99, 17926-17935
17926
Elasticity Theory and Numerical Analysis of DNA Supercoiling: An Application to DNA Looping Timothy P. Westcott, Irwin Tobias, and Wilma K. Olson* Department of Chemistry, Rutgers, The State University of New Jersey, New Brunswick, New Jersey 08903 Received: April 21, 1995; In Final Form: August 4, 1995@
A DNA polymer with 1000 base pairs (bp’s) is modeled as an elastic rod at the base pair level. The elastic theory of rods is used to express the free energy of a double helix that has been deformed by stresses. After including a Lagrange multiplier in the energy expression to constrain the ends of the rod, an expression for the equilibrium configuration of the rod is obtained through the use of the appropriate Euler-Lagrange equations. The resulting set of differential equations is simplified to a set of nonlinear algebraic equations by discretizing the rod into individual elements. Because each element can have its own physical characteristics, base sequence effects can be taken into account. The methods developed are applied to DNA loops that are either nicked ( i e . , torsionally relaxed) or unnicked ( i e . , supercoiled). Small changes in the orientations and displacements of the ends of the loops can cause large changes in the overall configuration of the DNA. The nicked DNA shows a greater propensity to change configuration than the same unnicked DNA. DNA loops that contain regions of intrinsic curvature require less elastic energy for loop formation and facilitate conversion between different looped configurations.
Introduction
In DNA, each strand of the double helix winds about a common axis. This common axis can also coil in a helical fashion, thus forming so-called supercoiled DNA. The supercoiled structure of DNA is related to its biological function; that is, the specific configuration adopted by DNA can facilitate or hinder biological processes such as replication, transcription, and A variety of different manifestations of supercoiling arise in biological systems. One occurs in chromatin when the DNA double helix wraps around a protein core to form a left-handed superhelix. Another occurs when a closed circular plasmid winds about itself to form an interwound structure, in much the same way a twisted telephone cord wraps around itself. In naturally occurring DNA, there exists, on average, one superhelical turn for every sixteen complete turns of the roughly tenfold B-DNA helix.5 Supercoiled DNA is generally underwound with respect to its relaxed counterpart; that is, supercoiling has the effect of untwisting the right-handed DNA double helix. The enormous size of biologically relevant supercoiled DNA currently prohibits conformational study at the atomic level, and until recently, most of what was known about supercoiled DNA was the result of either mathematical arguments based on topology or the approximation of DNA configurations by a selected group of geometric objects.lO,llThese objects, which include line segments, perfect helices, semicircles, and simple sinusoids, were cut and pasted together to form a single DNA shape. More recently, DNA has been modeled as a straight, thin, isotropic, elastic rod with a circular cross section.l2.l3 In this
* To whom correspondence should be addressed. Abstract published in Advance ACS Absrructs, November 1, 1995.
case, the energy of the rod is the sum of the bending energy, Eg, and the twisting energy, ET, as follows:
1 L E, -k E, = - A S K ( s ) ~ds 2
0
+ &(ATw)’ L
(1)
where K ( S ) is the curvature as a function of arc length, L is the total contour length, A and C are the bending and twisting constants, respectively, and ATw is the difference between the total twist and the total twist intrinsic to DNA. This formulation is equivalent to treating DNA as if it were a guitar string. This chain is naturally straight and has no preferential direction of bending. This model can incorporate neither intrinsic bends nor sequence effects, which are known to be important in DNA.I4-l6 Rods that have preferential directions of bending, and thus act less like a guitar string, have been examined.” Using the theory of anisotropic elastic rods,I8 the energy can be written as
+ -CK,*] 1 ds 2
(2)
where A I and A2 are bending constants in the two principal directions of bending, C is the twisting constant defined above, K I and ~2 are the components of curvature in each of the two principal directions of bending, and ~3 is the twist density. The components of curvature, ~1 and ~ 2 are , related to the curvature K by the following equation: K2
= (KI)’ f
(K2)*
(3)
Finally, the inclusion of intrinsic bends or sequence effects leads to the following energy expression which has been used
0022-3654/95/2099-17926$09.00/0 0 1995 American Chemical Society
+.
J. Phys. Chem., Vol. 99, No. 51, 1995 17927
Elastic Theory of DNA Looping previously: 19
Site b
1 5c(K3 - K!)~]ds
(4)
where K: and K: are the components of intrinsic curvature (i.e., the shape that a rod would adopt when it is relaxed in the absence of any forces) in each of the bending directions and K! is the intrinsic twist density (i.e., the twist density that a rod would adopt when it is relaxed in the absence of any forces). The twist density is related to ATw as follows:
ATw =
zh 1
L
(K3
- K!)
ds
Protein 2
Protcin 1
Site
d Protein 1
Site a
L Protein 2
Figure 1. DNA looping event. Protein 1 binds to the DNA at site a, and protein 2 binds at site b. These two proteins also bind to each other to form a complex, and the DNA between sites a and b forms a loop.
The justification for the use of the energy expression in eq 4 in terms of elasticity theory will be treated later in this paper. In terms of DNA, the energy expression in eq 4 is justified because it corresponds well with the data obtained in the Nucleic Acid Database.20,21Since this data is at the base pair level, specific properties of the rod can be treated to that resolution. By choosing an appropriate representation of the DNA curve, it is possible to divide the rod into segments of any given length; for example, each segment could represent one base pair. Each base pair can then have its own intrinsic bend and bending stiffnesses in each principal direction as well as its own intrinsic Figure 2. Local orthonormal coordinate system (dl, d2, d3), imbedded twist and twisting stiffness; thus, base sequence can be explicitly in the plane of each base pair. The vectors dl and d2 span a plane that treated. Although this energy expression substantially improves contains the base pair. The vector dl points in the direction of the major upon energy expressions that have previously been used to treat groove and is parallel to the short axis of the base plane. The vector d2 DNA, it still does not include stretching, which can be neglected is orthogonal to dl and parallel to the long axis of the base plane, with sufficient accuracy in a first-order theory of rods.22*23 pointing in the direction of the leading strand of the double helix. The However, it may be important to include a stretching term vector d3 is perpendicular to the base plane and points along the axis coupled to an untwisting term which is known to occur when of the DNA double helix in the 5’-3’ direction of the leading backbone strand. The tangent to the elastic rod is coincident with dl. a planar drug or a protein side group intercalates between adjacent base Also, cross terms in the energy expression uncrossed states. Interwound DNA supercoils can be divided (e.g., terms of the form KIK~) have been taken to be zero although into two parts: central regions of strand interwinding involving some pairs of components have been seen to be correlated.20 segments that are sequentially distant along the chain and outer Terms treating long-range self-contact are also not considered, regions that are not in close contact with any other part of the since they are not relevant to the systems considered here. DNA but are simply connections to regions of self-contact. Even though the majority of past studies have considered only These outer regions constitute the DNA loops. closed, circular DNA, the theoretical study of DNA tertiary structure is not limited to this case. It is also possible to study Computational Methods linear DNA fragments as long as both ends have been constrained in some manner. Consider the case where we have Description of Rod. Each base pair of the DNA double helix two proteins, protein 1 and protein 2. Protein 1 is bound to the is described by a local orthonormal coordinate system (dl, d2, DNA at site a, and protein 2 is bound at site b. If these two d3). The vectors dl and d2 span a plane that contains the base proteins also bind to each other to form a complex, then the pair, as shown in Figure 2. The vector dl points in the direction DNA between sites a and b forms a loop (Figure 1). This is of the major groove and is parallel to the short axis of the base the picture of looping as it is understood in the ara, gal, lac, plane. The vector d2 is orthogonal to dl and parallel to the and de0 operons and in phage systems where the loop is long axis of the base plane, pointing in the direction of the involved in the regulation of the initiation of tran~cription.~~*~*leading strand of the double helix. The vector d3 is perpenThis looping can be studied in the same fashion as that used dicular to the base plane and points along the axis of the DNA for closed circular DNA as long as the protein complex forms double helix in the 5’-3’direction of the leading backbone strand. a rigid link with the DNA, keeping the ends of the loop in a The tangent to the central axis of the elastic rod is coincident fixed orientation with respect to each with d3. DNA loops are found in chromatin as well as in the The local coordinate system (dl, d2, d3) is related to an interwound form of supercoiled DNA. When circular DNA orthonormal global coordinate frame (el, e2, e3) by the three wraps around a protein core to form chromatin, the part of DNA Euler angles-& v. The global coordinate system is rotated that is not bound to the protein forms the loop. This type of by an angle q5 around the e3 axis until el and e;! line up with the dotted lines in Figure 3. This new coordinate system is rotated loop has been observed to undergo a transition when the H5 by an angle 8 about the new el axis until the new e2 vector protein is added to nucleosomes that are initially devoid of H5.31 lines up with the dashed line in Figure 3 and e3 is coincident The linking number is seen to decrease from about -1.1 to about with d3. Note that we need only the first two Euler angles, 4 -1.6 with the addition of H5. This change is seen as a configurational change of the loop between crossed and and 8, to define d3, which is coincident with the tangent. Thus,
e,
17928 J. Phys. Chem., Vol. 99, No. 51, I995
Westcott et al. Derivation of Equilibrium Equations. The DNA double helix is modeled as a thin, inextensible elastic rod using the elastic energy, E, described in eq 4. We wish to minimize E with the constraint
that is, the fixed vector connecting the beginning of the rod to the end of the rod must be R = Riel R2e2 R3e3. In the case of closed circular DNA, RT = (0, 0, 0), where the superscript T denotes the transpose of the vector. We include this constraint as a Lagrange multiplier (A) in the original expression for the energy as
+
Figure 3. Three Euler angles 8, 4, and q that rotate the global coordinate system (el, e2, e3) into the local coordinate system (dl, d2, d3). First, the global system is rotated about e3 by 4 until it lines up with the dotted lines. Then the new system is rotated about the new el axis by 8 until the new e2 vector reaches the dashed line. Finally, the resulting system is rotated about the new e3 axis by until it is coincident with the local coordinate system.
the trajectory of the DNA axis through space can be defined with just two angles, # and 8. which are a function of arc length. Finally, in order to define the twist state of the DNA, the new coordinate system is rotated by an angle $J about the new e3 axis until the resulting system is coincident with the local coordinate system (dl, d2, d3). In matrix form the combination of these three rotations can be written as
+
where
Evaluation of eqs 6 and 9 yields expressions for K I , ~
2 ~, 3 and ,
d3 in terms of the Euler angles; that is,
+ #’sin 8 sin v = -8’ sin ly + #’ sin 8 cos ly = #‘ 8 + v‘ d3= (sin 8 sin #)el - (sin 8 cos #)e2 + (cos 8)e, K~
K~
where [TIis defined in Chart 1. Three Euler angles for every point along the curve are given by eq 6; thus, the configuration of the DNA double helix is represented by a set of e,@,and $J values. These Euler angles can, in turn, be related to the angular rate of rotation of the local coordinate frame, K,22 by ddi -- K x d , i = 1,2,3 ds
= 8’ cos ly
K3
COS
Inserting these equations along with the global axis components of A = illel L2e2 &e3 in eq 12 yields the following expression:
+
(7)
r = iAl(8’ 2
The angular rate of rotation in eq 7 can be written out in terms of its components as
cos ly
+
+ #’sin
8 sin II, - K!)’
+
1 5A2(-S sin q!~ #‘ sin 8 cos v - K;)’ 1 2
-c(#’COS e where ~1 and ~2 are the curvature components and ~3 is the twist component. The respective magnitudes of the rotations about dl, d2, and d3 are K I , ~ 2 and , ~ 3 .If the rotation is only about the d3 axis, the tangent of the rod, then the rod is completely described by a twisting without any bending. If the rotation is about some combination of the dl and d2 axes only, then the rod is said to be in a pure bending state. Substitution of eqs 6 and 8 into eq 7 yields the following expression:
+ +
+ ly’ - K!)” +
A, sin 8 sin # - A2 sin 8 cos # + A3 cos 8 Note that Ai are the global components of A, whereas Ki are the local components of K . The set of Euler angles which minimize the preceding integral (eq 11) must satisfy the following Euler-Lagrange equations:
cos II, sin 8 s i n v 0 -sin v sin 8 c o s 0
o
v
cos8
where e’, @,’ and $J’ are the derivatives of the Euler angles with respect to arc length. We thus have expressions for K I , ~ 2 and , ~3 in terms of the Euler angles.
CHART 1
cos # cos II, - cos 8 sin # sin -cos # sin v - cos 8 sin # cos sin 8 sin #
v v
This approach was also used by Maddocks for the simpler case of eq 2.17 Applying the three Euler-Lagrange equations, while assuming that the intrinsic curvatures and the bending and twisting constants do not vary throughout the rod, leads to the
v + cos 8 cos # sin 111 sin 8 sin v + cos 8 cos # cos 11, sin 8 cos v
sin # cos -sin # sin ly -sin 8 cos #
cos e
Westcott et al.
17930 J. Phys. Chem., Vol. 99, No. 51, 1995 Applying the Newton-Raphson method, we obtain equations which, in matrix form, look like
x2,1
x2,2
‘2,3
... 0
0
‘3,2
‘3,3
0
0 0
0 0
where
‘aF, aF, aF, ’I
_ _ _ 84, aw,
x.lJ =
aG, aG, aG, ---, aeJ
’41 aqj
6, sin 4, cos 0, cos 4, sin 0, -sin 0, cos 4, -sin 0, sin 4, 0
[-;OS
w, =
0
0
aH, aH, aH,
_ _ _
F
,aej
n
R, - C s i n ei sin 4i
el
Oj sin sin cos 4, 0 -COS e, cos 4, sin e, sin#, o -sin e, 0 0 COS
1
i=O n
-Gi , P = R2 + &in 6Jicos@, -Hi -Fi
i=O n
R3 h
+
+
The (3n 3) x (3n 3) supermatrix on the left-hand side of eq 18 contains all of the partial derivatives and is the analog of a,j, in eq 17 while the (3n 3) x 1 vector to the left of the equal sign in eq 18 contains all of the corrections to Bi, 4;. q;,and A and thus is analogous to A x j in eq 17. The (3n 3) x 1 vector on the right side of eq 18 is analogous to -pi in eq 17, and therefore is simply the negative of the algebraic equations which we are solving. The matrices Q,, Wi, L, and P are partitioned from other parts of the supermatrix with a line since they arise from the added constraint A, which keeps the end of the rod in a fixed position in space. What remains after the elimination of these partitioned parts of the matrix is a system of equations that yields the equilibrium configuration of an unconstrained rod. When the end of the rod is to be constrained, the entire supermatrix is needed and the end is positioned at RT = (R1,R2, R3) in the P matrix. Also, the rod is assumed to start at the origin. In all the calculations that follow (for a 1000 base pair (bp) DNA chain with one nodal element per bp), the calculation of one equilibrium structure requires 5-50 iterations to achieve the desired accuracy (0.000 001). In every case, the entire set of iterations required to converge to a solution takes no more than 5 s on a Silicon Graphics Indigo Extreme, with one iteration taking approximately 0.1 s of CPU time. Additional accuracy can be obtained with very little extra effort, since the NewtonRaphson method displays quadratic convergence near a root. Because the components of the matrix in eq 18 lie almost exclusively along the diagonal, the computer solution of the corrections to the Euler angles involves the order of N calculations to execute. Thus, even very large DNA chains can be treated at the base pair level since the time it takes to carry out one iteration increases only linearly with N .
+
+
CCOS Oi i=O
Results Nicked DNA Loop Flipping. Computations were first carried out on a 1000 bp nicked, intrinsically straight DNA that is assumed to behave isotropically (A, = A2 = A). In nicked DNA, the ends are allowed to rotate freely about the tangent to the double-helical axis. For this case, the three differential equations in eq 14 simplify to the following two equations:
fie,4,e’,4’,er’,4r’)= A(W - +r2 COS e sin e) A I cos 0 sin 4 + A 2 cos 8 cos 4 + A3 sin 6 = 0 (19a) g(e,4,e’,4’,e”,4’f)= A(@’’ sin2 e + 4‘0’ sin 28) AI sin 8 cos 4 - A, sin 8 sin 4 = 0 (19b) Although the energy of the DNA depends upon the number of base pairs in the chain, the equilibrium configuration does not depend upon the length of the DNA but instead upon the ratio, v , of the end-to-end separation, IRI, to the contour length, L. For most of the computer experiments presented below, the ends of the loop are separated by a distance which is equal to one hundredth of the entire contour length. The equilibrium shape of the DNA will not depend upon the twisting stiffness, C , since it does not appear in the two differential equations above. Also note that the bending stiffness can be divided out of these two equations (the Ai are replaced by ci, where ci = &/A, and the ci are determined in lieu of the dJ, yielding expressions that are independent of the bending stiffness. The elastic energy, however, does depend upon the bending stiffness, as can be seen from eq 1, 2, or 4. The following values reflect the overall bending and twisting stiffnesses of DNA in a solution of dilute
J. Phys. Chem., Vol. 99, No. 51, 1995 17931
Elastic Theory of DNA Looping Y
4
G
Figure 4. Some definitions concerning loop flipping. The neutral plane (shaded) is the plane containing both end points and the midpoint of the DNA chain. The angle a is the angle between the vector connecting both end points and the tangent at the beginning of the curve. The angle B is the angle between the tangent at the beginning of the curve and its projection onto the neutral plane, while y is the angle between the neutral plane and the negative tangent at the midpoint of the curve.
NaCl :33 A , = A, = A = 2.70 x
erg cm,
C = 2.04 x 10-19ergcm There are several definitions regarding loop flipping which need to be mentioned now. The neutral plane is the plane containing both end points and the midpoint of the DNA chain (k.,the shaded plane parallel to the XZ plane in Figure 4). The angle a is the angle between the vector connecting both end points and the tangent at the beginning of the curve. Changing a changes the orientation of the ends of the DNA loop. The angle p is the angle between the tangent at the beginning of the curve and its projection onto the neutral plane, while y is the angle between the neutral plane and the unit vector negative to the tangent at the midpoint of the'curve, as is shown in Figure 4. Thus, y gives information about the degree to which the loop is in either the crossed or the uncrossed state. In the following examples, the angle a is varied and the resulting values of y are determined. The tangents at the endpoints are changed, yielding a structure that is described by the direction of the tangent at the midpoint of the DNA chain. An example of the loop-flipping transition that occurs between crossed and uncrossed states can be seen in Figure 5. For this lo00 bp DNA chain, the end-to-end separation is fixed at 170 A, corresponding to q = 0.05. Views along two perpendicular directions are shown side-by-side. The left view shows how a is changed, and the right view shows the resulting y. In Figure 5a, cos a is 0.6, and the DNA chain is nearly contained within the neutral plane in the uncrossed form. Decreasing cos a to 0.4 or 0.2 (Figure 5b and c) does not change the overall configuration very much. However, between cos a = 0.0 and cos a = -0.4 (Figure 5d-f), the DNA loop undergoes a transition from the uncrossed state to a collapsed arrangement in which distant segments of the chain cross over one another. Finally, changing cos a from -0.4 to -0.6 (Figure 5f and g) does not significantly change the overall shape of the DNA loop. It is the change from an uncrossed to a crossed shape in Figure 5d-f (or vice versa) which is referred to here as loop flipping. The effect of the choice of /l on the transition between crossed and uncrossed states can be seen in Figure 6. In this case, q is chosen to be 0.01. The transition is symmetric about cos a = -0.03. Had there been no separation between the points at the ends of the curve, the transition would be symmetric about cos a = 0. Also, as /l approaches zero, the plot of cos a versus
Figure 5. Perpendicular views of equilibrium configurations of a lo00 bp, naturally straight, nicked DNA loop with an end-to-end separation of 170 A (v = 0.05), /?= lo", and cos a equal to (a) 0.6, (b) 0.4, (c) 0.2, (d) 0.0,(e) -0.2, (f) -0.4, and (g) -0.6. The left view makes it easy to see the change in a, and the right view makes it easy to see that in y.
cos y in the vicinity of the transition becomes sharper. In fact, in the limit as /l goes to zero, the curve becomes a step function that varies between f l at cos a = -0.03.23 Increasing /3 to 20" makes the loop flipping a much more gradual transition. In general, the loop remains in either the crossed or uncrossed form except for a small region of rapid flipping. Both the crossed and uncrossed configurations of the lo00 bp DNA loop with q = 0.01 have very similar energies even though they have very different end conditions, as can be seen in the inset of Figure 6. The crossed configuration has a higher elastic energy by a margin slightly less than O.lkT, but in between these two forms, there is a peak in the energy at cos a = -0.03. This energy maximum occurs very close to the value of cos a, at which the loop is midway between its uncrossed and crossed states (Le., where cos y = 0). Even though the elastic energies for the uncrossed and crossed forms are very similar regardless of the choice of /l and even though the peak energies all occur at the same cos a,the heights of their peak intensities are different. In general, the smaller 6 and the flatter the loop is, the higher the elastic energy. For example, when /l = 2", the peak energy is approximately 0.7kT greater than that of either the crossed or the uncrossed form, but when p = 20°, the peak energy is only about 0.2kT greater than that of either the crossed or uncrossed form. The reader should be warned not to interpret the insets of Figures 6-9 as a dynamic event. Each point along the plotted
Westcott et al.
17932 J. Phys. Chem., Vol. 99, No. 51, 1995 1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
*
rv?
8
0
v)
8
-0.2
cog a cos a
at ligation = 0.866 at ligation = 0.707
C O S at ~
ligation = 0.500
+ + -
-e
I
0.2 0
-0.2
-0.4
--0.4
f .
-
-0.6
I
-0.6
I. 2.
-0.8 , I
-1 -0.8
..
.,I
n:
n
n:
I
.
-0.8 "I
0..
-0.6
-0.4
-0.2
0.2
0
-1 0.4
0.8
0.6
cos a Figure 6. Loop-flipping transitions (cos y is plotted as a function of cos a)for /3 = 2", 5". 10". and 20" for nicked DNA loops of IO00 bp with an end-to-end separation of 34 A. In the inset, the elastic energy, in units of kT, is plotted as a function of cos a over the same range for the same DNA. The ordinate ranges from 2.6kT to 3.5kT. 1
I
1
1
q=O.Ol q = 0.05 q=O.lO
0.8
+-
-
-8-
~ = 0 . 1 5-x -
0.6 0.4
* v?
8
0.2
Ol
-0.2 -0.4
-0.8
-o.6 -1
Q
x
.
\
?
- - A
.
1
t
-0.8
-0.4
-0.2
0
0.2
0.4
0.6
0.8
cos a Figure 7. Loop-flipping transitions (cos y is plotted as a function of cos a)for q = 0.01,0.05,0.10, and 0.15 for naturally straight, nicked DNA loops of IO00 bp with /3 = 10". In the inset, the elastic energy, in units of kT, is plotted as a function of cos a over the same range for the same DNA. The ordinate ranges from 2.0kT to 3.4kT.
curve represents an equilibrium structure with distinct end conditions. Global fluctuations of the DNA are not considered here although it should be noted that the bending and twisting constants are measures of local fluctuations. The methods developed in this paper can and will be applied to dynamics in the future. In order to determine the effect of loop proportions (ie., q) on the loop flipping transition, lo00 bp nicked DNA chains with /3 = 10" and q taking on values 0.0 1,0.05,0.10, and 0.15 were studied. Figure 7 shows these transitions for each q. Each transition follows the same general form, the flipping being equally sharp in every case but crossing the abscissa at different values of cos a, unlike the plots in Figure 6. As q changes from 0.01 to 0.15, the value of cos a at the transition state (cos y = 0) decreases from -0.03 to -0.5 and does so in a linear fashion.
-0.8 -0.6 -0.4 -0.2
and 0.5. In the inset, the elastic energy, in units of kT, is plotted as a function of cos a over the same range for the same DNA. The ordinate ranges from 2.6kT to 4.6kT.
The plots of the elastic energy versus cos a also show this shift in transition point (inset of Figure 7). On the right side of the figure (the side corresponding to the uncrossed configuration of the loop where cos y < 0), the energy minimum deepens and shifts slightly to the left as q is increased from 0.01 to 0.15. On the left side (corresponding to the crossed configuration of the loop where cos y > 0), the energy minimum increases in magnitude and flattens out. This rise and flattening are a result of the energy maximum shifting to the left. The energy maximum is correlated with the point where the DNA loop interconverts between the crossed and uncrossed form. Unnicked DNA Loop Flipping. Computations were next carried out on an intrinsically straight, unnicked, lo00 bp DNA that bends in an isotropic fashion. For this case, all three differential equations in eq 14 need to be used because we also need to keep track of how the DNA is twisting. Also, the values for the bending and twisting stiffnesses (A and C) are needed to determine not only the elastic energies but also the equilibrium configurations themselves. The values for A and C are the same as those used in the preceding calculations (AI = A2 = A, since the DNA is assumed to behave isotropically). When the DNA is not nicked, the ends of the loop are held tightly in place so that the DNA cannot rotate about the doublehelical axis at either end. Therefore, in terms of loop flipping, the configuration at which the DNA is ligated (i.e., converted from nicked to unnicked) or, equivalently, the configuration at which the twist is relaxed should make a difference. Figure 8 shows the loop transitions for a 10oO bp DNA chain with /3 = 10" and q = 0.01. This DNA loop was ligated at three different values of cos a: 0.866, 0.707, and 0.5, corresponding to a values of 60°, 45", and 30", respectively. What we see is a large delay in the crossover point between open and collapsed loops from what is seen in the corresponding nicked DNA. Also, within the range of a ligation values studied (3Oo-6O0), the DNA is not very sensitive to the point of ligation since the three graphs are virtually identical. When the elastic energy of this DNA is plotted against cos a (inset of Figure 8), the uncrossed configuration persists as an energy minimum at about cos a = 0.6, but the crossed minimum '
I
-0.6
0 0.2 0.4 0.6 0.8 I cos a Figure 8. Loop-flipping transitions (cos y is plotted as a function of cos a)for naturally straight, unnicked DNA loops of IO00 bp with p = IO" and q = 0.01. The DNA is resealed at cos a = 0.866, 0.707, -1
Elastic Theory of DNA Looping
J. Phys. Chem., Vol. 99, No. 51, 1995 17933
Discussion
1
0.8
0.6 0.4
0.2 r
8
0
-0.2 -0.4
-0.6 -0.8 -1
-0.8
-0.6
-0.4
-0.2
0 0.2 cos a
0.4
0.6
0.8
Figure 9. Loop-flipping transitions (cos y is plotted as a function of cos a) for nicked DNA loops of 1000 bp with /3 = 10" and 77 = 0.01 which have a 200 bp curved region inserted into the central portion of the DNA chain. The 200 bp insert is curved by a total of O', 90', and 180'. In the inset, the elastic energy, in units of kT, is plotted as a function of cos a over the same range for the same DNA. The ordinate ranges from 1.2kT to 3.4kT.
completely disappears. There is a small bump in the curve that corresponds to what was once the energy maximum at the transition state between the uncrossed and crossed configurations. Also, note that, as was the case for the transition plots, the three plots of elastic energy are virtually identical. DNA Loops with Curved Regions. The computations described in this section correspond to nicked 1000 bp DNA chains that are assumed to behave isotropically with p = 10" and q = 0.01, Curved segments of 200 bp are positioned symmetrically about the midpoint of the otherwise naturally straight chain. Thus, there are 400 bp of intrinsically straight DNA on either side of the curved region. The inserted segments are curved in a plane describing circular arcs of 90" or 180" in their unstressed state. Because the twist density is not con~ t a n twe , ~ need ~ to use all three differential equations in eq 14 to solve for equilibrium configurations. The loop flipping transitions for the DNA's with curved regions are shown in Figure 9 along with the transition for the corresponding naturally straight DNA. All three DNA chains seem to behave similarly except for a slight shift to a more gradual transition for the DNA with the 180" curved segment. The plots of the elastic energies do not show such similarities. In fact, they are all quite different (inset of Figure 9). As was seen earlier in Figure 6, the straight DNA exhibits energy minima at cos a = f 0 . 6 and a maximum at cos a = -0.03 that is approximately 0.7kT greater than the energy of either the crossed or uncrossed states. When the 90" bend is inserted, the energy of every configuration drops by about 1.5kT. The minima still persist although they move closer together to cos a RZ h0.4. Also, the energy maximum is only -0.lkT greater than the minima regions. After the central 200 bp are curved more strongly in an arc of 180", the maximum completely disappears, and the minima merge into a single broad minimum centered where the transition state was once a maximum. In other words, there is no longer a barrier between the crossed and the uncrossed forms. Additionally, the energy of every configuration increases by approximately 1.OkT with respect to the energies found when the central 200 bp is curved by a total of only 90".
Consider a protein complex that binds simultaneously to two sites, a and b, that are not ordinarily juxtaposed along the primary DNA sequence. With the formation of a protein complex, sites a and b are brought into close proximity with respect to the three-dimensional structure, and the DNA chain between the two binding sites is transformed into a loop. As illustrated in this work, the protein complex can control whether the loop will be in an open uncrossed state or it will collapse into a crossed state. In its crossed state, the protein complex brings into close contact two other sites, c and d, which lie in between sites a and b. In the absence of any protein, sites c and d would normally have little or no influence on each other. Thus, it is the protein complex which controls which regions of DNA will influence each other. The control that the protein complex has on the overall DNA structure manifests itself in how the protein orients the DNA at sites a and b. Small changes in the orientation of the ends of the DNA loops ( i e . , at sites a and b) can bring about large changes in the overall shape of the loop itself. The loop can be in a crossed configuration with sites c and d in close contact or in an uncrossed state, yielding some finite separation between these two sites. There are several mechanisms to control the flipping of a DNA loop. These mechanisms include (1) changing the value of a, ( 2 ) changing the value of /3, (3) changing the value of q, (4) nicking or ligating the DNA, or (5) introducing curved segments in the naturally straight DNA that comprises the loop. A change in the value of a corresponds to a change in the ends of the DNA loop ( i e . , the angle between the tangents at the ends of the loop). This change is the primary way we effected the transformation from an uncrossed to a crossed loop and vice versa. Figure 5d shows an uncrossed loop with cos a = 0.0, and Figure 5f shows the corresponding crossed loop after cos a has been change to -0.4. From this data, we see that a only needs to vary by about 24" (when p is 10") in order for one loop type to be converted to the other. Thus, the angle a is very useful for loop control. A change in the angle /3 alone cannot effect a change between the open and collapsed states of the DNA loop. A loop that is in the crossed configuration cannot be converted into its corresponding uncrossed state simply by changing p. This point is illustrated in Figure 6, where all the transition plots cross the cos y = 0 axis at the same point. Above this line (cos y > 0), the loop is in the crossed state, and below it (cos y < 0), the loop is in the uncrossed state. Changing ,L?, however, can change the degree to which the loop is crossed or uncrossed. Also, if 3!, is decreased to 2", the change in a that is needed to convert one loop type to the other decreases to about 5". Increasing the magnitude of /3, however, makes it easier for the protein complex to convert the DNA loop back and forth between crossed and uncrossed forms. We saw in the inset of Figure 6 that there is an elastic energy barrier separating the two forms. The elastic energy of the open loop rises as cos a is decreased, meaning that it takes energy to make this change. As the loop reaches its crossover point, the energy reaches a maximum and then begins to fall. The height of the energy maximum gives an indication of how difficult it is to transform the loop between its two forms. When p = 2", the energy barrier is about 0.7kT for a 1000 bp, naturally straight DNA chain, but when p = 20°, the energy ban-ier is only 0.2kT. Therefore, a protein-induced increase in p might make it easier to interconvert forms, but a decrease in p would tend to inhibit interconversion.
Westcott et al.
17934 J. Phys. Chem., Vol. 99, No. 51, 1995 The length of the separation of the ends of the loops is also an important factor in the control of loop flipping. Increasing 7 (where 7 is the end-to-end separation divided by the total contour length), increases the end-to-end separation and vice versa. Figure 7 shows the influence of 7 on the loop-flipping transition. Consider the case where cos a is fixed at -0.2 and 11 is 0.15, a value corresponding to a relatively large end-toend separation in a 1000 bp DNA. At this point, cos y = -0.83, the large negative value indicating that the loop is uncrossed. Now, consider a process that brings the ends of the loop closer together until 7 is 0.01 while keeping a constant. At this point, cos y = 0.69, indicating that the loop has flipped completely over. A similar type of experiment can be carried out by nicking a DNA loop with cos a = -0.2. As seen in Figure 8, the unnicked form will still assume an uncrossed state (Le., cos y < 0) when cos a = -0.2, but when this DNA chain is nicked, the loop flips over into its crossed configuration (see Figure 6). Control of loop formation and loop flipping can be affected by the presence of curved segments within the otherwise naturally straight loop. A 200 bp curved segment placed in the midsection of an otherwise naturally straight 1000 bp DNA loop does not significantly change the transition between open and closed forms, but it does bring about a large difference in the elastic energy. First, the elastic energy of every configuration of the loop with a planar 90" bend is at least 0.6kT lower than that of the corresponding configuration in the naturally straight DNA or the DNA with the 180" bend (see the inset of Figure 9), meaning that it is easier to form a loop with the 90" bend. Second, as the magnitude of the bending increases, the central maximum lowers and levels out until it disappears when the curved segment bends through 180". This disappearance removes the barrier that normally impedes interconversion between the crossed and uncrossed forms. Finally, anisotropic bending is not explored here, but we have investigated the effects of such anisotropy elsewhere.35 What we observe is a repeating variation in the local helical parameters of the DNA. Correlation of parameters is observed. For instance, the bending of DNA into a groove is accompanied by an untwisting of the DNA. These local variations closely mimic the bending tendencies in linear DNA predicted by Zhurkin and co-workers on the basis of the conformational energies of isolated base pairs.36
Conclusions A fast and accurate method for calculating equilibrium structures of supercoiled DNA has been developed. A chain of 1000 base pairs was studied at the base pair level, but even larger systems could be treated with no additional effort and with an increase in computing time that varies only linearly with additional base pairs. Inclusion of self-contact forces, which we are now considering, would, however, make the computational difficulty scale quadradically with additional base pairs unless a speedup technique such as the fast multipole method3' is used to treat the painvise interactions. The methods developed in this paper have been applied to DNA looping. Small changes in the orientations and positions of the ends of DNA loops can lead to large overall changes in the configuration of the loop itself. The shape of the loop and the transitions between those shapes also depend on whether the DNA is nicked or not nicked and whether there are regions of curved DNA in the loop. The methods developed have not been used to their full potential. In the future, we will treat DNA more realistically as an anisotropic elastic rod, and we will consider specific base sequences.
Acknowledgment. We are grateful to Mr. Scott C. Pedersen for valuable discussions and for his generous technical assistance with computing. This research has been funded by the U.S. Public Health Service under research grant GM34809. T.P.W. is a predoctoral trainee supported in part by a grant from the U.S. Public Health Service (Molecular Biophysics Training Grant GM08319) and by the U.S. Department of Education (National Needs in Chemistry Program). Appendix: Relation of Equilibrium Equations to Other Work First, eq 13a can be written as
-ar aK3 + A . 2 (AI) aK,
ae
Now note that X / a K i = Mi, where Mi is a component of the resultant moment of the rod, M = Midl M2d2 M3d3, with
+
+
Then, the terms of eq A1 can be rearranged to read
where the local derivative of an arbitrary vector, V = Vldl V2d2 V3d3, with respect to arc length is defined as
+
+
Using eqs 6 and 9, one can verify that the following relations are true:
(-45)
Inserting eqs A4 and A5 into eq A3 yields
Applying the same reasoning to eqs 13b and 13c, the following are also obtained:
J. Phys. Chem., Vol. 99, No. 51, 1995 17935
Elastic Theory of DNA Looping Combining eqs A6a, A6b, and A6c yields eq 15, which is the equation of the equilibrium of an elastic rod obtained by Dill.
References and Notes (1) Bauer, W. R.; Crick, F. H. C.; White, J. H. Sci. Am. 1980, 243, 118-133. (2) Fisher, L. M. Nature (London) 1984, 307, 686-687. (3) Wasserman, S. A,; Cozzarelli, N. R. Science 1986,232,951-959. (4) Liu, L. F.; Wang, J. C. Proc. Natl. Acad. Sci. U.S.A. 1987, 84, 7024-7021. ( 5 ) Kanaar, R.; Cozzarelli, N. R. Curr. Opin. Struct. Biol. 1992, 2, 369-379. (6) Dunaway, M.; Ostrander, E. A. Nature (London) 1993,361, 746748. (7) Travers, A. A.; Schwabe, J. W. R. Curr. Opin. Strucr. Biol. 1993, 3, 898-900. (8) Hermoso. J. M.; Freire, R.; Bravo, A.; Gutierrez, C.; Serrano, M.; Salas, M. Biophys. Chem. 1994, 50, 183-189. (9) Bazett-Jones, D. P.; Lablanc, B.; Herfort, M.; Moss, T. Science 1994, 264, 1134- 1137. (10) Camerini-Otero, R. D.; Felsenfeld, G. Proc. Natl. Acad. Sci. U.S.A. 1978, 75, 1708-1712. (11) White, J. H.; Cozzarelli, N. R.; Bauer, W. R. Science 1988, 241, 323-327. (12) Hao, M. H.; Olson, W. K. Macromolecules 1989,22, 3292-3303. (13) Schlick, T.; Olson, W. K.; Westcott, T.; Greenberg, J. P. Biopolymers 1994, 34, 565-597. (14) Trifonov, E. N. CRC Crit. Rev. Biochem. 1986, 19, 89-106. (15) Crothers, D. M.; Drak, J.; Kahn, J. D.; Levene, S. D. Methods Enzymol. 1992, 212, 3-29. (16) Hagerman, P. J. Biochim. Biophys. Acta 1992, 1131, 125-132. (17) Maddocks, J. H. Arch. Ration. Mech. Anal. 1984, 85, 3 11-354. (18) Landau, L. D.; Lifshitz, E. M. Theory of Elasticity; Pergamon Press: Oxford, 1986. (19) Yang, Y.; Tobias, I.; Olson, W. K. J. Chem. Phys. 1993, 98, 16731686.
(20) Gorin, A. A.; Zhurkin, V. B.; Olson, W. K. J. Mol. Biol. 1995, 247, 34-48. (21) Berman, H. M.; Olson, W. K.; Beveridge, D. L.; Westbrook, J.; Gelbin, A.; Demeny, T.; Hsieh, S. H.; Srinivasan, A. R.; Schneider, B. Biophys. J. 1992, 63, 751-759. (22) Dill, E. H. Arch. Hist. Exact Sci. 1992, 44, 1-23. (23) Coleman, B. D.; Dill, E. H.; Lembo, M.; Lu, Z.; Tobias, I. Arch. Ration. Mech. Anal. 1993, 121, 339-359. (24) Jain, S . C.; Tsai, C. C.; Sobell, H. M. J. Mol. Biol. 1977, 114, 317-331. (25) Kim, Y.; Geiger, J. H.; Hahn, S.; Sigler, P. B. Nature 1993, 365, 512-520. (26) Kim, J. L.; Nikolov, D. B.; Burley, S. K. Nature 1993, 365, 520527. (27) Schleif, R. Annu. Rev. Biochem. 1992, 61, 199-223. (28) Matthews, K. S. Microbiol. Revs. 1992, 56, 123-136. (29) Tobias, I.; Coleman, B. D.; Olson, W. K. J. Chem. Phys. 1994, 101, 10990-10996. (30) Zhang, P.; Tobias, I.; Olson, W. K. J. Mol. Biol. 1994, 242, 271290. (31) Zivanovic, Y.;Duband-Goulet, I.; Shultz, P.; Stofer, E.; Oudet, P.; Prunell, A. J. Mol. Biol. 1990, 214, 479-495. (32) Press, W. H.; Flannery, B. P.; Teukolsky, S. A,; Vetterling, W. T. Numerical Recipies; Cambridge University Press: Cambridge, 1986. (33) Hagerman, P. J. Annu. Rev. Biophys. Biophys. Chem. 1988, 17, 265-286. (34) Tobias, I.; Olson, W. K. Biopolymers 1993, 33, 639-646. (35) Olson, W. K.; Babcock, M. S.; Gorin, A.; Liu, G.; Marky, N. L.; Martino, J. A.; Pedersen, S. C.; Srinivasan, A. R.; Tobias, I.; Westcott, T. P.; Zhang, P. Biophys. Chem. 1995, 55, 7-29. (36) Zhurkin, V. B.; Lysov, Y. P.; Ivanov, V. Nucleic Acids Res. 1979, 6, 1081-1096. (37) Greengard, L. Science 1994, 265, 909-914. JP951132V