Dissociation Free-Energy Profiles of Specific and Nonspecific DNA

May 28, 2013 - Also, the free-energy profiles were found to be correlated with changes in the number of protein–DNA contacts and that of surface wat...
1 downloads 10 Views 2MB Size
Subscriber access provided by UNIVERSITY OF CALGARY

Article

Dissociation Free-Energy Profiles of Specific and Nonspecific DNA–Protein Complexes Yoshiteru Yonetani, and Hidetoshi Kono J. Phys. Chem. B, Just Accepted Manuscript • DOI: 10.1021/jp402664w • Publication Date (Web): 28 May 2013 Downloaded from http://pubs.acs.org on June 10, 2013

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

The Journal of Physical Chemistry B is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

TOC Graphics

1

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Dissociation Free-Energy Profiles of Specific and Nonspecific DNA–Protein Complexes Yoshiteru Yonetan*i, Hidetoshi Kono* Molecular Modeling and Simulation Group, Quantum Beam Science Directorate, Japan Atomic Energy Agency, 8-1-7 Umemidai, Kizugawa, Kyoto 619-0215, Japan

*Corresponding Authors [email protected]; [email protected]

Keywords: specific binding, non-specific binding, free energy, molecular dynamics, target search, dissociation

1

ACS Paragon Plus Environment

Page 2 of 40

Page 3 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

ABSTRACT

DNA-binding proteins recognize DNA sequences with at least two different binding modes, specific and nonspecific. Experimental structures of such complexes provide us a static view of the bindings. However, it is difficult to reveal further mechanisms of their target-site search and recognition only from static information because the transition process between the bound and unbound states is not clarified by static information. What is the difference between specific and nonspecific bindings? Here we performed adaptive biasing force molecular dynamics simulations with the specific and nonspecific structures of DNA–Lac repressor complexes to investigate the dissociation process. The resultant free-energy profiles showed that the specific complex has a sharp, deep well consistent with tight binding, whereas the non-specific complex has a broad, shallow well consistent with loose binding. The difference in the well depth, ~5 kcal/mol, was in fair agreement with the experimentally obtained value and was found to mainly come from the protein conformational difference, particularly in the C-terminal tail. Also, the free-energy profiles were found to be correlated with changes in the number of protein–DNA contacts and that of surface water molecules. The derived protein spatial distributions around the DNA indicate that any large dissociation occurs rarely, regardless of the specific and nonspecific sites. Comparison of the free-energy barrier for sliding [~ 8.7 kcal/mol, Furini et al., J. Phys. Chem. B 114, 2238 (2010)] and that for dissociation (at least ~16 kcal/mol) calculated in this study suggests that sliding is much preferred to dissociation.

2

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1. Introduction DNA-binding proteins regulate gene expression by binding to specific DNA sequences. Their recognition of the target sequences has at least two different binding modes, specific and nonspecific. These binding modes are known to be essential in the target-site search process, in which proteins find their target sites among a vast amount of genomic DNA (~106–109 bp). A current consensus view for finding the target site is as follows.1-3 In a cell, a DNA-binding protein moves around in the three-dimensional space of a solution; when it encounters DNA, it binds to the DNA regardless of the sequence. The protein then begins a one-dimensional sliding search on the DNA for its target. While sliding, if the protein encounters a particular site of the target sequence, a more stable complex is formed to recognize the sequence, which is specific binding. The protein otherwise continues one-dimensional search or dissociates from the DNA at certain time, and then again performs three-dimensional diffusion in the solution. Such a target search process has yet to be elucidated well4-5 because we cannot directly observe it, but in the currently accepted view, both specific and nonspecific bindings are regarded as key factors in the process. Molecular structures of both specific and nonspecific protein–DNA complexes have been solved for Lac repressor,6-8 EcoRV,9 and BamHI10 by NMR spectroscopy or x-ray crystallographic analysis. For example, the Lac repressor has two different conformations for the specific and nonspecific sites.6, 8 The molecular structures let us speculate on how the DNAbinding protein searches for its target site. The C-terminal tail, which is disordered in nonspecific binding, is considered to play an important role in efficient movement. Still, only static structures are provided in the bound states. Further understanding of the mechanism of the target-site search and of the sequence recognition requires information about transitions from the bound states. What is the difference between specific and nonspecific bindings? To address this question, 3

ACS Paragon Plus Environment

Page 4 of 40

Page 5 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

various experimental studies have been performed. Iwahara and Clore et al.,11-12 using an NMR technique, observed a protein homeodomain sliding on nonspecific DNA. Their results indicated that the protein at nonspecific sites retains almost the same form as in the specific complex. Blainey et al.,13 by combing single-molecule diffusion data and theoretical considerations, deduced that nonspecifically bound proteins undergo rotation-coupled sliding along the helical axis of DNA. Kalodimos et al.14 analyzed the dissociation process of the Lac repressor protein bound to a specific site of DNA at the residue level with H/D exchange measurements. More detailed pictures of such intermediate states of DNA-binding proteins searching for their target sites will be valuable; however, it is currently difficult to obtain such information only from experimental measurements. Thus, in the present study, we take an alternative approach, computer simulation. By performing all-atom molecular dynamics (MD) simulations starting from each of the specific and nonspecific complex structures of DNA–Lac repressor, we investigated the dissociation process in terms of structural and free-energy changes. NMR analysis6 has shown that the Lac repressor headpiece (DNA-binding domain composed of 62 amino acid residues) undergoes different conformations depending on the binding site of specific and nonspecific DNA. The major difference lies in the molecular packing in the protein–DNA interface, particularly in the C-terminal region;6 the specific complex is tightly packed, but the nonspecific complex is loosely packed. The present MD simulations, starting from each complex structure, derived both structural and free-energy information, which clarifies the following two aspects. The first is binding energetics, that is, binding specificity and sequence recognition. What structural differences are responsible for the energetic difference between the specific and nonspecific complexes? Such understanding of the energetics is important when considering the mechanism of target-site search and recognition; because the specific complex is the final state and the nonspecific complex is an intermediate state, the energetic balance of both states is a key 4

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

factor that determines the efficiency of the target-site search and the stability of the final product.15-16 Second, the dissociation free-energy profile provides insights into the protein spatial distribution and kinetics around DNA specific and nonspecific sites. Such information enables us to consider how a protein moves from specific and nonspecific binding sites. A closely related work has recently been reported by Furini et al.17. They carried out all-atom MD free-energy calculations on the same Lac repressor-DNA complex as studied here, but for a different movement—sliding. Sliding and dissociation are both considered to be major ratelimiting processes during the target search. Since this study, together with the study by Furini et al.,17 provides free-energy profiles of both essential processes, we can now address the kinetics of the Lac repressor protein bound to DNA. Other related works report dissociation free-energy profiles of different DNA-binding molecules, SRY protein18 and small minor-groove binders,19-20 but in these studies the difference between specific and nonspecific bindings has not been examined. Villa et al.21 studied a multi-domain Lac repressor complex with DNA using a multiscale MD approach. The study addressed the essential dynamics for Lac repressor function, though the associated free-energy change was not examined. Free-energy calculation for molecular processes such as protein–DNA dissociation is not a trivial task in the current all-atom MD simulations because protein–DNA dissociation is a rare event and the free-energy surface is very rough. Conventional MD simulations cannot sample such rare events within the accessible time-scale. Thus, various technical improvements have been proposed22; of these, we have chosen a promising method, the adaptive biasing force (ABF) method proposed by Darve and Pohorille et al.23-24 In the ABF MD simulation, a biasing force is added to the molecular system along the reaction coordinates so as to derive rare events such as dissociation of protein–DNA complexes. The biasing force is adjusted automatically according to the gradient of the free-energy surface (i.e., average force). 5

ACS Paragon Plus Environment

Page 6 of 40

Page 7 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

In this study, we simulated the dissociation processes of the Lac repressor of specific and nonspecific binding and obtained the accompanying structural and free-energy changes with the ABF method.23-24 We found that free-energy profiles for specific and nonspecific bindings have distinct features; the specific complex has a sharp, deep energy well, but the nonspecific complex has a broad, shallow well, which reflects the tight and loose binding of specific and nonspecific complexes, respectively. The difference in well depths is ~5 kcal/mol, which is in good agreement with the experimentally obtained value. This energetic difference can be explained in terms of structural changes observed in the dissociation process. In addition, the protein spatial distributions around DNA derived from free-energy profiles indicate that large dissociation events are rare, regardless of the specific and nonspecific sites. Quantitatively, the free-energy barrier for dissociation (at least ~16 kcal/mol) calculated in this study is about double that for sliding (~8.7 kcal/mol) obtained by the previous study of Furini et al.,17 indicating that sliding is much preferred to dissociation.

2. Details of MD simulation 2.1 System setup DNA–Lac repressor complex systems were set up with the LEaP module in AMBER10, using the NMR structures, PDB 1L1M8 for specific and 1OSL6 for nonspecific complexes. In the PDB data, two DNA-binding domains, each consisting of 62 amino acid residues, form a dimer. As shown in Figure 1, we considered half of the dimer; the other half was discarded. In the specific complex, the dimer structure is asymmetric. We chose the left half in view of its stronger affinity.25 We included DNA fragments of 19 and 18 bp in the present specific and nonspecific complex systems, respectively. In the specific complex, PDB data includes 23 bp DNA, but 2 bp at either end were deleted to make the two conditions similar. The specific and nonspecific 6

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

DNAs differ in sequence and conformation.6 The specific DNA is deformed significantly, but this part was relaxed into a typical straight form of DNA during the MD equilibration process as the result of the deletion of half the Lac dimer. The specific and nonspecific complexes show no difference in the amino acid sequences of the Lac repressor, but the conformations are different, particularly in the C-terminal region, where the α-helix (H4) is formed in the specific complex and deeply intrudes into the DNA minor groove. In contrast, that region is disordered in the nonspecific complex. The Amber force fields ff99SB26 for protein and bsc027 for nucleic acids were used. Amino acid residues of the Lac repressor were set to the protonation state at pH 7. Cys52, which has been introduced as a S-S linker for dimerization in experimental work,6 was protonated here. TIP3P28 water molecules of 20,103 and 16,526 were placed around the specific and nonspecific protein–DNA complexes, respectively, which ensures at least 10 Å of water shell from the dsDNA helical axis (z axis in Fig. 1) and 20 Å in the other x and y directions. Then, a threedimensional periodic boundary condition was imposed. The resultant MD box was 100 Å × 88 Å × 86 Å and 81 Å × 87 Å × 89 Å for the specific and nonspecific complex systems, respectively. These boxes are sufficiently large so that the Lac repressor has no contact with any periodic images of DNA even in the dissociation state. In addition, K+ counter ions were added for neutralization: 33 K+ ions in the specific system and 31 K+ ions in the nonspecific system. This concentration is about 0.1 M, which is near the physiological ionic concentration of ~0.15 M.

2.2 Conventional MD simulations Using the sander module in AMBER10,29 we first carried out energy minimizations for the specific and nonspecific complex systems, and then carried out conventional MD simulations at 300 K and 1 atm for 1 ns. During the initial 400 ps, the protein and DNA atoms were restrained 7

ACS Paragon Plus Environment

Page 8 of 40

Page 9 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

at the initial positions. The equations of motion were solved numerically with a time step of 2 fs, while constraining covalent bonds involving hydrogen bonds. The temperature and pressure were controlled by the weak coupling algorithm.30 Electrostatic interactions were calculated with the particle mesh Ewald,31 whereas van der Waals interactions were calculated with a cutoff of 9.0 Å. Another restraint was imposed on Cα-Cα distances of the Lac repressor except for the Cterminal region throughout the MD simulations in order to prevent any accidental protein deformation. The C-terminal region (residues 50–62) was excluded from this restraining because this region takes different conformations depending on the situation as shown experimentally.6 This Cα-Cα restraining followed a scheme employed by Higo et al.,30 but the parameters were set differently. The Cα-Cα restraining was realized by placing half-harmonic potentials at both ends of 1 Å-flat region. The force constant of 20 kcal/mol Å2 was used. Also, note that all Cα-Cα pairs were restrained in our study, but only pairs more than three residues apart in sequence were restrained in the study of Higo et al.32

2.3 ABF MD simulations After the system equilibration runs, we carried out ABF MD simulations. In this study, we implemented the ABF algorithm23-24 into the sander module of AMBER10.29 In the ABF simulation, an additional unphysical force was added as a biasing force to induce protein–DNA dissociation. All other settings of MD simulations such as temperature, pressure control, and distance restraining were the same as in the above conventional MD simulations. The ABF simulation has been used in various free-energy studies such as conformation changes of peptide and protein,33-34 and molecular association and dissociation,35-36 where atom-atom or groupgroup distance or RMSD was employed as the reaction coordinate along which the biasing force was applied. In the present study, as the reaction coordinate ξ, we employed the distance between 8

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the first principal axis of the DNA (helical axis) and the center of mass for the Lac repressor, shown in Figure 1c. This distance ξ is expressed as

ξ = ( rprotein − rDNA ) −  v ⋅ ( rprotein − rDNA )  v ,

(1)

where v is the unit vector showing the DNA first principal axis, and rDNA and rprotein are the centers of mass for the DNA and the Lac repressor, respectively. To obtain the principal vector v, we first calculated the center of mass for each of the 12 bp to which the Lac repressor was bound, and then calculated the principal axis v using these 12 centers of mass. The ABF simulations were performed according to the procedure presented by Darve and Pohorille.24 The average force F(ξ) was evaluated as d  dξ   mξ  , dt  dt  ξ

F (ξ ) =

n

[

(2)

]

where mξ−1 = ∑ mk−1 (∂ξ ∂x k ) + (∂ξ ∂y k ) + (∂ξ ∂z k ) , n is the number of atoms included in 2

2

2

k =1

the system, and k denotes the index for the atoms. The biasing force −F(ξ) was applied to the systems throughout the ABF simulations. This force was updated every MD step by evaluating the instantaneous value in the brackets of Eq. (2). The bin size of ξ was set to 0.1 Å. It should be noted that we reduced the biasing force −F(ξ) by the damping factor N/Ndamp until the number of conformation samples N was over a threshold, Ndamp.24 This is because evaluation using a small number of samples can yield a completely incorrect value of F(ξ). After the sample number N reaches Ndamp, the biasing force −F(ξ) was fully added. The ABF and free-energy calculations consist of three stages: dissociation, sampling, and postprocessing. At the first stage, dissociation of the Lac repressor from the DNA was derived by performing 8 ns ABF simulations. The distance ξ was initially 12 Å for specific cases and 14 Å 9

ACS Paragon Plus Environment

Page 10 of 40

Page 11 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

for nonspecific cases, but it was gradually increased and finally reached 29 Å. The dissociation speed can be adjusted by a parameter Ndamp,33 and six different settings of Ndamp (100, 5,000, 10,000, 20,000, 50,000, and 100,000) were examined, with the result that Ndamp = 5,000 was used in the next stage (see Results). At the second stage, conformation sampling ABF runs were carried out within partitioned windows; each window had a width of 1 Å. We picked up a starting structure for each of the windows from the trajectories obtained by the dissociation stage calculation with Ndamp = 5,000, and then carried out an ABF calculation in each window. Such window-runs were realized by imposing the half-harmonic potentials with the spring constant 100 kcal/mol Å2 at both ends. The sampling ABF runs were carried out in parallel and continued until the resultant average force F(ξ) converged, that is, 24 ns for each 1 Å window and 32 ns for some cases. In this sampling stage, Ndamp was set to 100,000; that is, the biasing force was added more gradually than in the dissociation stage with Ndamp = 5,000. The third stage is post-processing, where the average force F(ξ) obtained from the sampling runs were processed to obtain free-energy profiles G(ξ) with the thermodynamic integration, ξ2

G (ξ 2 ) − G (ξ1 ) = ∫ F (ξ )dξ .

(3)

ξ1

3. Results 3.1 Verification of the ABF results We first examined the dissociation processes with various speeds. Figure 2 shows the process for the specific complex. In the initial state, the protein–DNA distance was ξ = 12 Å, where the Lac repressor was bound to the DNA tightly. As the time proceeded, the Lac repressor gradually dissociated from the DNA, and protein–DNA contacts were almost lost when ξ reached 29 Å. 10

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Different Ndamp values gave different dissociation speeds, as in the previous ABF study.33 A very rapid dissociation with Ndamp = 100 led to an artificial collapse of the protein–DNA interface. On the other hand, larger values of Ndamp, 20,000 - 100,000, were too slow to reach the complete dissociation (ξ is 29 Å) within a reasonable computational time (Fig. 2). The dissociation speeds with Ndamp = 5,000 to 10,000 produced almost the same dissociation process, where the dissociation starts at the C-terminal helix H4 followed by remaining helices including the recognition helix H2 (the details are discussed later). In the nonspecific complex, almost the same process was obtained in the same range of Ndamp. Therefore, we decided to use the dissociation trajectories with Ndamp = 5,000 for the further sampling stage. It should be mentioned that the ABF-derived dissociation processes were consistent with those obtained in H/D experiments.14 This study suggested that protein–DNA dissociation starts at the C-terminal region, followed by dissociations of the recognition helix H2, as seen in our ABF simulations. Next, we carried out the ABF sampling runs. We checked the convergence of the average force F(ξ) against the number of conformation samples N (see Fig. 3a). The average force values for

ten different bins (ξ = 14.5 to 15.5 Å) show that a good convergence can be achieved at sample number N = 400,000 to 500,000. However, at smaller samples, such as less than N = 200,000, the average force did not converge well. Thus, we concluded that at least 400,000 samples are required per 0.1 Å bin; the corresponding time-length is 8 ns per 1 Å window (= 400,000 × (1 Å / 0.1 Å) × 2 fs). Yet, because the resultant distribution of ξ was not completely uniform, the sampling run was continued further until all bins were filled with at least 400,000 (see the inset in Fig. 3b). The resultant trajectory length was 24 to 32 ns for each window.

3.2 Free-energy profiles for dissociation of specific and nonspecific complexes 11

ACS Paragon Plus Environment

Page 12 of 40

Page 13 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Using the average force F(ξ) obtained from the sampling runs, we calculated the free-energy profiles as a function of protein–DNA distance ξ (Fig. 4a). In the figure, a clear difference is found between the specific and nonspecific cases. The specific complex has a sharp, deep well, whereas the nonspecific complex has a broad, shallow well. These results demonstrate an expected feature that specific binding is tight and nonspecific binding is loose. The energy minimum in the specific case exists at a closer region, ξ ~ 12.0 Å from the DNA axis, whereas in the nonspecific case, there were two minima, at ξ = 14.0 and 16.7 Å. The minimum for specific binding is about 5 kcal/mol deeper than the two minima of nonspecific binding. Based on the free-energy profiles, the dissociation process can be characterized at three points, SI SII, and SIII for specific binding, and NI, NIII, and NIII for nonspecific binding (Fig. 5). In the specific case, at the first point SI (ξ = 12.0 Å), the C-terminal helix H4 is tightly contacted with the DNA in the minor groove. This state corresponds to the NMR structure.8 As ξ is increased, this C-terminal contact becomes slightly loose [SII (ξ = 15.7 Å)], but the α-helix is still maintained. At that point, the free-energy gradient substantially decreased. During the subsequent gentle slope [SII (ξ = 15.2 Å) to SIII (ξ = 26.0 Å)], the C-terminal region completely dissociated from the DNA. Within this process, the α-helix was destroyed. In addition, remaining protein– DNA contacts gradually disappeared and the regions around the helices H2 and H3 finally dissociated at point SIII. Next, we see the structural changes in the nonspecific case, NI–NII in Fig. 5. The first point NI (ξ = 14.0 Å) corresponded to the NMR experimental structure,6 where the C-terminal region had some contacts with DNA, but the contacts were not tight, unlike the specific case. At the next point NII (ξ = 16.7 Å), the C-terminal contacts almost disappeared; however, the free energy was still almost the same. This indicates that whether or not the nonspecific C-terminal contacts are formed does not affect the free energy. This means that the entropy gain by fluctuation of the C12

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

terminal region compensated for the loss in enthalpy caused by losing the protein–DNA complex structure. This description is consistent with the NMR result6 and indicates that the C-terminal region forms various structures. From point NII, dissociation proceeded further in the same way as in the specific case; the remaining protein–DNA contacts gradually disappeared, and then final dissociation occurred around the helices H2 and H3 after the point NIII (ξ = 26.0 Å). When compared to the C-terminal region, the H1-H3 domain has smaller structural differences between bound states, SI and NI. However, the H1-H3 domain also has tight and loose contacts in specific and nonspecific complexes, respectively. Such differences can be observed in residue contacts shown later. Also, the differences appear in the fluctuations of RMSD and orientation of the H1-H3 domain, which are shown in Figs. S1 and S2 (see Supporting Information).

3.3 Correlation with the number of protein–DNA contacts and of surface water molecules We found that the above free-energy changes are well correlated with the number of protein– DNA contact atoms Ncontact and the number of water molecules at the DNA surface Nwater, which are shown in Figs. 4b and 4c, respectively. In the figure, we can see that as ξ increases, freeenergy increases, and as the number of contact atoms Ncontact decreases, the number of surface water molecules Nwater increases as well. Particularly, the slopes of each profile are also correlated; for example, Ncontact in the specific complex decreases significantly at ξ ~ 12.0 to 15.0 Å. In the same region, the free energy shows a significant change as well. One exceptional region is observed in the nonspecific case. Around the minima NI (ξ = 14.0) and NII (ξ =16.7 Å), despite the change in Ncontact, no significant change in the free energy is observed. This suggests that the related nonspecific contacts, which mainly form in the C-terminal region, are quite different in quality from those in the specific complex; that is, the contribution of the nonspecific contacts is almost equal to that of the entropy gained by the dissociation and thus contributes little to the 13

ACS Paragon Plus Environment

Page 14 of 40

Page 15 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

binding free energy, as seen in the previous section. A similar characterization can be made with another quantity, solvent-accessible surface area. As shown in the inset in Fig. 4b, the result showed almost the same tendency. It is reasonable to assume that as the number of protein–DNA contacts increases, the number of surface water molecules decreases and vice versa. In fact, Figure 4c confirms that change in the number of water molecules at the DNA surface Nwater is correlated with changes in Ncontact and free energy (Fig. 4). It has been shown that hydration water plays an essential role in the modulation of protein–DNA specific and nonspecific binding.37-38 Here we can provide molecular pictures of the hydration. In Fig. 6, we show typical molecular configurations observed at specific SI (ξ = 12.0 Å) and nonspecific NI (ξ = 14.0 Å) and NII (16.7 Å), which correspond to the free-energy minima (Fig. 4a). In specific binding, at SI (ξ = 12.0 Å), Nwater decreased by 50 (the value denotes the number of sequestered water molecules from the DNA surface; see the caption of Fig. 4). On the other hand, nonspecific binding has ~10 more surface water molecules at NI (ξ = 14 Å), which increases further by 10 from NII (ξ = 17 Å). The changes in number of water molecules are observed mainly in the DNA minor groove to which the C-terminal region binds, as indicated by the circles in Fig. 5. This is the molecular-level correlation that occurs between the surface water and the protein–DNA contacts.

3.4 Residue-level analysis of the protein–DNA contacts Residue-level analysis provides a more detailed characterization. We calculated the probability that each of the protein residues is making any contact with DNA atoms. The analysis was made on the three representative points, specific SI–SIII and nonspecific NI–N III (Fig. 7). This analysis can tell us which contact is essential for the free-energy changes. The change of the contact probabilities from SI to SII (indicated by arrows in the figure) accounts for a large increase in free 14

ACS Paragon Plus Environment

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

energy. In the process from SI to SII, the contact probabilities are reduced mainly for residues located in the C-terminal H4 helix (Asn50, Ala53, Gln54, Leu56, Ala57, Lys59, Gln60, and Ser61). In addition, the residues in the H1-H3 domain, Thr5, Leu6, Tyr7, Tyr17, Gln18, Asn25, and Tyr47 (indicated by arrows in Fig. 7) also have lowered contact probabilities. This can be clearly shown by the molecular structure in Fig. 8. These residues are tightly binding to the DNA at the SI point, but the binding has loosened toward the SII point. In contrast, in the nonspecific case (Fig. 7), the corresponding residues do not show such high contact probabilities even at point NI (ξ = 14.0 Å). As a result, the free-energy change from NI to NII is small. From the comparison between specific and nonspecific cases, we conclude that the main factor for deep free-energy well formation in the specific complex (i.e., binding specificity) is the contacts in the C-terminal residues (Asn50, Ala53, Gln54, Leu56, Ala57, Lys59, Gln60, Ser61) and the residues in the H1-H3 domain, Thr5, Leu6, Tyr7, Tyr17, Gln18, Asn25, and Tyr47. Figure 8 indeed shows that these residues are located around the specific recognition sites (colored blue in Fig. 8). In particular, our simulation shows that three residues, Tyr7, Tyr17, and Gln18, make direct hydrogen bonds to the specific bases of DNA, which are also pointed out in previous NMR studies.6, 8 When protein–DNA dissociation proceeds further (i.e., SIII for specific or NIII for nonspecific), there is no marked difference in contact probability between the specific and nonspecific cases (Fig. 7). Parts of the helices H2 and H3 (Thr19, Arg22, Lys33, and Thr34) and their neighbors remain in contact with the DNA until complete dissociation (see also the molecular structure in Fig. 5).

4. Discussion 4.1 Binding energetics of the specific and nonspecific complexes

15 Environment ACS Paragon Plus

Page 16 of 40

Page 17 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

With all-atom ABF MD simulations, we have obtained dissociation free-energy profiles of the specific and nonspecific complexes of the DNA–Lac repressor. The obtained free-energy profiles distinctly capture the two binding modes. In nonspecific binding, the protein is loosely bound to DNA, whereas in specific binding, the protein is tightly bound. In particular, the difference in the free-energy well depth, ~5 kcal/mol (Fig. 4a) per monomer, agrees well with experimental data of binding energetics.6, 39-42 We also found that the free-energy changes are correlated with the numbers of atom contacts and of surface water molecules. Residue-level analysis showed that the difference between specific and nonspecific bindings, ~5 kcal/mol, mainly comes from the Cterminal tail interaction with the DNA and some other residues interacting with specific recognition sites. Slutsky and Mirny et al.15-16 have studied the balance of specific and nonspecific bindings theoretically. Because nonspecific binding must be weak for efficient movement, whereas specific binding must be strong for sequence recognition, they estimated that the energy gap would be a few kcal/mol and considered that different conformations must be taken by the nonspecific and specific states to produce such an energy gap. Our result is a suitable example to account for this theoretical consideration,15-16 because the conformation difference between the nonspecific and specific complexes of Lac-DNA, particularly in the C-terminal disorder helix, is indeed responsible for the energetic gap of ~5 kcal/mol. Similar disordered tails are found in many other DNA-binding proteins,43-44 which may also serve as a regulator of switching between specific and nonspecific bindings. A further question regarding the C-terminal tail arises: what factors trigger the helix formation on the specific sequence? We can suppose the following two possibilities. One is that specific contact with the recognition sequence (e.g., hydrogen bonding) triggers the helix formation and the other is that the specific shape of DNA triggers the helix formation. Indeed, the specific DNA

16 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

region is bended and the minor groove is widened. This conformation appears essential for binding and helix formation of the C-terminal tail. Although the present MD simulations cannot determine which is important, it will be clarified from another type of simulation (e.g., MDbased conformational search of the C-terminal tail with a DNA whose conformation is specific, but the base composition is not specific). A closely related MD study45 was recently published, which addressed DNA-Lac repressor specific and nonspecific energetics from MM/PBSA binding free-energy analysis using structures without C-terminal tail. Although direct comparison is difficult because of the differences in method and in system setting (i.e., C-terminal region was deleted in the study45), we can find a consensus in that the C-terminal region is important. The MM/PBSA analysis,45, showed no superior binding strength of the specific complex, which implies the requirement of the C-terminal contact for the binding specificity. Our result demonstrated that behavior of the Cterminal tail is involved in the dominant stability of the specific complex.

4.2 Protein spatial distribution around the specific and nonspecific DNA sites From the dissociation free-energy profiles, we can assess the spatial distribution, P(ξ), of the Lac repressor protein around DNA, which was derived from P(ξ1 ) P(ξ 2 ) = e −[G (ξ1 )−G (ξ 2 )]/ k BT . The distribution P(ξ) (Fig. 4a, inset) suggests that the protein in the specific site almost always stays around the SI state (ξ = 12 Å). That is, the specific protein–DNA contacts are always maintained. This is quite reasonable because specific binding must be considerably stable to recognize the specific DNA sequence. On the other hand, the nonspecific P(ξ) apparently has a wider distribution around ξ = 13 to 18 Å; however, this wide distribution is due to fluctuation of the Cterminal tail, and any global disruption does not occur even in the nonspecific site.

17 Environment ACS Paragon Plus

Page 18 of 40

Page 19 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Our calculation indicates that once a protein encounters a DNA, dissociation is rare, irrespective of whether the binding is specific or nonspecific, implying that protein will use the same binding interface and orientation for binding to nonspecific DNA as those for binding to specific DNA. This DNA-binding mode is indeed reported for a small DNA-binding protein, homeodomain of HOXD9 by Iwahara and Clore et al.11-12

4.3 Dissociation versus sliding: Insight into the kinetics of DNA-binding Another insight from our free-energy calculation concerns the kinetics of DNA binding. It is currently thought that DNA-binding proteins search the target site through two different processes: sliding on the DNA surface (1D search) and dissociation from DNA (3D search); however, a detailed mechanism (such as the ratio of both processes) is not yet elucidated and is the subject of considerable debate.1, 46 Here, we consider a Lac repressor protein bound to a nonspecific site of DNA. Which process—dissociation or sliding—is more likely to occur? The ‡ result presented in Fig. 4a shows that the free-energy barrier for dissociation, ∆Gdissociati on , is at

least ~16 kcal/mol [see the nonspecific case: G(ξ = 29 Å) − G (ξ = 14 Å)]. Furini et al.17 showed ‡ that the free-energy barrier for sliding, ∆Gsliding , is ~8.7 kcal/mol, where one barrier-crossing

event corresponds to a 1-base translocation of the Lac repressor along the DNA.17 The ‡ ‡ comparison between ∆Gdissociati on and ∆G sliding clearly shows that sliding is much more likely to

occur than dissociation. How are they different in terms of barrier crossing rates? A rough estimation using the transition state theory provides

k sliding k dissociation



=

ve −∆Gslide v' e

k BT

‡ −∆Gdissociati on k BT

= e −(8.7−16) / kBT = 1.9 ×105 ,

18 Environment ACS Paragon Plus

(4)

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

assuming v = v′ and room temperature (T = 300 K). This means that 1-base sliding is 190,000 times more likely to occur than one dissociation. Although the free-energy barrier will be the most dominant factor for rate constants, accurate comparison must be made with more careful consideration of the coefficients v and v′. Therefore, Eq. (4) gives only a tentative value for the ratio of rates of the two processes, but such estimation provides important implications for understanding the mechanisms of the target search process in DNA-binding proteins.1 The comparison above was made on the same DNA-binding domain of Lac repressor, but we should mention one difference between the calculations of Furini et al.17 and ours: we included all 62 amino acid residues in the calculation, whereas Furini et al. deleted C-terminal residues 47–62. However, this difference probably would not affect the result fundamentally because the nonspecific C-terminal tail makes little contribution to the free energy, as we have shown (see Figs. 4a and 7). Presumably, the ratio 1.9 × 105 [from Eq. (4)] is the lower bound of ksliding/kdissociation; the actual value is thought to be much larger. This is because the above estimation using Eq. (4) was made ‡ with ∆Gsliding = 8.7 kcal/mol,17 but other experimental13 and theoretical16, 47 studies suggest that

‡ ~ 1 kcal/mol.3, 48 These data were not for the Lac the free-energy barrier is still lower, ∆Gsliding

repressor, but experimental diffusion constants49 for Lac repressor sliding also suggest lower ‡ barriers, ∆Gsliding = ~4–7 kcal/mol (Fig. 6 in Ref.

17

‡ ). For example, if we use ∆Gsliding = 1

kcal/mol instead of 8.7 kcal/mol, we obtain ksliding/kdissociation = 7.2 × 1010; the ratio between the ‡ rates becomes larger than the one obtained by Eq. 4. The value of ∆Gsliding = 8.7 kcal/mol used by

Furini et al.17 is relatively high, probably because their MD simulation employed a regular helical axis as a reaction coordinate. Actual protein sliding will be slightly perturbed, however, thereby reducing the free-energy barrier.

19 Environment ACS Paragon Plus

Page 20 of 40

Page 21 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

The current discussion does not mean that dissociation is always a trivial process in the targetsite search of DNA-binding proteins. Though dissociation is less likely to occur than sliding, dissociation certainly happens and plays an important role in biological processes, as shown previously.50 For example, if there are DNA structural defects or some obstacles for other DNAbinding proteins or histone proteins, protein-bound states become more unstable, so accordingly dissociation occurs more readily.51 Condensation of DNA also helps protein dissociation. Such a mechanism is discussed as intersegment transfer52 or the monkey bar mechanism.44 Ionic conditions also affect the dissociation.40, 53 Our calculations were carried out at ~0.1 M, which is close to a physiological concentration of 0.15 M. Therefore, dissociation is usually unlikely to occur in a diluted concentration of DNA, but it is more likely to occur in a condensed condition, such as inside a nucleus.

4.4 Comparison with the experimental binding free energies As shown, the free-energy difference between the specific and nonspecific complexes was ~5 kcal/mol (Fig. 4a), which agrees well with the experimental data of a few kcal/mol.6,

39-42

However, we must be careful to make further comparisons on the absolute binding free energies. Figure 4a shows that the free-energy difference between the bound and dissociation states is at least ~21 kcal/mol for the specific case and ~16 kcal/mol for the nonspecific case. These values do not include all the components of the absolute binding free energies.54 For example, translation and orientation entropy of protein,54-55 which arises from the difference in the configuration numbers between bound and dissociation states, is not included. This energy component amounts to 5–10 kcal/mol,55-57 if we assume the standard state concentration where the binding free energy is usually defined. Considering this reduction of 5–10 kcal/mol, we will obtain a free energy of 11–16 kcal/mol for specific binding and 6–11 kcal/mol for nonspecific

20 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

binding. These values fall in a range of binding free energy similar to that experimentally obtained for the Lac repressor DNA-binding domain (8–15 kcal/mol).25, 39, 58 To develop a more quantitative comparison with the experimental data, a sophisticated approach for such a purpose56 and careful consideration of other factors such as ion conditions will be required.

4.5 Computational limitations We here discuss possible effects of Cα-Cα restraining imposed on the Lac H1-H3 domain. The NMR study59 showed that the structural change in the H1-H3 domain is not large between bound and free states. As for the intermediate dissociation process, the experimental structure is not available, but our MD calculations assume that no conformational changes occur. If this is true, the resultant free-energy profiles are little influenced by the Cα-Cα restraints introduced here. On the other hand, if the protein H1-H3 domain experiences some conformational changes in the dissociation process, free-energy barriers will be lower than those we obtained in this study. However, our results of free energies seem to be appropriate even in a quantitative aspect, implying that the effect of the restraints is not significant on this system. To address such possibilities more clearly, fully-flexible simulations, employing much more computational resources, will be needed.

5. Concluding Remarks With all-atom ABF molecular dynamics simulations, we have investigated the dissociation process and calculated the free-energy profiles of the DNA–Lac repressor of specific and nonspecific complexes. The dissociation starts at the C-terminal region, which is consistent with the previous suggestion from experimental H/D exchange data. The obtained free-energy profiles show that the specific complex has a sharp, deep well, whereas the nonspecific has a broad, 21 Environment ACS Paragon Plus

Page 22 of 40

Page 23 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

shallow well. This well explains the observation that specific complexes are tightly bound and nonspecific complexes are loosely bound. In particular, the free-energy difference of ~5 kcal/mol between the well depths of the specific and nonspecific complexes agreed with the experimentally obtained value, which has important implications for the future approach to dissecting specific and nonspecific complexes by free-energy calculations. We also found that the free-energy changes are well correlated with the number of protein– DNA atom contacts and that of surface water molecules. More detailed residue-level analysis found that the specific C-terminal contacts, as well as some specific contacts, account for the energetic difference associated with the binding specificity, ~5 kcal/mol. This result verifies the importance of the C-terminal region in the specific to nonspecific modulation, which has been supposed by Kalodimos et al.7 Also, the present result is consistent with a theoretical account by Slutsky and Mirny15-16 in that the essential energy gap between specific and nonspecific complexes is responsible for the conformational changes. The free-energy profiles provided information about the Lac repressor spatial distribution around specific and nonspecific DNA sites. The resultant distribution P(ξ) shows that a small collapse, corresponding to fluctuation of the C-terminal tail, occurs at the nonspecific site; nonetheless, any large dissociation is not allowed for any specific and nonspecific sites. This result is consistent with the physical explanation of the Iwahara-Clore view11-12 that a DNAbinding protein, while sliding on the DNA, uses the same interface and orientation for binding as those used in specific binding. Our dissociation free-energy profiles were compared with the sliding free-energy profiles previously obtained for the same Lac repressor. This comparison provided insight into the kinetics of DNA-binding proteins: sliding is much preferred to dissociation. Frequent occurrence

22 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 40

of dissociation is usually not expected and requires some additional conditions such as DNA deformation, obstacles, approach of another part of DNA, and higher ionic concentrations. Recently, alternative computational approaches using coarse-grained44,

60

or stochastic47

simulations, which can cover a wider range of protein movement around DNA, have been performed. The all-atom MD simulations performed here are more restricted as to the spatial region explored (owing to the high computational expense), but the results are more realistic. Therefore, all-atom MD results will be useful for verifying the results of coarse-grained or stochastic simulations.

Supporting Information RMSD (Fig. S1) and orientation (Fig. S2) of the Lac H1-H3 domain. This material is available free of charge via the Internet at http://pubs.acs.org.

ACKNOWLEDGMENT This work was supported by Grants-in-Aid for Scientific Research (Nos. 21107532 and 23114723) from the Ministry of Education, Culture, Sports, Science and Technology in Japan. A part of this research has also been funded by MEXT Strategic Programs for Innovative Research, Computational Life Science and Application in Drug Discovery and Medical Development.

23 Environment ACS Paragon Plus

Page 25 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

REFERENCES 1. Halford, S. E., An End to 40 Years of Mistakes in DNA–Protein Association Kinetics? Biochem. Soc. Trans. 2009, 37, 343-348. 2. Halford, S. E.; Marko, J. F., How Do Site-Specific DNA-Binding Proteins Find Their Targets? Nucleic Acids Res. 2004, 32, 3040-3052. 3. Zakrzewska, K.; Lavery, R., Towards a Molecular View of Transcriptional Control. Curr. Opin. in Struct. Biol. 2012, 22, 160–167. 4. Riggs, A. D.; Bourgeois, S.; Cohn, M., The Lac Repressor-Operator Interaction. 3. Kinetic Studies. J. Mol. Biol. 1970, 53, 401–417. 5. von Hippel, P. H.; Berg, O. G., Facilitated Target Location in Biological Systems. J. Biol. Chem. 1989, 264, 675-678. 6. Kalodimos, C. G.; Biris, N.; Bonvin, A. M. J. J.; Levandoski, M. M.; Guennuegues, M.; Boelens, R.; Kaptein, R., Structure and Flexibility Adaptation in Nonspecific and Specific Protein-DNA Complexes. Science 2004, 305, 386-389. 7. Kalodimos, C. G.; Boelens, R.; Kaptein, R., Toward an Integrated Model of Protein-DNA Recognition as Inferred from NMR Studies on the Lac Repressor System. Chem. Rev. 2004, 104, 3567-3586. 8. Kalodimos, C. G.; Bonvin, A. M. J. J.; Salinas, R. K.; Wechselberger, R.; Boelens, R.; Kaptein, R., Plasticity in Protein-DNA Recognition: Lac Repressor Interacts with Its Natural Operator O1 Through Alternative Conformations of Its DNA-Binding Domain. EMBO J. 2002, 21, 2866-2876. 9. Winkler, F. K.; Banner, D. W.; Oefner, C.; Tsernoglou, D.; Brown, R. S.; Heathman, S. P.; Bryan, R. K.; Martin, P. D.; Petratos, K.; Wilson, K. S., The Crystal Structure of EcoRV Endonuclease and of Its Complexes with Cognate and Non-Cognate DNA Fragments. EMBO J. 1993, 12, 1781-1795. 10. Viadiu, H.; Aggarwal, A. K., Structure of BamHI Bound to Nonspecific DNA: A Model for DNA Sliding. Mol. Cell 2000, 5, 889–895.

24 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

11. Iwahara, J.; Clore, G. M., Detecting Transient Intermediates in Macromolecular Binding by Paramagnetic NMR. Nature 2006, 440, 1227-1230. 12. Iwahara, J.; Zweckstetter, M.; Clore, G. M., NMR Structural and Kinetic Characterization of a Homeodomain Diffusing and Hopping on Nonspecific DNA. Proc. Natl. Acad. Sci. USA 2006, 103, 15062-15067. 13. Blainey, P. C.; Luo, G.; Kou, S. C.; Mangel, W. F.; Verdine, G. L.; Bagchi, B.; Xie, X. S., Nonspecifically Bound Proteins Spin While Diffusing Along DNA. Nature Struct. Mol. Biol. 2009, 16, 1224-1229. 14. Kalodimos, C. G.; Boelens, R.; Kaptein, R., A Residue-Specific View of the Association and Dissociation Pathway in Protein-DNA Recognition. Nature Struct. Biol. 2002, 9, 193-197. 15. Mirny, L.; Slutsky, M.; Wunderlich, Z.; Tafvizi, A.; Leith, J.; Kosmrlj, A., How a Protein Searches for Its Site on DNA: The Mechanism of Facilitated Diffusion J. Phys. A: Math. Theor. 2009, 42, 434013. 16. Slutsky, M.; Mirny, L. A., Kinetics of Protein-DNA Interaction: Facilitated Target Location in Sequence-Dependent Potential. Biophys. J. 2004, 87, 4021-4035. 17. Furini, S.; Domene, C.; Cavalcanti, S., Insights into the Sliding Movement of the Lac Repressor Nonspecifically Bound to DNA. J. Phys. Chem. B 2010, 114, 2238-2245. 18. Bouvier, B.; Lavery, R., A Free Energy Pathway for the Interaction of the SRY Protein with Its Binding Site on DNA from Atomistic Simulations. J. Am. Chem. Soc. 2009, 131, 98649865. 19. Vargiu, A. V.; Ruggerone, P.; Magistrato, A.; Carloni, P., Dissociation of Minor Groove Binders from DNA: Insights from Metadynamics Simulations. Nucleic Acids Res. 2008, 36, 5910-5921. 20. Mukherjee, A.; Lavery, R.; Bagchi, B.; Hynes, J. T., On the Molecular Mechanism of Drug Intercalation into DNA: A Simulation Study of the Intercalation Pathway, Free Energy, and DNA Structural Changes. J. Am. Chem. Soc. 2008, 130, 9747-9755. 21. Villa, E.; Balaeff, A.; Schulten, K., Structural Dynamics of the Lac Repressor-DNA Complex Revealed by a Multiscale Simulation. Proc. Natl. Acad. Sci. USA 2005, 102, 67836788.

25 Environment ACS Paragon Plus

Page 26 of 40

Page 27 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

22. Chipot, C.; Pohorille, A., Free Energy Calculations: Theory and Applications in Chemistry and Biology. Springer: Berlin, 2007. 23. Darve, E.; Pohorille, A., Calculating Free Energies Using Average Force. J. Chem. Phys. 2001, 115, 9169-9183. 24. Darve, E.; Rodríguez-Gómez, D.; Pohorille, A., Adaptive Biasing Force Method for Scalar and Vector Free Energy Calculations. J. Chem. Phys. 2008, 128, 144120. 25. Kalodimos, C. G.; Folkers, G. E.; Boelens, R.; Kaptein, R., Strong DNA Binding by Covalently Linked Dimeric Lac Headpiece: Evidence for the Crucial Role of the Hinge Helices. Proc. Natl. Acad. Sci. USA 2001, 98, 6039-6044. 26. Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C., Comparison of Multiple Amber Force Fields and Development of Improved Protein Backbone Parameters. Proteins 2006, 65, 712-725. 27. Pérez, A.; Marchán, I.; Svozil, D.; Sponer, J.; Cheatham III, T. E.; Laughton, C. A.; Orozco, M., Refinement of the AMBER Force Field for Nucleic Acids: Improving the Description of α/γ Conformers. Biophys. J. 2007, 92, 3817-3829. 28. Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L., Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926-935. 29. Case, D. A.; Darden, T. A.; Cheatham III, T. E.; Simmerling, C. L.; Wang, J.; Duke, R. E.; Luo, R.; Crowley, M.; Walker, R. C.; Zhang, W.; Merz, K. M.; Wang, B.; Hayik, S.; Roitberg, A.; Seabra, G.; Kolossváry, I.; Wong, K. F.; Paesani, F.; Vanicek, J.; Wu, X.; Brozell, S. R.; Steinbrecher, T.; Gohlke, H.; Yang, L.; Tan, C.; Mongan, J.; Hornak, V.; Cui, G.; Mathews, D. H.; Seetin, M. G.; Sagui, C.; Babin, V.; Kollman, P. A. AMBER 10, University of California, San Francisco: 2008. 30. Berendsen, H. J. C.; Postma, J. P. M.; van Gunsteren, W. F.; DiNola, A.; Haak, J. R., Molecular Dynamics with Coupling to an External Bath. J. Chem. Phys. 1984, 81, 3684-3690. 31. Darden, T.; York, D.; Pedersen, L., Particle Mesh Ewald - an N·log(N) Method for Ewald Sums in Large Systems. J. Chem. Phys. 1993, 98, 10089-10092.

26 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

32. Higo, J.; Nishimura, Y.; Nakamura, H., A Free-Energy Landscape for Coupled Folding and Binding of an Intrinsically Disordered Protein in Explicit Solvent from Detailed All-Atom Computations. J. Am. Chem. Soc. 2011, 133, 10448-10458. 33. Hénin, J.; Fiorin, G.; Chipot, C.; Klein, M. L., Exploring Multidimensional Free Energy Landscapes Using Time-Dependent Biases on Collective Variables. J. Chem. Theory Comput. 2010, 6, 35-47. 34. Lee, E. H.; Hsin, J.; Mayans, O.; Schulten, K., Secondary and Tertiary Structure Elasticity of Titin Z1Z2 and a Titin Chain Model. Biophys. J. 2007, 93, 1719-1735. 35. Hénin, J.; Pohorille, A.; Chipot, C., Insights into the Recognition and Association of Transmembrane α-Helices. The Free Energy of α-Helix Dimerization in Glycophorin A. J. Am. Chem. Soc. 2005, 127, 8478-8484. 36. Xu, J.; Crowley, M. F.; Smith, J. C., Building a Foundation for Structure-Based Cellulosome Design for Cellulosic Ethanol: Insight into Cohesin-Dockerin Complexation from Computer Simulation. Protein Sci. 2009, 18, 949-959. 37. Fried, M. G.; Stickle, D. F.; Smirnakis, K. V.; Adams, C.; MacDonald, D.; Lu, P., Role of Hydration in the Binding of Lac Repressor to DNA. J. Biol. Chem. 2002, 277, 50676-50682. 38. Sidorova, N. Y.; Rau, D. C., Differences in Water Release for the Binding of EcoRI to Specific and Nonspecific DNA Sequences. Proc. Natl. Acad. Sci. USA 1996, 93, 12272-12277. 39. Botuyan, M. V.; Keire, D. A.; Kroen, C.; Gorenstein, D. G., 31P Nuclear Magnetic Resonance Spectra and Dissociation Constants of Lac Repressor Headpiece Duplex Operator Complexes: The Importance of Phosphate Ester Backbone Flexibility in Protein-DNA Recognition. Biochemistry 1993, 32, 6863-6874. 40. Ha, J.-H.; Capp, M. W.; Hohenwalter, M. D.; Baskerville, M.; Record Jr., M. T., Thermodynamic Stoichiometries of Participation of Water, Cations and Anions in Specific and Non-Specific Binding of Lac Repressor to DNA. J. Mol. Biol. 1992, 228, 252-264. 41. Kao-Huang, Y.; Revzin, A.; Butler, A. P.; O'Conner, P.; Noble, D. W.; von Hippel, P. H., Nonspecific DNA Binding of Genome-Regulating Proteins as a Biological Control Mechanism: Measurement of DNA-Bound Escherichia Coli Lac Repressor in Vivo. Proc. Natl. Acad. Sci. USA 1977, 74, 4228-4232.

27 Environment ACS Paragon Plus

Page 28 of 40

Page 29 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

42. Frank, D. E.; Saecker, R. M.; Bond, J. P.; Capp, M. W.; Tsodikov, O. V.; Melcher, S. E.; Levandoski, M. M.; Record Jr., M. T., Thermodynamics of the Interactions of Lac Repressor with Variants of the Symmetric Lac Operator: Effects of Converting a Consensus Site to a Nonspecific Site. J. Mol. Biol. 1997, 267, 1186-1206. 43. Crane-Robinson, C.; Dragan, A. I.; Privalov, P. L., The Extended Arms of DNA-Binding Domains: A Tale of Tails. Trends in Biochem. Sci. 2006, 31, 547-552. 44. Vuzman, D.; Azia, A.; Levy, Y., Searching DNA via a “Monkey Bar” Mechanism: The Significance of Disordered Tails. J. Mol. Biol. 2010, 396, 674-684. 45. Furini, S.; Barbini, P.; Domene, C., DNA-Recognition Process Described by MD Simulations of the Lactose Repressor Protein on a Specific and a Non-Specific DNA Sequence. Nucleic Acids Res. 2013, 41, 3963-3972. 46. Elf, J.; Li, G.-W.; Xie, X. S., Probing Transcription Factor Dynamics at the SingleMolecule Level in a Living Cell. Science 2007, 316, 1191-1194. 47. Chen, C.; Pettitt, B. M., The Binding Process of a Nonspecific Enzyme with DNA. Biophys. J. 2011, 101, 1139-1147. 48. Gorman, J.; Greene, E. C., Visualizing One-Dimensional Diffusion of Proteins Along DNA. Nature Struct. Mol. Biol. 2008, 15, 768-774. 49. Wang, Y. M.; Austin, R. H.; Cox, E. C., Single Molecule Measurements of Repressor Protein 1D Diffusion on DNA. Phys. Rev. Lett. 2006, 97, 048302. 50. Gowers, D. M.; Wilson, G. G.; Halford, S. E., Measurement of the Contributions of 1D and 3D Pathways to the Translocation of a Protein Along DNA. Proc. Natl. Acad. Sci. USA 2005, 102, 15883-15888. 51. Gorman, J.; Plys, A. J.; Visnapuu, M.-L.; Alani, E.; Greene, E. C., Visualizing OneDimensional Diffusion of Eukaryotic DNA Repair Factors Along a Chromatin Lattice. Nature Struct. Mol. Biol. 2010, 17, 932–938. 52. Zandarashvili, L.; Vuzman, D.; Esadze, A.; Takayama, Y.; Sahu, D.; Levy, Y.; Iwahara, J., Asymmetrical Roles of Zinc Fingers in Dynamic DNA-Scanning Process by the Inducible Transcription Factor Egr-1. Proc. Natl. Acad. Sci. USA 2012, 109, E1724-E1732.

28 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

53. Barkley, M. D.; Lewis, P. A.; Sullivan, G. E., Ion Effects on the Lac Repressor-Operator Equilibrium. Biochemistry 1981, 20, 3842-3851. 54. Gilson, M. K.; Given, J. A.; Bush, B. L.; McCammon, J. A., The StatisticalThermodynamic Basis for Computation of Binding Affinities: A Critical Review. Biophys. J. 1997, 72, 1047-1069. 55. Swanson, J. M. J.; Henchman, R. H.; McCammon, J. A., Revisiting Free Energy Calculations: A Theoretical Connection to MM/PBSA and Direct Calculation of the Association Free Energy. Biophys. J. 2004, 86, 67-74. 56. Woo, H.-J.; Roux, B., Calculation of Absolute Protein–Ligand Binding Free Energy from Computer Simulations. Proc. Natl. Acad. Sci. USA 2005, 102, 6825-6830. 57. Deng, Y.; Roux, B., Computations of Standard Binding Free Energies with Molecular Dynamics Simulations. J. Phys. Chem. B 2009, 113, 2234-2246. 58. Romanuka, J.; Folkers, G. E.; Biris, N.; Tishchenko, E.; Wienk, H.; Bonvin, A. M. J. J.; Kaptein, R.; Boelens, R., Specificity and Affinity of Lac Repressor for the Auxiliary Operators O2 and O3 Are Explained by the Structures of Their Protein-DNA Complexes. J. Mol. Biol. 2009, 390, 478-489. 59. Slijper, M.; Bonvin, A. M. J. J.; Boelens, R.; Kaptein, R., Refined Structure of Lac Repressor Headpiece (1-56) Determined by Relaxation Matrix Calculations form 2D and 3D NOE Data: Change of Tertiary Structure upon Binding to the Lac Operator. J. Mol. Biol. 1996, 259, 761-773. 60. Terakawa, T.; Kenzaki, H.; Takada, S., p53 Searches on DNA by Rotation-Uncoupled Sliding at C-Terminal Tails and Restricted Hopping of Core Domains. J. Am. Chem. Soc. 2012, 134, 14555-14562. 61. DeLano, W. L. The PyMOL Molecular Graphics System, DeLano Scientific: San Carlos, CA, 2002.

29 Environment ACS Paragon Plus

Page 30 of 40

Page 31 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 1. Simulation systems used in this study. The Lac repressor is bound to the specific and nonspecific DNA. These initial structures were constructed from NMR data.6, 8 The helices H1– H3 are colored yellow. The C-terminal region (residues 50–62), colored red, takes different conformations: α-helix (specific) and disordered (nonspecific). The schematic illustration (at right) shows the reaction coordinate ξ, the distance between the principal axis of DNA (the helical axis of DNA), and the center of mass of the Lac repressor. All molecular graphics in this article were created using PyMOL.61

30 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. Trajectories of protein–DNA distances with various Ndamp in ABF dissociation stage. The protein–DNA dissociation speed was adjusted by a parameter Ndamp (see Sections 2.4 and 3.1). The results for the specific-complex system are shown. Note that since a harmonic-potential boundary was set at ~ 33.5 Å, ξ with Ndamp = 100 was pushed back by the potential.

31 Environment ACS Paragon Plus

Page 32 of 40

Page 33 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 3. Average force against sampling number (a) and sampling number at each bin (b). In (a), the average force F(ξ) for 10 different bins within ξ = 14.5 – 15.5 Å are monitored. The F(ξ) changes as the number of conformation samples N increases, and converges at around N = 500,000 (arrow). The analysis (a) was for the specific complex system. (b) Number of conformation samples stored finally at individual bins, which were used for the free energy calculations.

32 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4. Free-energy profiles G(ξ) (a), number of protein atoms in contact with DNA Ncontact (b), and number of water molecules at the DNA surface Nwater (c) as a function of the protein– DNA distance ξ. In (a), the free energy at ξ = 29 Å is set to zero. The representative points, specific SI–SIII and nonspecific NI–NIII, are discussed in the text. In the inset of (a), probabilities P(ξ) evaluated from G(ξ) are shown. Inset in (b) shows the protein–DNA interface area

33 Environment ACS Paragon Plus

Page 34 of 40

Page 35 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

calculated as the decrease from the sum of solvent-accessible surface areas of protein and DNA. Ncontact and Nwater, obtained by averaging over each 1 Å window, are marked at the center of each

window. Nwater is counted when the water oxygen has a distance less than 3.5 Å from any DNA atoms. Nwater at ξ = 29 Å is set to zero.

34 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5. Snapshots in the dissociation process. Conformations at points SI–SIII (specific) and NI–NIII (nonspecific) are shown (each point is denoted in Figure 4a).

35 Environment ACS Paragon Plus

Page 36 of 40

Page 37 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 6.

The Journal of Physical Chemistry

Surface water molecules in the representative points, specific SI (ξ = 12 Å),

nonspecific NI (ξ = 14 Å), and NII (ξ = 16.7 Å). Green spheres indicate the water oxygen located within 3.5 Å from DNA atoms. The main difference is observed at the C-terminal region highlighted by dashed circles.

36 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 7. Contact-residue analysis for the representative points, specific SI–SIII and nonspecific NI–NIII. The contact probabilities for each residue (vertical axis) were calculated for each 1 Å window with the criterion as to whether or not at least one atom of the residue is within 2.5 Å from DNA atoms. Arrows denote the changes from SI to SII, and from NI to NII (see Sec. 3.4).

37 Environment ACS Paragon Plus

Page 38 of 40

Page 39 of 40

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The Journal of Physical Chemistry

Figure 8. Snapshots of the SI and SII states. Shown are the residues responsible for the large free-energy change (see Sec. 3.4). DNA is drawn by a surface representation, and the recognition sites by the protein are colored in blue.

38 Environment ACS Paragon Plus

The Journal of Physical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

TOC Graphics

39 Environment ACS Paragon Plus

Page 40 of 40