Insights into the Sliding Movement of the Lac Repressor

Jan 22, 2010 - of Medical Surgery and Bioengineering, UniVersity of Siena, Siena, Italy, and ... Chemistry Laboratory, Department of Chemistry, UniVer...
0 downloads 0 Views 2MB Size
2238

J. Phys. Chem. B 2010, 114, 2238–2245

Insights into the Sliding Movement of the Lac Repressor Nonspecifically Bound to DNA Simone Furini,*,†,‡ Carmen Domene,§ and Silvio Cavalcanti† Department of Electronics, Computer Science and Systems, UniVersity of Bologna, Bologna, Italy, Department of Medical Surgery and Bioengineering, UniVersity of Siena, Siena, Italy, and Physical and Theoretical Chemistry Laboratory, Department of Chemistry, UniVersity of Oxford, Oxford OX1 3QZ, U.K. ReceiVed: July 10, 2009; ReVised Manuscript ReceiVed: December 17, 2009

The Lac repressor finds its DNA binding sequences with an association rate 2 orders of magnitude higher than what is expected for a random diffusive process. This experimental data stimulated numerous theoretical and experimental studies, which led to the facilitated diffusion model. In facilitated diffusion, the Lac repressor binds nonspecifically to DNA. This nonspecific binding is followed by an exploration of the DNA molecule in a reduced space. Single-molecule imaging confirmed that the Lac repressor may move along the DNA molecule; however, it is still under debate whether the LacI movement proceeds through sliding, with a continuous close contact between the protein and DNA, or through hopping between adjacent binding sites. We have investigated the one-dimensional sliding movement of the Lac repressor along nonspecific DNA by full-atomistic molecular dynamics simulations and free-energy calculations based on the umbrella sampling technique. The computed free-energy profile along a helical trajectory was periodic, with periodicity equal to the distance between successive nucleotides and an energy barrier between successive minima of 8.7 ( 0.7 kcal/mol. The results from the molecular simulations were subsequently used in a Langevin dynamics framework to estimate the diffusion coefficient of the Lac repressor sliding along nonspecific DNA. The computed diffusion coefficient is close to the lower limit of the experimental range. Introduction A distinctive feature of living organisms is their ability to adapt to a changing environment. The transcription of different genes in diverse environmental conditions is one of the molecular mechanisms underlying adaptation, and transcription factors (TFs) are the protein molecules regulating this process. TFs operate by binding to specific sequences of DNA, named operator sites. Typical operator sites are 10-20 base pairs long, which leads to the question of how TFs localize these relatively short target sequences in the much longer genomic DNA. Since the seminal work by Jacob and Monod,1 the Lac repressor, LacI, has served as a model system for studying protein-DNA interactions and gene regulation. Here, molecular dynamics (MD) simulations are used to investigate the mechanism used by LacI to find its operator sites. LacI is a homotetrameric protein, and each monomer is composed of 360 amino acids.2,3 Four functional domains are defined in the monomer, (i) the DNA binding domain, amino acids 1-50, (ii) the hinge region, amino acids 51-60, (iii) the core domain, amino acids 61-340, and (iv) the tetramerization domain, amino acids 341-360.4,5 The protein binds to the operator site as a dimer;6 the simultaneous binding of the two LacI dimers to different operator sites stabilizes the DNA-LacI complex.7,8 The search by LacI for its operator sites does not proceed through a simple diffusion and collision strategy. According to the three-dimensional diffusion coefficient of LacI, ∼108 nm2/s, a diffusion and collision strategy would result in an association rate of 108 M-1 s-1, the so-called diffusion limit.9 * To whom correspondence should be addressed. Address: viale Mario Bracci, 12 53100, Siena, Italy. E-mail: [email protected]. Tel: +390577585730. † University of Bologna. ‡ University of Siena. § University of Oxford.

Unexpectedly, an experimental measurement of the association rate between LacI and its operator sites provided a value of around 1010 M-1 s-1, 2 orders of magnitude higher than the diffusion limit.10 This value was later confirmed by experiments with a rapid footprinting procedure, which measured the occupancy of single operator sites.11 Association rates faster than the diffusion limit are not ubiquitous among DNA binding proteins,12–18 but they were also found for the integration host factor19 and for the Gal repressor,20 a protein that belongs to the same family of LacI. A possible explanation for these high association rates is facilitated diffusion. The general idea behind facilitated diffusion is that the two molecules bind to each other in a nonspecific way, thus reducing the dimensionality of the search space and boosting the association rates. In the context of TFs, facilitated diffusion requires a nonspecific binding between the TF and the DNA. Then, DNA exploration may proceed by three different strategies, (i) one-dimensional sliding movement, (ii) hopping between adjacent binding positions or (iii) translocation between distant regions of DNA, resulting from the simultaneous binding of the TF to two separate segments of the DNA molecule. A more recent hypothesis to explain association rates higher than the diffusion limit is subdiffusive motion of TFs in the crowded cellular environment.21 The gene that codifies for the TF is usually close to the TF operator site. Therefore, a subdiffusive motion would keep the TF adjacent to the operator site for a continuous period of time, and it could boost the association rates. However, singlemolecule imagining of the Lac repressor in the Escherichia coli cytoplasm revealed a diffusive motion22 which seems to exclude subdiffusive motions as the origin of the high association rates. The hypothesis of facilitated diffusion of LacI along DNA is supported by both theoretical and experimental analyses.23–25 Wang et al. measured the one-dimensional diffusion coefficient and the average length of the sliding events by single-molecule

10.1021/jp906504m  2010 American Chemical Society Published on Web 01/22/2010

Sliding Movement of the Lac Repressor

J. Phys. Chem. B, Vol. 114, No. 6, 2010 2239

Figure 1. Atomic System. Surface of the DNA molecule: nucleotide bases in gray; the backbone of DNA strand D1 in red; and the backbone of DNA strand D2 in blue. Residues 1-62 of LacI are shown in cartoon representation, and a green line is used to highlight the helical trajectory of LacI around the DNA molecule. Labels indicate the R-helixes H1-H3 of the LacI monomer. The figure was generated with the software VMD.66 Nucleotide sequences of the DNA strands are shown in a schematic picture at the bottom, with the same color scheme of the molecular cartoon (red for the D1 strand and blue for the D2 strand).

imaging, using LacI attached to a green fluorescent protein.26 These experimental values, used in the theoretical framework developed by Halford and Marko,9 give an association rate in agreement with the experimental data.11 Even if this agreement is somehow questioned by the high variability of the measured diffusion coefficients, these experiments undoubtedly showed that LacI moves along the DNA molecule. As previously outlined, a prerequisite for facilitated diffusion is nonspecific binding between the TF and the DNA molecule. A structure for LacI bound to nonspecific DNA was obtained by nuclear magnetic resonance (NMR),27 using an engineered version of the protein where only the first 62 amino acids were included. Compared to the LacI structure bound to an operator site,28 if LacI is bound to nonspecific DNA, the DNA molecule is in the B-form, and the hinge regions of LacI are disordered. The recognition helix, H2, is close to the DNA major groove (Figure 1), even if it does not establish any atomic interactions with the base edges. A helical trajectory of LacI would keep the helix H2 in this orientation with respect to the major groove, thus in an optimal position to recognize an operator site. Helical trajectories of TFs around DNA were first hypothesized by Shurr.29 Recently, Bagchi et al. calculated theoretically the diffusion coefficient for LacI along a helical trajectory,30 obtaining a value (1.8 × 106 bp2/s) close to the experimental data (2 × 103-1.1 × 106 bp2/s).22,26 Here, the sliding movement of LacI along the DNA molecule is investigated using a multiscale model. The first step was to calculate the local diffusion coefficient of LacI along nonspecific DNA and the potential of mean force of the sliding movement by classical MD simulations. Second, these data were used in a Langevin dynamics framework, where it is possible to simulate longer time scales. Stochastic simulations on the millisecond time scale allowed calculation of the diffusion coefficient of the sliding movement, which was compared to experimental data.26

MD simulations have already been used to analyze the atomic interactions between proteins and DNA.31–35 Consistency between the simulated and the experimental data proves that the force fields currently in use provide an adequate description for these molecular systems.36 MD simulations of LacI helped in understanding the allosteric changes of the protein in the core domain.37,38 However, little is known about the behavior of the DNA binding domain. The DNA binding domain of LacI presents a helix-turn-helix structure, which is conserved in a plethora of DNA binding proteins.39–41 Thus, the analysis of LacI sliding along DNA will provide more general information about the atomic interactions between proteins and DNA. Methods System Definition. The molecular structures of LacI and DNA were defined according to the NMR data by Kalodimos et al., Protein Data Bank entry 1OSL.27 Only residues 1-62 of a single LacI monomer were included in our model (Figure 1). The model for the DNA fragment was composed of 20 nucleotides. The structure of the four nucleotides missing in the NMR data was modeled in a perfect B-form helix. The nucleotide sequences of the two DNA strands, D1 and D2, were 5′-CGATAAGATATCTTATCGCG-3′ and 5′-CGCGATAAGATATCTTATCG-3′, respectively (Figure 1). The DNA axis was aligned to the z-axis, and the molecule was centered in the xy plane. An ester bond was defined between nucleotide 1 and nucleotide 20 of each strand in order to simulate an ideally infinite DNA molecule in MD runs with periodic boundary conditions. The missing atoms in the protein structure were built using the software psfgen of NAMD.42 Default protonation states were used for all of the amino acids, in agreement with the prediction of the PROPKA algorithm43 at neutral pH. Histidines were protonated at the delta position. N and C termini were respectively acetylated and amidated. The system was solvated by ∼8000 water molecules. Sodium and chloride ions were

2240

J. Phys. Chem. B, Vol. 114, No. 6, 2010

added to neutralize the system, reaching an ion concentration of 0.5 M. The total size of the molecular system was ∼26000 atoms. Molecular Dynamics. MD simulations were run using NAMD2.644 and the CHARMM-27 all-atom force field with CMAP correction.45 The TIP3 model was used for water molecules.46 The time step was set to 2 fs, SETTLE algorithm,47 and the coordinates were saved every 10 ps. Long-range electrostatic interactions were treated by the Particle Mesh Ewald algorithm,48 and a 10 Å cutoff was used for the van der Waals interactions. Langevin dynamics controlled the temperature at 300 K, with a damping factor of 5 ps-1. The Nose´-Hoover Langevin pressure control was used to keep a pressure of 1 bar; the piston period was 200 fs, and the damping time scale was 100 fs.49,50 Harmonic restraints were applied to all heavy atoms (force constant 10 kcal · mol-1 · Å-2). The energy was first minimized by 5000 steps of steepest descent followed by 50 ps of MD. Harmonic restraints were then applied to backbone atoms only. The force constant of the restraints was set to 20 kcal · mol-1 · Å-2 and gradually reduced to 0 in a 500 ps MD trajectory. Afterward, 17 ns of unrestrained MD was simulated. The umbrella sampling technique51 was used to compute the potential of mean force (PMF) for the sliding of the LacI monomer along the DNA double helix. Only residues 1-46 of LacI were included in the atomic model used for the umbrella sampling studies. Residues 47-62 are highly mobile and unlikely to be involved in nonspecific DNA recognition.27 The last snapshot of the MD simulation described previously was used as the initial condition for the umbrella sampling simulations. Water molecules were added to fill the space left by the part of the protein removed (residues 47-62). An equilibrating period of 100 ps followed, where harmonic restraints were applied to backbone atoms. The computation of the PMF was confined to a 30 Å long helical trajectory, which covers ∼3 nucleotides. This region was divided into 120 windows, and 2 MD simulations of 1200 ps were run for each window, starting from different initial conditions. The starting configurations were generated by rigid movements of the protein along a helical trajectory surrounding the DNA (Figure 1). The separation along the z-axis between the helix loops was equal to the DNA periodicity (34 Å), and the helical radius was equal to the mean distance along the MD trajectory between the DNA axis and the center of mass of LacI (residues 1-46). In each starting configuration, water molecules within 1.4 Å of the protein were displaced. The center of mass of the protein was restrained to the window center by a harmonic potential with a force constant of 10 kcal · mol-1 · Å-2, and its coordinates were saved at each time step. The first 200 ps of each simulation were discarded as equilibration, and the remaining 1000 ps of MD were used for the analysis. The unbiased PMF was computed using the weighted histogram analysis method,52 recombining the biased distributions of the position of the center of mass of the protein in the threedimensional space. Umbrella sampling trajectories were divided into data sets of 200 ps, and the free-energy profile was computed for each data set. The potential of mean force was defined as the mean value of the different data sets. The standard deviation provided an estimate of the error. The same atomic system used for the umbrella sampling simulations was also used to estimate the local diffusion coefficient of the sliding movement. MD trajectories of 6 ns were simulated, starting from three different initial conditions (corresponding to minima in the PMF profile). In these simulations, no restraint was applied to the center of mass of

Furini et al. the protein, and the position of LacI along the helical trajectory was saved at each time step. Then, the local diffusion coefficient, Dlocal, was estimated as

Dlocal )

MSD 2·t

(1)

where MSD is the mean square displacement along the helical trajectory and t is the simulation time. The diffusion coefficient was estimated using the first 2 ps of the MSD curve in each MD simulation. The local diffusion coefficient was computed as the average value of the three MD simulations; the error was estimated by the standard deviation. Langevin Dynamics. The sliding of LacI along nonspecific DNA takes place on the millisecond time scale. Nowadays, MD simulations of molecular systems with this complexity are limited to a time scale of a few hundred nanoseconds. Langevin dynamics simulations are a possible strategy to simulate longer time scales. In Langevin dynamics, the sliding motion of LacI along the helical trajectory can be described using

M

kBT dx d2x dPMF )+ A(t) 2 Dlocal dt dx dt

(2)

where x is the position along the helical trajectory, M is the mass of the moving particle, kB is the Boltzmann constant, T is the temperature (300 K), and A(t) is a random thermal force. The local diffusion coefficient and the potential of mean force are the inputs of this stochastic equation. If Dlocal and PMF are computed by MD simulations, as has been done here, the trajectories obtained by the numerical integration of eq 2 are directly linked to the atomistic characteristics of the molecular system. The discretized form of eq 2 used for the numerical integration is

{

xi+1 ) xi + Vi∆t

(

Vi+1 ) Vi 1 -

)

kBT ∆t dPMF ∆t MDlocal M dx

|

+ξ xi

kBT M



2∆t Dlocal

(3)

where Vi and xi are, respectively, the position and velocity at time step i, and ξ is Gaussian noise with a mean value of 0 and a mean square deviation of one. Once the Langevin trajectories are obtained, the diffusion coefficient for the sliding movement can be estimated using the MSD along the helical trajectory. The slope of the MSD at short lag times gives the local diffusion coefficient, while at longer lag times, the slope changes because of the barriers in the PMF, and it gives the effective diffusion coefficient, which is the one measured experimentally53 (see inset of Figure 5B). A formula analogous to eq 1 was used to estimate the effective diffusion coefficient in this second part of the MSD curve. The percentage of error in the estimate of the effective diffusion coefficient was calculated as 1/N1/2, with N equal to the number of transitions between PMF minima observed in the stochastic trajectory. Results Atomic Interactions between LacI and Nonspecific DNA. The LacI monomer (residues 1-62) proved to be stable on the nonspecific DNA in a 17 ns MD trajectory in explicit solvent

Sliding Movement of the Lac Repressor

J. Phys. Chem. B, Vol. 114, No. 6, 2010 2241

Figure 2. Root mean square deviation (rmsd) of backbone atoms along the MD simulation. The rmsd was computed with respect to the first frame of the trajectory. Red and blue lines show the rmsd of the backbone atoms of the entire protein and that of residues 1-46, respectively. Most of the protein mobility concerns residues 47-62 of the hinge region, while the part of the protein in closer contact with DNA, residues 1-46, is more stable.

TABLE 1: H-Bonds between LacI Monomer and DNA donor

acceptor

H-bond stabilitya

Thr19 Ser21 Thr34 Ser16 Ser31 Leu6 Arg22 Tyr17 Lys2 Ser16

D1:Thy4 D2:Cyt14 D1:Ade3 D1:Thy4 D1:Ade3 D2:Thy13 D1:Ade3 D2:Thy13 D2:Thy13 D1:Ade5

96% 87% 85% 85% 71% 67% 65% 52% 49% 32%

a The stability of the H-bonds is estimated as the percentage of the MD trajectory when donor and acceptor atoms are closer than 3.0 Å and the angle donor-hydrogen-acceptor is between 150 and 210°.65

(Figure 2). Despite the fact that part of the DNA molecule was not present in the NMR data and was manually modeled (see Methods section), the root mean square deviation (rmsd) of the DNA backbone was ∼2 Å. The rmsd of the DNA backbone atoms reached a plateau in the first 2 ns, and the DNA showed an almost perfect B-form structure along the entire MD simulation. The protein backbone rmsd was ∼3 Å; however, if the rmsd was calculated using just amino acids 1-46, the value decreased to ∼1.2 Å and reached a plateau after 4-5 ns. Thus, most of the protein movement was confined to the C-terminal of the LacI monomer (residues 47-62). The presence of hydrogen bonding between the LacI monomer and the DNA molecule was examined. Donor and acceptor atoms were considered to be part of a H-bond if their distance was below 3.0 Å, and the donor-hydrogen-acceptor angle was between 150 and 210°. Table 1 lists the stable H-bonds found in the MD trajectory. The stability was estimated as the percentage of time the donor and the acceptor atoms fulfilled the above criteria of distance and directionality. Most of the residues found to be involved in H-bonds coincide with those proposed by Kalodimos et al;27 these are Leu6, Ser16, Tyr17, Thr19, Ser21, Ser31, and Thr34. H-bonds between DNA and residues Tyr7, Gln18, Asn25, and Val30, which were observed by NMR, were unstable in the MD simulation and vanished in the first 2 ns of the trajectory. In contrast, H-bonds between DNA and residues Lys2 and Arg22 were revealed during the MD simulation but

Figure 3. Potential of mean force (PMF) for LacI sliding along the DNA molecule. Bottom ticks of the x-axis show the length of the LacI movement (center of mass of residues 1-46) along the helical trajectory surrounding the DNA molecule. Top ticks of the x-axis show the position along the helical trajectory of the bases D1:A6-D2:T15 and D1:G7-D2:C14, respectively. The average PMF is shown by a continuous line, and gray shading is used for the standard deviation. The average energy barrier between minima is 8.7 ( 0.7 kcal/mol; this average value was calculated considering both forward and backward transitions between the three minima.

were not observed in the NMR structure. The differences in H-bonds observed in the MD simulations and in the NMR data can be explained by the different conditions employed. The NMR data were obtained under slightly acidic conditions, while here it was preferred to simulate the system at neutral pH, which is closer to the experimental conditions used to visualize LacI one-dimensional movements on DNA.22,26 The residues involved in hydrogen bonding are mainly located along the H2 helix and at the N-termini of H1 and H3 helices (Figure 1). The donor atom of all of the H-bonds belongs to the LacI residue, and the acceptor atom is an oxygen atom of the DNA backbone (more specifically, the oxygen of the phosphodiester group for all of the H-bonds, except the oxygen, O5, of the deoxyribose in the H-bond with Tyr17). Besides hydrogen bonding, the LacI-DNA complex is also stabilized by electrostatic interactions between the negatively charged backbone of DNA and the positively charged residues of the protein. The residues contributing most to the electrostatic attraction are Arg22 and Lys2, showing an average distance between the charged side chains and the DNA backbone of 2.38 and 2.90 Å, respectively. The other positively charged residues are at an average distance to the DNA backbone of 5.1 Å for Arg51, 6.7 Å for Lys59, 8.2 Å for Lys37, 9.5 Å for Lys33, and 9.8 Å for Arg35. All of the negatively charged residues face the protein side opposite the DNA and are farther than 10 Å from the DNA backbone atoms. Potential of Mean Force of the LacI Sliding Movement. The potential of mean force (PMF) for the sliding movement of LacI along the helical trajectory around the DNA molecule is shown in Figure 3. The maxima of the PMF correspond to configurations where the center of mass of LacI is aligned with the base planes of nucleotides D1:A6-D2:T15 and D1:G7-D2: C14, respectively (see Figure 1 for the definition of the nucleotide sequences in the DNA strands D1 and D2). In contrast, the minima are localized halfway between successive maxima. The energy values are equal, within the numerical errors, in the two maxima and in the three minima. The alignment of the maxima with the base planes of the DNA strands does not have any direct physical meaning; the x-values of Figure 3 are just positions of the center of mass of LacI along

2242

J. Phys. Chem. B, Vol. 114, No. 6, 2010

Furini et al.

Figure 4. Representative structures of minima (A) and maxima (B) of the PMF. A monomer of LacI is shown in cartoon representation, and the structure of DNA is shown schematically. Licorice representation is used for residues Ser31 and Thr34 and for the backbone atoms of the nucleotide D1:A3-D1:A5. In (A), the H-bonds between LacI and the DNA are highlighted with black dotted lines. The average lengths of the H-bonds with Ser31 and Thr34 in the PMF minima are 1.8 and 1.7 Å, respectively. The average distances between the same couples of atoms in the PMF maxima are 5.3 and 4.3 Å, respectively.

the helical trajectory. What it is physically relevant is the periodicity of the PMF. A periodic PMF is consistent with the idea of a protein sliding along a nonspecific DNA segment. Nonspecific binding excludes strong interactions with the DNA bases, and interactions are expected to occur with the backbone of the DNA only. Therefore, since the DNA backbone has a periodic structure, a periodic PMF profile was anticipated. From the analysis of the H-bonding between LacI and the DNA strands during the simulation, a correlation emerged between their number and their stability and the PMF profile. Overall, in the PMF minima, there are, on average, two H-bonds more than in the PMF maxima. In particular, in the minima configurations, residues Ser31 and Thr34 make very stable H-bonds with an oxygen atom from the DNA phosphodiester group. Both residues form a H-bond with the same nucleotide, as shown in Figure 4A. In the three minima of the PMF, the nucleotide binding to Ser31 and Thr34 moves from D1:A3 (first minimum) to D1:T4 and finally to D1:A5 (last minimum). In the configurations corresponding to the PMF maxima, the side chains of residues Ser31 and Thr34 fall halfway with respect to two successive phophodiester groups (the average distance between the hydrogen atom in the hydroxyl group and oxygen atom in the phosphodiester group is ∼5 Å), and they cannot form a hydrogen bond with the DNA strand (Figure 4B). One-Dimensional Diffusion Coefficient for LacI Sliding. MD simulations estimated a local diffusion coefficient of (4.5 ( 0.2) × 107 bp2/s. This local diffusion coefficient only describes the LacI movement in the region sampled by the MD trajectories. In the time scale explored by the MD simulations presented here, the LacI protein remains close to its initial position, that is, it does not cross any of the energetic barriers between local minima of the PMF profile. Thus, the local diffusion coefficient cannot provide any information about the sliding movement of LacI along the DNA. Combining the local diffusion coefficient data with the free-energy profile, the Langevin dynamics simulations provide the effective diffusion coefficient along the helical trajectory, which can be compared with the experimental data on LacI diffusion. The PMF profile was approximated by a perfect sinusoid in the Langevin dynamics simulations. Several stochastic trajectories were generated, changing the amplitude of this sinusoid and hence the energy barrier between local minima of the PMF profile. Each trajectory was simulated until at least 50 transitions

Figure 5. Langevin dynamics along the helical trajectory. (A) Sample trajectory of a Langevin dynamics simulation with an energy barrier between energy minima of 8.7 kcal/mol. (B) MSD of the trajectory shown in (A). The slope of the MSD versus time was used to estimate the effective diffusion coefficient using eq 1. Since the effective diffusion coefficient is about 5 orders of magnitude lower than the local diffusion coefficient, (7.8 ( 1.1) × 102 and (4.5 ( 0.2) × 107 bp2/s, respectively, the slope of the MSD at short lag times is appreciable only in the inset.

between energy minima were sampled. Figure 5A shows part of the stochastic trajectory obtained with an energy barrier of 8.7 kcal/mol. Transitions between energy minima are almost instantaneous compared to the time spent by the system in the local minima. Energy barriers hamper LacI diffusion along the DNA molecule; thus, the effective diffusion coefficient of the sliding movement is much lower than the local diffusion coefficient (Figure 5B). The height of the energy barrier is a major determinant of the effective diffusion coefficient (Figure 6). Considering the range of the energy barrier computed using the umbrella sampling strategy, 8.7 ( 0.7 kcal/mol, the value of the effective diffusion coefficient ranges from (2.4 ( 0.3) × 102 bp2/s at 9.4 kcal/mol to (2.5 ( 0.3) × 103 bp2/s at 8.0 kcal/mol, which is close to the lower limit of the experimental range, 2 × 103 bp2/s to 1.1 × 106 bp2/s.26 A decrease of the energy barrier to 4 kcal/ mol results in a diffusion coefficient of (1.0 ( 0.1) × 106 bp2/ s, already close to the upper limit of the experimental range (Figure 6). Discussion The Lac repressor finds its operator sites at a rate higher than the diffusion limit,10,11 through a process called facilitated diffusion. The theoretical foundations of facilitated diffusion have been widely investigated,9 and the theoretical models are also supported by experimental data at singlemolecule level.54 What it is still missing is a description of how facilitated diffusion proceeds at the atomic level. In particular, it is still uncertain if LacI slides along a helical trajectory (with an intimate contact between the recognition helix H2 and the DNA major groove preserved) or whether it explores the DNA by successive jumps to adjacent binding

Sliding Movement of the Lac Repressor

Figure 6. Effective diffusion coefficient at different barrier heights. The gray region of the plot highlights the experimental range of the diffusion coefficients.26 Cross-markers are used for data points at 8.7, 8.0, and 9.4 kcal/mol, corresponding to the average value of the energy barrier computed by the umbrella sampling simulations plus/minus a standard deviation.

sites. Here, we have investigated the atomic details of facilitated diffusion by MD simulations, focusing on (i) the atomic interactions between the DNA binding domain of LacI and nonspecific DNA and (ii) the characteristics of the LacI sliding movement along a helical trajectory around the DNA molecule. A single monomer of the LacI DNA binding domain was included in the model system, whereas the NMR structure shows two monomers of LacI symmetrically oriented on the DNA molecule. The atomic interactions in the left and right monomer are thought to be identical.27 Therefore, simulating a single monomer should be sufficient to provide a complete picture of how LacI binds to nonspecific DNA. Moreover, the experimental NMR structure refers to an engineered version of LacI, truncated up to the first 62 amino acids at the N-terminal, and with the mutation V52C. Since the LacI dimer is stabilized by interactions between the core domains (residue 61-340), a protein truncated to the first 62 amino acids would not form a dimer. Dimer formation was therefore achieved through a disulfide bond between the cysteine residues introduced at position 52, which is part of the hinge region. Hence, this disulfide bond limits the mobility of both monomers in this area. In contrast, the hinge regions of wildtype LacI are highly mobile when the repressor is not bonded to an operator site,55,56 and, as a consequence, the helix-turnhelix domain of one monomer has a certain degree of freedom with respect to the equivalent domain of the second monomer. The high mobility of the DNA binding domain of LacI is reflected by the fact that crystallization of this part of the protein is achieved only when the repressor binds to an operator site.5,56–59 Thus, simulation of a single LacI monomer is likely to accurately represent the wild-type protein, considering that the helix-turn-helix domains of the two monomers are somehow independent. Moreover, if the two monomers are independent during the sliding motion, each step of this movement is associated with the energy barrier

J. Phys. Chem. B, Vol. 114, No. 6, 2010 2243 encountered by a single monomer, which justifies the approach adopted in the Langevin dynamics simulations. The classical MD simulation confirmed that the LacI monomer is stabilized on nonspecific DNA by electrostatic interactions and a highly organized H-bond network. As first suggested by Pabo and Sauer,60 the atomic interactions between a TF and nonspecific DNA may properly orientate the DNA binding domain. Indeed, helices H1 and H3 are both in optimum positions to interact with the DNA backbone. This in turn keeps the recognition helix H2 in an optimal position to interact with the base edges in the DNA major groove. The H-bonds between the N-terminal of helices H1 and H3 and the DNA backbone were very stable in the MD simulation and showed a correlation with the free-energy profile of the LacI sliding movement along DNA. The energy minima were 8.7 kcal/mol lower in energy than the maxima, and on average, two extra H-bonds involving the DNA backbone were observed in the minima. H-bonds between protein side chains and the DNA backbone may play an important stabilizing role in the nonspecific bonding. Indeed, ab initio calculations showed that hydrogen bonds between protein side chains and the DNA backbone may stabilize a protein-DNA complex by tens of kcal/mol.61 Accordingly, breaking the H-bonds with Ser31 and Thr34 is likely to represent the major energetic barrier encountered by LacI along the helical trajectory movement. The energy barriers in the PMF profile hamper the diffusion of LacI along the helical trajectory. Langevin dynamics were employed to evaluate if the PMF profile and the local diffusion coefficient estimated by MD trajectories render consistent data with that observed in experiments on LacI diffusion. The values of the effective diffusion coefficient computed through these stochastic simulations were close to the lower limit of the experimental range.26 However, it should be emphasized that the free energy is a difficult quantity to calculate in MD simulations. An error of 1 kcal/ mol in the computed energetic barrier may change the effective diffusion coefficient by almost an order of magnitude (Figure 6). Moreover, the multiscale model that we have adopted to compute the effective diffusion coefficient did not include any arbitrary parameter. The parameters of the Langevin dynamics equation were calculated by MD simulations; thus, the computed values of the diffusion coefficient are directly linked to the atomistic structure of the molecular system. In this context, the partial overlap between the calculated values of the diffusion coefficient and the experimental range is remarkable. Indeed, a recent paper by Blainey et al.62 (published after this Article was submitted) provided experimental evidence to support the hypothesis that proteins follow a helical trajectory while diffusing along DNA. The estimate of the diffusion coefficient based on the multiscale model suffers from two main shortcomings. (i) The PMF profile was assumed to be perfectly symmetric, while small differences in the values of the energy minima are expected due to interactions between the protein and specific DNA bases. As a first-order approximation, we chose to model the PMF profile as perfectly periodic because the length of the DNA segment analyzed by umbrella sampling is too short to provide complete statistics for the energy variability in the free-energy minima. If the variability of the energy minima is included in the model, the estimate of the diffusion coefficient should decrease.63 (ii) The second main shortcoming in the estimation of the LacI diffusion coefficient is that the model considered only residues 1-46

2244

J. Phys. Chem. B, Vol. 114, No. 6, 2010

of a single LacI monomer, while the experimental data were obtained using an entire LacI dimer, engineered with a fluorescent probe.26 Even if the atomic interactions between the protein and the DNA are mainly confined to the helixturn-helix motif that is present in our model, inclusion of the entire protein is likely to affect the characteristics of the diffusion process. In summary, the simulated sliding movement of LacI nonspecifically bound to DNA is slower than that measured experimentally. The values of the diffusion coefficient calculated by the multiscale model partly overlap with the experimental range, suggesting that the simulated molecular trajectory is indeed possible but probably not sufficient to achieve the fast diffusive motion observed experimentally. The experimental diffusion coefficient of LacI along DNA spans a 2 orders of magnitude range.26 A possible explanation for this high variability is the presence of multiple diffusion mechanisms. If the protein-DNA complex exists in different configurations, as suggested by the NMR data,27 each configuration will result in a different energy barrier along the diffusion pathway and thus in a different diffusion coefficient. For instance, in cases where the interactions between the repressor protein and the DNA are weak, the energy barriers for the sliding motion should not be substantial, resulting in higher diffusion coefficients. The presence of multiple structures for the protein-DNA complex can also explain the difference between our estimate of the energy barrier and the value measured by Blainey et al.62 Recent theoretical work employing coarse-grained models by Givaty and Levy showed that the diffusion of TFs along DNA proceeds faster when the recognition helix of the TF is not in intimate contact with the DNA major groove and that these loose protein-DNA complexes are needed to produce the fast diffusive motions observed experimentally.64 The results presented here are in agreement with this conclusion, providing a detailed atomistic description of the sliding movement. This description may prove useful in interpreting structural and functional data on DNA binding domains of proteins belonging to the helix-turn-helix family. Abbreviations Used. TF, transcription factor; MD, molecular dynamics; PMF, potential of mean force; MSD, mean square displacement. Acknowledgment. The work has been performed under the HPCEUROPA2 project (project number: 228398) with the support of the European Commission Capacities Area - Research Infrastructures Initiative. C.D. thanks The Royal Society for a University Research Fellowship. References and Notes (1) Jacob, F.; Monod, J. J. Mol. Biol. 1961, 3, 318. (2) Farabaugh, P. Nature 1978, 274, 765. (3) Beyreuther, K.; Adler, K.; Geisler, N.; Klemm, A. Proc. Natl. Acad. Sci. U.S.A. 1973, 70, 3576. (4) Kaptein, R.; Zuiderweg, E.; Scheek, R.; Boelens, R.; van Gunsteren, W. J. Mol. Biol. 1985, 182, 179. (5) Friedman, A.; Fischmann, T.; Steitz, T. Science 1995, 268, 1721. (6) Chen, J.; Matthews, K. J. Biol. Chem. 1992, 267, 13843. (7) Oehler, S.; Eismann, E. R.; Kra¨mer, H.; Mu¨ller-Hill, B. EMBO J. 1990, 9, 973. (8) Pfahl, M.; Gulde, V.; Bourgeois, S. J. Mol. Biol. 1979, 127, 339. (9) Halford, S. E.; Marko, J. F. Nucleic Acids Res. 2004, 32, 3040. (10) Riggs, A.; Bourgeois, S.; Cohn, M. J. Mol. Biol. 1970, 53, 401. (11) Hsieh, M.; Brenowitz, M. J. Biol. Chem. 1997, 272, 22092. (12) Nobbs, T.; Szczelkun, M.; Wentzell, L.; Halford, S. J. Mol. Biol. 1998, 281, 419. (13) Erskine, S. G.; Baldwin, G. S.; Halford, S. E. Biochemistry 1997, 36, 7567.

Furini et al. (14) Gottlieb, P.; Wu, S.; Zhang, X.; Tecklenburg, M.; Kuempel, P.; Hill, T. J. Biol. Chem. 1992, 267, 7434. (15) Hoopes, B.; LeBlanc, J.; Hawley, D. J. Biol. Chem. 1992, 267, 11539. (16) Kleinschmidt, C.; Tovar, K.; Hillen, W.; Porschke, D. Biochemistry 1988, 27, 1094. (17) Kim, J.; Takeda, Y.; Matthews, B.; Anderson, W. J. Mol. Biol. 1987, 196, 149. (18) Brunner, M.; Bujard, H. EMBO J. 1987, 6, 3139. (19) Dhavan, G.; Crothers, D.; Chance, M.; Brenowitz, M. J. Mol. Biol. 2002, 315, 1027. (20) Wallis, R.; Leung, K.; Pommer, A.; Videler, H.; Moore, G.; James, R.; Kleanthous, C. Biochemistry 1995, 34, 13751. (21) Golding, I.; Cox, E. C. Phys. ReV. Lett. 2006, 96, 098102. (22) Elf, J.; Li, G.-W.; Xie, X. S. Science 2007, 316, 1191. (23) Berg, O. G.; Winter, R. B.; Von Hippel, P. H. Biochemistry 1981, 20, 6929. (24) Winter, R. B.; Berg, O. G.; Von Hippel, P. H. Biochemistry 1981, 20, 6961. (25) Winter, R. B.; Von Hippel, P. H. Biochemistry 1981, 20, 6948. (26) Wang, Y. M.; Austin, R. H.; Cox, E. C. Phys. ReV. Lett. 2006, 97, 048302. (27) Kalodimos, C. G.; Biris, N.; Bonvin, A. M. J. J.; Levandoski, M. M.; Guennuegues, M.; Boelens, R.; Kaptein, R. Science 2004, 305, 386. (28) Kalodimos, C.; Bonvin, A.; Salinas, R.; Wechselberger, R.; Boelens, R.; Kaptein, R. EMBO J. 2002, 21, 2866. (29) Shurr, J. M. Biophys Chem 1975, 9, 413. (30) Bagchi, B.; Blainey, P. C.; Xie, X. S. J. Phys. Chem. B 2008, 112, 6282. (31) Kamberaj, H.; van der Vaart, A. Biophys. J. 2009, 96, 1307. (32) Ramachandran, S.; Temple, B. R.; Chaney, S. G.; Dokholyan, N. V. Nucleic Acids Res. 2009, gkp029. (33) Tsai, A. G.; Engelhart, A. E.; Hatmal, M. m. M.; Houston, S. I.; Hud, N. V.; Haworth, I. S.; Lieber, M. R. J. Biol. Chem. 2009, 284, 7157. (34) Ahmad, R.; Brandsdal, B. O.; Michaud-Soret, I.; Willassen, N.-P. Proteins: Struct., Funct., Bioinf. 2009, 75, 373. (35) Zhu, Y.; Wang, Y.; Chen, G. J. Phys. Chem. B 2009, 113, 839. (36) Mackerell, A. J.; Nilsson, L. Curr. Opin. Struct. Biol. 2008, 18, 194. (37) Flynn, T. C.; Swint-Kruse, L.; Kong, Y.; Booth, C.; Matthews, K. S.; Ma, J. Protein Sci. 2003, 12, 2523. (38) Swint-Kruse, L.; Zhan, H.; Matthews, K. S. Biochemistry 2005, 44, 11201. (39) Wintjens, R.; Rooman, M. J. Mol. Biol. 1996, 262, 294. (40) Aravind, L.; Anantharaman, V.; Balaji, S.; Babu, M.; Iyer, L. FEMS Microbiol. ReV. 2005, 29, 231. (41) Santos, C. L.; Tavares, F.; Thioulouse, J.; Normand, P. FEMS Microbiol. ReV. 2009, 33, 411. (42) Kale, L.; Skeel, R.; Bhandarkar, M.; Brunner, R.; Gursoy, A.; Krawetz, N.; Philips, J.; Shinozaki, A.; Varadarajan, K.; Schuten, K. J. Comput. Phys. 1999, 151, 283. (43) Li, H.; Robertson, A. D.; Jensen, J. H. Proteins: Struct., Funct., Bioinf. 2005, 61, 704. (44) Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kale, L.; Schulten, K. J. Comput. Chem. 2005, 26, 1781. (45) MacKerell, A. D.; Bashford, D.; Bellott, M.; Dunbrack, R. L.; Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.; JosephMcCarthy, D.; Kuchnir, L.; Kuczera, K.; Lau, F. T. K.; Mattos, C.; Michnick, S.; Ngo, T.; Nguyen, D. T.; Prodhom, B.; Reiher, W. E.; Roux, B.; Schlenkrich, M.; Smith, J. C.; Stote, R.; Straub, J.; Watanabe, M.; Wiorkiewicz-Kuczera, J.; Yin, D.; Karplus, M. J. Phys. Chem. B 1998, 102, 3586. (46) Jorgensen, W. L.; Chandresekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. J. Chem. Phys. 1983, 79, 926. (47) Miyamoto, S.; Kollman, P. A. J. Comput. Chem. 1992, 13, 952. (48) Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. J. Chem. Phys. 1995, 103, 8577. (49) Feller, S. E.; Zhang, Y. H.; Pastor, R. W.; Brooks, B. R. J. Chem. Phys. 1995, 103, 4613. (50) Martyna, G. J.; Tobias, D. J.; Klein, M. L. J. Chem. Phys. 1994, 101, 4177. (51) Torrie, G. M.; Valleau, J. P. Chem. Phys. Lett. 1974, 28, 578. (52) Kumar, S.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A.; Rosenberg, J. M. J. Comput. Chem. 1992, 13, 1011. (53) Chiu, S. W.; Novotny, J. A.; Jakobsson, E. Biophys. J. 1993, 64, 98. (54) Gorman, J.; Greene, E. C. Nat. Struct. Mol. Biol. 2008, 15, 768. (55) Wade-Jardetzky, N.; Bray, R. P.; Conover, W. W.; Jardetzky, O.; Geisler, N.; Weber, K. J. Mol. Biol. 1979, 128, 259. (56) Lewis, M.; Chang, G.; Horton, N. C.; Kercher, M. A.; Pace, H. C.; Schumacher, M. A.; Brennan, R. G.; Lu, P. Science 1996, 271, 1247. (57) Bell, C.; Lewis, M. Nat. Struct. Biol. 2000, 7, 209.

Sliding Movement of the Lac Repressor (58) Bell, C. E.; Lewis, M. J. Mol. Biol. 2001, 312, 921. (59) Bell, C. E.; Barry, J.; Matthews, K. S.; Lewis, M. J. Mol. Biol. 2001, 313, 99. (60) Pabo, C. O.; Sauer, R. T. Annu. ReV. Biochem. 1992, 61, 1053. (61) Mukherjee, S.; Majumdar, S.; Bhattacharyya, D. J. Phys. Chem. B 2005, 109, 10484. (62) Blainey, P. C.; Luo, G.; Kou, S. C.; Mangel, W. F.; Verdine, G. L.; Bagchi, B.; Xie, X. S. Nat. Struct. Mol. Biol. 2009, 16, 1224.

J. Phys. Chem. B, Vol. 114, No. 6, 2010 2245 (63) Slutsky, M.; Mirny, L. A. Biophys. J. 2004, 87, 4021. (64) Givaty, O.; Levy, Y. J. Mol. Biol. 2009, 385, 1087. (65) Luscombe, N. M.; Laskowski, R. A.; Thornton, J. M. Nucleic Acids Res. 2001, 29, 2860. (66) Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graphics 1996, 14, 33.

JP906504M