Lysine Side-Chain Dynamics in the Binding Site of Homeodomain

Publication Date (Web): April 17, 2018 ..... (53) All of the NMR data reported in this paper were collected using complexes with the hairpin DNA const...
0 downloads 3 Views 12MB Size
Subscriber access provided by UNIV OF NEW ENGLAND ARMIDALE

Lysine Side-Chain Dynamics in the Binding Site of Homeodomain/DNA Complexes as Observed by NMR Relaxation Experiments and Molecular Dynamics Simulations Jamie M. Baird-Titus, Mahendra Thapa, Thomas Doerdelmann, Kelly A. Combs, and Mark Rance Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.8b00195 • Publication Date (Web): 17 Apr 2018 Downloaded from http://pubs.acs.org on April 17, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Lysine Side-Chain Dynamics in the Binding Site of Homeodomain/DNA Complexes as Observed by NMR Relaxation Experiments and Molecular Dynamics Simulations †#

Jamie M. Baird-Titus , Mahendra Thapa §

Mark Rance

‡⊥#

§¶

§

, Thomas Doerdelmann , Kelly A. Combs and

*

† Department of Chemistry and Physical Sciences, Mount St. Joseph University, Cincinnati, OH ‡ Department of Physics, University of Cincinnati, Cincinnati, OH § Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH # These authors contributed equally

* Corresponding author Dept. of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, Ohio 45267. Phone: 513-558-0066. E-mail: [email protected]

Funding

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

This work was supported by a grant from the National Institutes of Health (GM063855). Funding for the NMR facility was provided by NIH grants RR19077 and RR027755.

Abbreviations: NMR: nuclear magnetic resonance; PITX2: Pituitary homeobox protein 2; Antp: Antennapedia; RMSD: root-mean-square deviation; MD: molecular dynamics; HSQC: heteronuclear single quantum coherence; HISQC: heteronuclear in-phase single quantum coherence; NVT: constant Number, Volume and Temperature; NPT: constant Number, Pressure and Temperature; FTIR: Fourier transform infrared

ACS Paragon Plus Environment

2

Page 2 of 74

Page 3 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Abstract An important but poorly characterized contribution to the thermodynamics of protein-DNA interactions is the loss of entropy that occurs from restricting the conformational freedom of amino acid side chains. The effect of restricting the flexibility of several side chains at a proteinDNA interface may be comparable in many cases to the other factors that determine the binding thermodynamics, and may therefore play a key role in dictating the binding affinity and/or specificity. Because the entropic contributions, including the presence and influence of sidechain dynamics, are especially difficult to estimate based on structural information, it is important to pursue experimental and theoretical studies that can provide direct information regarding these issues. We report on studies of a model system, the homeodomain−DNA complex, focusing on the Lys50 class of homeodomains where a key lysine residue in position 50 was shown previously to be critical for binding site specificity. NMR methodology was employed for determining the dynamics of lysine side-chain amino groups via 15N relaxation measurements in the Lys50-class homeodomains from the Drosophila protein Bicoid and the human protein Pitx2. In the case of Pitx2, complexes with both a consensus and a non-consensus DNA binding site were examined. NMR-derived order parameters indicated moderate to substantial conformational freedom for the lysine NH3+ group in the complexes studied. To complement the experimental NMR measurements, molecular dynamics simulations were performed for the consensus complexes to gain further, detailed insights regarding the dynamics of the Lys50 side chain and other important residues in the protein-DNA interface.

ACS Paragon Plus Environment

3

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Introduction Fundamental cellular activities such as the transcription, replication, recombination and repair of genes require the non-covalent interaction of DNA and DNA-binding proteins. The underlying molecular recognition processes governing protein-DNA interactions are complex and not yet fully understood, despite the great number of structural, functional and thermodynamic studies that have been reported. A large number of molecular mechanisms affect the interactions of proteins and DNA, including hydrogen bonding, dehydration of surfaces, reorganization of counter-ion atmospheres, conformational changes such as coupled protein folding (and unfolding), electrostatic effects, and changes in dynamics. While significant progress has been made in quantifying some of these mechanisms, others require greater attention, in particular the characterization of molecular dynamics. The current body of research makes it clear that protein/DNA complexes employ a variety of thermodynamic strategies for achieving an appropriate binding free energy and for effecting the desired structural and functional properties. In regard to the contributions made by molecular dynamics, it is of particular interest to characterize the flexibility of amino acid side chains in the protein-DNA interface.1 A potentially important but too often neglected contribution to the thermodynamics of protein-DNA interactions is the loss of entropy that can occur from restricting the conformational freedom of amino acid side chains. In this context, entropic effects can arise from two sources, the first being changes in the number of conformations (rotamers) populated and the second being a restriction in the width of potential energy wells (a restriction in the range of dihedral angles). The sum of these two effects is termed configurational entropy.2 The summed effect of restricting the flexibility of several side chains at a protein-DNA interface may be

ACS Paragon Plus Environment

4

Page 4 of 74

Page 5 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

comparable in many cases to other factors that determine the binding thermodynamics,3 and may therefore play a key role in dictating the binding affinity and/or specificity. A significant challenge is to develop a detailed understanding of the conformational dynamics of amino acid side chains in protein/DNA complexes. Because the entropic contributions, including the presence and influence of side-chain conformational dynamics, are especially difficult to estimate based on structural information alone, it is very important to pursue experimental and theoretical studies that can provide direct information regarding these issues. For example, studies of the TFIIIA/DNA complex4 suggested that some lysine side chains involved in conferring sequence specificity fluctuate between contacts with different bases rather than making fixed contacts; NMR linewidth observations indicated that these lysine side chains exhibit motions on a microsecond-millisecond timescale, while evidence from NOE data indicated faster motion of another lysine side chain. We made similar observations, in terms of line-broadening effects for the key lysine in position 50 (vide infra) in our NMR studies of the Pitx2/DNA and Bicoid/DNA homeodomain complexes.5, 6 Another observation of relevance to the present work is the line-broadening observed for the side-chain resonances of the almost invariant asparagine in position 51 of homeodomains in complexes with DNA, which has been interpreted to indicate the presence of motions on the microsecond-millisecond timescale.5-8 Side-chain 15N relaxation measurements indicated the presence of time-dependent fluctuations in studies of several protein/DNA complexes.3, 9 These NMR studies provide strong evidence for the presence of conformational fluctuations at protein-DNA interfaces, in contrast to the prior perception of a rather well defined set of specific contacts between the protein side chains and the DNA. Such conformational flexibility at the protein-DNA interface may have arisen as a result of the thermodynamic advantage that flexibility confers on molecular recognition

ACS Paragon Plus Environment

5

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

processes. It seems likely that a fine balance between rigidity and flexibility needs to occur in order to realize the specificity and affinity required of protein-DNA interactions. A valuable model protein system for studying protein-DNA interactions is the homeodomain, as it represents one of the key families of eukaryotic DNA-binding motifs. Homeodomains consist of a N-terminal arm and three helices, including a helix-turn-helix DNAbinding motif (helices 2 and 3), and they bind to short DNA fragments consisting of six base pairs, predominantly the sequence TAATXY (where X and Y can be A, C, G, or T).10 The amino acids that control the DNA binding specificity of homeodomains are located mostly in the recognition helix (helix 3) and in the N-terminal arm.11-17 The principle contacts with the DNA bases involve amino acids usually in positions 2 or 3 and 5 of the N-terminal arm as well as residues at positions 47, 50, 51, 54 and 55 of the recognition helix18, 19 (the canonical homeodomain sequence is numbered from 1 to 60). These protein-DNA contacts appear to represent the major interactions responsible for nucleotide-specific binding, although surrounding residues can also affect specificity through indirect effects.18 Biochemical and genetic studies indicate that residue 50 is particularly important in determining the differential specificity of homeodomain-DNA recognition,14, 15, 17, 20-22. Although residue 50 is one of several amino acids that interact with the DNA binding site in a sequence-specific manner, it is the only residue that has clearly been shown to be involved in the discriminative recognition of distinct classes of DNA sequences.23 Homeodomain proteins are often classified according to the identity of residue 50, due to the key role that this residue plays in DNA binding specificity. While glutamine is the most common residue found at position 50, a number of other residues also occur, including cysteine, serine, and lysine.23 Much attention has been focused on the consequences of lysine being located at position 50, largely due to the fact that the most dramatic

ACS Paragon Plus Environment

6

Page 6 of 74

Page 7 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

examples of altered DNA specificity occur when a lysine is either introduced or replaced at position 50.11, 15, 24, 25 The tightest and most specific binding occurs when lysine is present at position 50.26 Although there had been much interest expressed earlier in studying the structural consequences of a lysine in position 50,26 the first and only experimentally determined structures of a wild-type Lys50 homeodomain were reported from our group for the Bicoid6 and Pitx25 homeodomain/DNA complexes. The homeodomain of the Drosophila protein Bicoid has been the most extensively studied member of the Lys50 class of homeodomains (also commonly referred to as the K50 class). Bicoid is responsible for embryonic anterior structure development and recognizes DNA sequences present in enhancer elements of a wide variety of Bicoid-responsive genes. The consensus sequence to which Bicoid binds is TAATCC.10, 27 Pituitary homeobox protein 2 (Pitx2) is a Bicoid-related homeodomain transcription factor that was originally identified to be involved in Rieger syndrome,28 and is present in many tissues of developing vertebrate embryos as well as adult vertebrates. Many genes have been identified that are regulated by Pitx2.29 Pitx2 is another member of the Lys50 class of homeodomain proteins. Like Bicoid, Pitx2 recognizes the consensus DNA binding site TAATCC; Pitx2 also recognizes a number of non-consensus sites, including TAAGCT, TAAGCC, AAATCC, TCATCC, TTATCC, and CAATCC.30-34 Comparisons of data obtained from X-ray crystallographic and NMR studies of homeodomain/DNA complexes provide compelling reasons for studying the dynamic properties of these complexes. Crystallographic studies of homeodomains have indicated that there are several conserved and relatively stable contacts at the homeodomain-DNA interface.24, 35 In at least a few cases, such as for the crystal structures of the Antennapedia35 and even-skipped36 homeodomains, multiple conformations were observed for the Gln50 side chain. On the other

ACS Paragon Plus Environment

7

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

hand, NMR studies of the Antennapedia homeodomain/DNA complex,37, 38 and subsequent molecular dynamics simulations,39, 40 have stressed the dynamic, fluctuating nature of the protein-DNA interactions. The Wüthrich group hypothesized39 that Antp achieves specificity through a fluctuating network of short-lived contacts that allows it to recognize DNA without the entropic cost that would result if side chains were immobilized upon DNA binding. The availability of structural data for Lys50 class homeodomain/DNA complexes and an investigation of the internal dynamics of such complexes by modern NMR methods will provide valuable new insights on these critical issues. In the present work we have employed advanced NMR methodology41 for determining the dynamics of lysine side-chain amino groups via 15N relaxation measurements and applied it to the study of the key Lys50 residue in Pitx2/DNA and Bicoid/DNA complexes. In the case of Pitx2, complexes with both consensus and non-consensus DNA binding sites have been examined. To complement the experimental NMR measurements, molecular dynamics simulations have also been carried out on the consensus complexes in order to gain further, detailed insights regarding the dynamics of the Lys50 side chain as well as other important residues located in the protein-DNA interface. Our results indicated some differences between the Lys50 side-chain dynamics between Bicoid and Pitx2, with the Bicoid Lys50 side chain existing predominantly in one rotameric state on the picosecond-nanosecond timescale, whereas the Pitx2 Lys50 side chain exhibits somewhat greater conformational flexibility on the fast timescale, for both the consensus and non-consensus complexes. For both homeodomains, fluctuating hydrogen bond contacts are observed for the Lys50 NH3+ group with multiple bases in the DNA binding site.

Materials and Methods

ACS Paragon Plus Environment

8

Page 8 of 74

Page 9 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Sample Preparation Proteins were expressed from a pET-28 expression vector as a His6-TEV-GSPitx2homeodomain-EFIVTD fusion protein or His6-TEV-MSS-Bicoidhomeodomain- fusion protein in Escherichia coli BL21DE3star cells (Invitrogen). Expression conditions were as described previously.5 Cells were harvested by centrifugation at 2500g for 15 mins. Harvested cells were resuspended in 137 mM NaCl, 2.7 mM KCl, 10 mM sodium phosphate dibasic, 2 mM potassium phosphate monobasic at pH 7.4 (PBS) with the addition of 10mM imidazole. Resuspended cells were lysed by passing through a chilled French Press twice at 12,000 psi. The lysate was cleared by centrifugation at 25,000g for 30 minutes, at 4 °C. Cleared lysate was applied to a pre-equilibrated 5ml HisTrap HP column (GE Healthcare). The column was washed with 10 column volumes (c.v.) of PBS+10mM imidazole. An additional wash step of 10 c.v. with PBS+100mM imidazole was next performed. Target fusion protein was eluted with 10 c.v. of PBS+500mM imidazole. Protein concentrations were estimated via A278 (ε278 = 18350 cm−1 M−1) and the eluate was adjusted to contain 10% (v/v) glycerol, 5mM β-mercaptoethanol and 5mM EDTA. In-house produced TEV protease was added for fusion tag removal at a 1:25 ratio and cleavage was performed at 4 °C over 4 hours. Cleaved protein was then loaded on a 1ml HiTrap SP FF (GE Healthcare) cation exchange column, washed with washing buffer (10 mM NaH2PO4, 400 mM NaCl, pH 7.0), and eluted with buffer containing a higher salt concentration (10 mM NaH2PO4, 1 M NaCl, pH 7.0). Purity was determined to be >98% by SDS-PAGE. The eluted homeodomain was then dialyzed overnight at 4 °C into 10 mM NaH2PO4, 150mM Na2SO4, 1mM EDTA pH 7.0. Protein/DNA Complex Formation

ACS Paragon Plus Environment

9

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Hairpin-forming DNA sequences were used to form the protein/DNA complexes. The DNA was purchased from Integrated DNA Technologies (www.idtdna.com). Two different DNA sequences were used: 5’-GCTAATCCGCTTGCGGATTAGC-3’ (consensus), and 5’GCTAAGCTGCTTGCAGCTTAGC-3’ (non-consensus). The DNA molecules were dissolved in nuclease-free water, heated to 95 °C for 15 min, and allowed to cool slowly to room temperature. Two-dimensional NMR spectra were run on the annealed DNA sample and indicated a very homogeneous structure had been formed. A slight excess of the annealed DNA was added to either the purified Bicoid or Pitx2 homeodomains that had been dialyzed into 10 mM NaH2PO4, 1 mM EDTA, 0.02% NaN3, pH 7.0. The protein/DNA solutions were concentrated in an Amicon Ultra15 spin filter (Millipore) with a molecular cut-off of 3kDa. Samples were placed into 5 mm coaxial NMR tubes and stored at 4 °C. The final concentration of each sample was between 0.9 and 1.5 mM. NMR spectroscopy 15

N R1, R2,ini and R(4NzHzHz) relaxation rate constants and heteronuclear 1H-15N NOE

values were measured for the homeodomain-DNA complexes using pulse sequences developed by Esadze et al.41 to probe the dynamics of lysine 15NH3+ groups; the pulse sequence code was written in-house. (Note, Iwahara’s group recently reported sensitivity-improved versions of their pulse sequences42). The pulse sequence for the R2,ini experiment was modified from the original version of Esadze et al.41 in the following manner. The proton 180° pulses that are part of spin alignment elements bracketing the proton cw periods had the phase lists φ5 and φ7, instead of the proton 90° pulse immediately preceding (in the case of φ5) or immediately following (in the case of φ7) the 180° pulse, and these 90° pulses were applied with phase x.43 Data was recorded on Varian VNMRS 800 MHz and Inova 600 MHz spectrometers; the 800 MHz instrument was

ACS Paragon Plus Environment

10

Page 10 of 74

Page 11 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

equipped with a 5 mm HCN triple resonance cryoprobe, while a conventional 5 mm HCN probe was employed on the 600. Measurements were made at a temperature of 295 K. Twodimensional 1H-15N HISQC-type spectra44, 45 were recorded in the process of determining the relaxation data. The R1 rate constants were determined using ten values of the parametric delay T: 10, 90, 190, 310, 440, 600, 800, 1050, 1400, and 2000 ms, with duplicate data recorded for the 10, 310 and 1050 ms time points in order to evaluate experimental errors in the peak intensities. The R2,ini rate constants were determined using eight time points: 16, 32, 48, 64, 80, 96, 112, and 128 ms, with duplicate data obtained for the 16, 64 and 112 ms time points. The effective field strength of the CPMG sequence in the R2,ini experiments was 122 Hz. The R(4NzHzHz) rate constants were obtained using eight time points: 3.9, 6.8, 14.7, 24, 35.3, 50, 70.7, and 106 ms, with duplicate data recorded for the 3.9, 24 and 70.7 ms time points. Recycle delays (including 5 s saturation periods) in the heteronuclear NOE experiments were 10 s for the 600 MHz data and 15 s for the 800 MHz data. The CPMG relaxation dispersion experiments were performed as described elsewhere,41, 43 using a 100 ms constant-time period for the CPMG element and employing CPMG field strengths, νCPMG, varying between 10 and 120 Hz. Measurements of 15N R1 and R2 rate constants (Pitx2 and Bicoid complexes) and heteronuclear 1H-15N NOEs (Pitx2 complexes) were performed using standard experimental procedures46, 47 on the 600 MHz spectrometer for the protein backbone amide groups in the homeodomain-DNA complexes in order to determine the value for the molecular rotational correlation time τc and to obtain estimates of the generalized order parameter S2NH for the backbone 15N−H bond vectors; these measurements were done similarly to those reported previously for the Pitx2 homeodomain in the absence of DNA.48 All NMR data were processed with NMRPipe and spectral peak

ACS Paragon Plus Environment

11

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

intensities extracted with NMRDraw,49 and the relaxation rate constants were extracted using the Curvefit program from the Palmer group50 or software written in-house. Determination of order parameters and correlation times Order parameters S2axis for the symmetry axis of the lysine 15NH3+ groups, τi correlation times for reorientation of the symmetry axis and τf correlation times for bond rotation around the symmetry axis were determined from the 15NH3+ group 15N relaxation parameters as described by Esadze et al.,41 using a mathematica notebook they very helpfully provided in the Supporting Information section of their paper. Errors in the computed dynamics parameters were estimated by a Monte Carlo approach. Order parameters S2NH and the molecular rotational correlation time τc for the Pitx2 homeodomain backbone 15N−H vectors were determined from the 600 MHz relaxation data using the Modelfree program,50 as described elsewhere.48 The value of τc for the Bicoid/DNA complex was estimated from the trimmed mean of the R2/R1 ratio using the approach of Tjandra et al.51 as implemented in the r2r1_diffusion program from the Palmer group (http://www.palmer.hs.columbia.edu/software/r2r1_diffusion.html). Molecular dynamics simulations For the Pitx2 and Bicoid homeodomain/consensus DNA calculations, the model closest to the average structure as determined by the program THESEUS52, from the NMR-derived ensemble of 17 structures5 for Pitx2 (PDB code 2LKX) and 20 structures for Bicoid6 (PDB code 1ZQ3) were chosen as starting structures for the homeodomains; this turned out to be model 1 in the PDB file for Pitx2 and model 16 for Bicoid. For our earlier studies where we determined the solution structures of the Pitx2 and Bicoid homeodomains bound to the DNA consensus sequence,5, 6 the DNA duplex was doublestranded. However, we subsequently found it more convenient to use single-stranded DNA

ACS Paragon Plus Environment

12

Page 12 of 74

Page 13 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

designed to form a hairpin. We chose to use the CTTG hairpin construct that has been shown to be highly stable.53 All of the NMR data reported in this paper were collected using complexes with the hairpin DNA construct. For our MD studies, we used the same hairpin DNA sequence that was employed for the NMR relaxation measurements. To generate a hairpin DNA structure to use in the simulations, we started with the crystal structure of a 32 base CTTG hairpin bound to a bacterial protelomerase54 (PDB code 4F41). The DNA was shortened to 22 bases. We then used the tleap module of the AMBER molecular dynamics package55 to convert the bases from the original sequence to the appropriate bases for the DNA sequence we desired; this substitution was done in an iterative fashion where we replaced two base pairs at a time and performed an energy minimization and short MD run to equilibrate the DNA structure at each step. Once the desired DNA structure had been obtained, we used the program Chimera.56 to manually dock the homeodomains with the DNA hairpin in an orientation matching that of the NMR-derived structures that employed double-stranded DNA. At this point, extensive energy minimization and MD simulations were performed to equilibrate the structures of the complexes; in this process, restraints were employed for the highly conserved intermolecular contacts that have been observed in numerous crystal and NMR structures of homeodomain/DNA complexes. The specific distance restraints were Asn51 HD21 – A3 N7, Asn51 OD1 – A3 H62, Tyr25 HH – G6 OP1, Arg5 HH12 – T1 O2, and Arg5 HH22 – T1 O2 (employing the atom naming convention used in AMBER, 3-letter codes for amino acid residues, 1-letter codes for DNA nucleotides, and the numbering schemes indicated below in Figures 1 and 2). These restraints were removed for all subsequent stages of the simulations. Using the tleap module of the AMBER version 16 software suite, the homeodomain/DNA complexes were solvated with approximately 6000 TIP3P57 water

ACS Paragon Plus Environment

13

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

molecules in a truncated octahedral box sized to accommodate a minimum water shell thickness of 1.2 nm. The systems were neutralized by adding an appropriate number of K+ ions (13 for Pitx2 and 11 for Bicoid). The ff14SB force field58 in the AMBER 16 package was employed for the proteins and the bsc0 force field59 was used for the DNA. The sander module within AMBER was used for the energy minimization stages within the equilibration period while the pmemd.cuda code60, 61 was employed on a graphics processor unit for all non-minimization stages. The systems were extensively equilibrated in a multi-step process62, starting with 5000 steps of energy minimization (2500 steps of steepest descent, followed by 2500 steps of conjugate gradient); all non-water and non-hydrogen atoms were restrained during this minimization step. A second minimization step was performed in which all restraints were removed. The systems were then equilibrated for 50 ps in the canonical NVT (constant Number, Volume and Temperature) ensemble63 during which time the temperature was ramped from 50 to 295 K during the first 30 ps; harmonic restraints of 10 kcal mol-1 Å-2 on the solute atoms were applied. Particle-Mesh-Ewald periodic boundary conditions64 and a cutoff of 0.9 nm for the nonbonded interactions were employed; the SHAKE algorithm65 was applied to constrain bonds to hydrogen atoms, and a time step of 1 fs was used. The next two stages in the equilibration process were 100 ps in length, using NPT (constant Number, Pressure and Temperature) conditions and the solute restraints were decreased in two steps, to 1.0 and 0.1 kcal mol-1 Å-2. Langevin dynamics66, 67 was used to regulate the temperature, with a collision frequency of 1 ps-1 and the pressure was regulated at 1 bar by isotropic position scaling, with a pressure relaxation time of 2 ps. The final stage of the equilibration process was a 5 ns calculation under NPT conditions and a 2 fs time step. The production runs were performed in the NVT ensemble and the temperature was regulated using the weak coupling algorithm68 and a weak coupling time

ACS Paragon Plus Environment

14

Page 14 of 74

Page 15 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

constant of 20 ps to the heat bath; structure coordinates were saved every picosecond. Hydrogen mass repartitioning69 was employed during the production runs, which allowed a time step of 4 fs to be used instead of 2 fs. A set of 10 production runs of duration 100 ns each were performed for each homeodomain/DNA complex. Each run was initiated from the same starting structure, generated as described above for the Bicoid and Pitx2 complexes. For each of the 10 runs, the initial velocities were selected from a Maxwell distribution at a temperature of 50 K, and the spatial distribution of the charge-compensating K+ ions was randomized using the randomizeions command within the CPPTRAJ program70 (K+ ions were specified to be no closer than 6 Å from solute atoms and at least 4 Å from each other). Performing multiple MD runs starting from the same structure but using different initial velocities followed the strategy recommended by Caves et al.71 Autocorrelation functions72 for the lysine Cε−Nζ bond vectors were calculated from the MD trajectories using the CPPTRAJ program70 in the AMBER suite of software; rotational tumbling of the complex was removed prior to the calculation of the autocorrelation functions by performing a RMSD fit of each frame of the trajectory to the initial frame; the backbone atoms of the helical regions of the homeodomains and of the backbone atoms of the DNA were used in determining the RMSD fits. The autocorrelation functions were calculated individually for each of the 100 ns simulation runs for each homeodomain/DNA complex and the resulting 10 data sets were averaged together. Lysine side-chain amino group S2axis order parameters were obtained as the plateau value of the averaged autocorrelation functions. The backbone S2NH order parameters for the Pitx2/DNA complex were determined using the following expression:73

ACS Paragon Plus Environment

15

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

S2NH = 3/2 [ 2 + 2 + 2 +22 + 22 + 22 ] − ½

Page 16 of 74

(1)

where x, y and z are the Cartesian components of the relevant bond vector, normalized to a bond vector length of unity, and the denote averages over the specified segment of the MD trajectory. The trajectories were split into blocks of 10 ns (approximately the value of the molecular rotational correlation time), eq. (1) was applied to each block, and the results were averaged together. The S2NH results were not very sensitive to the exact length of block chosen. The angular order parameter, Sangle, for a given dihedral angle74, 75 was calculated as the normalized amplitude of the vector sum of the two-dimensional unit vectors with phase equal to the dihedral angle, summed over all frames of the ten trajectories for the Bicoid and Pitx2 complexes. The order parameter was normalized to the number of frames considered, which was one million for each complex. Hydrogen bond analysis of the MD trajectories was performed using the CPPTRAJ program. Identification of hydrogen bonds was determined using a distance cutoff of 3.5 Å (distance between the donor and acceptor heavy atoms) and an angle limit of 100°. Structures were visualized with the program Chimera.56

Results and Discussion Homeodomain/DNA complexes Previously our group had determined the first and, to date, only structures of native Lys50-class homeodomains; specifically, we determined solution structures for Bicoid/DNA6 (PDB code 1ZQ3) and Pitx2/DNA5 (PDB code 2LKX) complexes, with the DNA containing the consensus binding site TAATCC. These structures agree well with those of other homeodomain/DNA complexes present in the Protein Data Bank76 (www.rcsb.org), exhibiting

ACS Paragon Plus Environment

16

Page 17 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

the typical three helix topology for the homeodomain. Figure 1 shows one structure from the NMR ensemble for the Pitx2/TAATCC DNA complex. Helix 3 of the homeodomain binds in the major groove of the DNA duplex while the flexible N-terminal tail wraps around to make several contacts within the minor groove and phosphates in the DNA backbone. For ease of presentation we use the single letter nucleotide designations and the numbering system T1A2A3T4C5C6  G6G5A4T3T2A1 for the DNA consensus binding site; standard three letter codes are used when referring to amino acid residues. Some of the key hydrogen bonds between the protein and DNA are indicated in the Figure, including the highly conserved interaction between Asn51 and A3, hydrogen bonds from the Lys50 NH3 group to G5, interactions between the side chains of Arg3 and Arg5 with bases in the minor groove, and a hydrogen bond from the Tyr25 OH group to the phosphate backbone of the DNA.

helix 2

helix 3

helix 1

ACS Paragon Plus Environment

17

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1: NMR-derived structure of the Pitx2/DNA complex, with the consensus TAATCC binding site shown in blue (PDB code 2LKX; model 1 in the ensemble of NMR structures). Some key, intermolecular hydrogen bonds are indicated in red, involving side-chain groups in Arg3, Arg5, Tyr25, Lys50 and Asn51; further details of the specific atoms involved in the hydrogen bonds are provided in the text.

Subsequent to our original structural studies, we found it more convenient to work with DNA hairpins instead of double-stranded DNA. We chose to use a hairpin containing the loop sequence CTTG, which was shown to provide high stability.53 A schematic of the DNA hairpin employed for the NMR relaxation measurements reported here, containing the consensus binding site, is shown in Figure 2. Also shown in Figure 2 is a depiction of the modeled 3D structure of this hairpin. This structure was obtained by starting with the DNA component in the crystal structure of a complex with a protelomerase TelA mutant54 (PDB code 4F41), truncating the DNA to 22 bases, and iteratively mutating the bases of this DNA sequence to the desired sequence shown in Figure 2 and performing a molecular dynamics refinement of the structure in each iteration. A comparison of the modeled hairpin structure to the DNA duplex from the NMR-derived structure of the Bicoid/DNA complex indicated a maximum likelihood RMSD of 0.27 Angstroms for the backbone of the TAATCC binding site, as determined using the THESEUS program. The Pitx2 and Bicoid homeodomains were previously determined to bind with high affinity the double-stranded DNA duplex (TAATCC consensus sequence) employed in our original structural studies (Kd of 2.6 ± 0.4 nM for Pitx25 and 0.43 ± 0.03 nM for Bicoid6). Although we did not repeat the affinity measurements for the hairpin DNA construct, we would anticipate very similar results, based on using the same TAATCC binding site and the similarity of the experimentally determined duplex and computationally modeled hairpin structures. As one experimental piece of evidence that the homeodomains bind the hairpin DNA in a very similar fashion as for the duplex DNA, we observed that the agreement in the 2D 1H-15N HSQC spectra of the two types of

ACS Paragon Plus Environment

18

Page 18 of 74

Page 19 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

complexes is excellent. This includes the highly shifted (from the free homeodomain values) 1H and 15N chemical shift values observed for the two side-chain NH2 signals from the Asn51 residue. Such Asn51 shifts are very characteristic for all reported NMR structural studies of homeodomain-DNA complexes.

We concluded that the homeodomains were fully DNA-bound, due to: (i) the expected high binding affinity; (ii) the absence of a second set of resonances corresponding to free homeodomain; and (iii) the fact that in the low salt conditions employed for the samples of the homeodomain/DNA complexes, the homeodomains by themselves are poorly soluble at the concentrations used for NMR measurements.

T

T

C

G G

C

C6

G6

C5

G5

T4

A4

A3

T3

A2

T2

T1

A1

C

G

G

C

Figure 2: Left: schematic of the CTTG DNA hairpin, containing the TAATCC consensus DNA binding site (indicated by grey box). The binding site residues are numbered for ease of discussion in the text. Right : DNA hairpin structure (see text for details on how this structure was determined).

NMR measurement of lysine side-chain 15NH3+ relaxation parameters The NMR measurements were done on samples that were placed in coaxial NMR tubes, as suggested by Iwahara et al.45 No D2O was added to the NMR buffer, but rather to the thin

ACS Paragon Plus Environment

19

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

space between the outer and inner tubes, for the NMR lock signal. This meant that only signals from 15NH3+ groups were detected as there was no population of 15NHD2 or 15NH2D groups. The presence of the latter groups would lead to overlapping peaks from 15NH3+ and 15NH2D groups, thus complicating the data analysis. In the case of the Pitx2 homeodomain, there are 3 lysines (Lys50, Lys55 and Lys58), while in the Bicoid homeodomain there are 4 lysines (Lys37, Lys46, Lys50 and Lys57). (The amino acid sequences for the Bicoid and Pitx2 homeodomains are provided in Figure S1 in the Supporting Information section). For all three homeodomain/DNA complexes that were studied, only one lysine 15NH3+ signal was detected, corresponding to Lys50. A contour plot encompassing the 15NH3+ region of a 2D HISQC spectrum of the Pitx2/TAATCC complex is shown in Figure S2 in Supporting Information, along with one dimensional 15N and 1H crosssections through the single observed peak. NMR signals from lysine 15NH3+ groups are usually difficult to detect under typical sample conditions for studies of protein/DNA complexes, due to rapid hydrogen exchange with the solvent water.45, 77 Reduction in the rate of this exchange is possible via lowering the temperature and pH of the sample, and under such conditions Esadze et al. were able to detect signals from multiple lysine residues in their studies.41 In our case however, we wished to work under conditions closer to physiological pH and temperature. Also, we were focused on studying the dynamics of the key Lys50 residue, so the fact that only the signal from Lys50 was detected was not a significant drawback. Regarding the assignment strategy for the lone lysine 15NH3+ resonance detected for each complex, coherence transfer methodology was not successful, due to exchange broadening observed previously for the aliphatic portion of the Lys50 side chain.5, 6 Given this situation, the following logic was followed in assigning the observed resonances to Lys50. Under the temperature and pH

ACS Paragon Plus Environment

20

Page 20 of 74

Page 21 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

conditions we used, we would expect to observe significant signals only for those lysine 15NH3+ groups that were substantially protected from solvent exchange. Of all the lysine residues in the Pitx2 and Bicoid complexes, only Lys50 is in common between the two homeodomains, and is buried in the protein-DNA interface, with its side-chain amino group forming extensive hydrogen bonds with base atoms in the major groove. For Pitx2, Lys55 and Lys58 are located in helix 3 and are fully exposed to the bulk solvent. For Bicoid, Lys37 is at the C-terminal end of helix 2, Lys46 is near the beginning of helix 3 and Lys57 is near the end of helix 3, and all are fully exposed to the bulk solvent. Thus, all of the lysine 15NH3+ groups except for Lys50 in Pitx2 and Bicoid would be expected to exhibit rapid hydrogen exchange with the bulk solvent and therefore be difficult to observe in the NMR spectra. As an independent check on this reasoning for the assignment of the Lys50 resonances, a mutant was generated of the Pitx2 homeodomain with a Lys55Arg substitution and a complex formed with the TAATCC DNA hairpin. A standard 2D 1H-15N HSQC spectrum of this complex was highly superimposable with a spectrum of the wild-type sample, as shown for the backbone residues in Figure S3a, with small shifts seen mainly for residues in helix 3 in the vicinity of Lys/Arg55. The wild-type Lys55 backbone peak and the tentatively assigned mutant Arg55 peak are labeled in the overlaid spectra. Figure S3b shows the overlay of the arginine Nε-Hε side-chain region of the 2D spectra, with the expected additional signal for Arg55 labeled. Most importantly, the 2D HISQC spectrum of the mutant sample, shown in Figure S3c as an overlay with the wild-type spectrum, again revealed one and only one signal in the lysine side chain region, proving unambiguously that the observed signal could not arise from Lys55 in the case of Pitx2. Because only one lysine 15NH3+ signal was observed, it was conceivable to use onedimensional versions of the pulse sequences for all of the relevant NMR experiments, employing

ACS Paragon Plus Environment

21

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

15

N frequency selective pulses to focus in on the region around 33 ppm where the lysine side-

chain signals appear. However, to eliminate further any possibility of contamination by residual signals from the backbone 15NH groups, we ran all experiments as 2D versions and set the 15N spectral width such that no folded signals would overlap the 15NH3+ region. In our experience, running the experiments as 2D versions provides an extra level of reliability in peak intensity measurements, for example by reducing baseline-related issues. With only one detected lysine 15

NH3+ signal in each spectrum, we were able to place the 15N transmitter on-resonance.

Frequency selectivity for the 15NH3+ region was achieved using simple rectangular 15N 180° pulses instead of the typical broadband inversion pulses in the refocused INEPT pulse sequence elements at the beginning and end of the pulse sequence; the rf power for these selective pulses was reduced to 3.6 kHz (600 MHz data) or 3.3 kHz (800 MHz data) from the typical value of ~6.25 kHz. An excellent presentation of the theoretical foundation for 15N spin relaxation measurements of 15NH3+ groups has been given by Iwahara and co-workers.41, 78 The 15NH3+ group can be represented by an AX3 spin system and all of the spin physics that have previously been worked out for 13CH3 methyl groups is directly applicable to 15N relaxation of 15NH3+ groups. Relaxation is dominated by auto- and cross-correlated dipolar interactions as the 15N CSA for lysine 15NH3+ groups is small.79 The measurement of 15N longitudinal relaxation rate constants, R1, of 15NH3+ groups is straightforward using an appropriate pulse sequence. As is the case for 13CH3 methyl groups, more complex behavior is expected for 15N transverse relaxation of 15NH3+ groups due to strong dipolar cross-correlation effects among the 15N-1H dipoles, which leads to bi-exponential signal decay. However, with the pulse sequence reported by Esadze et al.41 and limiting data collection to approximately the first 30% of the transverse relaxation

ACS Paragon Plus Environment

22

Page 22 of 74

Page 23 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

signal decay, a R2,ini rate constant is measured that can be employed in the determination of the 15

NH3+ group dynamics. The heteronuclear 1H-15N NOE experiment is implemented with

important modifications to the manner in which the proton magnetization is saturated.41, 80, 81 In order to make use of the NOE data it is necessary to measure the relaxation of the longitudinal three-spin order term 4NzHzHz; the pulse sequence for this experiment is similar to that for the R1 experiment. Table 1: Measured relaxation parameters for the Lys50 side-chain 15NH3+ amino groups; data was collected on 600 and 800 MHz spectrometers.

Pitx2/TAATCC

Pitx2/TAAGCT

Bicoid/TAATCC

600

1.51 ± 0.02 a (1.51)

2.29 ± 0.11 (2.22)

1.91 ± 0.11 (1.89)

800

1.29 ± 0.01 (1.28)

1.83 ± 0.03 (1.88)

1.52 ± 0.02 (1.54)

600

2.90 ± 0.20 (2.95)

5.25 ± 0.51 (5.30)

3.72 ± 0.54 (3.85)

800

2.81 ± 0.08 (2.76)

5.09 ± 0.11 (5.04)

3.71 ± 0.16 (3.57)

600

-2.783 ± 0.034 (-2.845)

-1.920 ± 0.054 (-1.947)

-2.41 ± 0.07 (-2.53)

800

-2.637 ± 0.011 (-2.604)

-1.549 ± 0.015 (-1.533)

-2.216 ± 0.026 (-2.144)

R(4HzNzNz) 600 (s-1)

27.0 ± 1.1

21.3 ± 1.3

45.7 ± 2.7

800

31.9 ± 0.3

19.3 ± 0.3

43.8 ± 1.0

R1 (s-1)

R2,ini (s-1)

NOE

a

The values in parentheses are back-calculated from the derived order parameters and correlation times presented in Table 2.

Using the pulse sequences developed by Esadze et al.,41, we measured 15N R1, R2,ini and R(4NzHzHz) rate constants and the heteronuclear 1H-15N NOE for the Lys50 15NH3+ group in three homeodomain/DNA complexes: Bicoid/TAATCC, Pitx2/TAATCC and Pitx2/TAAGCT.

ACS Paragon Plus Environment

23

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Measurements were done on both 600 MHz and 800 MHz NMR instruments. The resulting data are presented in Table 1; the numbers in parentheses are the back-calculated values determined from the numerical fitting procedures described below. Example R1 and R2,ini relaxation decay curves are shown in Figure 3 for the Pitx2/TAATCC complex; the data for R2,ini at 800 MHz is omitted from Figure 3b as the rate constants for the 600 and 800 MHz are nearly the same and so the two fitted curves are difficult to distinguish if they are overlaid in one plot. The range of values appearing in Table 1 are generally consistent with data for ubiquitin.41 While it is clear there are variations in the relaxation data for the three different complexes, specific interpretation of these differences in terms of 15NH3+ group dynamics is facilitated by further analysis to extract order parameters and motional correlation times.

Figure 3: Plots of (a) R1 and (b) R2,ini relaxation decay data and fitted exponential functions for the Lys50 side-chain 15NH3+ group in the Pitx2/TAATCC complex. In (a), the solid line is the best fit for the 600 MHz data and the dashed line is for the 800 MHz data. In (b), only the 600 MHz data is shown as the

ACS Paragon Plus Environment

24

Page 24 of 74

Page 25 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

800 MHz R2,ini relaxation rate constant is nearly the same and thus the two data sets would be difficult to distinguish.

In addition to the relaxation measurements indicated above, we also carried out 15N relaxation dispersion experiments at 600 MHz on the Lys50 15NH3+ groups of all three homeodomain/DNA complexes. Such measurements are valuable in probing for the presence of motions on a slow (millisecond-microsecond) timescale.82 As described by Esadze et al.,41 R2,ini measurements were performed as a function of the CPMG field strength, νCPMG, and the results are shown in Figure 4. The red horizontal lines in the Figure panels give a visual reference to the R2,ini values reported in Table 1, with the error limits being indicated by the grey bands. The absence of a significant CPMG field dependence of the R2,ini values for the Bicoid/TAATCC and Pitx2/TAATCC consensus complexes and the relatively slow relaxation rates indicate the absence of significant, slow timescale dynamics for the Lys50 15NH3+ group in these complexes. In contrast, there appears to be a very small CPMG field dependence of R2,ini for the Pitx2/TAAGCT non-consensus DNA complex, in addition to an overall elevation in the R2,ini relaxation rate constants. These observations provide an indication of the presence of slow timescale motion for the Lys50 side-chain amino group in the non-consensus complex. Due to the relatively small dispersion observed in the 600 MHz data, we did not pursue measurements at 800 MHz and did not try to quantitate any exchange parameters from just single-field relaxation dispersion data.

ACS Paragon Plus Environment

25

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4: Plots of 15N R2,ini relaxation dispersion data for the Lys50 15NH3+ group in the two Pitx2/DNA complexes – (a) TAATCC and (b) TAAGCT binding sites, and (c) the Bicoid/DNA complex (TAATCC). The horizontal red lines provide a visual reference to the value of the conventional R2,ini relaxation rate constants from Table 1 and the grey bands indicate the error limits in these values.

Lysine 15NH3+ order parameters and motional correlation times The contributions to the dynamics of lysine side chains considered in the analysis here are the rotation of the 15NH3+ group about its symmetry axis, the reorientation of this symmetry axis with respect to a molecule-based frame of reference,41 and the overall tumbling of the protein-DNA complex. The symmetry axis is equivalent to the Cε−Nζ bond vector. In the model considered here, these motions are considered to be uncoupled from each other, as elaborated

ACS Paragon Plus Environment

26

Page 26 of 74

Page 27 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

upon elsewhere.83, 84 The rotation of the 15NH3+ group is characterized by a correlation time τf, while the symmetry axis motion is characterized by an order parameter S2axis and a correlation time τi. (The order parameter ranges from 0 to 1, corresponding to the amplitude of reorientational motion in a molecular reference frame on a picosecond-nanosecond timescale, with 0 resulting from no restriction and 1 to complete immobilization). In the classic paper by Kay and Torchia on 13C methyl group relaxation in macromolecules,85 their eq. 15 was derived assuming that τf Lys mutant homeodomain-DNA complex, with the DNA containing the TAATCC binding site preferred by Lys50 homeodomains. Their structural data indicated several direct hydrogen bonds from the Lys50 amino group to the O6 and N7 atoms of G5 and G6, and an alternative conformation of the Lys50 side chain allowed for a hydrogen bond with the O4 atom of T4. Our NMR results and MD simulations indicate that this is a fluctuating network of hydrogen bonds, rather than fixed patterns from a few static conformations. Our MD data also supports the prevailing view that water molecules are very important mediators in the hydrogen bonding network present in the homeodomain-DNA interface.39, 40, 89, 100, 101 In addition to the hydrogen bonding network between the Lys50 side-chain amino group and the bases of the DNA binding site, it is apparent from the MD simulations that hydrophobic interactions between the Cε carbon of the Lys50 side chain and the DNA are also important; in the present context a hydrophobic interaction is defined as being present between two carbon atoms that are separated by 4.5 angstroms or less.40 In the case of Pitx2, there is a hydrophobic contact with the G6 C5 and C6 atoms in 95% and 93% of the MD trajectory frames, respectively,

ACS Paragon Plus Environment

38

Page 38 of 74

Page 39 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

and with the G5 C5 and C6 atoms in 37% and 78% of the frames, respectively. For Bicoid, the data is quite similar to that of Pitx2 for the contacts between Lys50 Cε and the G6 C5 and C6 atoms, with contacts present in 95% and 87% of the frames, respectively. Contacts with the G5 C5 and C6 atoms are observed in 7% and 25% of the trajectory frames, representing a distinct difference from the Pitx2 results. The observations of hydrophobic contacts between Lys50 and the G5 and G6 bases of the DNA are in agreement with results from MD simulations reported by Gutmanas and Billeter40 (the identity of the atoms involved in the contacts was not reported in their paper). Lysine-DNA Salt Bridges Although experimental determination of order parameters for the Cε−Nζ bond vector of lysine residues other than Lys50 in the Pitx2/DNA and Bicoid/DNA complexes was not possible, as the 15NH3+ resonances for these residues were not observed, the MD-derived autocorrelation functions are included in Figure 5. For Pitx2, the S2axis order parameter for Lys58 is quite low (0.102 ± 0.009), indicating a high degree of conformational freedom, whereas the order parameter for Lys55 (0.615 ± 0.033) is indicative of some moderate restriction of the bond vector motion. Considering the surface exposed location of Lys55, initially it seemed surprising that its order parameter was higher than that for Lys50, which is buried in the protein-DNA interface. However, upon further analysis, the MD simulations revealed frequent salt bridge contacts between the Lys55 15NH3+ group and the phosphate backbone of the DNA at the positions of T1 and A2. Such salt bridge contacts, also referred to as ion pair contacts,102 occur frequently between the side chains of arginine and lysine residues in proteins and the phosphate backbone of DNA.103 The dynamic properties of salt bridge contacts in the context of proteinDNA interactions have been the subject of much interest, both in terms of experimental and

ACS Paragon Plus Environment

39

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

theoretical studies.77, 78, 86, 90, 104-109 Shown in Figure S7a is a plot of the probability density of the distance from the Nζ atom of Pitx2 Lys55 to the closest phosphate oxygen of T1 and A2; the peak centered at ~2.8 Å is indicative of a salt bridge104, 105 that is present for ~85% of the total duration of the MD simulations. Despite this high population of salt bridge interactions involving the Lys55 NH3+ group, the overall flexibility of the Lys55 side chain is greater than that for Lys50, as indicated by the larger number of rotamer states for Lys55 than for Lys50 (vide supra) and the lower angular order parameters for Lys55 compared to Lys50 (vide infra). Similar findings were reported by the Palmer group110 in a combined NMR/MD study of arginine side chain dynamics in E. coli ribonuclease H; for example, the side chain of Arg27 exhibited greater conformational entropy than that for Arg75, despite having a significantly higher S2 order parameter for the Nε-Hε bond vector. This observation was explained in part by the involvement of the Arg27 NεHε group in a salt bridge with the carboxylate of Glu6. For the case of Bicoid, the MD simulations indicate that the additional lysine residues all exhibit substantial conformational freedom, with order parameters of 0.245 ± 0.018 for Lys37, 0.430 ± 0.024 for Lys46 and 0.306 ± 0.010 for Lys57. A salt bridge is observed between the side chains of Lys37 and Glu15 (occupancy of ~60% duration of the total MD simulation), which is reminiscent of the network observed in the Engrailed homeodomain,111 with an additional salt bridge observed between Lys37 and Asp33 of ~10% duration. Despite its low S2axis order parameter, a salt bridge is observed between Lys57 and the DNA phosphate backbone involving G5 and G6 for ~70% of the total duration of the MD simulation, as indicated by the distance probability density for this salt bridge that is shown in Figure S7b. The secondary, broad peak seen in Figure S7b at a distance of ~4.5 Å is due mainly to conformations in which the Lys57 side chain forms a hydrogen bond with the backbone carbonyl oxygen of Arg53 and, at other

ACS Paragon Plus Environment

40

Page 40 of 74

Page 41 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

times, with the DNA G6 backbone O3’ atom. Lys46 has a lowly populated (~20%) salt bridge to the phosphate backbone outside the TAATCC binding site, and it is also somewhat restricted sterically by proximity to the side chain of Leu31. Dynamics of Additional Protein Side Chain – DNA Contacts Aside from the side chain of residue 50, there are several other homeodomain side chain – DNA contacts that are important for binding. In the recognition helix, the side chains of residues in positions 47, 50, 51 and 54 are normally involved in the interface with the DNA major groove. The N-terminal arm makes contact with the minor groove, most commonly via residues in positions 2, 3 and 5. The molecular dynamics simulations provide detailed information regarding the conformational properties of the side chains of these residues in the homeodomain – DNA interface. Position 47 of the homeodomain is commonly occupied by a valine or isoleucine residue, and makes a hydrophobic contact with the major groove of the DNA. In the case of Pitx2, residue 47 is a valine, and makes contact with the base of T4 in the consensus DNA site. The MD simulation indicates that this hydrophobic contact is quite stable. There is a hydrophobic contact between the Val47 Cγ1 and T4 C7 atoms ~90% of the time in the MD trajectories, and between Val47 Cγ2 and T4 C7 atoms ~84% of the time. In order to display the side chain dihedral angle variation over the course of the ten simulations in a single plot, the dihedral angles were binned to obtain the distribution. A plot of the Pitx2 Val47 χ1 dihedral angle distribution, shown in the top panel of Figure S8, reveals that the side chain of Pitx2 Val47 predominately populates the t rotameric state97, with very small populations in the p and m states. In the case of Bicoid, there is an isoleucine at position 47. Similar to the results seen for Pitx2 Val47, there are nearly constant hydrophobic contacts between the Bicoid Ile47 methyl groups and the T4 C7 atom (~83% of the

ACS Paragon Plus Environment

41

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

time for each methyl group). The plots of the χ1 and χ2 dihedral angle distributions, shown in the bottom two panels of Figure S8, reveal that there are significant populations of the mt and mm rotameric states, indicating that there is much greater conformational freedom for the Bicoid Ile47 methyl groups than for the case of Pitx2 Val47, even though the hydrophobic contacts with T4 C7 are similarly conserved. As mentioned previously, the residue at position 51 of the homeodomain is nearly invariant, being an asparagine whose side chain forms hydrogen bond contacts with the floor of the DNA major groove.10 The direct hydrogen bonds observed between Asn51 and A3 are Asn51 Nδ2 – A3 N7 and Asn51 Oδ1 – A3 N6. For Pitx2 these two hydrogen bonds are observed in approximately 60% and 35%, respectively, of the frames of the MD trajectories, while the populations in the case of Bicoid are notably higher, at ~90% and ~60%, respectively. Waterbridged hydrogen bonds also are quite prevalent. In addition to the water bridge between Lys50 Nζ and Asn51 Oδ1, water-bridged hydrogen bonds are present (approx. greater than 10% of the time) between the Asn51 Oδ1 and A2 OP2, A2 N7, A3 N7, and T4 O4 atoms for Pitx2. In the case of Bicoid, the predominant (> ~10% of the trajectory frames) water-bridged hydrogen bonds involves bridging between Asn51 Oδ1 and A2, A3 and A4 H61/H62. Plots of the Asn51 χ1 - χ2 dihedral angle distributions for Pitx2 and Bicoid are shown in Figure S9, and reveal a distinct difference between the two homeodomains. In the case of Bicoid, the χ1 dihedral is mainly in the m state97, while for Pitx2, both the t and m conformations are populated, at a ratio of approximately 1:2.6. The ptm dihedral designations do not apply for χ2 of Asn residues due to different physical constraints for segments involving sp3 – sp2 hybridized atoms112. The Asn51 χ2 distribution is noticeably broader for Pitx2 than for Bicoid, with distinctly different average positions for the t and m χ1 states in the case of Pitx2. The

ACS Paragon Plus Environment

42

Page 42 of 74

Page 43 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

differences observed in the Asn51 dihedral angle distributions for the Pitx2 and Bicoid complexes are consistent with the significant differences pointed out above in the populations of the hydrogen bonds observed between Asn51 and A3. Position 54 of the Pitx2 homeodomain is an alanine, and its side chain makes no direct contact with the DNA. Similar to the case of other homeodomains with an alanine in position 54, such as engrailed, paired and Pax3, there is a cavity in the homeodomain – DNA interface in the vicinity of the Ala54 methyl group which is filled with a network of water molecules.24, 113 In the case of Bicoid, the residue at position 54 is an arginine. The MD simulations indicate the presence of a salt bridge between the Arg54 guanidinium group and the phosphate backbone at position G5 of the DNA. This salt bridge exists in >95% of the frames of the MD simulations, as shown in Figure S10. In addition, there is a hydrophobic contact between the Arg54 Cδ methylene group and the base of G5 in >95% of the frames of the MD simulations, the base of A4 in ~35% of the frames. The close contact between the Arg54 side-chain Cδ methylene group and the deoxyribose unit of G5 was observed in our previous NMR structural studies of the Bicoid/TAATCC complex.6 As expected, the N-terminal arm of both the Pitx2 and Bicoid homeodomains interacts with the DNA principally through the side chains of arginine residues in positions 2, 3 and 5. The most stable interaction is between the guanidinium group of Arg5 and the T1 O2 atom in the minor groove. In contrast to the stable interaction between the Arg5 side chain and the DNA minor groove, the interactions of Arg2 and Arg3 with the DNA are more variable and transient in nature. For both Pitx2 and Bicoid, Arg2 contacts the DNA mainly via salt bridges between the arginine side chain and backbone oxygen atoms in the two residues immediately outside (in the 3’ direction) of the ATTAGG anti-sense strand of the DNA. The interactions of the guanidinium

ACS Paragon Plus Environment

43

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

group of Arg3 were observed to be qualitatively different between Pitx2 and Bicoid. In the case of Pitx2, the most frequent interactions are salt bridges with the phosphate group of T4 and to a lesser extent A3. In the case of Bicoid, the interaction of Arg3 with the DNA is dominated by hydrogen bonding to the O2 atom of T2 and the N3 atom of A3. In addition to the most commonly observed homeodomain-DNA interactions discussed above, other significant interactions were also observed in the MD simulations. In the case of Pitx2, Arg44 has salt bridge interactions with the phosphate groups of A3 and T4 in ~80% of the frames of the MD simulations. The guanidinium group of Arg53 interacts essentially 100% of the time via salt bridges with the phosphate groups of G6 and the adjacent cytosine and via a hydrogen bond with the backbone oxygen of Arg24. There is also a cation-π interaction114 between the Arg53 and Tyr25 residues. Arg57 exhibits salt bridge interactions with G5 and G6 in ~90% of the frames of the MD simulations. Arg59 interacts via salt bridges to the phosphate groups of T1 and the adjacent cytosine in ~70% of the MD trajectory frames. In the case of additional DNA contacts in the Bicoid complex, Arg53 exhibits the same interactions as described above for Pitx2. Arg 54 exhibits essentially continuous salt bridge interactions to the phosphate group of G5, as noted above. Arg55 has salt bridges to the phosphate groups of T1 and A2 in ~85% of the frames of the MD simulations. Angular order parameters for Arg and Lys side-chain dihedral angles The focus in all of the above discussion was on the dynamics and interactions of the lysine 15NH3+ and arginine guanidinium groups. To gain some additional insights, the MD simulation data can also be mined for information regarding the dynamics of the side-chain dihedral angles of these residues. A convenient and concise way of characterizing such dynamics is to calculate a so-called angular order parameter, Sang, for each side-chain dihedral angle, as

ACS Paragon Plus Environment

44

Page 44 of 74

Page 45 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

described in the Methods section.74, 75 This order parameter ranges from 0 to 1, where a value of 0 indicates essentially complete disorder, while a value of 1 indicates an immobilized state. The angular order parameters were computed for the χ1 - χ4 angles using all frames of the ten 100 ns trajectories for each homeodomain complex, and the results for the Pitx2 and Bicoid complexes are shown in Figures S11 and S12, respectively. One of the main, general observations is the fact that considerable side-chain mobility can be maintained despite the presence of significant salt bridge and hydrogen bond interactions. A similar conclusion was drawn by Iwahara’s group in their NMR and MD studies of the zinc finger protein Egr-1.106 Most of the side chains exhibit substantial dihedral angle variation, with several exceptions. For both Pitx2 and Bicoid, the side chains of the highly conserved Arg52 and Arg53 residues are tightly constrained. In the case of Arg52, there are very stable salt bridge interactions with Glu17. The salt bridge and hydrogen bond interactions constraining Arg53 were pointed out above. The Arg31 side chain in Pitx2 is also highly constrained, due to a network of salt bridge interactions involving Arg31-Glu42Arg46. The Glu17-Arg52 and Arg31-Glu42 pairings are frequently observed in homeodomains.111, 115 As might be anticipated from the experimental Lys50 S2axis order parameter data, the Lys50 side chain in Pitx2 exhibits somewhat greater dihedral angle variation than for the same chain in the Bicoid complex. Dynamics of Val45 in the Bicoid/TAATCC complex The analysis above has focused on the dynamics of amino acid side chains in the homeodomain-DNA interface. The hydrophobic core of the homeodomain motif has generally been found to be well packed, with little conformational flexibility expected. Interestingly, a report on conformational heterogeneity in a Bicoid homeodomain/DNA complex, as observed by vibrational spectroscopy, was recently published by the Romesberg group.116 The Bicoid

ACS Paragon Plus Environment

45

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

homeodomain was selectively labeled with deuterium at five residue positions (Leu16, Leu21, Leu31, Val45 and Lys50) and Fourier transform infrared (FTIR) absorption measurements were recorded for the C-D bonds in the side chains of the labeled residues. The authors noted that the complex absorption lineshapes for the Val45 and Lys50 residues were consistent with there being conformational heterogeneity in these local environments. This observation for Lys50 is certainly consistent with the experimental NMR and MD simulation results reported here, and with previously reported results for Bicoid Lys50 dynamics.6 Conversely, the result for Val45 was perhaps somewhat surprising, given that Val45 is buried within the hydrophobic core of the Bicoid homeodomain. However, examination of the MD simulation results for Bicoid provides some evidence for at least the possibility of some limited conformational heterogeneity in the Val45 side chain. In one of the ten 100 ns simulations of the Bicoid/DNA complex, an excursion from the t rotamer97 to the m rotamer was observed, which lasted approximately 40 ns, as shown in Figure S13. In half of the other 100 ns trajectories, infrequent, short (1-2 ns) jumps from the t rotamer to the m and p rotamer states were observed. The Val45 dihedral angle transitions observed in the MD trajectory between rotamer states appeared to be permitted by general breathing motions of the N-terminal portion of helix 3, rather than any correlated dihedral angle transitions of surrounding amino acid side chains. Consistent with this behavior, 180° aromatic ring flips were observed for Phe49 several times during the course of the ten 100 ns simulations – that aromatic ring is close in space to the Val45 side chain. Whether the very limited conformational heterogeneity as observed in the MD simulations could contribute significantly to the observed absorption lineshape changes in the FTIR study is an open question - Romesberg and co-workers note that “further experiments are required to unambiguously associate these changes with conformational heterogeneity of the protein.”

ACS Paragon Plus Environment

46

Page 46 of 74

Page 47 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Comparison of Lys50 dynamics in Pitx2 versus Bicoid homeodomain/DNA complexes The main goal of the present work was to examine the side-chain dynamics of the key Lys50 residue in homeodomain/DNA complexes. The experimental observations from the NMR studies indicated a significant difference between the Pitx2 and Bicoid homeodomains in the S2axis order parameter for the symmetry axis of the Lys50 15NH3+ group. This experimental observation was supported by the analysis of the MD simulation data. In looking for an explanation for the observed difference, one obvious possibility is the effect of an arginine at position 54 in Bicoid, versus an alanine in Pitx2. The Bicoid Arg54 residue leads to both salt bridge interactions with the DNA phosphate backbone as well as hydrophobic interactions between the aliphatic segment of the Arg54 side chain and the DNA bases nearby. There is also the difference of having valine in position 47 of Pitx2, versus isoleucine for Bicoid. However, one cannot rule out more subtle, global effects arising from other differences in amino acid composition. A very interesting study was recently reported117 in which a homeodomain was constructed by using a consensus sequence, and the biophysical characteristics were compared to those for the wild-type engrailed homeodomain. Remarkably, the consensus homeodomain binds the same target DNA sequence as the engrailed homeodomain with 100-fold higher affinity, even though all of the amino acid residues that directly contact the DNA are identical between the two homeodomains, and the other sequence positions most highly conserved within the homeodomain family are also identical. The sequence differences, 30 residues, are nearly all located on the protein surface.

Conclusions A major goal of structural biology is to elucidate the mechanisms behind the specificity and affinity of intermolecular interactions involving proteins. Protein/DNA complexes are of

ACS Paragon Plus Environment

47

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

particular interest because an understanding of the principles guiding recognition and binding affinity could suggest new solutions to a number of biological problems that are associated with the regulation of DNA transcription. In the present work, we have employed state-of-the-art NMR methodology and molecular dynamics simulations to characterize the dynamics of a lysine side chain that plays a key role in DNA binding site recognition in the Lys50 class of homeodomains. Experimental measurements involving two members of the Lys50 class, Bicoid and Pitx2, indicate a range of flexibility of the Lys50 side chain, as reported by the NMR relaxation data for the 15NH3+ amino group. For Bicoid, the founding member of the Lys50 class, the NMR-derived S2axis order parameter for the Lys50 Cε−Nζ bond vector indicates a relatively modest degree of flexibility on the picosecond-nanosecond timescale, consistent with significant occupation of a single rotameric state for the Lys50 side chain. In contrast, the data for the Pitx2 homeodomain indicates a substantial amount of conformational freedom for the Lys50 amino group on the picosecond-nanosecond times scale. The measured order parameter for the Lys50 amino group in the Pitx2 complex with a non-consensus DNA binding site was very similar to the value determined for the case of the TAATCC consensus site. The correlation time reporting on the rotation of the 15NH3+ group about its symmetry axis, on the order of several hundred picoseconds in all cases studied, is consistent with the Lys50 amino group being involved in several hydrogen bonds, including direct and water-bridged interactions with base atoms in the DNA binding site. The MD data indicates that the Lys50 side-chain amino group in both Bicoid and Pitx2 interacts significantly with base G5 in the consensus binding site, while on the other hand, there are significant differences between the two homeodomains in the intermolecular contacts made by Lys50 with neighboring bases T4 and G6. Our results concerning the dynamics of the Lys50 side chain support the view that a fluctuating network of contacts exists between the

ACS Paragon Plus Environment

48

Page 48 of 74

Page 49 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Lys50 amino group and the DNA binding site, and indicate that entropic contributions of such side chains need to be considered in any thermodynamic analysis of site-specific recognition in protein−DNA interactions. Our MD simulations also provided some insights regarding the potential of entropic contributions of other amino acid residues located at the protein−DNA interface. Most notably, well conserved arginine residues in the N-terminal arm of the homeodomain display a wide range of conformational freedom in their side chains, with the Arg5 side chain being quite restricted by numerous interactions with the DNA minor groove, whereas Arg2 and Arg3 display a high degree of flexibility in their interactions with the DNA backbone and minor groove. As has been shown in a number of earlier studies, the presence of water molecules in the protein-DNA interface is very important in forming bridging interactions between the protein and DNA as well as in intramolecular contacts.

ACKNOWLEDGMENTS We would like to thank Profs. Eric Johnson, Doug Kojetin, and Arthur Palmer for helpful discussions and Walter Chazin for the suggestion of using single-stranded DNA. Molecular graphics and analyses were performed with the UCSF Chimera package. Chimera is developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco (supported by NIGMS P41-GM103311).

ASSOCIATED CONTENT Supporting Information

ACS Paragon Plus Environment

49

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure S1 gives the primary amino acid sequence for Bicoid and Pitx2 homeodomains; Figures S2-S13 show results from NMR experiments and the MD simulations. This material is available free of charge via the Internet at http://pubs.acs.org. Author present address ¶ T.D.: DePuy Synthes, 1302 Wrights Lane East, West Chester, PA 19380 ⊥ M.T.: Department of Physics, California State University, Chico, 400 West First Street, Chico, CA 95929-0202 Author contributions T.D. and A.C. prepared the samples, J.M.B.-T., M.T. and M.R. performed the MD simulations, M.R. recorded and analyzed the NMR relaxation data, J.M.B.-T. did the structural work, M.R. designed the project, and J.M.B.-T. and M.R. prepared the manuscript. Funding sources This research was funded by NIH grants GM063855 (M.R.) and HL007382 (J.M.B.-T.), an American Heart Association Fellowship(T.D.), and a Fight for Sight Summer Student Fellowship (T.D.). Funding for the NMR facility was provided by NIH grants RR19077 and RR027755. The authors declare no competing financial interests.

ACS Paragon Plus Environment

50

Page 50 of 74

Page 51 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

References [1] Caro, J. A., Harpole, K. W., Kasinath, V., Lim, J., Granja, J., Valentine, K. G., Sharp, K. A., and Wand, A. J. (2017) Entropy in molecular recognition by proteins, Proceedings of the National Academy of Sciences of the United States of America 114, 6563-6568. [2] Karplus, M., Ichiye, T., and Pettitt, B. M. (1987) Configurational entropy of native proteins, Biophysical journal 52, 1083-1085. [3] Berglund, H., Baumann, H., Knapp, S., Ladenstein, R., and Hard, T. (1995) Flexibility of an arginine side chain at a DNA-protein interface, J Am Chem Soc 117, 12883-12884. [4] Foster, M. P., Wuttke, D. S., Radhakrishnan, I., Case, D. A., Gottesfeld, J. M., and Wright, P. E. (1997) Domain packing and dynamics in the DNA complex of the N-terminal zinc fingers of TFIIIA, Nature structural biology 4, 605-608. [5] Chaney, B. A., Clark-Baldwin, K., Dave, V., Ma, J., and Rance, M. (2005) Solution structure of the K50 class homeodomain PITX2 bound to DNA and implications for mutations that cause Rieger syndrome, Biochemistry 44, 7497-7511. [6] Baird-Titus, J. M., Clark-Baldwin, K., Dave, V., Caperelli, C. A., Ma, J., and Rance, M. (2006) The solution structure of the native K50 Bicoid homeodomain bound to the consensus TAATCC DNA-binding site, J Mol Biol 356, 1137-1151. [7] Qian, Y. Q., Otting, G., Billeter, M., Muller, M., Gehring, W., and Wuthrich, K. (1993) Nuclear magnetic resonance spectroscopy of a DNA complex with the uniformly 13Clabeled Antennapedia homeodomain and structure determination of the DNA-bound homeodomain, J Mol Biol 234, 1070-1083.

ACS Paragon Plus Environment

51

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[8] Gruschus, J. M., Tsao, D. H., Wang, L. H., Nirenberg, M., and Ferretti, J. A. (1997) Interactions of the vnd/NK-2 homeodomain with DNA by nuclear magnetic resonance spectroscopy: basis of binding specificity, Biochemistry 36, 5372-5380. [9] Slijper, M., Boelens, R., Davis, A. L., Konings, R. N., van der Marel, G. A., van Boom, J. H., and Kaptein, R. (1997) Backbone and side chain dynamics of lac repressor headpiece (156) and its complex with DNA, Biochemistry 36, 249-254. [10] Wilson, D. S., Sheng, G., Jun, S., and Desplan, C. (1996) Conservation and diversification in homeodomain-DNA interactions: a comparative genetic analysis, Proceedings of the National Academy of Sciences of the United States of America 93, 6886-6891. [11] Ades, S. E., and Sauer, R. T. (1994) Differential DNA-binding specificity of the engrailed homeodomain: the role of residue 50, Biochemistry 33, 9187-9194. [12] Ades, S. E., and Sauer, R. T. (1995) Specificity of minor-groove and major-groove interactions in a homeodomain-DNA complex, Biochemistry 34, 14601-14608. [13] Damante, G., Fabbro, D., Pellizzari, L., Civitareale, D., Guazzi, S., Polycarpou-Schwartz, M., Cauci, S., Quadrifoglio, F., Formisano, S., and Di Lauro, R. (1994) Sequencespecific DNA recognition by the thyroid transcription factor-1 homeodomain, Nucleic Acids Res 22, 3075-3083. [14] Hanes, S. D., and Brent, R. (1991) A genetic model for interaction of the homeodomain recognition helix with DNA, Science 251, 426-430. [15] Percival-Smith, A., Muller, M., Affolter, M., and Gehring, W. J. (1990) The interaction with DNA of wild-type and mutant fushi tarazu homeodomains, The EMBO journal 9, 39673974.

ACS Paragon Plus Environment

52

Page 52 of 74

Page 53 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

[16] Pomerantz, J. L., and Sharp, P. A. (1994) Homeodomain determinants of major groove recognition, Biochemistry 33, 10851-10858. [17] Treisman, J., Gonczy, P., Vashishtha, M., Harris, E., and Desplan, C. (1989) A single amino acid can determine the DNA binding specificity of homeodomain proteins, Cell 59, 553562. [18] Kornberg, T. B. (1993) Understanding the homeodomain, J Biol Chem 268, 26813-26816. [19] Noyes, M. B., Christensen, R. G., Wakabayashi, A., Stormo, G. D., Brodsky, M. H., and Wolfe, S. A. (2008) Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites, Cell 133, 1277-1289. [20] Hanes, S. D., and Brent, R. (1989) DNA specificity of the bicoid activator protein is determined by homeodomain recognition helix residue 9, Cell 57, 1275-1283. [21] Schier, A. F., and Gehring, W. J. (1992) Direct homeodomain-DNA interaction in the autoregulation of the fushi tarazu gene, Nature 356, 804-807. [22] Mathias, J. R., Zhong, H., Jin, Y., and Vershon, A. K. (2001) Altering the DNA-binding specificity of the yeast Matalpha 2 homeodomain protein, J Biol Chem 276, 3269632703. [23] Gehring, W. J., Affolter, M., and Burglin, T. (1994) Homeodomain proteins, Annual review of biochemistry 63, 487-526. [24] Fraenkel, E., Rould, M. A., Chambers, K. A., and Pabo, C. O. (1998) Engrailed homeodomain-DNA complex at 2.2 A resolution: a detailed view of the interface and comparison with other engrailed structures, J Mol Biol 284, 351-361.

ACS Paragon Plus Environment

53

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[25] Grant, R. A., Rould, M. A., Klemm, J. D., and Pabo, C. O. (2000) Exploring the role of glutamine 50 in the homeodomain-DNA interface: crystal structure of engrailed (Gln50 -> ala) complex at 2.0 A, Biochemistry 39, 8187-8192. [26] Tucker-Kellogg, L., Rould, M. A., Chambers, K. A., Ades, S. E., Sauer, R. T., and Pabo, C. O. (1997) Engrailed (Gln50-->Lys) homeodomain-DNA complex at 1.9 A resolution: structural basis for enhanced affinity and altered specificity, Structure 5, 1047-1054. [27] Driever, W., Thoma, G., and Nusslein-Volhard, C. (1989) Determination of spatial domains of zygotic gene expression in the Drosophila embryo by the affinity of binding sites for the bicoid morphogen, Nature 340, 363-367. [28] Semina, E. V., Reiter, R., Leysens, N. J., Alward, W. L., Small, K. W., Datson, N. A., Siegel-Bartelt, J., Bierke-Nelson, D., Bitoun, P., Zabel, B. U., Carey, J. C., and Murray, J. C. (1996) Cloning and characterization of a novel bicoid-related homeobox transcription factor gene, RIEG, involved in Rieger syndrome, Nature genetics 14, 392-399. [29] Espinoza, H. M., Ganga, M., Vadlamudi, U., Martin, D. M., Brooks, B. P., Semina, E. V., Murray, J. C., and Amendt, B. A. (2005) Protein kinase C phosphorylation modulates Nand C-terminal regulatory activities of the PITX2 homeodomain protein, Biochemistry 44, 3942-3954. [30] Dave, V., Zhao, C., Yang, F., Tung, C. S., and Ma, J. (2000) Reprogrammable recognition codes in bicoid homeodomain-DNA interaction, Molecular and cellular biology 20, 7673-7684. [31] Espinoza, H. M., Cox, C. J., Semina, E. V., and Amendt, B. A. (2002) A molecular basis for differential developmental anomalies in Axenfeld-Rieger syndrome, Human molecular genetics 11, 743-753.

ACS Paragon Plus Environment

54

Page 54 of 74

Page 55 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

[32] Green, P. D., Hjalt, T. A., Kirk, D. E., Sutherland, L. B., Thomas, B. L., Sharpe, P. T., Snead, M. L., Murray, J. C., Russo, A. F., and Amendt, B. A. (2001) Antagonistic regulation of Dlx2 expression by PITX2 and Msx2: implications for tooth development, Gene expression 9, 265-281. [33] Hjalt, T. A., Amendt, B. A., and Murray, J. C. (2001) PITX2 regulates procollagen lysyl hydroxylase (PLOD) gene expression: implications for the pathology of Rieger syndrome, The Journal of cell biology 152, 545-552. [34] Kozlowski, K., and Walter, M. A. (2000) Variation in residual PITX2 activity underlies the phenotypic spectrum of anterior segment developmental disorders, Human molecular genetics 9, 2131-2139. [35] Fraenkel, E., and Pabo, C. O. (1998) Comparison of X-ray and NMR structures for the Antennapedia homeodomain-DNA complex, Nature structural biology 5, 692-697. [36] Hirsch, J. A., and Aggarwal, A. K. (1995) Structure of the even-skipped homeodomain complexed to AT-rich DNA: new perspectives on homeodomain specificity, The EMBO journal 14, 6280-6291. [37] Billeter, M. (1996) Homeodomain-type DNA recognition, Progress in biophysics and molecular biology 66, 211-225. [38] Billeter, M., Qian, Y. Q., Otting, G., Muller, M., Gehring, W., and Wuthrich, K. (1993) Determination of the nuclear magnetic resonance solution structure of an Antennapedia homeodomain-DNA complex, J Mol Biol 234, 1084-1093. [39] Billeter, M., Guntert, P., Luginbuhl, P., and Wuthrich, K. (1996) Hydration and DNA recognition by homeodomains, Cell 85, 1057-1065.

ACS Paragon Plus Environment

55

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[40] Gutmanas, A., and Billeter, M. (2004) Specific DNA recognition by the Antp homeodomain: MD simulations of specific and nonspecific complexes, Proteins 57, 772782. [41] Esadze, A., Li, D. W., Wang, T., Bruschweiler, R., and Iwahara, J. (2011) Dynamics of lysine side-chain amino groups in a protein studied by heteronuclear 1H-15N NMR spectroscopy, J Am Chem Soc 133, 909-919. [42] Nguyen, D., Lokesh, G. L. R., Volk, D. E., and Iwahara, J. (2017) A Unique and Simple Approach to Improve Sensitivity in 15N-NMR Relaxation Measurements for NH(3)(+) Groups: Application to a Protein-DNA Complex, Molecules 22, 1355; doi:10.3390/molecules22081355 [43] Hansen, D. F., Vallurupalli, P., and Kay, L. E. (2008) An improved (15)N relaxation dispersion experiment for the measurement of millisecond time-scale dynamics in proteins, J Phys Chem B 112, 5898-5904. [44] Bax, A., Ikura, M., Kay, L. E., Torchia, D. A., and Tschudin, R. (1990) Comparison of Different Modes of 2-Dimensional Reverse-Correlation Nmr for the Study of Proteins, J Magn Reson 86, 304-318. [45] Iwahara, J., Jung, Y. S., and Clore, G. M. (2007) Heteronuclear NMR spectroscopy for lysine NH3 groups in proteins: Unique effect of water exchange on N-15 transverse relaxation, J Am Chem Soc 129, 2971-2980. [46] Skelton, N. J., Palmer, A. G., Akke, M., Kordel, J., Rance, M., and Chazin, W. J. (1993) Practical Aspects of 2-Dimensional Proton-Detected N-15 Spin Relaxation Measurements, J Magn Reson Ser B 102, 253-264.

ACS Paragon Plus Environment

56

Page 56 of 74

Page 57 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

[47] Farrow, N. A., Muhandiram, R., Singer, A. U., Pascal, S. M., Kay, C. M., Gish, G., Shoelson, S. E., Pawson, T., Formankay, J. D., and Kay, L. E. (1994) Backbone Dynamics of a Free and a Phosphopeptide-Complexed Src Homology-2 Domain Studied by N-15 Nmr Relaxation, Biochemistry 33, 5984-6003. [48] Doerdelmann, T., Kojetin, D. J., Baird-Titus, J. M., Solt, L. A., Burris, T. P., and Rance, M. (2012) Structural and biophysical insights into the ligand-free Pitx2 homeodomain and a ring dermoid of the cornea inducing homeodomain mutant, Biochemistry 51, 665-676. [49] Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J., and Bax, A. (1995) NMRPipe: a multidimensional spectral processing system based on UNIX pipes, Journal of biomolecular NMR 6, 277-293. [50] Palmer, A. G., Rance, M., and Wright, P. E. (1991) Intramolecular Motions of a Zinc Finger DNA-Binding Domain from Xfin Characterized by Proton-Detected Natural Abundance C-12 Heteronuclear Nmr-Spectroscopy, J Am Chem Soc 113, 4371-4380. [51] Tjandra, N., Feller, S. E., Pastor, R. W., and Bax, A. (1995) Rotational diffusion anisotropy of human ubiquitin from N-15 NMR relaxation, J Am Chem Soc 117, 12562-12566. [52] Theobald, D. L., and Wuttke, D. S. (2006) THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures, Bioinformatics 22, 21712172. [53] Antao, V. P., Lai, S. Y., and Tinoco, I. (1991) A Thermodynamic Study of Unusually Stable Rna and DNA Hairpins, Nucleic Acids Res 19, 5901-5905. [54] Huang, W. M., DaGloria, J., Fox, H., Ruan, Q. R., Tillou, J., Shi, K., Aihara, H., Aron, J., and Casjens, S. (2012) Linear Chromosome-generating System of Agrobacterium

ACS Paragon Plus Environment

57

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

tumefaciens C58 PROTELOMERASE GENERATES AND PROTECTS HAIRPIN ENDS, J Biol Chem 287, 25551-25563. [55] Case, D. A., Cheatham, T. E., Darden, T., Gohlke, H., Luo, R., Merz, K. M., Onufriev, A., Simmerling, C., Wang, B., and Woods, R. J. (2005) The Amber biomolecular simulation programs, J Comput Chem 26, 1668-1688. [56] Pettersen, E. F., Goddard, T. D., Huang, C. C., Couch, G. S., Greenblatt, D. M., Meng, E. C., and Ferrin, T. E. (2004) UCSF chimera - A visualization system for exploratory research and analysis, J Comput Chem 25, 1605-1612. [57] Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W., and Klein, M. L. (1983) Comparison of Simple Potential Functions for Simulating Liquid Water, J Chem Phys 79, 926-935. [58] Maier, J. A., Martinez, C., Kasavajhala, K., Wickstrom, L., Hauser, K. E., and Simmerling, C. (2015) ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB, J Chem Theory Comput 11, 3696-3713. [59] Perez, A., Marchan, I., Svozil, D., Sponer, J., Cheatham, T. E., 3rd, Laughton, C. A., and Orozco, M. (2007) Refinement of the AMBER force field for nucleic acids: improving the description of alpha/gamma conformers, Biophysical journal 92, 3817-3829. [60] Gotz, A. W., Williamson, M. J., Xu, D., Poole, D., Le Grand, S., and Walker, R. C. (2012) Routine Microsecond Molecular Dynamics Simulations with AMBER on GPUs. 1. Generalized Born, J Chem Theory Comput 8, 1542-1555. [61] Le Grand, S., Gotz, A. W., and Walker, R. C. (2013) SPFP: Speed without compromise-A mixed precision model for GPU accelerated molecular dynamics simulations, Comput Phys Commun 184, 374-380.

ACS Paragon Plus Environment

58

Page 58 of 74

Page 59 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

[62] Shang, Y., and Simmerling, C. (2012) Molecular dynamics applied in drug discovery: the case of HIV-1 protease, Methods Mol Biol 819, 527-549. [63] Allen, M. P., and Tildesley, D. J. (1987) Computer simulation of liquids, Clarendon Press ; Oxford University Press, Oxford England, New York. [64] Darden, T., York, D., and Pedersen, L. (1993) Particle Mesh Ewald - an N.Log(N) Method for Ewald Sums in Large Systems, J Chem Phys 98, 10089-10092. [65] Ryckaert, J. P., Ciccotti, G., and Berendsen, H. J. C. (1977) Numerical-Integration of Cartesian Equations of Motion of a System with Constraints - Molecular-Dynamics of NAlkanes, J Comput Phys 23, 327-341. [66] Izaguirre, J. A., Catarello, D. P., Wozniak, J. M., and Skeel, R. D. (2001) Langevin stabilization of molecular dynamics, The Journal of Chemical Physics 114, 2090-2098. [67] Cerutti, D. S., Duke, R., Freddolino, P. L., Fan, H., and Lybrand, T. P. (2008) Vulnerability in Popular Molecular Dynamics Packages Concerning Langevin and Andersen Dynamics, J Chem Theory Comput 4, 1669-1680. [68] Berendsen, H. J. C., Postma, J. P. M., Gunsteren, W. F. v., DiNola, A., and Haak, J. R. (1984) Molecular dynamics with coupling to an external bath, The Journal of Chemical Physics 81, 3684-3690. [69] Hopkins, C. W., Le Grand, S., Walker, R. C., and Roitberg, A. E. (2015) Long-Time-Step Molecular Dynamics through Hydrogen Mass Repartitioning, J Chem Theory Comput 11, 1864-1874. [70] Roe, D. R., and Cheatham, T. E. (2013) PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data, J Chem Theory Comput 9, 3084-3095.

ACS Paragon Plus Environment

59

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[71] Caves, L. S., Evanseck, J. D., and Karplus, M. (1998) Locally accessible conformations of proteins: multiple molecular dynamics simulations of crambin, Protein science : a publication of the Protein Society 7, 649-666. [72] Case, D. A. (2002) Molecular dynamics and NMR spin relaxation in proteins, Accounts of chemical research 35, 325-331. [73] Henry, E. R., and Szabo, A. (1985) Influence of Vibrational Motion on Solid-State LineShapes and Nmr Relaxation, J Chem Phys 82, 4753-4761. [74] Detlefsen, D. J., Thanabal, V., Pecoraro, V. L., and Wagner, G. (1991) Solution structure of Fe(II) cytochrome c551 from Pseudomonas aeruginosa as determined by twodimensional 1H NMR, Biochemistry 30, 9040-9046. [75] Hyberts, S. G., Goldberg, M. S., Havel, T. F., and Wagner, G. (1992) The solution structure of eglin c based on measurements of many NOEs and coupling constants and its comparison with X-ray structures, Protein science : a publication of the Protein Society 1, 736-751. [76] Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., and Bourne, P. E. (2000) The Protein Data Bank, Nucleic Acids Res 28, 235-242. [77] Anderson, K. M., Esadze, A., Manoharan, M., Bruschweiler, R., Gorenstein, D. G., and Iwahara, J. (2013) Direct observation of the ion-pair dynamics at a protein-DNA interface by NMR spectroscopy, J Am Chem Soc 135, 3613-3619. [78] Zandarashvili, L., Esadze, A., and Iwahara, J. (2013) NMR Studies on the Dynamics of Hydrogen Bonds and Ion Pairs Involving Lysine Side Chains of Proteins, Adv Protein Chem Str 93, 37-80.

ACS Paragon Plus Environment

60

Page 60 of 74

Page 61 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

[79] Sarkar, S. K., Hiyama, Y., Niu, C. H., Young, P. E., Gerig, J. T., and Torchia, D. A. (1987) Molecular-Dynamics of Collagen Side-Chains in Hard and Soft-Tissues - a Multinuclear Magnetic-Resonance Study, Biochemistry 26, 6793-6800. [80] Ferrage, F., Cowburn, D., and Ghose, R. (2009) Accurate Sampling of High-Frequency Motions in Proteins by Steady-State N-15-{H-1} Nuclear Overhauser Effect Measurements in the Presence of Cross-Correlated Relaxation, J Am Chem Soc 131, 6048-+. [81] Ferrage, F., Piserchio, A., Cowburn, D., and Ghose, R. (2008) On the measurement of N-15{H-1} nuclear Overhauser effects, J Magn Reson 192, 302-313. [82] Palmer, A. G., 3rd. (2014) Chemical exchange in biomacromolecules: past, present, and future, J Magn Reson 241, 3-17. [83] Showalter, S. A., Johnson, E., Rance, M., and Bruschweiler, R. (2007) Toward quantitative interpretation of methyl side-chain dynamics from NMR by molecular dynamics simulations, J Am Chem Soc 129, 14146-14147. [84] Xue, Y., Pavlova, M. S., Ryabov, Y. E., Reif, B., and Skrynnikov, N. R. (2007) Methyl rotation barriers in proteins from 2H relaxation data. Implications for protein structure, J Am Chem Soc 129, 6827-6838. [85] Kay, L. E., and Torchia, D. A. (1991) The Effects of Dipolar Cross Correlation on 13C Methyl-Carbon T1, T2, and NOE Measurements in Macromolecules, J. Magn. Reson. 95, 536-547. [86] Zandarashvili, L., and Iwahara, J. (2015) Temperature dependence of internal motions of protein side-chain NH3(+) groups: insight into energy barriers for transient breakage of hydrogen bonds, Biochemistry 54, 538-545.

ACS Paragon Plus Environment

61

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[87] Tsui, V., Radhakrishnan, I., Wright, P. E., and Case, D. A. (2000) NMR and molecular dynamics studies of the hydration of a zinc finger-DNA complex, J Mol Biol 302, 11011117. [88] Mackerell, A. D., Jr., and Nilsson, L. (2008) Molecular dynamics simulations of nucleic acid-protein complexes, Current opinion in structural biology 18, 194-199. [89] Duan, J., and Nilsson, L. (2002) The role of residue 50 and hydration water molecules in homeodomain DNA recognition, Eur Biophys J 31, 306-316. [90] Iurcu-Mustata, G., Van Belle, D., Wintjens, R., Prevost, M., and Rooman, M. (2001) Role of salt bridges in homeodomains investigated by structural analyses and molecular dynamics simulations, Biopolymers 59, 145-159. [91] Babin, V., Wang, D., Rose, R. B., and Sagui, C. (2013) Binding polymorphism in the DNA bound state of the Pdx1 homeodomain, PLoS computational biology 9, e1003160. [92] Flader, W., Wellenzohn, B., Winger, R. H., Hallbrucker, A., Mayer, E., and Liedl, K. R. (2003) Stepwise induced fit in the pico- to nanosecond time scale governs the complexation of the even-skipped transcriptional repressor homeodomain to DNA, Biopolymers 68, 139-149. [93] Yang, S. Y., Yang, X. L., Yao, L. F., Wang, H. B., and Sun, C. K. (2011) Effect of CpG methylation on DNA binding protein: molecular dynamics simulations of the homeodomain PITX2 bound to the methylated DNA, Journal of molecular graphics & modelling 29, 920-927. [94] Lipari, G., and Szabo, A. (1982) Model-Free Approach to the Interpretation of Nuclear Magnetic-Resonance Relaxation in Macromolecules .1. Theory and Range of Validity, J Am Chem Soc 104, 4546-4559.

ACS Paragon Plus Environment

62

Page 62 of 74

Page 63 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

[95] Lipari, G., Szabo, A., and Levy, R. M. (1982) Protein Dynamics and Nmr Relaxation Comparison of Simulations with Experiment, Nature 300, 197-198. [96] Kuster, D. J., Liu, C., Fang, Z., Ponder, J. W., and Marshall, G. R. (2015) High-resolution crystal structures of protein helices reconciled with three-centered hydrogen bonds and multipole electrostatics, PLoS One 10, e0123146. [97] Lovell, S. C., Word, J. M., Richardson, J. S., and Richardson, D. C. (2000) The penultimate rotamer library, Proteins 40, 389-408. [98] Scouras, A. D., and Daggett, V. (2011) The Dynameomics rotamer library: amino acid side chain conformations and dynamics from comprehensive molecular dynamics simulations in water, Protein science : a publication of the Protein Society 20, 341-352. [99] Moreland, R. T., Ryan, J. F., Pan, C., and Baxevanis, A. D. (2009) The Homeodomain Resource: a comprehensive collection of sequence, structure, interaction, genomic and functional information on the homeodomain protein family, Database : the journal of biological databases and curation 2009, bap004. [100] Janin, J. (1999) Wet and dry interfaces: the role of solvent in protein-protein and proteinDNA recognition, Structure 7, R277-279. [101] Schwabe, J. W. (1997) The role of water in protein-DNA interactions, Current opinion in structural biology 7, 126-134. [102] Iwahara, J., Esadze, A., and Zandarashvili, L. (2015) Physicochemical Properties of Ion Pairs of Biological Macromolecules, Biomolecules 5, 2435-2463. [103] Pabo, C. O., and Sauer, R. T. (1992) Transcription factors: structural families and principles of DNA recognition, Annual review of biochemistry 61, 1053-1095.

ACS Paragon Plus Environment

63

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[104] Ma, L., Sundlass, N. K., Raines, R. T., and Cui, Q. (2011) Disruption and formation of surface salt bridges are coupled to DNA binding by the integration host factor: a computational analysis, Biochemistry 50, 266-275. [105] Chen, C., Esadze, A., Zandarashvili, L., Nguyen, D., Pettitt, B. M., and Iwahara, J. (2015) Dynamic equilibria of short-range electrostatic interactions at molecular interfaces of protein–DNA complexes, The journal of physical chemistry letters 6, 2733-2737. [106] Esadze, A., Chen, C., Zandarashvili, L., Roy, S., Pettitt, B. M., and Iwahara, J. (2016) Changes in conformational dynamics of basic side chains upon protein-DNA association, Nucleic Acids Res 44, 6961-6970. [107] Etheve, L., Martin, J., and Lavery, R. (2016) Protein-DNA interfaces: a molecular dynamics analysis of time-dependent recognition processes for three transcription factors, Nucleic Acids Res 44, 9990-10002. [108] Etheve, L., Martin, J., and Lavery, R. (2016) Dynamics and recognition within a proteinDNA complex: a molecular dynamics study of the SKN-1/DNA interaction, Nucleic Acids Res 44, 1440-1448. [109] Nguyen, D., Hoffpauir, Z. A., and Iwahara, J. (2017) Internal Motions of Basic Side Chains of the Antennapedia Homeodomain in the Free and DNA-Bound States, Biochemistry 56, 5866-5869. [110] Trbovic, N., Cho, J. H., Abel, R., Friesner, R. A., Rance, M., and Palmer, A. G., 3rd. (2009) Protein side-chain dynamics and residual conformational entropy, J Am Chem Soc 131, 615-622.

ACS Paragon Plus Environment

64

Page 64 of 74

Page 65 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

[111] Torrado, M., Revuelta, J., Gonzalez, C., Corzana, F., Bastida, A., and Asensio, J. L. (2009) Role of conserved salt bridges in homeodomain stability and DNA binding, J Biol Chem 284, 23765-23779. [112] Hintze, B. J., Lewis, S. M., Richardson, J. S., and Richardson, D. C. (2016) Molprobity's ultimate rotamer-library distributions for model validation, Proteins 84, 1177-1189. [113] Birrane, G., Soni, A., and Ladias, J. A. (2009) Structural basis for DNA recognition by the human PAX3 homeodomain, Biochemistry 48, 1148-1155. [114] Wintjens, R., Lievin, J., Rooman, M., and Buisine, E. (2000) Contribution of cation-pi interactions to the stability of protein-DNA complexes, J Mol Biol 302, 395-410. [115] Clarke, N. D. (1995) Covariation of residues in the homeodomain sequence family, Protein science : a publication of the Protein Society 4, 2269-2278. [116] Adhikary, R., Tan, Y. X., Liu, J., Zimmermann, J., Holcomb, M., Yvellez, C., Dawson, P. E., and Romesberg, F. E. (2017) Conformational Heterogeneity and DNA Recognition by the Morphogen Bicoid, Biochemistry 56, 2787-2793. [117] Tripp, K. W., Sternke, M., Majumdar, A., and Barrick, D. (2017) Creating a Homeodomain with High Stab (a)ility and DNA Binding Affinity by Sequence Averaging, J Am Chem Soc. 139, 5051-5060.

ACS Paragon Plus Environment

65

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

For Table of Contents Use Only Lysine Side-Chain Dynamics in the Binding Site of Homeodomain/DNA Complexes as Observed by NMR Relaxation Experiments and Molecular Dynamics Simulations Jamie M. Baird-Titus, Mahendra Thapa, Thomas Doerdelmann, Kelly A. Combs and Mark Rance*

ACS Paragon Plus Environment

66

Page 66 of 74

Page 67 of 74

Biochemistry

helix 2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 helix 19 20 21 22

helix 3

1

ACS Paragon Plus Environment

T C

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Biochemistry

T

Page 68 of 74

G G

C

C6

G6

C5

G5

T4

A4

A3

T3

A2

T2

T1

A1

C

G

G

C

ACS Paragon Plus Environment

1.069 of 74 Page

Biochemistry

a

0.8

intensity

intensity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

0.6 0.4 0.2 0.0

0.0

0.5

1.0

1.5

1.0

2.0

b

0.8 0.6 0.4 0.2 ACS Paragon Plus Environment 0.0 0.00 0.02 0.04 0.06 0.08 0.10

time (s)

0.12

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment

Page 70 of 74

Page 71 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

ACS Paragon Plus Environment

1 (degrees)

360

4 (degrees)

3 (degrees)

2 (degrees)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

Biochemistry

Page 72 of 74

240 120 0 360 240 120 0 360 240 120 0 360 240 120 0

0

ACS Paragon Plus Environment

20

40

60

time (ns)

80

100

tttt

tttm

Page 73 of 74 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

Biochemistry

K50

G5

G5

T4 N51

A3

ttpp

G6

tttp

G5 A4

ACS Paragon Plus Environment

G6 G5

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13

ACS Paragon Plus Environment

Page 74 of 74