Molecular Characterization of a Tetramolecular Complex between

Mapping the Dimer Interface in the C-Terminal Domains of the Yeast MLH1−PMS1 Heterodimer. Jenny M. Cutalo, Thomas A. Darden, Thomas A. Kunkel, and ...
0 downloads 0 Views 172KB Size
Bioconjugate Chem. 2000, 11, 335−344

335

Molecular Characterization of a Tetramolecular Complex between dsDNA and a DNA-Binding Leucine Zipper Peptide Dimer by Mass Spectrometry Leesa J. Deterding,*,† Juergen Kast,‡ Michael Przybylski,‡ and Kenneth B. Tomer† Laboratory of Structural Biology, National Institute of Environmental Health Sciences, National Institutes of Health, P.O. Box 12233, Research Triangle Park, North Carolina 27709, and University of Konstanz, Faculty of Chemistry, P.O. Box 5560M732, D-78434 Konstanz, Germany. Received September 22, 1999; Revised Manuscript Received December 10, 1999

The characterization of sequence-specific noncovalent complexes of the GCN4 peptides and dsDNA using mass spectrometry is reported. The GCN4 peptides belong to a class of proteins which bind to sequence-specific dsDNA and are important in the regulation of gene transcription in yeast. These proteins contain a bZIP structural motif which consists of a basic DNA-binding domain and a leucine zipper dimerization domain. The protein dimers specifically bind double-stranded DNA containing the binding element 5′-ATGA(C/G)TCAT-3′ to form a tetramolecular noncovalent complex. Using electrospray ionization, we report the detection of such a specific tetramolecular complex using mass spectrometry. Under conditions necessary for observation of the tetramolecular complex, no ions were detected for the GCN4 peptide dimer or the GCN4 monomer with dsDNA. These observations indicate that the specific interaction of the dsDNA with the protein dimer stabilizes the biologically significant noncovalent complex in the gas phase. Complexes were observed for various lengths of both bluntended and cohesive-ended double-stranded DNA containing the specific recognition sequence. The binding specificity of the complex was verified with the use of control DNA not containing the recognition sequence and control peptides not known to bind DNA specifically. Additionally, combining limited proteolysis of GCN4 peptide-DNA complexes with mass spectrometric determination of the products compared to identical experiments with noncomplexed peptides was used to probe interactions of specific amino acids with the DNA. The ability to observe these complexes by mass spectrometry and to probe the specific interactions involved opens the door for utilizing this analytical technique to other structural biological problems including the study of transcription processes and determining the specific binding regions between dsDNA and proteins.

INTRODUCTION

The study of noncovalent complexes formed in biological systems has been the target of much research. These interactions play a critical role in the functions and regulation of biomolecules. Some examples of noncovalent interactions which are of biological relevance include protein-DNA, receptor-ligand, DNA-drug, proteindrug, and antibody-antigen complexes. Understanding these interactions is key to understanding biological functions such as gene regulation, antibody recognition, pharmacological response, and anti-viral applications. In the present study, we are investigating the applicability of MS-based techniques to the characterization of GCN4:DNA complexes. GCN4 is a sequence-specific DNA-binding protein containing a bZIP structural motif (Landschulz et al., 1988; Abel and Maniatis, 1989). GCN4 is important in the regulation of gene transcription and is responsible for the general control of amino acid synthesis in yeast (Penn et al., 1983; Thireos et al., 1984; Hinnebusch, 1984). The “bZIP” motif consists of a “basic” DNA-binding domain and a “leucine zipper” dimerization domain. The leucine zipper consists of about 30 amino acids containing a heptad repeat of leucine residues * To whom correspondence should be addressed. Phone: (919) 541-3009. Fax: (919) 541-0220. E-mail: [email protected]. † National Institute of Environmental Health Sciences. ‡ University of Konstanz.

(Hope and Struhl, 1986; Hope and Struhl, 1987; Hurst, 1996; Struhl, 1987). The coiled leucine zipper from two proteins interact through parallel coiled coils and form protein dimers (O’Shea et al., 1989a,b; Rasmussen et al., 1991; Oas et al., 1990), while the basic DNA-binding region is thought to be disordered in the absence of DNA and to form R helices when bound to DNA. While alterations of the leucine residues in the zipper region lead to loss of dimerization and DNA binding (Vogt and Bos, 1989; Landshulz et al., 1989, Turner and Tjian, 1989; Gentz et al., 1989; Schuermann et al., 1989), alterations in the basic region prevent DNA binding, but do not inhibit dimerization (Vogt and Bos, 1989; Landshulz et al., 1989; Turner and Tjian, 1989; Gentz et al., 1989). DNA recognition by GCN4 and related proteins was first proposed as a scissors grip model (Vinson et al., 1989) followed by the proposal of an induced helical fork model (O’Neill et al., 1990). Subsequent X-ray crystallographic studies, using smaller peptide models of the whole protein, GCN4-p1 (residues 249-281), showed a parallel, two-stranded coiled coil of R helices (O’Shea et al., 1991). The crystal structure for a GCN4 bZIP peptide (residues 226-281) complexed with a specific DNA site showed a pair of continuous R helices which diverge toward the N termini, allowing them to scissor grip the major groove of the DNA (Ellenberger et al., 1992).

10.1021/bc990123c CCC: $19.00 © 2000 American Chemical Society Published on Web 04/29/2000

336 Bioconjugate Chem., Vol. 11, No. 3, 2000

Mass spectrometry is increasingly being applied to the characterization of tertiary structures and noncovalent complexes. Since the first report in 1991 (Ganem et al., 1991), numerous noncovalent complexes, including protein-protein, protein-ligand, and oligonucleotide dimers and duplexes, have been observed using ESI in combination with mass spectrometry (Przybylski et al., 1996; Loo, 1997). Recent papers have reported the mass spectrometric analyses of nonspecific DNA duplexes (Gale et al., 1994; Bayer et al., 1994; Ding and Anderegg; 1995; Przybylski et al., 1995; Gale and Smith, 1995; Triolo et al., 1997) and trimeric and tetrameric complexes (Gale et al., 1994; Przybylski et al., 1995; Gale and Smith, 1995; Triolo et al., 1997; Baca and Kent, 1992; Schwartz et al., 1994; Siuzdak et al., 1994; Wendt et al., 1995; Miranker et al., 1996; Cheng et al., 1996a; Po´csfalvi et al., 1997), and one report showed the analysis of a sequence-specific trimeric complex between DNA and a protein (Cheng et al., 1996b). There have been very few reports of a successful mass spectrometric analysis of a sequencespecific noncovalent tetramolecular complex not involving four identical subunits or metal ions. A tetrameric noncovalent complex of two molecules of the vitamin D receptor DNA binding domain (VDR DBD) with dsDNA containing two vitamin D response elements in which the two VDR DBDs bound to the dsDNA independently was reported by Veenstra et al. (Veenstra et al., 1998). Potier et al. reported the MS observation of a 1:1 complex of the trp apo-repressor (TrpR) homodimer and dsDNA containing the TrpR recognition site, and complexes of this complex with tryptophan (Potier et al., 1998), while a preliminary report of this study of the noncovalent complexes of the GCN4 family of peptides with doublestranded DNA using electrospray ionization was also reported (Deterding et al., 1998). For direct characterization of the interactions involved in specific complex formation, however, mass spectrometric methods alone are not applicable. MS analysis, however, has been combined with protein-specific reactions, such as covalent modification of individual amino acid residues or enzymatic cleavage (Suckau et al., 1990) for probing protein tertiary structures (Brockerhoff et al., 1992, Polverino de Laureto et al., 1995, Zappacosta et al., 1996, Kriwacki et al., 1997) and protein-nucleic acid complexes (Cohen et al., 1995). In these studies, the identification of proteolytic peptides generated from “limited proteolysis” based on a time course provides additional information. Thus, either exposed regions of tertiary structures or structural differences of a protein in two forms, e.g., a nonbound and a receptor-bound state, can be monitored. In this paper, we report our results of the characterization of protein-DNA complexes using direct MS analyses and MS analysis combined with limited proteolysis of the GCN4:DNA complex. EXPERIMENTAL PROCEDURES

DNA Samples. The oligonucleotides (desalted) were purchased from Genosys Biotechnologies, Inc. (The Woodlands, TX) and were used without further purification. Sequences of the synthetic oligonucleotides are shown in Table 1. All of the oligonucleotides contain, as duplexes, the GCN4 DNA-binding element 5′-ATGA(C/G)TCAT-3′ (underlined nucleic acids) (Hope and Struhl, 1985; Hill et al., 1986) except for the blunt-ended 24-bp (control DNA) which was used as a control. Ammonium acetate was obtained from Mallinckrodt, Inc. (Paris, KY). All solutions were prepared using 18-ΜΩ water (Hydro Service and Supplies, Research Triangle Park, NC).

Deterding et al. Table 1. Oligonucleotide Sequences

§

Control DNA.

Oligonucleotide stock solutions (2 nmol/µL) were prepared using 50 mM ammonium acetate buffer, pH 8.0. Equal molar concentrations of the complementary strands of oligonucleotides were combined, heated to 70 °C, and cooled slowly to room temperature over 2.5 h to facilitate double-stranded annealing. Prior to mass spectrometric analysis, 1 µL of the double-stranded DNA (dsDNA) solution (2 nmol/µL) was added to 49 µL of 50:50 water: acetonitrile for negative ion ESI or 49 µL of 90:10 water: methanol for positive ion ESI resulting in 40 pmol/µL solution. Proteins. The GCN4 proteins, GCN4-bR and GCN461, were prepared on an Applied Biosystems-430A or a semi-automated ABIMED synthesizer using Fmoc protection strategy as described previously (Wendt et al., 1995). Stock solutions of the proteins (1 nmol/µL) were prepared in 50 mM ammonium acetate buffer, pH 8. Complex Preparation. The trimolecular and tetramolecular complexes were formed by first diluting 1 µL of the dsDNA solution (2 nmol/µL) to 2 µL with 50 mM ammonium acetate buffer. This 2 µL solution of dsDNA (1 nmol/µL) was then incubated with 2 µL of the protein solution (1 nmol/µL) at room temperature for approximately 5 min. For mass spectrometric analysis, 2 µL of the noncovalent complex solution was diluted to 50 µL with 90:10 water:methanol resulting in a 20 pmol/ µL solution. Limited Proteolysis Experiments of GCN4-bR and GCN4-bR-13-bp dsDNA Complex. For both nonbound GCN4-bR and GCN4-bR complexed with a 1.25fold excess of 13-bp dsDNA, endoproteases Lys-C, GluC, and Asp-N (Boehringer Mannheim, Mannheim, Germany) were used. All digests were performed in ammonium acetate (0.1 M, pH 6.5) at 20 °C using an enzyme:substrate ratio of 1:30. Because of the dynamics of the complex, quantitative formation of the complex prior to digestion has to be ensured. Cohen proposed the use of an initial excess of DNA high enough to shift the equilibrium in solution toward the side of the complex (Cohen et al., 1995). On the basis of their calculations, a 1.25-fold excess of the previously formed DNA double strand over the peptide subject to this study was chosen to ensure that most of the GCN4-bR was initially bound to the DNA. The 100 mM salt concentration was chosen because it is sufficient to enable specific interactions to occur while excluding nonspecific interactions. Aliquots of the reaction mixtures were taken at different digestion

ESI/MS of Protein−DNA Tetramolecular Complex

Figure 1. Amino acid sequences of the (A) GCN4-61 peptide and (B) GCN4-bR peptide. Cleavage points for the various enzymes studied in the limited proteolysis experiments are indicated in panel B. The A, D, and L peptide fragments correspond to enzymatic cleavages involving endoprotease GluC, Asp-N, and Lys-C, respectively.

times ranging from 1 min to 48 h, and reactions were stopped by diluting with a 4-fold excess of 0.1% TFA. Diluted aliquots of the reaction mixtures were analyzed by MALDI-MS without further purification. Electrospray Ionization Mass Spectrometry. A Micromass Platform II (Altrincham, U.K.) single-quadrupole mass spectrometer equipped with an ESI interface and an extended mass range quadrupole (4000 Da) was used for all analyses of the noncovalent complexes. The temperature of the ESI source of the mass spectrometer was maintained at 35 °C. Spraying was achieved using nitrogen as the nebulizing and drying gas. Samples were infused into the mass spectrometer at 5 µL/min using a pressure injection vessel (Tomer et al., 1994). The ESI probe voltage was operated at -3 kV for the analysis of the oligonucleotide dimers and at +3 kV for the analysis of the DNA-protein complexes. To obtain spectra of the noncovalent complexes, a cone voltage potential of (75 to (125 V was necessary. Full scan mass spectra were acquired in the continuum-data acquisition mode. For all analyses, a mass range of 300-4000 Da was scanned at 3.9 s/scan. Matrix-Assisted Laser Desorption Mass Spectrometry. All MALDI-MS analyses were carried out on a Bruker Biflex (Bruker-Franzen, Bremen, Germany) linear time-of-flight mass spectrometer using a UV nitrogen laser (337 nm), a dual microchannel plate detector, and the X-MASS data processing system. An accelerator voltage of 20 kV was applied, and spectra were calibrated with insulin as external standard. Previously identified fragments derived from GCN4 peptides were used for additional internal recalibration, if necessary. Samples were diluted with 0.1% trifluoroacetic acid (TFA) to a final concentration of about 10-5 M. For mass analyses, 1 µL of this solution was mixed with 1 µL of a saturated solution of R-cyano-4-hydroxy cinnamic acid (HCCA) in 25% formic acid/2-propanol (2:1, v/v). RESULTS

For the study of sequence-specific noncovalent complexes between proteins and DNA, the GCN4 peptides GCN4-bR, a peptide model of GCN4 similar to the peptide model GCN4-bR1 (Talanian et al., 1990), and the carboxy-terminus peptide GCN4-61, which contains both the basic DNA-binding region and the leucine zipper of GCN4, complexed with dsDNA were used. Figure 1 shows the amino acid sequence of the GCN4 peptides. The

Bioconjugate Chem., Vol. 11, No. 3, 2000 337

sequence of GCN4-bR contains the basic DNA-binding region and a disulfide bond in place of the leucine zipper. A Gly-Gly-Cys-Val linker is added at the carboxyl terminus to provide a flexible linker in the disulfide-bonded dimer. It has been shown that the GCN4-bR1 peptide retains sequence-specific DNA-binding activity (Talanian et al., 1990). GCN4-61, amino acid residues 221-281 of GCN4, corresponds to the entire bZIP structural motif of GCN4 and has been reported to be capable of dimerization and sequence-specific DNA binding (Hope and Struhl, 1986). The positive ion ESI deconvoluted spectra of the GCN4 peptides are shown in Figure 2. For both GCN4-bR (Figure 2A) and GCN4-61 (Figure 2B), the deconvoluted spectra show primarily the molecular ion of the respective peptides. For GCN4-61, additional ions (ions labeled with an asterisk) are observed which are due to truncated synthetic peptides in the sample. It should be noted that, under these conditions, no ions corresponding to the GCN4-61 dimer are observed. Only monomeric GCN461 is observed. The electrospray ionization mass spectra of the dsDNA samples revealed that the complementary strands of the oligonucleotides were successfully annealed. For example, the positive ion and negative ion ESI mass spectra of the blunt-ended 24-bp dsDNA are shown in Figure 2, panels C and D, respectively. Specific odd-charge 7 and/or 9 ions of the duplex oligonucleotides (D ions) could be detected verifying the formation of the dsDNA from the singlestranded oligonucleotides (S ions). The broadness of the ions in these spectra is most likely due to counterion association with the oligonucleotides (Saenger, 1984; Light-Wahl et al., 1993; Bayer et al., 1994). To obtain the ESI spectra of the DNA duplexes, a cone voltage potential of ca. 75-100 V was necessary. For cone voltages less than 75 V, only ions from the single oligonucleotides were observed. This phenomenon is in agreement with several literature reports for the analysis of duplex oligonucleotides (Bayer et al., 1994; Ding and Anderegg, 1995; Triolo et al., 1997). The higher cone voltages may be necessary to efficiently desolvate the duplex. Triolo and co-workers (Triolo et al., 1997) have suggested that this finding indicates good gas-phase stability of the duplex ions. The positive ion ESI mass spectrum of the trimolecular complex of the blunt-ended 13-bp dsDNA and GCN4-bR (Mr ) 15896.4) is shown in Figure 3A. Molecular ions for the noncovalent complex are observed at ca. m/z 2000 and 2300 as well as molecular ions due to GCN4-bR peptide in the lower m/z range of 700-1400. The noncovalent complex ions correspond to the +7 and +8 charge state. No ions corresponding to a complex of GCN4-bR with single-stranded DNA or to the DNA dimer alone were observed. A cone voltage of 80 V was necessary to observe the molecular ions of the trimeric complex in the mass spectrum. The relatively broad peaks of the noncovalent complex are most likely due to incomplete desolvation and/or salt adducts of the molecular ions as observed by others (Huang et al., 1993; Cheng et al., 1996b). The broadness of the peptide ions in the low mass/charge region is due to fragmentation induced by the high cone voltage as well as oxidation of the peptide as a result of the incorporation of the disulfide linker which can be readily oxidized. The ESI mass spectrum of the tetramolecular complex of the blunt-ended 13-bp dsDNA with the GCN4-61 noncovalent dimer (Mr ) 22135.6) is shown in Figure 3B. The ions observed at m/z 2461, 2213, 2015, and 1848 correspond to the +9, +10, +11, and +12 charge states

338 Bioconjugate Chem., Vol. 11, No. 3, 2000

Deterding et al.

Figure 2. (A) Positive ion ESI deconvoluted mass spectrum of GCN4-bR (average of 28 scans) and (B) positive ion ESI deconvoluted mass spectrum of GCN4-61 (average of five scans). Ions designated with an asterisk (*) are due to truncated synthetic peptides. ESI mass spectra of annealed blunt-ended 24-bp dsDNA acquired under (C) negative ion conditions (average of 12 scans) and (D) positive ion conditions (average of 95 scans). Ions designated as D are due to the dsDNA and ions designated as S are due to the single stranded oligonucleotides.

Figure 3. Positive ion ESI mass spectra of the noncovalent complex of the 13-bp dsDNA with (A) GCN4-bR (average of 40 scans) and (B) GCN4-61 dimer (average of 94 scans). Ions designated as P are due to the peptides while ions designated as C are due to the trimolecular and tetramolecular noncovalent complexes. The fragment ions designated as F are due to the loss of one peptide chain from the tetramolecular complex, while the ions designated with an asterisk (*) are due to complexes involving the truncated synthetic peptides.

of the noncovalent complex (C series ions). In addition, ions are observed in the low m/z range which correspond to the GCN4-61 monomer peptide (P series ions). Additional molecular ion species are observed which cor-

respond to the truncated synthetic peptides noted in Figure 2B. A cone voltage of 85 V was necessary to detect the ions for the tetramolecular complex. The low relative abundance fragment ions at ca. m/z 1880 and 2145

ESI/MS of Protein−DNA Tetramolecular Complex

Bioconjugate Chem., Vol. 11, No. 3, 2000 339

Figure 4. Positive ion ESI mass spectra of the noncovalent complex of the blunt-ended 24-bp dsDNA with (A) GCN4-bR (average of 54 scans) and (B) GCN4-61 dimer (average of 52 scans). Ions designated as P are due to the peptides, ions designated as C are due to the trimolecular and tetramolecular noncovalent complexes, and ions designated as D are due to the dsDNA.

Figure 5. Positive ion ESI mass spectra of the noncovalent complex of the cohesive-ended 24-bp dsDNA with (A) GCN4-bR (average of 86 scans) and (B) GCN4-61 dimer (average of 70 scans). Ions designated as P are due to the peptides, ions designated as C are due to the trimolecular and tetramolecular noncovalent complexes, and the ion designated as D is due to the dsDNA.

correspond to the loss of one peptide chain from the tetramolecular complex due to the high cone voltage (F ions). Similar to the electrospray mass spectrum of the GCN4-61 peptide alone (Figure 2B), only GCN4-61 monomeric ions were detected. No ions were observed for the GCN4-61 dimer. A slightly longer strand of dsDNA complexed with the GCN4 peptides was evaluated in Figure 4. The positive ion ESI mass spectrum of blunt-ended 24-bp dsDNA and GCN4-bR (Mr ) 22695.8) was acquired using a cone voltage of 125 V and is shown in Figure 4A. Molecular ions for the trimolecular complex (C series ions) are observed in the m/z range 2500-3300 and correspond to the +7, +8, and +9 charge state of the complex. Ions due to the GCN4-bR peptide alone (P series ions) are observed in the low m/z range region. The low relative abundance ion of m/z 2980 corresponds to the +5 charge state of the 24-bp dsDNA (D+5 ion). This spectrum is similar to that observed for the 13-bp dsDNA complex (Figure 3A) in that no ions were observed for the 24-bp DNA dimer alone or for the GCN4-bR with a 24-bp single strand. In addition, the peaks are relatively broad due to incomplete desolvation and/or salt adducts. The positive ion ESI spectrum for the tetramolecular complex of the bluntended 24-bp dsDNA and the GCN4-61 dimer (Mr) 28935) is shown in Figure 4B. A cone voltage of 85 V was

necessary to observe the tetramolecular complex ions. The ions of m/z 2895, 2634, and 2424 correspond to the +10, +11, and +12 charge states, respectively, of the noncovalent complex (C series ions). An additional ion in the higher m/z range region of m/z 2110 is also observed and corresponds to the +7 ion of the 24-bp dsDNA dimer (D+7 ion). Ions of the GCN4-61 monomer are observed in the lower m/z range region with the +6 charge state ion being the highest in relative abundance (P series ions). No ions were observed for the GCN4-61 dimer, the GCN4-61 monomer complexed with dsDNA, or any other nonspecific complex. To determine if the end of the double-stranded DNA has any effect on the stability of the noncovalent complexes during electrospray ionization, cohesive-ended dsDNAs complexed with the GCN4 peptides were analyzed. Cohesive-ended DNA is DNA which has short single-stranded sequences at the ends. The positive ion ESI spectra of the noncovalent complex of the cohesiveended 24-bp dsDNA (Figure 5) and 30-bp dsDNA (Figure 6) with GCN4-bR and with GCN4-61 show data very similar to that for the previous complexes using bluntended dsDNAs. Ions due to the noncovalent complexes with cohesive-ended dsDNA (C series ions) are observed in the mass spectra acquired at a cone voltage of 95 V.

340 Bioconjugate Chem., Vol. 11, No. 3, 2000

Deterding et al.

Figure 6. Positive ion ESI mass spectra of the noncovalent complex of the cohesive-ended 30-bp dsDNA with (A) GCN4-bR (average of 64 scans) and (B) GCN4-61 dimer (average of 114 scans). Ions designated as P are due to the peptides, ions designated as C are due to the trimolecular and tetramolecular noncovalent complexes.

Specificity of ESI Mass Spectrometric Detection of Complexes. A variety of control experiments was conducted to verify that the observations in these spectra are due to specific interactions between the GCN4 peptides and the dsDNA. A non-DNA-binding protein, bovine pancreatic trypsin inhibitor (BPTI), was mixed with the dsDNAs to see if a nonspecific trimolecular complex could be detected. Low relative abundance ions (