Anal. Chem. 2003, 75, 6437-6448
Affinity Capture of Specific DNA-Binding Proteins for Mass Spectrometric Identification Mariana Yaneva and Paul Tempst*
Molecular Biology Program, Memorial Sloan-Kettering Cancer Center, and Weill Graduate School of Medical Sciences, Cornell University, New York, New York 10021
We describe a general approach for affinity microcapture of site-specific, nucleic acid-binding proteins. The major difficulties to developing this method into a widely applicable protocol derived from the need for a massive enrichment and the inadvertent, extensive binding of nonspecific proteins to the bait. On the basis of a detailed analysis, we propose (i) a one-step fractionation of crude extracts on P11 phosphocellulose, followed by (ii) a discrete series of positive/negative selections on wild-type and site-mutated ligand DNA in a magnetic microparticulate format, with cobalt magnets, concatamerized and biotinylated ligands, selective salt conditions, and improved competitor DNAs. We also present rules for determining the precise number and order of selections. The approach and protocol allowed isolation of four, lowabundance transcription factors and repressors from 2 × 109 cultured leukemia cells. Captured proteins were 10-20 000-fold enriched from the nuclear extract, in a form and amounts that permitted facile MALDI-TOF and TOF/TOF MS-based protein identification. This is 1-2 orders of magnitude better than many previous efforts and in a fraction of the time (∼1 factor/week). The method can be applied to any protein that binds DNA, including those with modest to low affinity, and bridges functionalbiochemical studies on replication, transcriptional regulation, and DNA repair with the analytical power of mass spectrometry-based proteomics. Targeted proteomics denotes the examination of subsets of the proteome. We seek to apply this concept to the biochemical analysis of genome function, integrity, and dynamics. Transcription, replication, recombination, and DNA repair all involve the action of sequence-specific, DNA-binding proteins. Those interact with additional proteins, some also binding DNA directly and others tethered to it through the partner. In this way, large functional complexes are formed and similarly anchored to specific DNA sites.1-3 Biochemical analysis requires purification, identification, reconstitution, and functional characterization of the com* To whom correspondence should be addressed at Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10021. Phone: (212) 6398923. E-mail:
[email protected]. (1) Carey, M.; Smale, S. T. Transcriptional Regulation in Eukaryotes; CSHL Press: Cold Spring, Harbor, NY, 2000. (2) Malik, S.; Roeder, R. G. Trends Biochem. Sci. 2000, 25, 277-283. (3) Naar, A. M.; Lemon, B. D.; Tjian, R. Annu. Rev. Biochem. 2001, 70, 475501. 10.1021/ac034698l CCC: $25.00 Published on Web 11/01/2003
© 2003 American Chemical Society
ponents. Targeted proteomics can contribute to this by generating a catalog of physical interactions that may help define the functional modules. Many of the auxiliary proteins, enzymes, and complexes have important, more general functions and are present in the nucleus in small to moderate amounts. By contrast, those binding at select sites, such as transcription factors (TFs) to specific promoters, represent only 0.01-0.001% of the total cellular protein.4 Historically, purification of TFs by conventional chromatographic methods has been very difficult, often requiring nuclear extract from hundreds to several thousand liters of cultured cells to achieve a 10 000-100 000-fold enrichment and still recover enough protein for chemical and functional analysis.4-6 The inherent capacity of these proteins to bind nucleic acids has long been exploited as a means to purification, initially by coupling total calf thymus DNA to solid supports7 and later by immobilizing selected DNA fragments or synthetic oligonucleotides for sequencespecific affinity chromatography.8,9 Attachment is either done through chemical means, usually by CNBr activation,4,5 or by utilizing the biotin- streptavidin system.8,9 Still, sequence-specific DNA affinity chromatography has rarely, if ever, succeeded in purifying human TFs to homogeneity directly from crude extract, a shortcoming related to (i) the massive enrichment factor that is required and (ii) the presence of large numbers of abundant, nonspecific DNA- and RNA-binding proteins in the nucleus. Therefore, multicolumn chromatography is still very much the norm with an affinity step at the end, usually in the presence of nonspecific competitor DNAs such as Escherichia coli DNA or poly(dI-dC).7,10 Most protocols remain laborious and very timeconsuming and generally end with large-volume preparations, impeding downstream mass spectrometric analysis. Subsequent concentration of the sample by evaporation or precipitation frequently leads to a large or total loss of the proteins. Although the use of mass spectrometry (MS) has made protein identification several orders of magnitude faster and more sensi(4) Kerrigan, L. A.; Kadonaga, J. T. In Current Protocols in Molecular Biology; Asubel, F. M., et al., Eds.; John Wiley and Sons: New York; 1993; Unit 12.10, pp 1-11. (5) Kadonaga J. T. Methods Enzymol. 1991, 208, 10-23. (6) Andrews, N. C.; Erdjument-Bromage, H.; Davidson, M. B.; Tempst, P.; Orkin, S. H. Nature 1993, 362, 722-728. (7) Alberts, B.; Herrick, G. Methods Enzymol. 1971, 21, 198-217. (8) Kasher, M. S.; Pintel, D.; Ward, D. C. Mol. Cell. Biol. 1986, 6, 3117-3127. (9) Blanks, R.; McLaughlin, L. W. Nucleic Acids Res. 1988, 16, 10283-10299. (10) Gadgil, H.; Jurado L. A.; Jarrett, H. W. Anal. Biochem. 2001, 290, 147178.
Analytical Chemistry, Vol. 75, No. 23, December 1, 2003 6437
tive,11,12 the classical purification protocols have been slowly or not at all adapted, resulting in virtually zero overall time and effort savings. Instead, recent methods and technology research in the proteomics field has been largely directed at mass spectrometric hardware, software, quantitational issues, and mixed analysis, in the hope of overcoming current limitations in sample preparation and homogeneity.11,12 While laudable, this approach can be costly and in some cases prohibitively sophisticated for all but a few expert laboratories. Real micro-biochemical methods are therefore needed to match the analytical ones and to provide a bridge between classical molecular biology/biochemistry and mass spectrometry-based protein identification and proteomics. The ideal method starts with the smallest possible amount of cells or tissue and follows a fast and efficient protocol to prepare protein(s) in amounts, form, and final volume fully compatible with popularly used MALDI-TOF-based peptide mass fingerprinting or an alternative identification technique. This has been tried by onestep capture from crude extract with direct MS analysis on a commercial SELDI-TOF system.13,14 However, these efforts did not progress beyond detecting a protein with an apparent molecular mass somewhere in the range of what was expected for the analyte;14 positive identification was either not made or, in case of an abundant bacterial protein, purified off-chip for standard mass fingerprinting.13 A more pragmatic alternative is off-line capture on magnetic particles coated with double-stranded DNA that contains specific binding sequences for the factor(s) of interest. In its early configuration, the technique was used for purification of yeast TFs without, however, making any attempts to identify captured proteins.15 This method also exists in a one-step, microassay format with detection by Western blot,16 rather than MS. More recent applications with successful identification still required the inclusion of four chromatographic steps before final capture and MS readout.17 There is scant evidence in the published literature that one-step capture with MS-based identification would, in fact, be feasible at all as a routine method. Nordhoff and colleagues identified RXRR and PPARγ in this way but only after ectopic expression in yeast or after induced expression in cultured mouse fibroblasts.18 Finally, Masternak et al. discovered a novel RFX-associated protein after a single round of DNA affinity purification yielded about 500 ng (∼15 pmol) for mass spectrometric analysis,19 a uncommonly large amount by today’s standards. All in all, there is currently no robust, generally applicable method for affinity capture of specific DNA-binding proteins. (11) Yates, J. R., 3rd. Trends Genet. 2000, 16, 5-8. (12) Aebersold, R.; Mann, M. Nature 2003, 422, 198-207. (13) Forde, C. E.; McCutchen-Maloney, S. L. Mass Spectrom Rev. 2002, 21, 41939. (14) Bane, T. K.; LeBlanc, J. F.; Lee, T. D.; Riggs, A. D. Nucleic Acids Res. 2002, 30, 69. (15) Gabrielsen, O. S.; Huet, J. Methods Enzymol. 1993, 218, 508-530. (16) Kumar, V. N.; Bernstein, L. R. Anal. Biochem. 2001, 299, 203-210. (17) Schweppe, R. E.; Melton, A. A.; Brodsky, K. S.; Aveline, L. D.; Resing, K. A.; Ahn, N. G.; Gutierrez-Hartman, A. J. Biol. Chem. 2003, 278, 1686316872. (18) Nordhoff, E.; Krogsdam, A. M.; Jørgensen, H. F.; Kallipolitis, B. H.; Clark, B. F. C.; Peter Roepstorff, P.; Kristiansen, K. Nat. Biotechnol. 1999, 17, 884-888. (19) Masternak, K.; Barras, E.; Zufferey, M.; Conrad B.; Cortals, G.; Aebersold, R.; Sanches, J. C.; Hochstrasser, D. F.; Mach, B.; Reith, W. Nat. Genet. 1998, 20, 273-277.
6438
Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
Here, we report a fast, simple, and highly efficient method for protein capture on specific DNA-magnetic beads and identification by MALDI-TOF MS. It involves a single-step fractionation of crude extracts on phosphocellulose and a discrete series of positive/negative selections and ends with purified protein, concentrated in a small volume, that allows successful MS analysis. After appropriate optimizations, this method can be applied to any factor, of human or other origin, that binds DNA including those with modest to low affinity. EXPERIMENTAL SECTION Abbreviations: DNA, deoxyribonucleic acid; ds, doublestranded; EMSA, electrophoretic mobility shift assay; MALDITOF, matrix-assisted desorption/ionization time-of-flight; MS, mass spectrometry; mut, mutated; NB4, human myeloid leukemia cell line; NE, nuclear extract; P11, phosphocellulose; PCR, polymerase chain reaction; PMF, peptide mass fingerprinting; ppm, parts per million; SDS, sodium dodecyl sulfate; ss, single-standed; TF, transcription factor; wt, wild type. Materials. Tween-20 was purchased from Fisher Scientific (Pittsburgh, PA); NP-40 was from Sigma (St. Louis, MO); acryl amide, bis(acryl amide), and Coomassie Blue R250 were from BioRad (Hercules, CA); PVDF Immobilon-P membranes were from Millipore (Bedford, MA). Poly(dI:dC) was purchased from Amersham Pharmacia (Piscataway, NJ). Oligo(dI:dC), 30 bp in length, was custom synthesized by Integrated DNA Technologies (IDT, Coralville, IA). Regular and 5′-biotinylated (with 6-carbon linker), single-stranded oligonucleotides were either customsynthesized by IDT or purchased in double-stranded form from Santa Cruz Biotechnology (Santa Cruz, CA). Antibodies to PU.1, Fos, Jun, and RARR were also from Santa Cruz. Rabbit anti-GABPR polyclonal antibodies were custom-raised by Pocono Rabbit Farm (Canadis, PA) against KHL-coupled, synthetic peptides (Yaneva et al., unpublished). Cell Culture. Human promyelocytic leukemia NB4 cells were grown at 37 °C and 5% CO2 in RPMI-1670 medium (Life Technologies, Rockville, MD) supplemented with 10% fetal calf serum (Sigma), nonessential amino acids, and penicillin and streptomycin at 5 µg/mL each. Cell cultures were passaged twice a week to maintain the cell density between 0.5 and 1.5 × 106 cells/mL. Preparation of Nuclear Extracts (NE) and Chromatography on Phosphocellulose (P11). All procedures were performed at 4 °C. Nuclear extracts from NB4 cells were prepared as described previously20 and were then fractionated on a phosphocellulose P11 (Whatman, Clifton, NJ) column equilibrated with buffer D (20 mM HEPES pH 7.9, 0.2 mM EDTA, 0.5 mM DTT, 0.01% NP-40, 0.2 mM PMSF, and 10% glycerol) containing 75 mM NaCl. As a rule, approximately 3 mg of NE is loaded onto 1 mL of P11 resin. Bound proteins were eluted either with a linear gradient of 0.075-0.85 M NaCl or stepwise with 0.1, 0.3, 0.5, and 0.85 M NaCl, in the same buffer. Gradient or stepwise, the entire elution is done with 20 column volumes and at a flow rate of 0.4 mL/min. Fractions containing the respective DNA-binding activities, as monitored by EMSA (see below), were pooled, dialyzed overnight at 4 °C against 50 volumes of buffer D containing 0.1 M NaCl, and then bound directly to DNA-magnetic beads, (20) Dignam, J. D.; Martin, P. L.; Shastry, B. S.; Roeder, R. G. Methods Enzymol. 1983, 101, 582-598.
Table 1. Sequences of the Wild-Type (wt) and Mutant (mut) Double-Stranded DNA Oligonucleotides Used for Concatamerizations, Immobilization on Magnetic Beads, and Capture of GABP (Yaneva et al., Unpublished), PU.1,22 Ap1,27 PURr (Kippenberger et al., Unpublished), and RARr37 with Mutations Boxed
constructed with concatamerized oligonucleotides (see below), and equilibrated with the same buffer. Concatamerization of DNA Oligonucleotides. Multimers of DNA-binding sites were generated by a self-priming PCR method21 using two complementary, direct repeats of single-stranded oligonucleotides with either wild-type or mutant versions of specific binding sites (Table 1), as required. Only the forward, ss oligonucleotides were biotinylated at the 5′ end, using a 6-carbon spacer arm (IDT). PCRs (50-µL volume) contained 460 ng of each oligonucleotide, 8 µM of dNTPs (Roche Molecular Biochemicals, Indianapolis, IN), and 2 units of “Vent” polymerase (New England BioLabs, Beverly, MA) in 10 mM KCl, 10 mM (NH4)2SO4, 3.5 mM MgSO4, 0.1% Triton X-100, and 20 mM Tris-HCl pH 8.8 (at 25 °C). Cycling conditions were optimized for each pair of oligonucleotides. For GABP-binding concatamers, the PCR conditions were as follows: 95 °C for 2 min followed by 14 cycles at 95 °C for 1 min, 55 °C for 1 min, and 72 °C for 3 min. Conditions for PURR: 95 °C for 2 min followed by 9 cycles at 95 °C for 1 min, 55 °C for 1 min, and 72 °C for 3 min. Conditions for AP.1 and PU.1: 92 °C for 2 min followed by 14 cycles at 95 °C for 1 min, 55 °C for 1 min, and 72 °C for 3 min. Conditions for RARR: 95 °C for 2 min followed by 9 cycles of 95 °C for 1 min, 55 °C for 1 min, and 72 °C for 5 min. PCR products were purified using QiaQwick kit (Qiagen, Valencia, CA) and analyzed by electrophoresis in agarose gel/Tris-borate-EDTA buffer. Each PCR yielded about 3-5 µg of DNA, ranging between 200 bp and 5-10 kb in length. Preparation of Double-Stranded (ds) DNA-Magnetic Beads. The 5′-biotinylated ds oligonucleotides or concatamerized DNA, (DNA)n, were attached to M280 magnetic beads coated with streptavidin (Dynal Biotech, Oslo, Norway), using a KilobaseBINDER kit according to the manufacturer’s instructions. The efficiency of concatamer binding to those beads depends on the length of DNA. According to the manufacturer’s specifcations, binding of 1000-bp DNA is on the order of 2-5 µg/mg of beads; lower and higher molecular weight (DNA)n bind in up to ∼12 µg/mg. For quantitations, small aliquots of ds oligonucleotides (21) Hemat, F.; McEntee, K. Biochem. Biophys. Res. Commun. 1994, 205, 475481
or (DNA)n were end-labeled with T4 polynucleotide kinase (New England Biolabs) and γ-32P-ATP (PerkinElmer Life Sciences, Boston, MA) and then used as a tracer to monitor final attachment. This varied from 3 to 9 µg of concatamers/mg of beads (details in Table 2). To prepare ds oligonucleotides, complementary ss oligonucleotides were mixed in a 1:1 molar ratio (50 µM each) in 10 mM Tris-HCl, pH 8.0, 10 mM MgCl2, and 100 mM KCl, heated at 88 °C for 3 min, and then gradually cooled (10 min at 65 °C; 1 min at 55 °C; 10 min at 37 °C; 5 min at room temperature) and stored at -20 °C. About 200 pmol of ds oligonucleotides was attached/mg of beads. Protein Binding to DNA-Magnetic Beads. All procedures were carried out at 4 °C. The beads with attached concatameric DNA were first washed in DNA-binding solution (20 mM HEPES, pH 7.9, 0.1 M KCl, 0.2 mM EDTA, 0.5 mM DTT, 0.01% NP-40, 10% glycerol). The binding buffer also contained oligo(dI:dC) and poly(dI:dC) at 0.1 mg/mL each, except where indicated. If necessary, the protein fractions after P11 chromatography were dialyzed against 50 volumes of the binding buffer, mixed with the DNA-beads, and incubated for at least 3 h at 4 °C with rotation on LabQuake shaker (Labindustries, Berkeley, CA). For collection of the beads from large volumes of nuclear extracts or column fractions, high-powered cobalt magnetic disks (catalog no. CR3035275, Edmund Scientific, Tonawanda, NY) were placed underneath 15-mL or 50-mL Falcon tubes during centrifugation in GS-6KR rotor (Beckman, Fullerton, CA) for 5 min at 700g. For smaller volumes, in Eppendorf tubes, magnetic disks were hand-held. After protein binding, beads were washed with binding buffer (≈1 mL/ mg of beads) and three times in the presence of E. coli ds and ss DNA, at 0.1 mg/mL each, as a nonspecific competitor. Singlestranded E. coli DNA was prepared by heating ds DNA in boiling water for 20 min and quick chilling on ice. Finally, the beads were eluted with 50 µL of binding buffer, containing 0.5 M NaCl, for 15 min on ice. After elution, the beads were suspended directly in 50 µL of Laemmli sample loading buffer, and after being heated for 3 min at 95 °C, the eluates were subjected to analysis by SDS gel electrophoresis and Western blotting, at room temperature. Gels were always stained with Coomassie Blue R250. Silver staining of the sodium dodecyl sulfate gels prior to mass spectrometric analysis is not advisable as silver binds to traces of competitor DNA left in the final preparations. Electrophoretic Mobility Shift Assay (EMSA). EMSA was performed essentially as described previously,22 with minor modifications. In brief, prebinding of nuclear extract (5 to 10 µg/ mL), or respective protein fraction, to the poly(dI-dC) was carried out at 25 °C for 10 min in buffer containing 4% glycerol, 1 mM MgCl2, 0.5 mM EDTA, 0.5 mM DTT, 25 mM NaCl, 10 mM TrisHCl (pH 7.5), and 0.05 mg/mL poly(dI-dC)‚poly(dI-dC). For the competition experiments 5-100-fold molar excess of unlabeled wild-type or mutated oligonucleotides was included in the incubation mixture. Probe (3.5 fmol, ∼2 × 104 cpm) was then added to the above reaction, which was mixed and incubated at 25 °C for 20 min. One microliter of 10× gel loading buffer, containing 250 mM Tris-HCl (pH 7.5), 0.2% of bromophenol blue, 0.2% xylene cyanol, and 40% glycerol, was added to the reaction and then loaded onto a 6% native gel (which was prerun for 90 min at 100 (22) Ma, Y.; Qin, S.; Tempst, P. J. Biol. Chem. 1998, 273, 8727-8740.
Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
6439
Table 2. Affinity Capture of GABP, PU.1, AP.1, and PURr on DNA-Specific Magnetic Beads for Identification by MSa captured protein
cell culture (L)
no. of cells (109)
P11 fraction (M)
P11 protein (mg)
capture scheme
(+)DNA/mg of beads (µg)
GABP RARR AP.1 PURR PU.1
0.9 4.5 1.1 1.5 1.9
1.4 6.8 1.7 2.3 2.9
0.1 0.3 0.5 0.5 0.85
3.5 22 5 7 7
+-+ +---+ -+ --+ ---+
5 (×2) 6 (×2) 9 3 8
a RARR was identified by Western blotting. Human NB4 myeloid leukemia cell cultures were grown to 1.5 × 106 cells/mL; NE protein yields were on average 1 mg/108 NB4 cells. Amounts of ds DNA attached to beads were calculated as described under Experimental Section.
V) in 0.5× nondenaturing Tris-borate-EDTA buffer. Electrophoresis was performed at 25 °C and 100 V for about 3.5 h. The gel was then transferred onto Whatman paper, vacuum-dried, and exposed to Hyperfilm (Amersham Pharmacia) for the desired period of time at -80 °C and with an intensifier screen. Western Blotting. Protein solutions were separated by electrophoresis in 4-15% gradient polyacrylamide sodium dodecyl sulfate gels and the proteins transferred onto a PVDF membrane in Tris-glycine-methanol buffer. The respective antibodies were added to the membranes at concentrations of 1 µg/mL in PBS/ 0.05% Tween-20, for 2 h at room temperature. Anti-mouse or antirabbit-horseradish peroxidase-conjugated antibodies (Santa Cruz) and Enhanced chemiluminescence kit (Pierce; Rockford, IL) were used for visualization of the immune complexes. Mass Spectrometry (MS). Gel-resolved proteins were digested with trypsin, the mixtures fractionated on a Poros 50 R2 RP micro-tip, and resulting peptide pools analyzed by matrixassisted laser-desorption/ionization reflectron time-of-flight (MALDIreTOF) MS using a Bruker UltraFlex TOF/TOF instrument (Bruker Daltonics; Bremen, Germany), as described.23,24 Selected experimental masses (m/z) were then taken to search a nonredundant protein database (NR; ∼1.4 × 106 entries; National Center for Biotechnology Information; Bethesda, MD), utilizing the PeptideSearch (Matthias Mann, Southern Denmark University, Odense, Denmark) algorithm. A molecular weight range twice the predicted weight was covered, with a mass accuracy restriction better than 40 ppm and maximum one missed cleavage site allowed per peptide. Mass spectrometric sequencing of selected peptides was done by MALDI-TOF/TOF (MS/MS) analysis on the same prepared samples, using the UltraFlex instrument in “LIFT” mode. Fragment ion spectra were then taken to search the NR database using the MASCOT MS/MS Ion Search program (Matrix Science Ltd.; London, U.K.).25 Any identification thus obtained was verified by comparing the computer-generated fragment ion series of the predicted tryptic peptide with the experimental MS/MS data. In general, positive identifications were made on the basis of a Mascot MS/MS score g74 (p < 0.05) for a single peptide, or a score g40 for each of two or more peptides (combined score g80). Alternatively, a Mascot MS/MS score g40 for a single peptide (23) Erdjument-Bromage, H.; Lui, M.; Lacomis, L.; Grewal, A.; Annan, R. S.; McNulty, D.; Carr, S. A.; Tempst, P. J. Chromatogr., A 1998, 826, 167181. (24) Winkler, G. S.; Lacomis, L.; Philip, J.; Erdjument-Bromage, H.; Svejstrup, J. Q.; Tempst, P. Methods 2002, 26, 260-269. (25) Perkins, D. N.; Pappin, D. J. C.; Creasy, D. M.; Cottrell, J. S. Electrophoresis 1999, 20, 3551-3567.
6440 Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
was combined with a peptide mass fingerprinting (PMF) result that yielded g15% sequence coverage. RESULTS AND DISCUSSION We sought to develop a protocol for the microisolation and handling of unidentified transcription factors (TFs), characterized only by the binding to a specific DNA-recognition site, all the way from crude nuclear extracts to a downstream, facile mass spectrometric identification. The ideal method should allow prepararation of TFs from the minimum amount of tissue or number of cultured cells, in the fewest possible steps, and yield a small (i.e. e30-50 µL), concentrated sample for gel electrophoretic fractionation and visualization. The entire procedure should only require tools and reagents commonly available in the average molecular biology or biochemistry laboratory. Two enabling concepts can be considered up front. Affinity capture may in principle allow a single-step isolation by taking advantage of the highly specific factor-nucleic acid interactions. Sample volume and handling losses may be kept to a minimum by immobilizing the protein(s) of interest throughout the purification process onto the smallest amount of solid-phase support, operated in batch-mode for simplicity reasons. Ligand (e.g. ds DNA) bearing low-micrometer (e3 µm) diameter, magnetic particles potentially satisfy both requirements. Magnetic microparticles also have the advantage of facile collection from relatively large volumes, with resultant analyte concentration, and of minimal losses during washings and transfers. To accommodate larger volumes, we selected strong, 6-mm-diameter, cobalt magnetic disks (see Experimental Section) that can be used in either manual pull-downs or in combination with centrifugation. Nonspecific Nucleic Acid-Binding Proteins. At the outset of this study, we typically generated magnetic particles liganded with specific ds DNA sequences by annealing two synthetic oligonucleotides, one of which was biotinylated at the 5′-end, and that were then bound to commercially available, streptavidincoated beads. In a first, comparative experiment, three different particles were prepared, each with a specific DNA sequence (21bp long) containing the known binding site for one of three common TFs, namely AP.1, NFκB, and Sp1 (Figure 1A).26-28 These beads were separately incubated with nuclear extract from human leukemia NB4 cells under standard binding conditions (NaCl and Mg2+ concentrations; poly(dI:dC) competitor) commonly used in EMSA.1,22 After extensive washing, bound proteins (26) Kadonaga, J. T.; Tjian, R. Proc. Natl. Acad. Sci. U.S.A. 1986, 83, 58895893. (27) Lee, W.; Mitchell, P.; Tjian, R. Cell 1987, 49, 741-752. (28) Lenardo, M. J.; Baltimore, D. Cell 1989, 58, 227-229.
Figure 1. Capture of nuclear proteins on specific oligonucleotides. (A) Sequences of double-stranded DNA oligonucleotides, with specific binding sites for AP1,27 NFkB,28 and Sp1,41 immobilized on magnetic particles. Factor binding sites are shown in bold. (B) Electrophoretic profile of the proteins captured on magnetic beads with DNA specific for AP.1 (lane 2), NFkB (lane 3), and Sp1 (lane 4). A 1 mg amount of DNA-beads, containing approximately 100 pmol of oligonucleotides, was incubated with 36 mg of NB4 nuclear extract in the presence of 0.1 mg/mL poly(dI:dC) as competitor. After wash and elution, proteins were separated on a 10% SDS gel and stained with Coomassie Blue. Lane 1: molecular weight markers. Selected proteins were identified by MALDI-TOF MS as indicated on the right.
Figure 2. Effect of oligo(dI:dC) competitor and DNA concatamerization on the capture of AP.1. (A) Nuclear extracts were incubated with monomeric, AP.1-specific DNA oligonucleotides bound to magnetic beads in the absence (lane 2) or the presence of a 100-fold molar excess of oligo(dI:dC) (lanes 3 and 4); alternatively, beads contained concatamerized oligonucleotides (lane 4). All incubation mixtures contained 0.1 mg/mL poly(dI:dC) as well. The proteins eluted from the beads were analyzed on a 10% SDS gel and stained with Coomassie Blue. Lane 1: molecular weight markers. (B) Western blot of the proteins analyzed in (A) using a mixture of anti-Fos and anti-Jun antibodies at 1 µg/mL each. Lane 1: molecular weight markers.
were eluted with Laemmli sample buffer and analyzed by SDS gel electrophoresis and Coomassie Blue staining. The profiles in each lane looked nearly identical, regardless of the difference in DNA sequences attached to the beads (Figure 1B), suggestive of a high degree of nonspecific binding. In support of this view, several major bands from the gel were excised and identified by MS as mostly nonspecific RNA or DNA-binding proteins, some with preference for binding to free DNA ends (as indicated on the right in Figure 1B).29 It showed that, under conditions where specific bindings were observed in EMSA (data not shown), most of the DNA-binding sites on the beads were occupied nonspecifically. Poly(dI:dC) competitor at a concentration of 0.1 mg/mL, already high, was clearly insufficient to prevent these unwanted interactions in a chromatographic setting. Raising the concentration any further leads to viscosity-induced anomalies and potential loss of specific protein. Concatamerized DNA-Binding Sites and Oligo(dI:dC) Competitor. From the premise that several major, nonspecifically binding proteins (e.g. DNAPK, Ku autoantigen, PARP) have high affinities for DNA breaks or ends, we reasoned that the resultant background could be reduced by using (i) beads with DNA affinity ligands of a lower ends-to-binding site ratio and, conversely, (ii) competitor DNA of a higher ends-to-weight ratio. The latter option
was implemented by including 30-bp long, custom synthesized oligo(dI:dC) in the binding reaction, in addition to the standard polymeric competitor.30 A 0.1 mg/mL oligo(dI:dC) concentration was empirically found to be the highest that did not increase viscosity and did not visibly interfere with TF-DNA binding in EMSA (data not shown). As for the first option, long (g1000 bp) double stranded concatamers of 5′-biotinylated, specific oligonucleotides were produced in a self-priming PCR, effectively increasing ligand density on the beads without added DNA ends. It should be noted that PCR-based concatamerization reactions are sequence specific and must be carefully optimized first during pilot experiments. In binding experiments with AP.1-DNA beads (see Figure 1A), the effect of oligo(dI:dC) as competitor was evaluated and protein binding to single DNA sites vs concatamers compared. As anticipated, the nonspecific protein bands were reduced in the presence of oligodI:dC (Figure 2A; lanes 2 and 3) and did not visibly increase when concatamerized DNA instead of oligonucleotide was used as affinity ligand on the beads (Figure 2A, lane 4). Western blot analysis of the bound proteins demonstrated that both subunits of AP.1 (i.e. Jun and Fos) were successfully captured and that their levels were significantly higher on beads with concatamerized binding sites (Figure 2B, lane 4). However, the amount of AP.1 was insufficient for identification as several contaminating, more abundant proteins still precluded unbiased MS detection (data not shown). We concluded that, despite the improvements, direct capture from crude nuclear extracts would not be successful under the current conditions, not even of a relatively abundant TF such as AP.1. At least one enrichment step will be needed before capture on specific DNA-beads.
(29) Yaneva, M.; Kowalewski, T.; Lieber, M. R. EMBO J. 1997, 16, 5098-5112.
(30) West, R.; Yaneva, M.; Lieber, M. Mol. Cell. Biol. 1998, 18, 5908-5920.
Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
6441
Chromatography on P11 Phosphocellulose. Ion exchangers (DEAE-cellulose, BioRex70) or Heparin-Sepharose, PhenylSepharose, and gel-filtration columns (Sephacryl S300) have been used as a prefractionation step for affinity purification of nucleic acid-binding proteins.5,10 However, we chose to fractionate nuclear extracts on phosphocellulose P11 as it provided a unique combination of ion-exchange and affinity properties. P11 chromatography is a simple, popular procedure that has been used as a first step in many proven TF purification schemes and which allows enrichment of multiple factors, cofactors, and corepressors in separate fractions.31-33 In this way, we could conceivably also perform parallel affinity captures of several different factors, each from a separate P11 fraction but all derived from a single batch of nuclear extract. NB4 nuclear extracts (75 mg of total protein in 40 mL of buffer D) were applied to a P11 column (22-mL bed volume) and extensively washed, and bound proteins were then eluted with either a linear gradient (Figure 3A) or stepwise increase of NaCl concentration from 0.1 to 0.85 M, always in a 200-mL total elution volume. As monitored by specific EMSA, we detected the presence of GABP in the 0.1 M fraction, RARR in 0.3 M fraction, AP.1 and PUR in the 0.5 M fraction, and PU.1 activity in the 0.85 M fraction (Figure 3B and data not shown). The choice of these particular factors was directed at covering the entire P11 elution range and also determined by prior observations in our laboratory while purifying several factors (later identified as PU.1, GABP, and PUR) involved in the regulation of the myeloid defensin promoter or by scrutinizing published TF purification schemes.27,34,35 Note that, initially, every other 1-mL column fraction was assayed for each of the five factors. On the basis of the distribution of the various binding activities, fractions were arbitrarily combined into four pools, as indicated in Figure 3A. For reasons of simplicity and time efficiency, all subsequent P11 columns were batchwise eluted with 0.1, 0.3, 0.5, and 0.85 M salt (5 × column volumes each) for use in further protocol development. As the model DNA-binding factors were present in different pools of significantly varying ionic strenth, we investigated the effects of salt concentration on the in vitro binding to their cognate sequences as a prelude to the choice and optimization of subsequent purification steps. Also, low Mg2+ concentrations would reduce the probability for degradation of DNA concatamers, attached to the beads, by endogeneous nucleases. We noticed that all five chosen proteins bound equally well to their cognate DNA in the presence or absence of Mg2+ (data not shown). Consequently, all further DNA captures were performed in the absence of magnesium. As for the salt effects, selectively illustrated for GABP and PU.1 in Figure 4, we observed a striking correlation between the salt concentration at which a given protein eluted from the P11 column and the stability of the DNA-protein complex in the presence of salt as measured by EMSA. Proteins eluting at e300 mM NaCl formed less stable complexes with DNA than proteins that eluted at g500 mM. For example, the formation (31) Dignam, J. D.; Lebovitz, R. M.; Roeder, R. G. Nucleic Acids Res. 1983, 11, 1475-1489. (32) Ge, H.; Roeder, R. G. Cell 1994, 78, 513-523. (33) Roeder, R. G. Trends Biochem. Sci. 1996, 21, 327-335. (34) Yeh, W. C.; Hou, J.; McKnight, S. L. Methods Enzymol. 1996, 274, 101112. (35) Haas, S.; Thatikunta, P.; Steplewski, A.; Johnson, E. M.; Khalili, K.; Amini, S. J. Cell Biol. 1995, 130, 1171-1179.
6442
Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
of GABP-DNA complex was adversely affected by the presence of 500 mM as compared to 50 mM KCl, whereas high salt had no such destabilizing effect on the PU.1-DNA complex (Figure 4). In some cases (e.g. AP.1 and PUR), the presence of high salt caused a slight change in the mobility of the DNA-protein complexes (data not shown), but the DNA-binding specificity was preserved as confirmed in subsequent capturing experiments. Effect of High/Low Salt on DNA-Affinity Capture. In effect, these findings allowed us, as a rule, to perform specific DNAaffinity capture on beads in the presence of 0.5 M NaCl of any protein that has previously been eluted from a P11 column at high (g500 mM) salt. We speculated that, under such high salt conditions, the number of nonspecific, low-affinity binding contaminants might be significantly reduced, a characteristic that should definitely be taken advantage of whenever possible. On the other hand, capture of proteins that bind weakly to a P11 column (eluting at e300 mM NaCl), such as GABP, require a different strategy. To this end, the 0.1 M salt cut from the P11 column (Figure 3A; Figure 3B,C, lanes 2) was incubated with concatamerized GABP-DNA beads under conditions as optimized up to that point, washed, eluted with 0.5 M NaCl, and analyzed by standard gel electrophoresis (Figure 3C, lane 3) and by EMSA with a GABP-specific probe (Figure 3B, lane 3). While the binding activity had clearly been enriched by this affinity step, more than a dozen protein bands were still visible on the Coomassie-stained gel, the majority of which were identified to be nonspecific nucleic acid binding proteins or various housekeeping proteins. No GABP could be detected by MS-based methods, making further purification necessary. Similarly, “high salt” (i.e. in the presence of 0.5 M NaCl) affinity capture of proteins from the P11 0.85 M salt cut on PU.1-DNA beads also failed to produce sufficiently homogeneous PU.1 factor detectable by standard peptide MALDI-TOF mass fingerprinting. It appeared, therefore, that direct DNA-affinity capture from P11 fractions may be generally inadequate to purify specific TFs for facile mass spectrometric identification. It is imperative that any additional purification steps should also be carried out in a microcapture format. Negative Selection Enhances DNA-Affinity Capture Specificity. As nonspecific protein binding appeared to be the major obstacle for successful TF affinity capture, the problem could be reduced to the adequate removal of such interfering proteins. Conceptually, the suitable medium could be an immobilized DNA sequence that binds many or all of the nonspecific proteins but not the specific one. We envisioned this could be readily accomplished by using DNA ligands consisting of minimally modified TF binding sites, for instance, by one or more point mutations, just enough to abolish specific factor binding in vitro. The mutated DNA sequences were derived from pilot competition EMSA experiments performed in the presence of an excess of unlabeled, mutant DNA probes. Point mutations that fully abolished competition were then selected. Preclearing of protein solutions with beads loaded with such DNA will be further referred to as “negative selection”. Sequences for negative selections for the capture of GABPs, PU.1, and PURs were established in our laboratory (Yaneva et al., unpublished; Kippenberger et al., unpublished), as listed in Table 1; mutant DNAs for capture of AP.1 and RARR were taken from the published literature.27,36,37
Figure 3. Partial purification of GABP on phosphocellulose P11 and DNA-affinity beads. (A) Elution profile of P11 column chromatography. Nuclear extract (75 mg) was applied to a P11 column (22-mL bed volume) and eluted with a 200-mL linear gradient of NaCl as described in the Experimental Section. The fractions in the “0.1 M” pool contained GABP DNA-binding activity. These fractions were combined and applied to GABP-DNA affinity beads. Other fractions were all pooled as indicated (0.3 M; 0.5 M; 0.85 M; flow through ) FT). (B) DNA-binding activities of GABP-containing fractions. EMSA using a GABP-specific, ds oligonucleotide was performed as described in the Experimental Section. Key: lane 1, free DNA probe; lane 2, NB4 nuclear extract; lane 3, pooled “0.1 M” fractions from P11 column; lane 4, proteins eluted from the DNAaffinity beads. The two arrows on the right indicate the positions of the DNA/GABPR (bottom) and DNA/GABPR+β (top) complexes (Yaneva et al., unpublished). (C) Electrophoretic profile of eluted proteins. Proteins were analyzed by electrophoresis in 4-15% SDS gels and stained with Coomassie Blue. Lanes 1-3 correspond to lanes 2-4 of panel B. The proteins in lane 3 identified by MS are indicated at right.
The utility of this approach is best illustrated by the subsequent successful efforts on AP.1 and PU.1 isolation. Starting with the 0.5M P11 fraction, AP.1 was captured on “positive” (i.e. wild type) DNA-beads in the presence of 500 mM NaCl, after one round of negative selection. Each of the four visible Coomasie-stained bands (Figure 5A; lane 7) was identified by MS (MALDI-TOF and TOF/ (36) Perez, A.; Kastner, P.; Sethi, S.; Lutz, Y.; Reibel, C.; Chambon, P. EMBO J. 1993, 12, 3171-3182. (37) Leid, M.; Kastner, P.; Lyons, R.; Nakshatri, H.; Saunders, M.; Zacharevsky, T.; Chen, J.-Y.; Staub, A.; Garnier, J. M.; Mader, S.; Chambon, P. Cell 1992, 68, 377-395.
TOF) as either Fos or Jun proteins (i.e. a “AP.1” heterodimer) or breakdown products thereof (Figure 5C,D); experimental details are given in the figure legends. Western blot analysis with mixed anti-Fos and -Jun antibodies on a fraction of the final sample indicated that nearly all of the protein of interest was recovered (Figure 5B). Similarly, PU.1 protein was captured from the 0.85 M P11 fraction but not until a more extensive removal of nonspecific proteins was done by three consecutive rounds of negative selection (Figure 6A,B). MS-assisted identification was very straightforward at that point (Figure 6C). The additional Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
6443
Figure 4. Effect of salt on the stability of DNA-protein complexes. DNA binding was measured in EMSA using crude nuclear extracts and DNA oligonucleotides specific for GABP (lanes 1-3) or for PU.1 (lanes 4-6). Proteins and DNA were incubated in the presence of either 50 mM (lanes 2 and 5) or 500 mM KCl (lanes 3 and 6) prior to electrophoresis. Lanes 1 and 4: free DNA probes. Faster migrating DNA-protein complexes in lanes 5 and 6 were typically observed due to partial proteolysis of PU.1 protein in crude extracts.22
negative selections were required in this case because of the apparent, relatively low PU.1 concentration in NB4 cell nuclear extract and P11 fraction. As visualized by Western blotting, PU.1 enrichment throughout the entire procedure was quite dramatic (Figure 6B; lane 9). PUR proteins were also captured under the same conditions from the 0.5M P11 fraction using two rounds of negative selection. Both subunits (see Figure 7, lane 3) were then positively identified by MS (data not shown). “Positive/Negative/Positive” Affinity Selections. The capture of lower affinity binders, such as GABP and RARR, was done using a different sequence of positive/ negative selections. These proteins could be readily eluted from wild-type DNA-magnetic beads in 0.5 M salt, which was then removed by dialysis, and followed by one or more rounds of negative selection, as necessary. The advantage of a front-end, positive selection is the sizable reduction of protein mass, all the while retaining the TF of interest, which enables subsequent rounds of capture on a microscale (i.e. Eppendorf tubes). A final round of GABP purification was then also carried out by positive capture, and the proteins were eluted from the beads in 0.5 M NaCl for gel electrophoresis (Figure 7A, lane 2) and identification by MS (data not shown). RARR, easily detected by EMSA and Western blotting in the 0.3 M NaCl fraction of P11, was taken through three rounds of negative selection before capture on “positive” DNA-beads. Again, Western blot analysis clearly demonstrated that the protein was not bound to the mutant but only to the wild-type DNAbeads, from which it could be readily eluted in 0.5 M salt (Figure 7B). However, RARR was not found by MALDI-TOF MS analysis in the final preparation (Figure 7A,B; compare lanes A4 and B10). We concluded that for successful capture of RARR, additional purifications, or further scaling up of the present procedure, would be necessary because the protein level seemed to be unusually low. By contrast, Nordhoff et al. reported MS identification of a different nuclear receptor, RXRR, isolated from crude extracts of 6444
Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
hormone-induced mouse 3T3-F442A cells.18 This discrepancy may be related to genetic differences between mouse fibroblasts and human myeloid leukemia NB4 cells. Indeed, two forms of RARR exist in NB4 cells, namely the free protein and an oncogenic fusion PML-RARR.36 To our knowledge, neither wild-type nor fusion protein have ever been purified. It also should be noted that conventional biochemical procedures have rarely allowed purification of specific TFs to homogeneity in such a few steps and from such small number of cells.10 The four out of five factors, totaling six proteins out of seven, that were positively identified here by MS after capture on DNAmagnetic beads derived from only 1.5 to 3 × 109 human cells (Table 2). Taken together, these results suggest a wider applicability of the procedure, with RARR and PML-RARR being among the exceptions. General Rules for DNA-Affinity Capture of Site-Specific Proteins. On the basis of the results reported here, we propose the following empirical rules for future identifications of TFs by MS, as illustrated in Figure 8. After initial fractionation of nuclear extract on a P11 column, it is critical to determine whether the protein of interest will form a stable complex with its target DNA in the presence of at least 500 mM salt. This is generally the case for all proteins that elute from P11 at or higher than 500 mM salt (e.g. AP.1, PURR, PU.1, and others). Binding affinities may also be known from prior molecular characterization of the promoter of interest. Affinity and salt tolerance, often directly linked, are the primary determinants of the order in which positive and negative selections are then carried out: “negative/positive” for high-affinity, high-salt binders; “positive/negative/positive” for lowaffinity, low-salt binding TFs. Addition of a positive step prior to the negative selections serves dual purpose: (i) the protein mass is drastically reduced; (ii) protein eluates are very concentrated, enabling further microcapture format. This is not possible for highaffinity binders as approximately 1 M salt is required for elution, in most cases resulting in some degree of denaturation which complicates both assaying and subsequent affinity captures. Consequently, the first positive selection must be omitted. This can be compensated, in part, by the option to include 0.5 M salt throughout the entire selection process, thereby also increasing capturing stringency, albeit by a different mechanism. Both options appear to be mutually exclusive. Regardless of the selection sequence, the number of negative selections on mutant DNA-magnetic beads always depends on the abundance of the TF in the respective fraction. Low-abundant species require more nuclear extract and proportionally more rounds of negative selection than the higher abundant ones, for example, one round for GABP and AP.1, two for PURR, three for PU.1, but more than three for RARR (Figure 8). All capturing schemes end with a positive selection, including extensive washing of the beads with E. coli competitor ds and ss DNA. Final elution is done with 0.5 M NaCl for low-affinity binders or by boiling in Laemmli gel-loading buffer for the others, followed by gel electrophoresis and identification by the mass spectrometric method of choice. The purification scheme generates proteins in amounts and form that are readily compatible with MALDI TOFbased peptide mass fingerprinting. On a more molecular biological note, the precise DNA sequence to which a protein of interest binds both in vivo and in
Figure 5. Affinity capture of AP.1 on specific DNA-magnetic beads. (A) Protein analysis. Proteins from the P11 “0.5 M” fraction (lane 2) were incubated with 1 mg of beads derivatized with (mutAP1-DNA)n, i.e., negative selection; unbound (lane 3, “NB” ) not bound) and bound (lane 4, ‘beads’; boiled in Laemmli buffer) fractions are shown, after analysis on a 4-15% SDS gel and staining with Coomassie Blue. Unbound protein was then incubated with 1 mg of (wtAP1-DNA)n beads; the unbound (lane 5, wt-NB), wash (lane 6), and terminally bound (lane 7, beads) are indicated. All bindings, washes, and elutions were performed as described in Experimental Section, in the presence of 500 mM KCl. The arrows on the right, labeled A-D, denote bands that were identified by MS; band A is Jun (C), band B is Fos (D), and bands C and D are breakdown products of Jun and Fos. (C) is an MS/MS spectrum of a peptide selected from band A for further identification by sequencing, and (D) is a peptide from band B also selected for MS/MS. (B) Western blot analysis. An SDS gel with 1-5% aliquots of the same protein fractions as shown in (A) was transferred to a PVDF membrane and incubated with a mixture of anti-Fos and anti-Jun antibodies (see Experimental Section). (C) MALDI-TOF/TOF (MS/MS) identification of a Jun tryptic peptide. Protein band A (A; lane 7) was digested with trypsin; peptides were processed over a RP-microtip and analyzed by MALDI-TOF MS (not shown), and the data were used for peptide mass fingerprinting (PMF).24 Jun was independently identified by TOF/TOF (MS/MS) analysis of a peptide observed as a peak at m/z ) 1746.866. Fragment ion spectra were taken for a MASCOT MS/MS ion search of the NR database and retrieved a tryptic peptide sequence, GASTFKEEPQTVPEAR ([MH]+ ) 1746.844; ∆ ) 13 ppm) with a Mascot score of 53. b- and y-fragment ions are indicated. (D) MALDI-TOF/TOF (MS/MS) identification of a Fos tryptic peptide. Protein band B (A; lane 7) was processed as described for (C). In this case, Fos was identified by PMF and also by TOF/TOF analysis of a selected tryptic peptide at m/z ) 1960.966: LQAETEELEEEKSGLQK ([MH]+ ) 1960.972; ∆ ) 3 ppm) with a Mascot score of 74.
Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
6445
Figure 6. Affinity capture of PU.1 on specific DNA-magnetic beads. (A) Protein analysis. Proteins from the P11 “0.85 M” fraction (lane 2) were incubated with 1 mg of beads derivatized with (mutPU1-DNA)n, i.e. negative selection. The bound (lane 3, “mut1-B”; beads boiled in Laemmli buffer) fraction is shown, after analysis on a 4-15% SDS gel and staining with Coomassie Blue. Clearance of the unbound fraction was repeated two more times; bound proteins are shown after each cycle (lanes 4 and 5, “mut2-B, 3-B”), as is the unbound fraction after the third negative selection (lane 6, “mut3-NB”). Unbound protein was then incubated with 1 mg of (wtPU1-DNA)n beads; the unbound (lane 7, wt-NB), wash (lane 8), and terminally bound (lane 9, beads) fractions are indicated. All bindings, washes, and elutions were performed in the presence of 500 mM KCl. Arrows on the right point to the bands in lane 9 that were all identified by MS as PU.1 (C). (B) Western blot analysis. An SDS gel with 1-5% aliquots of the same protein fractions as shown in (A) was transferred to a PVDF membrane and incubated with anti-PU.1 antibodies. The arrows on the right point to the immunoreactive bands that were all identified as PU.1 protein by MS. (C) MALDI-TOF MS-based peptide mass fingerprinting of PU.1. Protein bands marked with arrows in lane 9 (A) were digested with trypsin, peptides were processed over a RPmicrotip and analyzed by MALDI-TOF MS, and all three were identified by peptide mass fingerprinting as PU.1 (or fragments), with 17% sequence coverage and average mass error of 9 ppm for the best match. CAL indicates the position of internal calibrants. The protein was confirmed by TOF/TOF (MS/MS) analysis (not shown) of two peptides, observed at m/z ) 1369.724 and 1497.816, and identified as tryptic peptides LTYQFSGEVLGR and KLTYQFSGEVLGR, respectively.
vitro should, of course, always be known in advance. Using highly sensitive and specific assays, it is especially critical to determine minimal mutations in the sequence to completely abolish factor 6446 Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
Figure 7. Affinity capture of GABPR, PURR, RARR, AP.1, and PU.1 on specific DNA-magnetic beads. (A) Protein analysis. Proteins were captured from the following P11 fractions (derived from crude nuclear extract): GABP, “0.1 M”; RARR, “0.3 M”; AP.1, “0.5 M”; PURR, “0.5 M”; PU.1, “0.85 M”. The first two pools were initially incubated with wild type (positive) beads. Bound proteins were eluted and desalted and protein solutions cleared on mutant (negative) beads, one round for GABP and three rounds for RARR. The 0.5 and 0.85 M pools were immediately precleared on negative beads, one round for AP.1, two for PURR, and three for PU.1. All cleared fractions were subjected to one more round of positive selection; beads were then washed and eluted with salt (GABP; RARR) or boiled in Laemmli sample buffer (others) and analyzed by electrophoresis in a 4-15% SDS gel and stained with Coomassie Blue, as indicated. The circles to the right of the gel lanes indicate protein bands that have been identified by MS. Closed circles mark proteins that were confirmed by MS as those predicted to bind to specific DNA baits; open circles indicate those that were identified as false positives after “RARR-DNA”-mediated capture. (B) Western blot analysis of proteins captured on (RARRDNA)n-beads. An SDS gel with 1-5% aliquots of intermediate and final protein fractions of the RARR capture were transferred to a PVDF membrane and incubated with anti-RARR polyclonal antibodies (1 µg/mL): lane 1, crude nuclear extract; lanes 2-8, eluted (EL) and bound (B) proteins during three rounds of negative selection on mutant beads; lanes 9-11, positive selection on wild-type (DNA)nbeads; lane 9, nonbound proteins (NB); lane 10, eluted (EL); lane 11, proteins remaining on the beads (B), recovered by boiling in Laemmli sample buffer. Positions of molecular weight markers are indicated on the right; positions of RARR and of the oncogenic fusion protein PML-RARR are also shown.
binding; this information will be used in the negative selections as described above. Traditional assays include DNA footprintings, transient expression of a promoter fused to a reporter gene, and EMSA. In addition, EMSA is used to optimize the binding with
Figure 8. DNA-specific, affinity microcapture diagram of nucleic factors. The sequence of positive (wt; filled circles) and negative (mut; open circles) selections performed on 0.1, 0.3, 0.5, and 0.85 M salt elution P11 fractions is shown. The (DNA)n-coated magnetic beads were prepared as described under the Experimental Section and in the text; corresponding, specific DNA sequences (wt; or mut) are listed in Table 1. The boxes cluster all the steps that were carried out at a similar salt concentration of either 0.05 or 0.5 M, as indicated. Arrows mark the steps where, and what, competitor DNA was added to the incubation mixtures; competitor is carried along with unbound fractions but is freshly added to any bead eluates before the next selection cycle. Elution, when done, was with 0.5 M KCl; tightly bound proteins were recovered from the beads by boiling in gel loading buffer. The proteins indicated on the bottom of the diagram (with exception of RARR) were identified by MALDITOF MS. NE denotes nuclear extract; NB denotes not bound.
respect to salt concentration, divalent metal ions, and competitor DNA requirements.1 As a rule, salt and competitor DNA should be as high as possible. Conversely, divalent cations should be avoided to protect the DNA on the beads from degradation by endonucleases.10 If absolutely necessary (e.g., in the case of Znfinger proteins), the lowest concentration of metal that still maintains specific protein-DNA complex in EMSA should be determined. Prospects for DNA Affinity Capture of Protein Complexes. The studies and general protocol described so far are primarily aimed at capturing single TFs on single DNA-binding sites. What if longer DNA promoter sequences were used as affinity ligands? Provided that separate binding sites for different TFs would map within this sequence, and absence of any steric hindrance, one might expect that all will bind, possibly together with additional interacting proteins such as coactivators or repressors that do not make direct contact with DNA.1-3 In earlier studies, we were able to capture GABP and PU.1 simultaneously on an immobilized, concatamerized 120-bp stretch derived from the human defensin1 gene promoter (Yaneva et al., manuscript in preparation). However, this could only be achieved from unfractionated nuclear extract as the two TFs are fully resolved on a P11 column (Figure 8), in effect precluding this elementary step. Not surprisingly, the resulting preparation was too impure for direct MALDI-TOF-based identification, and the presence of both proteins could only be confirmed by Western blotting. It seems unlikely, therefore, that
use of the protocol described in this report can be extrapolated to the general capture of multiple factors or complexes for straightforward MS analysis. Only in those selected cases where different factors would either cofractionate on P11 (or other resins) and/or stability of the complex would confer chromatographic comigration can we expect to cocapture proteins on DNA with reasonable efficiency and purity. Furthermore, gene transcription is regulated by multiple factors whose assembly is highly cooperative. Formation of these complexes takes place only at particular gene promoters and requires precise spatial interactions between the binding sites, the bound activators, and the coactivators involved in their regulation, a process that may be difficult to reproduce in vitro. For example, Drewett et al. were unable to attain quantitative formation of ternary complexes on immobilized target DNA.38 Two bona fide binding proteins, SRF and Elk-1, were identified by MS but only after prior overexpression in the cells from which they were captured; additionally bound proteins were all nonspecific. A more recent attempt to capture the yeast RNA polymerase II preinitiation complex on DNA-magnetic beads also demonstrated the low efficiency of the reconstitution process. Only an estimated 10% of the preinitiation complexes reconstituted on immobilized DNA templates were functional.39 Predominantly nonspecific (38) Drewett, V.; Molina, H.; Millar, A.; Muller, S.; Von Hesler, F.; Shaw, P. E. Nucleic Acids Res. 2001, 29, 479-487. (39) Ranish J. A.; Yudkovsky, N.; Hahn, S. Genes Dev. 1999, 13, 49-63.
Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
6447
proteins were captured on both control and experimental beads, requiring the use of a highly sophisticated, quantitative/comparative analysis technique to identify relevant proteins.40 A faithful and highly efficient reconstitution of the human transcriptional machinery that is suitable for MS analysis, especially in the context of the chromatin fiber, remains one of the unrealized challenges at present. CONCLUSIONS Today, mass spectrometry accounts for essentially all de novo protein identifications. For the most part, applications fall into one of two categories, either of the systems biology variety or standard identification of proteins isolated by traditional means such as multicolumn chromatography. These purification schemes can be very costly and time-consuming, often for the cell culturing alone. Yet, they typically yield only one or a few purified proteins, which can be identified within 1 day or so by a proteomics facility or by any laboratory with access to, and elementary expertise in the use of, a MALDI-TOF mass spectrometer. Biochemists and molecular biologists could take greater advantage of the much improved protein identifying capabilities and throughput if robust, efficient methods were available to scale down and accelerate the purification process, an effective bridge, as it were, between traditional life sciences research and the analytical power of proteomics. This has been particularly the case in the field of nucleic acid biochemistry, including the study of DNA-binding proteins that function in transcriptional regulation, replication, and maintainance of chromosomal integrity. Here, we describe an efficient, much accelerated method for affinity capture of transcription factors on specific DNA-magnetic particles, to yield final peparations in a form and amounts that are compatible with standard MALDI-TOF MS-based protein identification. A major obstacle to developing this approach into a widely applicable protocol was the inadvertent, extensive binding of nonspecific proteins to the bait. This problem was addressed at two levels. First, a single-batch fractionation on P11 serves to increase the relative concentration of the TF(s) of interest, removes several of the abundant nonspecific DNA-binding proteins, and may resolve multiple factors in different fractions which can subsequently be used for parallel affinity capture. Second, several couteracting measures were implemented. Protein fractions are (40) Ranish, J. A.; Yi, E. C.; Leslie, D. M.; Purvine, S. O.; Goodlett, D. R.; Eng, J.; Aebersold, R. Nat. Genet. 2003, 33, 349-355. (41) Jackson, S. P.; Tjian, R. Proc. Natl. Acad. Sci. U.S.A. 1989, 86, 1781-1785
6448
Analytical Chemistry, Vol. 75, No. 23, December 1, 2003
incubated with beads carrying concatamerized DNA-binding sites, and in the presence of short oligo(dI:dC) competitor, resulting in higher specific-ligand density and displacement of proteins that preferentially bind to DNA nicks and ends. Magnetic beads are collected from larger volumes on high-powered cobalt magnets. Preclearance with mutant DNA-beads (negative selection) greatly reduces the background of nonspecific proteins in the final preparation. The number of negative selections depends on the abundance and binding kinetics of the respective TFs and is determined experimentally. In case of low-affinity binders, negative selection is preceded by a single positive selection, which results in much reduced protein mass and volume. Affinity capture of high-affinity binding TFs, on the other hand, can be performed in high salt, increasing stringency of the procedure that way. This approach and protocol allowed capture of four distinct protein factors from comparatively small numbers of cultured human blood cells, easily permitting standard MALDI TOF-based peptide mass fingerprinting. A targeted proteomic analysis of this nature should always be preceded, and later also validated, by a thorough molecular biological analysis of the system under study. Proteomics is not a standalone science, and the field would benefit greatly if more projects were initiated by the traditional, hypothesis-driven research groups. The best way to popularize it among biologists and clinical scientists is by bringing a large portion of proteomic research activities into their own laboratories, leaving only the final read-out (i.e. protein identifications) to specialized facilities. The approach and protocol that are presented herein should be considered in that context. In fact, the benefits go well beyond convenience and the time and cost savings of having to culture fewer cells, in that we can now initiate related studies in cases of limited source material, such as specific tumor tissues. ACKNOWLEDGMENT The authors are greatly indebted to Arpi Nazarian and Hediye Erdjument-Bromage for all protein identifications by peptide mass fingerprinting and TOF/TOF mass spectrometric analysis. We thank Margaret McGarvey for technical assistance, Serena Kippenberger for sharing unpublished data, and Lynne Lacomis for help with preparing the figures. This work was supported by Developmental Funds from NCI Grant P30 CA08748. Received for review June 26, 2003. Accepted September 24, 2003. AC034698L