Anal. Chem. 1999, 71, 3894-3900
Technical Notes
Rapid Profiling of Induced Proteins in Bacteria Using MALDI-TOF Mass Spectrometric Detection of Nonporous RP HPLC-Separated Whole Cell Lysates Daniel B. Wall,† David M. Lubman,*,† and Shannon J. Flynn‡
Department of Chemistry, The University of Michigan, Ann Arbor, Michigan 48109-1055, and Center for Microbial Ecology, Michigan State University, 540 Plant and Soil Sciences Building, East Lansing, Michigan 48824-1325
A method for rapid profiling of water-soluble proteins from whole cell lysates has been developed using matrixassisted laser desorption/ionization (MALDI) time-offlight mass spectrometry (TOFMS) following separation by reversed-phase high-performance liquid chromatography (RP HPLC). Rapid separation of proteins from cell lysates was achieved using columns packed with C18 nonporous (NP) silica beads. Using this method, the whole cell lysate water-soluble proteins of E. coli were separated in under 15 min. A method using two columns in series at different temperatures was used in order to provide high loadability without loss of separation efficiency. The nonporous packing in the columns provided for high recovery. Eluting fractions were collected and analyzed by MALDI-TOFMS to determine the molecular weights and peptide maps of the proteins. These methods provided for the rapid screening and identification of proteins from E. coli where the response of E. coli to L-arabinose induction was studied. In this work, it is demonstrated that NP RP HPLC with MALDI-TOFMS detection may serve as a rapid means of detecting and identifying changes in bacterial protein expression due to external stimuli. The development of analytical tools for rapid analysis and identification of expressed protein profiles in cells and tissues is currently an important area in biological research.1,2 A particularly important problem is the development of such methodology for profiling changes in protein expression in bacteria due to various external stimuli.1-3 The rapid profiling of the expression of bacterial proteins currently finds use in diverse applications ranging from the response of bacteria to external stimuli for environmental applications1 to numerous applications in the response of bacteria to pharmaceutical agents,1,2 antibiotics,1,2 and disinfectants.2,3 Such protein profiling methods must provide rapid †
The University of Michigan. Michigan State University. (1) Kahn, P. Science 1995, 270, 369-371. (2) Cash, P. Anal. Chim. Acta 1998, 372, 121-145. (3) Dukan, S.; Turlin, E.; Biville, F.; Bolbach, G.; Touati, D.; Tabet, J. C.; Blais, J. C. Anal. Chem. 1998, 70, 4433-4440. ‡
3894 Analytical Chemistry, Vol. 71, No. 17, September 1, 1999
identification of changes in the level of protein expression or the presence of new proteins. These methods must be able to provide such information on many proteins together so that changes in cell mechanism that involve the expression of several proteins can be monitored. Further, such methods must provide a means for identification of these expressed proteins. The technique generally used to monitor cellular protein expression has been 2-D polyacrylamide gel electrophoresis (2DPAGE).4,5 This method can separate hundreds of proteins where the 2-D spot pattern provides a reference in which changes in the pattern are indicative of changes in protein expression. The relevant information obtained from 2-D gels is determined from differences observed in protein spot positions or intensities between gels which can now be conveniently obtained from image analysis.5 The 2-D gel method, though, only provides a separation of the cell components and an approximate molecular weight while exact identification must still be provided by Edman sequencing or mass spectrometric methods.6-16 Alternative methods to 2DPAGE are desirable since the method is labor intensive and slow and often the reproducibility is poor. In addition, the method is not readily amenable to automation. (4) O’Farrell, P. H. J. Biochem. 1975, 250, 4007-4021. (5) Strahler, J. R.; Kuick, R.; Hanash, S. M. In Protein Structure. A Practical Approach; Creighton, T., Ed.; IRL Press: Oxford, England, 1989; pp 231266. (6) Aebersold, R.; Leavitt, J.; Saavedra, R. A.; Hood, L. E.; Kent, S. B. Proc. Natl. Acad. Sci. U.S.A. 1987, 84, 6970-6974. (7) Henzel, J. W.; Billeci, T. M.; Stults, J. T.; Wong, S. C.; Grimely, C.; Watanabe, C. Proc. Natl. Acad. Sci. U.S.A. 1993, 90, 5011-5015. (8) Liang, X.; Bai, J.; Liu, Y. H.; Lubman, D. M. Anal. Chem. 1996, 68, 10121018. (9) Shevchenko, A.; Wilm, M.; Vorm, O.; Mann, M. Anal. Chem. 1996, 68, 850-858. (10) Courchesne, P. L.; Luethy, R.; Patterson, S. D. Electrophoresis 1997, 18, 369-381. (11) O’Connell, K. L.; Stults, J. T. Electrophoresis 1997, 18, 349-359. (12) Eckerskorn, C.; Grimm, R. Electrophoresis 1996, 17, 899-906. (13) Schuhmacher, M.; Glocker, M. O.; Wunderlin, M.; Przybiliski, M. Electrophoresis 1996, 17, 848-854. (14) Packer, N. H.; Pawlak, A.; Kett, W. C.; Gooley, A. A.; Redmond, J. W.; Williams, K. L. Electrophoresis 1997, 18, 452-460. (15) Loo, R. R.; Stevenson, T. I.; Mitchell, C.; Loo, J. A.; Andrews, P. C. Anal. Chem. 1996, 68, 1910-1917. (16) Cohen, S. L.; Chait, B. T. Anal. Biochem. 1997, 247, 257-267. 10.1021/ac990120t CCC: $18.00
© 1999 American Chemical Society Published on Web 07/30/1999
An alternative to 2D-PAGE for the profiling of proteins from bacteria is the use of liquid chromatography separation methods. The use of HPLC provides for a highly reproducible means of separating and isolating proteins in the liquid phase. In previous work, a traditional RP C18 column was used to separate proteins from whole cell lysates of bacteria17-20 and from human lymphocyte nuclei.21 The protein fractions eluting from the column could be analyzed by either on-line electrospray (ESI)-MS or off-line matrix-assisted laser desorption/ionization (MALDI). In recent work,22-27 the use of nonporous C18-coated silica-based packing materials has been shown to provide distinct advantages in the separation of proteins compared to porous packing materials. The separation of complex protein mixtures in cells can be accomplished in a third the time and with much improved resolution relative to separations using porous packing material. Nonporous reversed-phase (NP RP) high-performance liquid chromatography (HPLC) has recently been used to separate out over 100 protein peaks up to 30 kDa for human erythroleukemia cell lysates in under 30 min.28 In addition, the use of nonporous packing eliminates porosity and therefore minimizes loss of protein to irreversible adsorption within the pores. This provides much enhanced protein recovery, a greater speed of separation, and a decrease in chemical noise 29 especially for mass spectrometric detection as recently shown in the work of Banks and Gulcicek.24 In this work, we use RP HPLC columns packed with NP C18 silica beads to rapidly separate water soluble proteins in Escherichia coli whole cell lysates. In this case, we examine the E. coli response to L-arabinose induction as a model system that has been previously well characterized by protein and DNA analysis.30-37 It is shown that, by using a two-column separation with relatively short columns held at different temperatures, rapid separation can (17) Liang, X.; Zheng, K.; Qian, M. G.; Lubman, D. M. Rapid Commun. Mass Spectrom. 1996, 10, 1219-1226. (18) Opitek, G. J.; Lewis, K. C.; Jorgenson, J. W. Anal. Chem. 1997, 69, 15181594. (19) Griffin, P. R.; MacCoss, M. J.; Eng, J. K.; Blevins, R. A.; Aaronson, J. S.; Yates, J. R. Rapid Commun. Mass Spectrom. 1995, 9, 1546-1541. (20) Opiteck, G. J.; Ramirez, S. M.; Jorgenson, J. W.; Mosely, M. A. Anal. Biochem. 1998, 258, 344-361. (21) Nilsson, C. L.; Murphy, C. M.; Ekman, R. Rapid Commun. Mass Spectrom. 1997, 11, 610-612. (22) Nimura, N.; Itoh, H.; Kiroshita, T. J. Chromatogr. 1991, 585, 207-211. (23) Itoh, H.; Nimura, N.; Kiroshita, T.; Nagae, N.; Nomura, M. Anal. Biochem. 1991, 199, 7-10. (24) Banks, J. F.; Gulcicek, E. E. Anal. Chem. 1997, 69, 3973-3978. (25) Barder, T.; Wohlman, P.; Thrall, C.; DuBois, P. D. LC-GC 1997, 15, 918925. (26) Chen, Y.; Wall, D.; Lubman, D. M. Rapid Commun. Mass Spectrom. 1998, 12, 1994-2003. (27) Chong, B. E.; Lubman, D. M.; Rosenspire, A.; Miller, F. Rapid Commun. Mass Spectrom. 1998, 12, 1986-1993. (28) Tonella, L.; Walsh, B. J.; Sanchez, J.; Keli, O.; Wilkins, M. R.; Tyler, M.; et al. Electrophoresis 1998, 19, 1960-1971. (29) Nimura, N.; Itoh, H. Mol. Biotechnol. 1996, 5, 11-16. (30) Englesberg, E. J. Bacteriol. 1961, 81, 996. (31) Englesberg, E.; Anderson, R. L.; Weinberg, R.; Lee, N.; Hoffee, P.; Huttenhauer, G.; Boyer, H. J. Bacteriol. 1962, 84, 137. (32) Scripture, J. B.; Voelker, C.; Miller, S.; O’Donnell, R. T.; Polgar, L.; Rade, J.; Horazdovsky, B. F.; Hogg, R. W. J. Mol. Bacteriol. 1987, 197, 37. (33) Reeder, T.; Schleif, R. J. Bacteriol. 1991, 173, 7765. (34) Lee, N.; Gielow, W.; Martin, R.; Hamilton, E.; Fowler, A. Gene 1986, 47, 231. (35) Patrick, J.; Lee, N. J. Biol. Chem. 1968, 16, 4312. (36) Miyada, C. G.; Horwitz, A. H.; Cass, L. G.; Timko, J.; Wilcox, G. Nucleic Acids Res. 1980, 8, 5267. (37) Maiden, M. C.; Jones-Mortimer, M. C.; Henderson, P. J. J. Biol. Chem. 1988, 263, 8003.
be achieved with high efficiency and high recovery of proteins in the liquid phase for MALDI-MS analysis. The eluting fractions were collected and analyzed by MALDI time-of-flight mass spectrometry (TOFMS) to determine the molecular weights of the proteins and for peptide mapping of the tryptic and Glu-C digests. The peptide maps generated were then entered into the MS-Fit program, along with molecular weight information, to search the NCBInr and OWL databases for the identity of target proteins. Although often more than one protein was present in each fraction, these methods provided for rapid detection and identification of proteins from E. coli lysates. Six of the L-arabinoseinduced proteins were detected and identified from the Larabinose-induced E. coli. It is shown that NP RP HPLC with MALDI-TOFMS detection may serve as a rapid means of detecting and identifying changes in bacterial protein expression due to external stimuli. EXPERIMENTAL SECTION Sample Preparation for RP HPLC. The E. coli K12 (DH5R) used harbored the pBAD-GFP expression vector, a derivative of the pBAD18 vector,38 containing the cycle three mutant form of the green fluorescent protein39 (GFP). The cells were prepared as in previous work40 and ultimately produced as cell pellets which were stored at -80 °C. The cell pellets were then resuspended in 1 mL of 50 mM tris (ICN, ultrapure) (on ice) and vortexed. All water used from this point forward was filtered to Milli-Q purity (Millipore Corp.). The cell suspension was sonicated, on ice, twice for 30 s. The mixture was inspected visually to ensure that the cloudy solution became clear indicating effective cell disruption. This solution was then centrifuged at 28400g (Aermle Z252M Labnet centrifuge) for 10 min. The supernatant containing the water-soluble proteins was collected and placed on ice. This solution was mixed in a 1:1 ratio with a solution of 8 M urea (ICN, ultrapure) and 20 mM tris for a final sample of proteins in 4 M urea and 35 mM tris. This solution was injected into the RP HPLC system for separation. The volume of injection was typically from 500 to 1000 µL. The amount of protein in the protein extract was determined by use of the dotMETRIC protein assay (Geno Technology, Inc., St. Louis, MO). This assay is capable of (5% mass accuracy, and the results are independent of the type of proteins present. RP HPLC Separation. Separations were performed at a flow rate of 1.0 mL/min on two short analytical (4.6 × 53 mm, 4.6 × 14 mm) RP HPLC columns containing 3- and 1.5-µm C18 (ODSI) nonporous silica beads (Micra Scientific Inc.), respectively. The columns were connected in series starting with the 4.6 × 53 mm column followed by the 4.6 × 14 mm column. The longer first column was placed in a Timberline column heater and maintained at 65 °C while the shorter second column was placed just outside the heater. Sample eluting from the first column (65 °C) was transported directly to the second column (25 °C) by a 10 cm section of 0.01-in.-i.d. stainless steel tubing. The second column, at a lower temperature, exhibited relatively longer retention times (38) Guzman, L.; Medlin, D.; Carson, M. J.; Beckwith, J. J. Bacteriol. 1995, 177, 4121. (39) Crameri, A.; Whitehorn, E. A.; Tate, E.; Stemmer, P. C. Nature Biotechnol. 1996, 14, 315. (40) Chong, B. E.; Wall, D. B.; Lubman, D. M.; Flynn, S. J. Rapid Commun. Mass Spectrom. 1997, 11, 1900-1908.
Analytical Chemistry, Vol. 71, No. 17, September 1, 1999
3895
than the first column. The separations were performed using water/acetonitrile (0.1% TFA) gradients. The gradient profile used was as follows: (1) 0-30% acetonitrile (solvent B) in 3 min, (2) 30-50% B in 10 min, (3) 50-100% B in 0.5 min, (4) 100% B for 2 min, and (5) 100-0% B in 0.5 min. The starting point of this profile was 1.58 min due to a 1.58-min dwell time. The acetonitrile was 99.93+% HPLC grade from Sigma and the TFA was from 1-mL sealed glass ampules (Sigma). The first column was heated in the Timberline column heater, which was equipped with a precolumn solvent heater to ensure that both the solvent and column were at the designated temperature. Gradient times were minimized to achieve rapid separations while still providing adequate resolving power to separate large numbers of bacterial proteins. The HPLC instrument used was a Beckman model 127s/ 166 system. Peaks were detected by absorbance of radiation at 214 nm in a 15-µL analytical flow cell. Protein standards used for estimation of loading on the first and second columns were from Sigma Corp. and included cytochrome c (bovine), lysozyme (chicken egg), carbonic anhydrase, and bovine serum albumin. The amount of protein loaded on the first column was estimated by use of calibration curves that correlate peak area to mass of protein injected as well as by dotMETRIC analysis of the cell extract solution. Protein loading on the second column was significantly less than on the first column and was determined as explained in the Results and Discussion section. MALDI-TOFMS of RP HPLC Protein Isolates. MALDITOFMS was used to determine molecular weights of collected proteins as well as the masses of peptides from protein digests as described in previous work.26 Collection of proteins used for molecular weight information was performed directly onto the MALDI-TOFMS probe tips which were precoated with nitrocellulose (NC) (Immobilin-NC Pure, Millipore) to assist desalting of the samples.41-43 Typically 2 µL of NC at 18 mg/mL in acetone was used to coat the tip. A 3-µL aliquot of 0.1% TX-100 detergent (ICN) was added to the tip just prior to collection of the protein fractions. The collection volume was limited to 40-50 µL of effluent per tip. At a flow rate of 1 mL/min, this translates to 3-s collection windows, making the timing of the collection critical. Most peaks were from 6 to 12 s wide and only the most concentrated 3-s part of the peak was collected. A 5-µL aliquot of the MALDI matrix R-cyano-4-hydroxycinnamic acid (R-CHCA) (Sigma) in a saturated solution of acetone (1% TFA) was then mixed with the collected fraction (total volume, 50 µL). The sample was air-dried and then introduced into the TOF instrument for molecular weight determination. After MALDI-MS analysis was used to determine molecular weights for selected peaks from the RP HPLC separation, the proteins were collected a second time for enzymatic digestion. The fractions were collected into 1.5-mL polypropylene tubes, and the window of collection was expanded to cover the entire peak, which was then dried in a Centrivap concentrator Speedvac (Labconco Corp., Kansas City, MO) for 2 h at 45 °C. A 1 M NH4HCO3 (ICN) solution was added that resulted in a solution of 50100 mM NH4HCO3. 1,4-Dithiothreitol (DTT, ICN) was then added (41) Bai, J.; Liu, Y.-H.; Cain, T. C.; Lubman, D. M. Anal. Chem. 1994, 66, 3423. (42) Preston, L. M.; Murray, K. K.; Russel, D. H. Biol. Mass Spectrom. 1993, 22, 544. (43) Liu, Y.-H.; Bai, J.; Liang, X.; Lubman, D. M. Anal. Chem. 1995, 67, 3482.
3896 Analytical Chemistry, Vol. 71, No. 17, September 1, 1999
Figure 1. Discontinuous nonporous RP HPLC separations of whole cell protein lysates from L-arabinose-induced (A) and uninduced (B) E. coli K12 bacteria.
to a final concentration of 1 mM. After vortexing, 1 µg of enzyme was added. The solution was vortexed a final time and then placed in a 37 °C warm room for 18 h. The enzymes used were either trypsin (Promega, TPCK treated), which cleaves at the carboxy side of the arginine and lysine residues, or Glu-C (Promega), which in 50-100 mM NH4HCO3 solution cleaves at the carboxy side of the glutamic acid residues. These whole digests were then used for identification of the proteins by MALDI-MS. RESULTS AND DISCUSSION In Figure 1 is shown a separation of the water-soluble proteins from an E. coli K12 whole cell lysate using the two-column nonporous reversed-phase HPLC setup. These nonporous silica bead packed columns have the advantage in this work of eliminating nonspecific adsorption and of providing rapid separation of proteins. One of the advantages of using two columns in series is that since each column is held at a different temperature the selectivity for each separation will be different.44-47 The first separation in the longer column is performed at 65 °C in order to maximize separation efficiency and protein recovery. The second separation in the shorter column was slowed by being performed at a lower temperature. Proteins that coelute under the conditions of the first column then adsorb to the second column and experience a second separation under the new set of conditions (44) (45) (46) (47)
Wolcott, R. G.; Dolan, R. W. LC-GC 1998, 16, 1080-1083. Chen, H. Horvath, Cs. J. Chromatogr. 1995, 705, 3. Ooma, B. LC-GC 1996, 14, 306. Antia, F.; Horvath, Cs. J. Chromatogr. 1988, 435, 1.
Table 1. L-Arabinose-Induced E. Coli Protein Profile MW
a
fraction no.
predicted
exper
name and description
U (also Un)a
1 1 2 2 3 4 4 5 5 5 6-8 6-8 6-8 6-8 10 10 10 11 16
14 856.3 19 725.2 11 564.4 20 301.7 14 284.3 22 243.6 36 423.8 26 884.5 33 384.1 35 541.1 26 867.3 55 018.5 55 041.9 35 556.1 26 460.8 19 869.5 28 738.3 26 433.5 56 103.3
14 646 19 506 11 564 20 604 14 396 22 424 36 502 26 759 33 318 35 659 26 938 53 876 53 876 35 800 26 930 20 050 28 720 26 668 57 200
30S ribosomal subunit protein S9 insertion element IS 150 ORF A 50S ribosomal subunit protein L21 50S ribosomal subunit protein L5 YFID protein L3 reverse transcriptase green fluorescent protein GFP arabinose operon regulatory protein AraC L-arabinose binding periplasmic protein AraF GFPuv L-arabinose transport ATP binding protein AraG high-affinity ribose transport protein L-arabinose binding preprotein (AA -23 to 306) AraFGH CsoB gene colicin E4 tryptophan synthase R subunit phosphate regulon transcriptional regulatory protein PhoB operon araA
U9 U1 U3 U4 U3 U15
U11 U12 U11
These proteins labeled with a Un were also found in the protein profiles from the uninduced E. coli.
Figure 2. MALDI-MS of proteins in fractions 5 (A) and 10 (B) from the RP HPLC separation of the L-arabinose-induced E. coli whole cell lysate.
therein. The significantly lower sample loading of the second column provides for a more efficient separation. Another advantage of using two columns connected in series is that the amount of protein that can be loaded on the first column can be quite high without loss of separation efficiency. This is because the efficiency of the separation will ultimately depend
on the low loading of the second column. The amount of protein loaded for these analytical-scale separations, as determined by the dotMETRIC assay, was from 290 to 360 µg. At any given time during the separation the second column will only be loaded by a small fraction of the proteins from the initial injection. This is because only a small fraction of the proteins will be eluting from the first column to the second column at any given time and these proteins only remain on the second column for a short period of time before eluting again. To determine the amount of protein that is loaded onto the second column at any given time, it is necessary to determine the average time that the typical protein spends adsorbed to the second column. Once this is known then the amount of protein that may build up on the second column at any given time can also be found. This is determined by comparing chromatograms from separations where only the 4.6 mm × 53 mm column (65 °C) was used and when the 4.6 mm × 14 mm column (25 °C) was added in series. The increase in retention time observed with the addition of the second column may be attributed to two factors. One factor is the increase in dwell time, which in this case is 16 s. The second factor is the time that the typical protein spends adsorbed to the second column. For cytochrome c and bovine serum albumin, the extra time spent adsorbed to the second column was 15 and 35 s, respectively. If we can assume that 290 µg of protein is loaded initially and that this mass is separated over 12 min (720 s), then time windows of 15 and 35 s equate to (15/720) × 290 µg and (35/720) × 290 µg, respectively. Therefore, the typical loading of the second column ranges from 6 to 14 µg at any given time. Under the conditions used in this work, 290 µg of water-soluble bacterial proteins could be separated into at least 60 peaks in less than 15 min allowing for adequate amounts of protein in each peak for molecular weight and proteolytic digest analyses. The average amount of protein per peak then is 290 µg/60 peaks or 4.8 µg of protein/peak, assuming 100% recovery. The actual percent recovery of proteins for this two-column system was mass dependent. The following percent recoveries Analytical Chemistry, Vol. 71, No. 17, September 1, 1999
3897
Table 2. L-Arabinose Uninduced E. Coli Protein Profile MW fraction no.
predicted
exper
name and description
I (also In)a
1 2 2 3 3 3 4 4 4 5 6 6 6 6 7 7 7 8 9 9 10 10 11 11 11 12 12 12 13 14 15 16
19 725.2 26 228.9 24 160.9 22 243.6 16 018.6 20 301.7 14 284.3 25 578.2 25 493.1 37 660.9 29 028.9 29 000.9 23 581.4 19 400.0 19 422.2 17 688.3 17 772.4 34 186.3 14 856.3 29 860.6 27 891.3 15 563.2 26 460.8 26 433.5 21 152.5 33 366.9 22 121.7 19 869.5 29 860.6 23 630.5 36 423.8 43 313.8
19 296 26 298 24 264 22 260 15 810 20 316 14 283 25 388 25 388 15 000-40 000 29 060 29 060 23 182 19 810 19 000 18 000 18 000 15 000-40 000 14 922 29 844 28 231 15 279 26 804 26 804 21 224 33 147 22 098 20 000 29 726 23 630 36 730 43 780
insertion element IS150 ORF A sugar fermentation stimulation protein PAA4 ECOLI.RESOLVASE protein L3 50S ribosomal subunit protein L13 50S ribosomal subunit protein L5 YFID pspA protein phage shock protein A peptide transport system ATP binding protein SapD adhesin F41 Fim41a protein precursor (AA -22 to 255) gutQ gene product (function unknown) (AA 1-223) NADH dehydrogenase OraA histone-line protein Hlp-1 precursor cationic outer membrane protein precursor motility protein B 30S ribosomal subunit protein S9 50S ribosomal subunit protein L2 ECOLI.DNA replication protein DNAC traM gene product CsoBgene phosphate regulon transcriptional regulatory protein PhoB RES4 ECOLI.recombinase ATP phosphoribosyltransferase colicin A 20 000 fragment colicin E4 50S ribosomal subunit protein L2 TraW reverse transcriptase elongation factor EF-Tu (duplicate gene)
I1
a
I4 I2 I3
I1
I10 I11
I10 I4
These proteins labeled with an In were also found in the protein profiles from the uninduced E. coli.
were observed. For cytochrome c (12.4 kDa), lysozyme (14.4 kDa), carbonic anhydrase (29 kDa), and BSA (67 kDa) the percent mass recoveries were 94.5, 97.2, 86.3, and 61.2%, respectively. Using a linear fit to these data to generate an equation that relates percent recovery to molecular weight (percent recovery ) 1.041 - MW × 6.367; R ) 0.97, SD ) 0.016, and N ) 4), it is possible to determine the average amount of protein recovered from a whole cell lysate. Assuming that the average protein was 40 kDa then the average percent recovery would be ∼79%. Therefore if 290 µg of protein is injected, 228 µg will be recovered providing for ∼3.8 µg of protein/peak. The amount of material that can be loaded onto and recovered by the nonporous separations represents a significant advantage over the 2-D gel, which in preparatory mode typically provides for the loading of only 200 µg of protein. The RP HPLC system used in this work is considered to be analytical in scale, not preparatory, and so significant increases in loading could be obtained with a shift to larger diameter columns. The high loadability and recovery of this system is particularly relevant to proteomics as it relates to the amount of protein in a given fraction that is available for molecular weight, digest, and detailed sequencing analyses. In this work, NP RP HPLC was used to rapidly separate proteins from whole cell lysates of E. coli grown in rich medium and in rich medium supplemented with L-arabinose to detect the expression of the Ara proteins. To demonstrate induction of the ara genes, these cells were transformed with a plasmid, pBADGFP, that carries a copy of the GFP under the control of the 3898 Analytical Chemistry, Vol. 71, No. 17, September 1, 1999
arabinose promoter. The process required to metabolize Larabinose results in the production of specific proteins associated with this process. The result, as shown in the NP RP HPLC separation of bacterial whole cell lysates from the induced (Figure 1a) versus the uninduced sample (Figure 1b), is an alteration of the separation protein profiles. Some similarities are present, but also the appearance of new protein peaks in the induced form is observed, the most notable new peaks being fractions 5-8 and 16 (Figure 1a). The appearance of many similar peaks is expected since these protein profiles are both from E. coli K12 (DH5R). The appearance of new peaks was due to proteins expressed for the catabolism of L-arabinose. In Figure 1a, the very large peak, at a retention time of 7.0-7.1 min, is composed mostly of the GFP marker that is expressed in these bacteria upon induction of L-arabinose catabolism. In addition, several large peaks as labeled by arrows in the chromatogram indicate the expression of Ara proteins needed for L-arabinose catabolism. Fraction 5 contains AraC, AraF, and GFP; fractions 6-8 contain AraG, AraFGH, and GFP; and fraction 16 contains AraA. The ara regulon that these E. coli harbor contains nine genes that encode for the proteins necessary for L-arabinose catabolism48 as well as the GFP gene.39 AraA, araB, and araD code for the proteins that convert L-arabinose to D-xylulose 5-phosphate. AraE and araFGH encode four proteins for the low- and high-affinity (48) Neidhardt, F. C.; Ingraham, F. L.; Low, B.; Magasanik, B.; Schaecter, M. In Escherichia coli and Solmonella typherium: Cellular and Molecular Biology; Umberger, H. E., Ed.; American Society for Microbiology: Washington DC, 1987; pp 1473-1481.
uptake, respectively, of arabinose into the cell. The araC gene codes for the protein that regulates the expression of at least six of the ara genes. The araJ gene codes for a protein that has no known function. The proteins targeted in this study were Larabinose induced and water soluble. These proteins include AraC, a cytosolic homodimer protein that regulates ara expression; AraA, AraB, and AraD, which provide for the catabolism of L-arabinose in the cell; AraFGH, a precursor to AraF; AraF, a periplasmic high-affinity L-arabinose membrane transport system protein; and AraG, an inner membrane-associated L-arabinose transport ATP binding protein. AraH is a hydrophobic protein and not a water-soluble protein and so was not targeted. AraE is a transmembrane protein that previous experiments have failed to identify, and AraJ is a hydrophobic protein in low abundance. Previous work with 2-D gels of arabinose-induced cytoplasm and membrane fractions has been able to show the expected synthesis of AraB, AraA, AraD, AraC, AraF, and AraG proteins. The araE encoded protein was not identified and no other arabinose-induced proteins were seen.49 In this work using the two-column HPLC method, we were able to identify four out of the six Ara proteins observed in the gel work (AraA, AraC, AraF, AraG) as well as one Ara F precursor protein not seen in the gel work (AraFGH). Of the normally expressed proteins, only nine were found common to both the induced and uninduced protein profiles. This is because the induced cells translational machinery was overloaded with the task of making the L-arabinose-induced proteins. In Figure 1, over 60 peaks are observed in each of the induced and uninduced E. coli cell lysate chromatograms. On the basis of results from the 16 peaks that were collected and analyzed by MALDI-MS to determine the identity of the proteins therein, it can be said that each peak contained on average at least two identifiable proteins, most of which were under 40 kDa. Therefore, if 60 peaks can be resolved and collected, then at least 120 proteins under 40 kDa can ultimately be observed and identified from the separation.27 This compares favorably with 2D-PAGE in a similar mass range wherein approximately 120 proteins from an E. coli whole cell lysate, with molecular masses between 8 and 37 kDa and isoelectric points from 4 to 10, have been identified after separation by 2D-PAGE.28 In addition, the resolution of the HPLC separation in this mass range is far superior to that of the 2-D gel. To completely isolate individual proteins for sequencing purposes, an additional separation could be performed using a modified gradient, but was not required for the purpose of this study. Also, the conditions of the first and second columns could be further optimized to maximize the peak capacity and provide for a more efficient separation. The run-to-run reproducibility of the current setup for HPLC separations is such that each peak could be repeatedly detected within 6 s or ∼50% of the average peak width. The reproducibility is 0.83% of the total separation time, assuming a separation time of 12 min. The repeatability is especially significant for fraction collection and detection by MALDI-MS where multiple runs and fraction collections may be performed on the same peak for digest analysis or further LC/ESI-MS experiments. The MALDI-MS of several of the proteins collected from the HPLC separation are shown in Figure 2. In Figure 2a is shown the MALDI-MS of fraction 5 while in Figure 2b is shown the (49) Kolodrubetz, D.; Schleif, R. J. Bacteriol. 1981, 148, 472-479.
MALDI-MS of fraction 10 where both were collected from the induced E. coli. Several peaks are observed in each spectrum due to the presence of multiply charged species and to dimers that often result from the use of the R-CHCA matrix. Also, many fractions appeared to contain more than one protein. The experimentally determined molecular weights of each protein for the induced and uninduced E. coli are tabulated in Tables 1 and 2, respectively. These molecular weights are obtained by direct MALDI-MS analysis of the eluting fractions from a particular RP HPLC separation. The ability to perform MALDI-MS on large proteins collected directly from the liquid phase provides a relatively easy method for molecular weight sizing with minimal sample preparation and excellent mass accuracy relative to 2DPAGE. To identify selected proteins, peptide maps were generated by enzymatic digestion using trypsin for the uninduced E. coli proteins and both trypsin and Glu-C digestion for the induced E. coli proteins. The protein digests were then analyzed by MALDIMS, and an MS-Fit database search of the digest peaks observed was performed using the NCBInr and OWL databases. The search used the molecular weight of the protein, the species, and the enzymatic peptide mass maps to identify the proteins. This was performed for example for a tryptic digest of fraction 5 from the induced form of E. coli. The database search in this case shows that the digest products of three proteins are present in this fraction isolated by HPLC separation. The proteins identified in this peak are GFP, AraC, and AraF. The mass/charge (m/z) of these proteins had been determined by MALDI-MS of this fraction before digestion as 26 759, 33 318, and 35 659 Da, respectively. In Tables 1 and 2 are shown results of the MALDI-MS analysis of selected liquid fractions collected after separation of whole cell lysates by NP RP HPLC. In each case the molecular weight, as determined by MALDI-MS, is tabulated along with the tentative identity of the protein. In the case of the induced E. coli protein fractions, both trypsin and Glu-C digestion were performed on each fraction to increase the confidence of protein identification. An important goal of the work is to demonstrate the capabilities of the NP RP HPLC method for rapid separation of whole cell lysates with MALDI-MS detection for monitoring changes in cells due to external conditions. In this case, we are trying to detect and identify the appearance of arabinose-induced proteins when E. coli are grown on LB with arabinose. The L-arabinose-induced proteins identified from the induced cell lysate are listed in Table 1 and include GFP, AraA, AraC, AraF, AraFGH, and AraG. The molecular weights are shown as the predicted molecular weight as calculated from the sequence of the genes and as the experimental molecular weights of the proteins as determined in this work by MALDI-MS. As shown in Figure 1b and Table 2, these Ara protein peaks are absent from the uninduced spectrum. In addition, the detection of the GFP protein marker is an important indicator to confirm induction. Using the method described herein, we can confirm the identity and production of Ara proteins in response to growth on arabinose medium. CONCLUSION NP RP HPLC has been demonstrated as a means of rapidly separating proteins from whole cell lysates of bacterial cells. The proteins are collected in liquid fractions that are highly compatible with immediate MALDI-MS analysis and enzymatic digestion. In Analytical Chemistry, Vol. 71, No. 17, September 1, 1999
3899
this work, whole cell lysates were separated on-line using two nonporous columns at different temperatures in order to provide for high loadability and optimal separation efficiency for complex protein mixtures. The eluting proteins were collected and the molecular weights determined by MALDI-TOFMS, where more than one protein was often present in each fraction. The fractions were then digested by trypsin or Glu-C, and the resulting peptide map was used in conjunction with molecular weight information for MSFit database searching in order to identify the proteins present. The method was used to identify six of the L-arabinoseinduced proteins expressed in E. coli: AraA, AraC, AraFGH, AraF, AraG, and GFP. In addition, 18 proteins from the induced protein profile and 33 proteins from the uninduced protein profile that are normally expressed were identified. It is shown that NP RP HPLC with MALDI-TOFMS detection may serve as a rapid means of detecting and identifying normally expressed proteins and changes in bacterial protein expression due to external stimuli.
3900 Analytical Chemistry, Vol. 71, No. 17, September 1, 1999
ACKNOWLEDGMENT We thank Micra Scientific, Inc for the generous donation of a NPRP HPLC column and Tim Barder of Micra Scientific Inc. for helpful suggestions during this work.We also thank Craig Criddle of Stanford University for bacterial samples and helpful suggestions. In addition, we thank James Tiedje from the Center for Microbial Ecology at Michigan State University for encouraging this work.We gratefully acknowledge partial support of this work by the National Science Foundation under Grant DEB912006 to the Center for Microbial Ecology at Michigan State University and by the U.S. Army ERDEC under contract DAAD05-98-9-0796 and by the National Institutes of Health under Grant R01GM49500.
Received for review February 3, 1999. Accepted June 8, 1999. AC990120T