Anal. Chem. 2001, 73, 978-986
Quantitative Proteomic Analysis Using a MALDI Quadrupole Time-of-Flight Mass Spectrometer Timothy J. Griffin,*,† Steven P. Gygi,‡ Beate Rist,§ and Ruedi Aebersold†
Department of Molecular Biotechnology, University of Washington, Box 357730, Seattle, Washington 98195-7730 Alexander Loboda
MDS Sciex, 71 Four Valley Drive, Concord, Ontario, L4K 4V8, Canada Alexandra Jilkine, Werner Ens, and Kenneth G. Standing
Department of Physics and Astronomy, University of Manitoba, Winnipeg, Manitoba, R3T 2N2, Canada
We describe an approach to the quantitative analysis of complex protein mixtures using a MALDI quadrupole time-of-flight (MALDI QqTOF) mass spectrometer and isotope coded affinity tag reagents (Gygi, S. P.; et al. Nat. Biotechnol. 1999, 17, 994-9.). Proteins in mixtures are first labeled on cysteinyl residues using an isotope coded affinity tag reagent, the proteins are enzymatically digested, and the labeled peptides are purified using a multidimensional separation procedure, with the last step being the elution of the labeled peptides from a microcapillary reversed-phase liquid chromatography column directly onto a MALDI sample target. After addition of matrix, the sample spots are analyzed using a MALDI QqTOF mass spectrometer, by first obtaining a mass spectrum of the peptides in each sample spot in order to quantify the ratio of abundance of pairs of isotopically tagged peptides, followed by tandem mass spectrometric analysis to ascertain the sequence of selected peptides for protein identification. The effectiveness of this approach is demonstrated in the quantification and identification of peptides from a control mixture of proteins of known relative concentrations and also in the comparative analysis of protein expression in Saccharomyces cerevisiae grown on two different carbon sources. As the sequencing of the human genome nears completion, the focus of biological research on the molecular level is being shifted toward the functional analysis of the gene sequences being discovered. Essential to the functional analysis of biological systems is the ability to identify the proteins expressed in a particular cell or tissue and to quantify changes in their abundance resulting from external (e.g., environmental, pharmacological) or * Corresponding author: (e-mail)
[email protected]; (fax) (206) 732-1299 (Institute for Systems Biology). † Current address: Institute for Systems Biology, 4225 Roosevelt Way, NE Suite 200, Seattle, WA 98105. ‡ Current address: Harvard Medical School, Department of Cell Biology, 240 Longwood Ave., Boston, MA 02115. § Current address: BioVisioN GmbH and Co. KG, Hannover, Germany.
978 Analytical Chemistry, Vol. 73, No. 5, March 1, 2001
internal (e.g., genetic, developmental) perturbations of the cell or organism. To this end, mass spectrometry has proven to be the most effective technology for the characterization of gene products at the protein level.2-6 Traditionally, methods for the routine quantification and identification of proteins contained in complex mixtures involved the separation of proteins by twodimensional polyacrylamide gel electrophoresis (2D-PAGE), followed by their mass spectrometric (MS) analysis, most often using either nanoelectrospray tandem mass spectrometry (MS/MS)7,8 or reversed-phase microcapillary liquid chromatography (RP-µLC)9 in conjunction with electrospray (ESI) MS/MS analysis.10-16 More recently, alternative mass spectrometric approaches to determine quantitative profiles of complex protein mixtures have been developed that employ stable isotope labeling of proteins followed by mass spectrometric analysis.1,3,17-19 One such approach involves the specific labeling of proteins postisolation using a class of (1) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Nat. Biotechnol. 1999, 17, 994-9. (2) Yates, J. R., 3rd. Trends Genet. 2000, 16, 5-8. (3) Gygi, S. P.; Aebersold, R. In Proteomics: A Trends Guide; Mann, M., Blackstock, W., Eds.; Elsevier: London, 2000, pp 32-7. (4) Kuster, B.; Mann, M. Curr. Opin. Struct. Biol. 1998, 8, 393-400. (5) Pandey, A.; Mann, M. Nature 2000, 405, 837-46. (6) Arnott, D.; Shabanowitz, J.; Hunt, D. F. Clin. Chem. 1993, 39, 2005-10. (7) Wilm, M.; Shevchenko, A.; Houthaeve, T.; Breit, S.; Schweigerer, L.; Fotsis, T.; Mann, M. Nature 1996, 379, 466-9. (8) Wilm, M.; Mann, M. Anal. Chem. 1996, 68, 1-8. (9) Deterding, L. J.; Moseley, M. A.; Tomer, K. B.; Jorgenson, J. W. J. Chromatogr. 1991, 554, 73-82. (10) Hunt, D. F.; Henderson, R. A.; Shabanowitz, J.; Sakaguchi, K.; Michel, H.; Sevilir, N.; Cox, A. L.; Appella, E.; Engelhard, V. H. Science 1992, 255, 1261-3. (11) Davis, M. T.; Stahl, D. C.; Hefta, S. A.; Lee, T. D. Anal. Chem. 1995, 67, 4549-56. (12) McCormack, A. L.; Schieltz, D. M.; Goode, B.; Yang, S.; Barnes, G.; Drubin, D.; Yates, J. R., 3rd. Anal. Chem. 1997, 69, 767-76. (13) Gygi, S. P.; Rochon, Y.; Franza, B. R.; Aebersold, R. Mol. Cell. Biol. 1999, 19, 1720-30. (14) Haynes, P. A.; Fripp, N.; Aebersold, R. Electrophoresis 1998, 19, 939-45. (15) Link, A. J.; Eng, J.; Schieltz, D. M.; Carmack, E.; Mize, G. J.; Morris, D. R.; Garvik, B. M.; Yates, J. R., 3rd. Nat. Biotechnol. 1999, 17, 676-82. (16) Shabanowitz, J.; Settlage, R. E.; Marto, J. A.; Christian, R. E.; White, F. M.; Russo, P. S.; Martin, S. E.; Hunt, D. F. In Mass Spectrometry in Biology and Medicine; Burlingame, A. L., Carr, S. A., Baldwin, M. A., Eds.; Humana Press: Totowa, NJ, 2000; pp 163-77. 10.1021/ac001169y CCC: $20.00
© 2001 American Chemical Society Published on Web 02/03/2001
chemical reagents we have termed isotope-coded affinity tags (ICATs).1 In this method, the proteins in two samples (e.g., the proteins expressed by a cell under two different physiological conditions) are labeled separately on the side chains of their reduced cysteinyl residues using one of two isotopically different, but chemically identical sulfhydryl-reactive ICAT reagents (one being an isotopically “light” reagent, d(0), the other being a “heavy” reagent containing eight deuterium atoms on its carbon backbone, d(8)). The labeled protein mixtures are combined and enzymatically digested, and the labeled peptides are isolated by affinity chromatography, using the affinity tag (biotin group) that is part of the ICAT reagents. The selected peptides are analyzed by RP-µLC ESI tandem mass spectrometry, operated with automated, data-dependent ion selection for collision-induced dissociation (CID)20,21 and with dynamic exclusion.16,22-24 As the pairs of peptide labeled with the d(0) and d(8) versions of the ICAT reagent are chemically identical, they serve as mutual internal standards for accurate protein quantification. The relative quantity of each protein present in the two biological samples is therefore determined by measuring the relative signal intensities of pairs of isotopically labeled, concurrently eluting peptides using an initial MS scan. The identification of the proteins is accomplished by switching the instrument to MS/MS mode in which it selects peptides for CID. The CID mass spectra are then automatically correlated with sequence databases to identify the protein from which the selected peptide originated.25,26 This procedure, while automated and robust, has the limitation that, with current instrumentation operated in automated LC-MS/MS mode, peptide selection cannot be based on the abundance ratio. Therefore, during the analysis of complex protein mixtures, potentially large numbers of proteins are identified that do not show a quantitative change under the conditions tested and may thus be of limited interest for that particular experiment. Recently, a prototype mass spectrometer has been described,27,28 consisting of a matrix-assisted laser desorption/ ionization (MALDI) source, an analytical quadrupole (Q) and a collision cell (q), and an orthogonal injection time-of-flight (TOF) detector, to form what has been termed a MALDI QqTOF mass spectrometer. This instrument is capable of operating in either the MS mode, to acquire a TOF spectrum of intact peptides, or alternatively in MS/MS mode, where a precursor ion can be selected in the analytical quadrupole for CID in the collision cell. (17) Oda, Y.; Huang, K.; Cross, F. R.; Cowburn, D.; Chait, B. T. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 6591-6. (18) Pasa-Tolic, L.; Jensen, P.; Anderson, G.; Lipton, M.; Peden, K.; Martinovic, S.; Tolic, N.; Bruce, J.; Smith, R. J. Am. Chem. Soc. 1999, 121, 7949-50. (19) Geng, M.; Ji, J.; Regnier, F. E. J. Chromatogr., A 2000, 870, 295-313. (20) Hunt, D. F.; Yates, J. R. d.; Shabanowitz, J.; Winston, S.; Hauer, C. R. Proc. Natl. Acad. Sci. U.S.A. 1986, 83, 6233-7. (21) Papayannopoupos, I. A. Mass Spectrom. Rev. 1995, 14, 49-73. (22) Gatlin, C. L.; Eng, J. K.; Cross, S. T.; Detter, J. C.; Yates, J. R., 3rd. Anal. Chem. 2000, 72, 757-63. (23) Figeys, D.; Aebersold, R. Electrophoresis 1997, 18, 360-8. (24) Courchesne, P. L.; Jones, M. D.; Robinson, J. H.; Spahr, C. S.; McCracken, S.; Bentley, D. L.; Luethy, R.; Patterson, S. D. Electrophoresis 1998, 19, 956-67. (25) Yates, J. R., 3rd. Electrophoresis 1998, 19, 893-900. (26) Eng, J.; McCormack, A. L.; Yates, J. R., 3rd. J. Am. Soc. Mass Spectrom. 1994, 5, 976-89. (27) Shevchenko, A.; Loboda, A.; Shevchenko, A.; Ens, W.; Standing, K. G. Anal. Chem. 2000, 72, 2132-41. (28) Loboda, A. V.; Krutchinsky, A. N.; Bromirski, M.; Ens, W.; Standing, K. G. Rapid Commun. Mass Spectrom. 2000, 14, 1047-57.
The resulting fragment ion spectra are recorded by the TOF component and searched against sequence databases to ascertain the sequence of the precursor peptide and, consequently, to identify the protein from which it is derived. The effectiveness of the MALDI QqTOF mass spectrometer in the sequence identification of peptides has been shown.27,28 Here we describe the use of the MALDI QqTOF mass spectrometer in conjunction with the ICAT technology for the concurrent quantification and identification of the components of complex protein mixtures. The advantages of using the MALDI QqTOF instrument for such quantitative proteomics experiments include simplified quantification based on the high mass resolution and simple mass spectra (essentially only singly charged ions are detected in the MS spectrum) provided by this instrument, the ability choose precursor ions for CID based on the ratio of abundance of a specific analyte in the two samples compared, and the ability to constrain sequence database searches with the highly accurate precursor ion mass measured by TOF detection. Using multidimensional chromatography in conjunction with the labeling of complex protein mixtures using ICAT reagents and analysis by MALDI QqTOF mass spectrometry, we demonstrate the effectiveness of this approach to the quantitative analysis of proteins expressed in yeast cells grown on the two carbon sources, ethanol and galactose. The results presented here show this approach to be well suited for quantitative proteomics, with great potential for automated, high-throughput applications. EXPERIMENTAL SECTION Materials and Reagents. The ICAT reagents used were synthesized as previously described.1,29 For all chromatographic steps, HPLC grade acetonitrile and MilliQ water (Millipore, Bedford, MA) were used. The MALDI matrix 2,5-dihydroxybenzoic acid (DHB) was purchased from Aldrich (Milwaukee, WI). Analysis of a Control Mixture of ICAT Labeled Proteins. Two mixtures containing the same five standard proteins at different concentrations (detailed here as µg/mL mixture 1, µg/mL mixture 2) were prepared. The proteins were purchased from Sigma (St. Louis, MO), and the names of these proteins are shown along with their abbreviated names as given in the Swiss-Prot annotated protein sequence database (http://www. expasy.ch/sprot/): rabbit glyceraldehyde-3-phosphate dehydrogenase (G3P_RABIT) (40, 20); rabbit phosphorylase b (PHS2_RABIT) (60, 20); chicken ovalbumin (OVAL_CHICK) (30, 60); bovine β-lactoglobulin (LACB_BOVIN) (10, 40); bovine R-lactalbumin (LCA_BOVIN) (10, 10). The proteins were denatured and reduced to generate free sulfhydryl groups by treatment with 50 mM Tris buffer, pH 8.5/6 M guanidine hydrochloride/5 mM tributylphosphine for 1 h at 37 °C. Cysteinyl residues in each mixture were independently biotinylated with a 5-fold molar excess of either the isotopically light, d(0), or heavy, d(8), form of the ICAT reagent.1 After the two mixtures were combined, excess ICAT reagent was removed by gel filtration using Econo-Pac 10 DG columns (Bio-Rad, Richmond, CA) in 50 mM Tris buffer, pH 8.5, with 0.1% SDS. The protein mixture was digested with porcine trypsin (Promega, Madison, WI) overnight at 37 °C. The solution was then passed over a monomeric avidin column (Pierce, (29) Gerber, S. A.; Scott, C. R.; Turecek, F.; Gelb, M. H. J. Am. Chem. Soc. 1999, 121, 1102-3.
Analytical Chemistry, Vol. 73, No. 5, March 1, 2001
979
Rockford, IL), prepared by following the product instructions provided by the manufacturer. The column was washed with water, and the biotinylated peptides were eluted with 0.3% formic acid into fractions of ∼1 mL. Approximately 10 pmol of total protein of the digested standard protein mix was loaded onto a 200 µm i.d. × 18 cm fused-silica capillary column packed in-house with Monitor 5-µm spherical silica C18 resin of 100-Å pore size (Column Engineering, Ontario, CA). The HPLC solvents used consisted of 5% acetonitrile/0.4% acetic acid/0.005% heptafluorobutyric acid (HFBA) for solvent A and 80% acetonitrile/0.4% acetic acid/0.005% HFBA for solvent B. A binary gradient from 15 to 50% B over 85 min was used to elute the peptides using an Integral HPLC workstation (PE Biosystems, Framingham, MA) flowing at 4 µL/min across the capillary column. The eluent was postcolumn flow split, so that ∼20% of the flow was directed into a Mariner ESI-TOF mass spectrometer (PE Biosystems) and the remaining flow was manually fraction collected in 1-min time intervals into a Nunc Microwell (Fisher Scientific, Rockford, IL) microtiter plate. The flow into the ESITOF mass spectrometer was used to monitor the elution of the peptides during the gradient. The dried fractions contained in each microtiter well were redissolved and spotted onto a sample plate containing preformed DHB matrix crystals at 160 mg/mL in 1:3 acetonitrile/water, for subsequent analysis by MALDI QqTOF mass spectrometry. Analysis of ICAT Labeled Proteins from Yeast. Logarithmically growing cells from Saccharomyces cerevisiae utilizing either 2% galactose or 2% ethanol as a carbon source in YP media were harvested as previously described13 and lyophilized. Protein samples were redissolved and labeled with the ICAT reagent using 2.5 mg of total protein from each cell state. Proteins from cells using the ethanol carbon source were labeled with the d(0) form of the ICAT reagent and those from cells using the galactose carbon source with the d(8) form. To calculate the amount of reagent required, each protein was estimated to have an average of six cysteines and an average protein molecular weight of 50 000. A 5-fold molar excess of ICAT reagent, relative to the thus estimated molar amount of cysteines, was used. The protein samples were then combined and digested overnight with trypsin at 37 °C. After digestion, the peptides were diluted into strong cation exchange (SCX) buffer A (25% acetonitrile/10 mM KHPO4, pH 3.0) and loaded onto a 2.1 mm i.d. × 20 cm, 300-Å pore size, polysulfoethyl A SCX column (PolyLC, Columbia, MD). The peptides were eluted using a gradient from 0 to 25% buffer B (25% acetonitrile/350 mM KCl/10 mM KHPO4, pH 3.0) at a flow rate of 200 µL/min over a period of 30 min using an Integral HPLC workstation (PE Biosystems, Framingham, MA), with collection of the eluent in 1-min fractions. The elution of peptides was monitored by absorbance at 214 nm. Each fraction was concentrated to a volume of ∼100 µL by vacuum centrifugation prior to avidin affinity purification to remove acetonitrile. Three SCX fractions were chosen from time points at 7, 10, and 13 min during the elution, where there was a large absorbance signal in the UV trace. These fractions were loaded onto a monomeric avidin column (Pierce) that was packed by gravity flow into the tip of a glass Pasteur pipet and prepared as per the product instructions with an additional wash step using 30% acetonitrile/0.4% trifluoroacetic acid (TFA) to remove impurities 980
Analytical Chemistry, Vol. 73, No. 5, March 1, 2001
from the avidin support. The pH of the concentrated SCX fractions was adjusted to 7.2 by the addition of 1 M NH4HCO3 and diluted in 2× PBS to a total volume of ∼300 µL. The samples were loaded onto separate avidin columns and allowed to bind for 20 min. The bound ICAT-labeled peptides were then washed with 5 column volumes (1 column volume was ∼500 µL) of 2× PBS using gravity flow, followed by washing with and additional 5 column volumes of 1× PBS and finally washing with 6 column volumes of 20% methanol in 50 mM NH4HCO3. The bound peptides were eluted in 4 column volumes of 30% acetonitrile/0.4% TFA. The purified ICAT-labeled peptides were then lyophilized and redissolved in RP-µLC buffer A. Approximately 20% of each of the avidin-purified SCX samples was loaded onto a 300 µm i.d. × 0.5 cm, C18 precolumn (LC Packings, Amsterdam, The Netherlands) using a Switchos (LC Packings) single-pump module at a flow rate of 22 µL/min. The precolumn was then put in-line with a 75 µm i.d. × 12 cm analytical column, packed in-house with Monitor 5-µm spherical silica C18 resin of 100-Å pore size (Column Engineering). A binary gradient, using the same RP buffers as described above, was run from 10 to 35% buffer B over 60 min to elute the peptides, using a dualpump Ultimate µLC system (LC Packings) operated at a flow rate of 250 nL/min across the column. The UV absorbance of the eluting peptides was monitored at 214 nm using an in-line nanoflow UV cell (LC Packings). Eluent fractions were spotted manually over 1-min time intervals onto a MALDI sample target, and to each of these spots, 0.5 µL of DHB at 160 mg/mL in 1:3 acetonitrile/water was immediately added and allowed to air-dry. Fractions were spotted at selected time periods that showed large UV absorbance signals during the gradient elution. Approximately 10 fractions were spotted onto the MALDI sample plate from each of the three SCX fractions, giving a total of 33 spots that were analyzed by MALDI QqTOF mass spectrometry. MALDI QqTOF Mass Spectrometric Analysis. All data were acquired on a prototype MALDI QqTOF mass spectrometer built at the University of Manitoba in collaboration with MDS Sciex (Concord, ON, Canada) that has been previously described.28 MS spectra were acquired at a laser repetition rate of 8 Hz in time periods of 10-60 s. MS/MS spectra were acquired at a 16-Hz laser repetition rate with spectrum acquisition time varied from 20 s to 5 min, depending on the peak intensity. The precursor ion selection window was set to 2 Da at m/z ) 500 increasing to 4 Da at m/z ) 3000. Argon was used as a cooling gas in q0 and as a collision gas in the collision cell q2. For each precursor ion, the collision energy determined by the potential difference between q0 and q2 was initially set using a rule of 0.05 V/Da. Then the collision energy was slightly adjusted to obtain a desired degree of fragmentation. Database Searching of MS/MS Spectra. Peptides sequences were automatically identified by sequence database searching of the MS/MS spectra using the Sequest software.26 For the standard protein mix, the tandem mass spectra were searched against the full nonredundant protein database compiled at the Frederick Biomedical Supercomputing Center, Frederick, MD. Tandem mass spectra from the yeast sample were searched against a database containing all proteins derived from the open-reading frames contained in the S. cerevisiae genome. The mass window for the single-charged molecular ion of the precursor peptide being
Figure 1. Quantitative protein analysis by MALDI QqTOF mass spectrometry. After labeling of separate protein mixtures with the d(0) and d(8) forms of the ICAT reagent, respectively, the combined, enzymatically digested, labeled peptides in the mixture are isolated by avidin affinity chromatography and further separated by RP-µLC, with spotting of the eluent onto a MALDI sample target. After addition of MALDI matrix, the peptides are analyzed by MALDI-QqTOF mass spectrometry, with an initial MS scan to quantify the proteins by comparison of signal intensities of the monoisotopic peaks for the d(0)- and d(8)-labeled forms of the peptide, followed by selection of peptides at specific m/z values for CID to identify the expressed proteins.
searched against was given a tolerance of (0.07 Da deviation between the measured monoisotopic mass and the calculated monoisotopic mass. The a, b, y, and z ion series of the database peptides being searched against were included in the Sequest analysis. Those peptides showing a Sequest correlation score of at least 2.0 or a ∆ correlation score of at least 0.2 were considered for positive identification of the peptide sequence. All of these MS/ MS spectra were manually checked to verify the validity of the Sequest results. RESULTS AND DISCUSSION Analysis of a Protein Control Mixture. Figure 1 shows the overall scheme for protein quantification and identification using the MALDI QqTOF instrument. Two separate mixtures of proteins are initially labeled with either the d(0) or d(8) forms of the ICAT reagent, combined, and digested with trypsin. The peptide mixture is then purified by avidin affinity chromatography and further separated by RP-µLC, with spotting of the eluent in discrete fractions onto a MALDI target and mixing with MALDI matrix. This is followed by MALDI QqTOF mass spectrometric analysis of each sample spot, using an initial MS scan to quantify the ICATlabeled peptide pairs by comparison of the monoisotopic signal intensities of the d(0)- and d(8)-labeled forms of the peptide, followed by MS/MS analysis and sequence database searching to identify the proteins present. To first determine the ability of
MALDI QqTOF mass spectrometry to quantify and identify ICATlabeled peptides using this approach, a control mixture of proteins of known identity and composition was analyzed. Two mixtures containing different molar amounts of each of five different proteins were labeled independently with the d(0) and d(8) forms of the ICAT reagents, mixed together, enzymatically digested, purified using avidin affinity chromatography, and separated by RP-µLC, with eluent fractions being collected and analyzed by MALDI QqTOF mass spectrometry. Table 1 shows the peptide sequences that were identified from the proteins contained in the mix, along with the relative intensity ratios (d(0)/d(8)) of the d(0)and d(8)-labeled peptides measured in the initial MS scan and the expected d(0)/d(8) values in the sample. A representative result is given in Figure 2. A segment of the MS scan of a specific sample spot is shown in Figure 2A with an ICAT-labeled peptide pair being shown where the d(0)- and d(8)-labeled peptides have observed monoisotopic m/z values of 1711.873 and 1719.890 for their single-charged molecular ions, respectively. Isotopic resolution of the peptides is evident, consistent with the resolving power of ∼10 000 (fwhm) of the MALDI QqTOF mass spectrometer.28 The d(8)-labeled peptide was selected for CID, and the resulting MS/MS spectrum is shown in Figure 2B. Database searching of this spectrum using Sequest26 matched this peptide to the sequence PTQLEEQC*HI from the bovine protein LACB_BOVIN, where the cysteine (C*) has been modified with the d(8) version Analytical Chemistry, Vol. 73, No. 5, March 1, 2001
981
Table 1. MALDI QqTOF Analysis Results of a Control Protein Mixturea M+H protein G3P_RABIT LCA_BOVIN LACB_BOVIN
OVAL_CHICK PHS2_RABIT
meas
calc
error (ppm)
2014.059 2022.158 1167.567 1175.627 1711.873 1719.890 1972.969 1981.020 2059.967 2068.014 1851.806 1859.838 2445.192 2453.236 2332.103 2340.252 2293.125 2301.150
2014.072 2022.136 1167.591 1175.655 1711.840 1719.904 1972.951 1981.015 2059.983 2068.047 1851.818 1859.882 2445.180 2453.244 2332.168 2340.232 2293.089 2301.153
6.3 -11.0 20.3 23.6 -19.3 8.1 -9.0 -2.5 7.8 16.0 6.5 23.7 -4.9 3.3 27.2 -8.5 -15.7 1.3
peptide sequence identified VPTPNVSVVDLTC#R VPTPNVSVVDLTC*R C#EVFR C*EVFR PTQLEEQC#HI PTQLEEQC*HI FNPTQLEEQC#HI FNPTQLEEQC*HI LSFNPTQLEEQC#H LSFNPTQLEEQC*H C#MENSAEPEQSL C*MENSAEPEQSL LPGFGDSIEAQC#GTSVNVH LPGFGDSIEAQC*GTSVNVH TC#AYTNHTVIPEALER TC*AYTNHTVLPEALER VFADYEEYVKC#QER VFADYEEYVKC*QER
d(0)/d(8) meas
av
expected
% error
0.40
0.40
0.50
1.01
1.01
1.00
1.0
3.51
3.43
4.00
14.4
2.23
2.23
2.00
11.5
0.34
0.38
0.33
13.6
20
3.07 3.21 3.91
0.41
a Identified peptides from each of the control proteins are shown, along with the measured mass of the single-charged molecular ion (M + H measured), the calculated value (M + H calculated) and the mass measurement error in ppm. The measured d(0)/d(8) value (meas d(0)/d(8)) and the average value for all peptides found for each protein (av d(0)/d(8)) is also shown, along with the expected values for these ratios and the percent error. Peptides identified that contain cysteines labeled with the d(0) form of the ICAT reagent are denoted with a C# in their sequence, while those labeled with the d(8) reagent are denoted with a C*.
of the ICAT reagent. Although this sample was digested with trypsin, for unknown reasons the proteins in the mix were also cleaved frequently at nontryptic cleavage sites. The large proportion of nonspecific cleavage products in this mix of proteins has been confirmed by ESI-MS/MS analysis of the sample using an ion trap mass spectrometer (data not shown). Inspection of the MS/MS spectrum shown in Figure 2B shows that the MS/MS analysis using the MALDI QqTOF mass spectrometer on ICATlabeled cysteine-containing peptides produces y and b ion series, as well as a substantial number of a and z ions. Consequently, the a, b, y, and z ion series were included when searching these MS/MS spectra against the peptide databases using Sequest. The peaks labeled with an * are from fragmentation along the backbone of the ICAT reagent, and consequently, these fragment ions are consistently present in the MS/MS spectra of ICATlabeled peptides. Additionally, due to the high mass accuracy of TOF detection, the mass window for the precursor peptide ions that the MS/MS spectra are to be searched against can be made very narrow, constraining the database space that has to be searched with the CID spectrum of a specific peptide and greatly increasing the confidence of protein identification.30-33 A mass window of (0.07 Da was used in the database searches because it gives a mass tolerance that is slightly above the previously reported mass accuracy of the MALDI QqTOF mass spectrometer of ∼10 ppm28 for the full mass range of peptides analyzed in this study. The results in Figure 2A show the manner in which peptide quantification can be done using the MALDI QqTOF mass (30) Clauser, K. R.; Baker, P.; Burlingame, A. L. Anal. Chem. 1999, 71, 287182. (31) Fenyo, D.; Qin, J.; Chait, B. T. Electrophoresis 1998, 19, 998-1005. (32) Jensen, O. N.; Podtelejnikov, A.; Mann, M. Rapid Commun. Mass Spectrom. 1996, 10, 1371-8. (33) Masselon, C.; Anderson, G. A.; Harkewicz, R.; Bruce, J. E.; Pasa-Tolic, L.; Smith, R. D. Anal. Chem. 2000, 72, 1918-24.
982 Analytical Chemistry, Vol. 73, No. 5, March 1, 2001
Figure 2. Quantification and identification of a peptide from a control protein mixture. (A) The relative quantities of the protein in the d(0)and d(8)-labeled mixtures were obtained by comparing the peak intensities of the monoisotopic peaks at m/z values of 1711.873 and 1719.890, respectively. (B) The peak at a m/z value of 1719.890 was selected for MS/MS analysis, and the spectrum obtained was matched to the sequence PTQLEEQC*HI from the control protein LACB_BOVIN, where C* is modified with the d(8) form of the ICAT reagent.
Figure 3. Quantitative MALDI QqTOF mass spectrometric analysis of proteins expressed in yeast grown on ethanol or galactose carbon sources. Equal amounts of total soluble proteins from cells growing on ethanol or galactose were labeled with the d(0) and d(8) forms of the ICAT reagent, respectively, purified by multidimensional chromatography and analyzed by MALDI QqTOF mass spectrometry.
spectrometer, by calculating the relative ratios of the monoisotopic peaks of the d(0) and d(8) ICAT-labeled peptide pairs. The predominance of single-charged molecular ion peaks in MALDI analysis, along with the high mass accuracy of TOF detection, makes this measurement straightforward, by selecting peak pairs that are separated by ∼8 Da in the MS spectrum. This is in contrast to ESI-MS analysis, where the presence of multiple charge states for each peptide complicates quantification. For the peptides identified from this control mixture, the average error between the observed and expected d(0)/d(8) values is 12.1%, indicating that the relative quantities of proteins are accurately measured using the MALDI QqTOF mass spectrometer. In the case of LACB_BOVIN, four different peptides were identified from this protein, with d(0)/d(8) values ranging from 3.07 to 3.91, with a standard deviation of (∼0.4. Furthermore, the average absolute mass accuracy of the MALDI QqTOF mass spectrometer for the peptides identified here is 12.2 ppm, consistent with previous reports.28 It should also be noted that we observed no decrease in the detection sensitivity of ICAT-labeled peptides relative to nonlabeled peptides when analyzed by MALDI QqTOF MS. Quantitative Analysis of Induced Changes in Protein Expression in Yeast. After establishing the ability of the MALDI QqTOF mass spectrometer to both quantify and identify ICATlabeled peptides, we applied the approach to the quantitative analysis of steady-state protein expression in yeast S. cerevisiae grown on two different carbon sources. Figure 3 shows the overall scheme used in the experiment. Yeast cells were harvested in log phase from growth media containing either ethanol or galactose as the sole carbon source. Equal amounts of total soluble proteins were isolated from each of these cell populations, labeled with either the d(0) (ethanol carbon source) or the d(8) (galactose carbon source) form of the ICAT reagent, and the samples were
combined and enzymatically digested with trypsin. The resulting peptide sample was then fractionated using multidimensional chromatography, which included an initial separation of peptides by SCX HPLC, with fraction collection of peptides eluting from the SCX column over 1-min time intervals, followed by affinity isolation of ICAT-labeled, cysteine-containing peptides from selected SCX fractions by avidin affinity chromatography and, finally, RP-µLC separation of the ICAT-labeled peptides, with spotting of eluting peptides directly onto a MALDI sample target for subsequent analysis by MALDI QqTOF mass spectrometry. Multidimensional chromatography involving the use of cation exchange HPLC and reversed-phase chromatography was previously shown to be an effective separation method for the analysis protein mixtures by mass spectrometry,15,34,35 and the utility of the affinity isolation of cysteine-containing peptides was also demonstrated for the analysis of complex mixtures.36 Fractions were spotted onto the MALDI sample plates in 1-min time intervals in a total volume of ∼250 nL, and DHB MALDI matrix was immediately added to each spot and allowed to crystallize. Elution of the peptides from the RP-µLC column was monitored in-line by UV absorbance at 214 nm, and fractions were collected when the UV trace showed an abundance of eluting peptides. Table 2 shows the results of the analysis of 33 sample spots collected from the RP-µLC separation of three different SCX fractions. In all, 59 proteins from S. cerevisiae were quantified and identified by database searching of MS/MS spectra obtained from (34) Gygi, S. P.; Corthals, G. L.; Zhang, Y.; Rochon, Y.; Aebersold, R. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 9390-5. (35) Opiteck, G. J.; Lewis, K. C.; Jorgenson, J. W.; Anderegg, R. J. Anal. Chem. 1997, 69, 1518-24. (36) Spahr, C. S.; Susin, S. A.; Bures, E. J.; Robinson, J. H.; Davis, M. T.; McGinley, M. D.; Kroemer, G.; Patterson, S. D. Electrophoresis 2000, 21, 1635-50.
Analytical Chemistry, Vol. 73, No. 5, March 1, 2001
983
Table 2. Quantitative Analysis of Protein Expression in Yeast Utilizing either Ethanol or Galactose Carbon Sourcesa ORF
gene
description
error (ppm)
CBI
Eth/Gal
YAR010C YDL134C YCR083W YLR421C YHR037W YDR214W YDR047W YBR019C YBR230C YBR020W YIR031C YHR005C-A YMR300C YGR088W YNL134C YPR191W YBL015W YOL030W YJL116C YDL126C YLL026W YKR097W YDL066W YPL154C YER110C YIL094C YBR221C YMR307W YNL055C YOR204W YDR432W YLR259C YPL231W YGL245W YKR042W YLR304C YDR502C YPL061W YLR109W YNL135C YBR025C YBL030C YDL055C YLR058C YOR375C YER091C YDR450W YNL301C YBR048W YFL039C YNL178W YLR249W YDR385W YBR031W YOL086C YGR254W YKL060C YLR044C YGR192C
TyA PPH21 TRX3 RPN13 PUT2 YDR214W HEM12 GAL10 YBR230C GAL1 DAL7 MRS11 ADE4 CTT1 YNL134C QCR2 ACH1 YOL030W NCA3 CDC48 HSP104 PCK1 IDP1 PEP4 KAP123 LYS12 PDB1 GAS1 POR1 DED1 NPL3 HSP60 FAS2 YGL245W UTH1 ACO1 SAM2 ALD6 AHP1 FPR1 YBR025C PET9 PSA1 SHM2 GDH1 MET6 RPS18A RPL18B RPS11B ACT1 RPS3 YEF3 EFT2 RPL4A ADH1 ENO1 FBA1 PDC1 TDH3
Ty1 Gag-like TyA structural protein protein serine/threonine phosphatase PP2A-1 mitochondrial thioredoxin putative proteasomal subunit δ-1-pyrroline-5-carboxylate dehydrogenase (P5C dehydrogenase) protein of unknown function uroporphyrinogen decarboxylase UDP-glucose 4-epimerase protein of unkown function galactokinase, first step in galactose metabolism* malate synthase essential component of the mitochondrial import machinery amidophosphoribosyltransferase catalase T (cytosolic)* protein with similarity to C. carbonum toxD gene ubiquinol cytochrome-c reductase core protein 2 acetyl-CoA hydrolase protein with similarity to Gas1p regulation of synthesis of Atp6p and Atp8p protein of the AAA family of ATPases heat shock protein phosphoenolpyruvate carboxykinase (ATP) isocitrate dehydrogenase (NADP+), mitochondrial* proteinase A (PrA/yscA/saccharopepsin) karyopherin-β involved in nuclear import of ribosomal proteins homoisocitrate dehydrogenase pyruvate dehydrogenase complex, E1-β subunit 1,3-β-glucanosyltransferase* outer mitochondrial membrane porin ATP-dependent RNA helicase of DEAD box family protein involved in 18S and 25S rRNA processing mitochondrial chaperonin that cooperates with Hsp10p fatty-acyl-CoA synthase, R chain glutamyl-tRNA synthetase protein involved in the aging process aconitate hydratase (aconitase) S-adenosylmethionine synthetase cytosolic acetaldehyde dehydrogenase alkyl hydroperoxide reductase* FK506-binding protein member of the GTP-binding protein family ADP/ATP carrier protein of the mitochondrial carrier family* mannose-1-phosphate guanyltransferase serine hydroxymethyltransferase* glutamate dehydrogenase (NADP+) homocysteine methyltransferase, methionine synthase* ribosomal protein S18 ribosomal protein L18 ribosomal protein S11 actin ribosomal protein S3 translation elongation factor EF-3A translation elongation factor EF-2 ribosomal protein L4 alcohol dehydrogenase i enolase fructose-bisphosphate aldolase ii, sixth step in glycolysis pyruvate decarboxylase isozyme 1 glyceraldehyde-3-phosphate dehydrogenase
-16.5 -12.0 -1.8 -10.2 -8.6 -14.2 -2.8 -12.1 10.1 10.1 6.0 5.5 -19.1 12.6 -18.9 -4.2 17.2 3.5 -19.3 -6.3 -10.5 8.9 15.4 4.9 0.0 1.7 -10.7 5.9 8.3 -11.7 -19.9 23.0 -21.0 4.2 -12.9 -17.3 -15.9 -8.3 20.1 0.5 8.4 6.8 11.6 3.8 13.2 3.5 9.0 -7.6 -2.5 21.3 3.6 -20.4 16.8 -12.1 12.5 1.6 3.6 -6.9 15.0
0.069 0.103 0.114 0.149 0.171 0.179 0.182 0.183 0.202 0.217 0.232 0.244 0.283 0.306 0.316 0.326 0.327 0.368 0.377 0.406 0.409 0.423 0.436 0.466 0.468 0.478 0.488 0.499 0.499 0.529 0.546 0.553 0.553 0.557 0.559 0.582 0.641 0.664 0.680 0.680 0.684 0.685 0.718 0.722 0.746 0.772 0.810 0.813 0.824 0.824 0.863 0.865 0.887 0.896 0.913 0.930 0.935 0.962 0.988
0.22 0.73 0.56 0.61 1.25 0.44 1.22 0.10 0.57 0.15 (0.16, 0.18, 0.10, 0.17) 1.08 0.74 0.38 0.35 (0.27, 0.42) 0.51 1.31 0.90 1.00 0.24 0.95 0.96 1.84 0.78 (0.89, 0.68) 0.69 0.47 1.04 0.60 1.44 (1.54, 1.33) 0.62 0.29 0.49 0.66 0.64 0.46 0.45 0.72 0.49 1.05 0.09 (0.08, 0.09) 1.61 0.64 0.39 (0.33, 0.45) 0.67 0.7 (0.67, 0.73) 1.50 1.46 (1.32, 1.60) 0.23 0.52 0.30 0.52 0.84 0.26 0.62 0.22 0.56 0.23 0.23 0.26 0.38
a The identity of the open reading frame (ORF) and gene name is given for each protein, as well as the description and codon bias index (CBI), as found at the Yeast Protein Database website (http://www.proteome.com/databases/index.html). The error between the measured and calculated mass for each single-charged peptide identified is also shown, as well as the measured d(0)/d(8) values (Eth/Gal). For those proteins denoted with an asterisk, at least two peptides were identified from the protein, and the mass measurement error and d(0)/d(8) values shown are an average for those peptides identified. The measured d(0)/d(8) values for each of the individual peptides identified for these proteins are shown in parentheses.
the MALDI QqTOF mass spectrometer. Not every peptide peak that was detected in the MS scan of each sample spot was selected for MS/MS analysis, so these results do not represent a comprehensive analysis of these sample spots. Rather, a representative 984
Analytical Chemistry, Vol. 73, No. 5, March 1, 2001
sampling was obtained, by selecting several peaks from each sample spot that represented a wide range of signal intensities and masses of single-charged precursor ions. These same three SCX fractions were also analyzed by RP-µLC ESI-MS/MS using
Figure 4. Representative results of the MALDI QqTOF mass spectrometric analysis of induced changes in protein expression in yeast. (A) A segment of the mass spectrum is shown for an ICATlabeled peptide pair having monoisotopic m/z values of 2143.010 and 2151.042 for the d(0) (ethanol carbon source)- and d(8) (galactose carbon source)-labeled peptides, respectively. The measured d(0)/ d(8) value is also shown. The d(8)-labeled peptide was selected for CID and identified as being from the galactose-induced protein Gal1. (B) A segment of the mass spectrum for an ICAT-labeled peptide pair having monoisotopic m/z values of 1854.913 and 1862.996 for the d(0)- and d(8)-labeled peptides, respectively, is shown, along with the d(0)/d(8) value. The d(0),labeled peptide was selected for CID and identified as being from the glucose-repressed protein PCK1.
an ion trap mass spectrometer, and approximately 90% of the proteins identified by the MALDI QqTOF mass spectrometer and shown in Table 2 were also identified by this method. For all of these proteins, the quantitative results were in close agreement between the two instruments (data not shown). Figure 4A shows a representative result for a peptide identified to be from the protein galactokinase (GAL1), a kinase involved in the metabolism of galactose. As expected,37 this protein shows dramatically increased expression in the presence of galactose as the carbon source, with a d(0)/d(8) value of 0.10. In the data analyzed, a total of four peptides from the galactokinase protein were detected and quantified with an average ratio of 0.15 ( 0.04. Figure 4B shows the results for a peptide identified to be from the protein phosphoenolpyruvate carboxykinase (PCK1), which is glucose repressed38 and shows an increased expression on the ethanol carbon source. The d(0)/d(8) expression ratio of 1.84 measured here is consistent with previously determined values.1 In some instances, quantification may be confounded by overlap of isotope peaks from two different, closely eluting (37) Lashkari, D. A.; DeRisi, J. L.; McCusker, J. H.; Namath, A. F.; Gentile, C.; Hwang, S. Y.; Brown, P. O.; Davis, R. W. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 13057-62. (38) Yin, Z.; Smith, R. J.; Brown, A. J. Mol. Microbiol. 1996, 20, 751-64.
Figure 5. Quantification and identification of a low-abundance protein. (A) The MS scan of a 1-min fraction collected during the RPµLC separation of ICAT-labeled peptides isolated from yeast grown on ethanol or galactose is shown. A pair of ICAT-labeled peaks at relatively low signal intensity in the spectrum is circled. (B) An expanded view of the region around this pair of peaks is shown. The d(0)/d(8) value was determined by comparing the signal intensities of the monoisotopic peaks at m/z values of 1705.810 and 1713.864. The d(8)-labeled peptide at m/z value of 1713.864 was chosen for CID and successfully identified as being from the protein TRX3, which has a CBI value of 0.114.
peptides, which are similar in mass. One possible remedy for this problem is to measure the intensities of peaks in the isotope distribution other than the monoisotopic peak, assuming that the overlap with the interfering peptide(s) may be different or absent for the other peaks of the target peptide isotope distribution. In the case of proteins in which multiple peptides have been identified, those peptides that do not show peak overlap can be used. In this study, only in two cases were peptides identified that could not be accurately quantified due to significant peak contamination. In cases where a very complex mixture shows significant amounts of peak overlap, increased separation of the peptides in the RP-µLC step may be achieved by employing a shallower elution gradient for RP-µLC or by introducing additional prefractionation steps such as protein fractionation by size34 prior to proteolysis of the ICAT labeled sample. Characterization of Low-Abundance Proteins. One important measure of any mass spectrometric approach to protein quantification and identification is the ability to analyze lowabundance proteins. It has been shown that the use of twodimensional chromatography combined with ESI-MS/MS operated in data-dependent fragmentation mode provides sufficient sensitivity for the identification of low-abundance proteins within complex protein mixtures.34 We therefore examined the ability of the approach described here to both quantify and identify lowabundance proteins. Figure 5 shows the results from the analysis of an ICAT-labeled peak pair at relatively low signal intensity in Analytical Chemistry, Vol. 73, No. 5, March 1, 2001
985
the mass spectrum. Even at the low intensity of the signal indicated in Figure 5A, quantification of the detected peak pair was straightforward, giving a d(0)/d(8) ratio of 0.56. The MS/ MS analysis of the peak at a mass-to-charge value of 1713.864 identified this peptide to be from the protein mitochondrial thioredoxin (TRX3). This is a low-abundance protein, as predicted by its codon bias index (CBI) of 0.114. The CBI for a gene is a measure of its propensity to preferentially use only one of the several possible codons to incorporate a specific amino acid into the polypeptide chain.39,40 More highly expressed proteins tend to use a select subset of all the possible codons, resulting in relatively high CBI values. It has been shown that conventional proteomic analyses employing separation of proteins from complete cell lysates by 2-D PAGE combined with ESI-MS/MS are only sensitive enough to identify proteins with CBI values generally greater than 0.2.34 In this study, we were able to identify eight proteins having CBI values below 0.2, with one identified protein having a CBI value of 0.069. Only one of these proteins (Gal10) is expected to be in high abundance when grown on both of the carbon sources used in this experiment, as it is a gene known to be repressed by glucose and induced by galactose.37 These data therefore suggest that the approach described here may prove effective for the analysis of low-abundance proteins. Selective Protein Identification for High-Throughput Quantitative Analysis. The MALDI QqTOF mass spectrometer offers some unique benefits to high-throughput quantitative proteomic analysis using the ICAT strategy. With RP-µLC ESI-MS/MS methods, sample is being continually consumed and decisions as to which peptide peaks are to be selected for CID must be done “on the fly” in a limited amount of time. Conversely, analysis by MALDI QqTOF mass spectrometry allows decisions as to which peptide peaks to identify by MS/MS analysis to be made without time limitations, because the peptides remain within each spot of matrix crystals during the analysis. Consequently, ICAT-labeled peak pairs identified in the initial MS scan as showing significant differential expression between the two biological conditions can be selectively identified by MS/MS analysis, while those peaks showing little or no differences in expression can be omitted from the MS/MS analysis, providing an efficient approach to quantitative analysis with increased throughput. RP-µLC ESI-MS/MS-
based methods are unable to selectively identify those peptides showing significant differential expression because peptide peaks are selected for CID preferentially by peak intensity in the MS scan;16,22-24 thus, potentially a large proportion of those peptides that are identified will be from proteins showing no differential expression, thereby committing a large portion of analysis and database-searching time to proteins that may not be of biological interest to the system being analyzed. The ability to select, without time constraints, peaks for MS/MS analysis using the MALDI QqTOF mass spectrometer also may help in the identification of low-abundance proteins, as the data presented here indicate, as many times these low-intensity peaks are missed in RP-µLC ESIMS/MS analyses due to the preferential identification of highintensity peaks. Additionally, the amount of sample loaded in the RP-µLC fractionation step could be increased by using a larger capillary column with increased loading capacity, which would further increase the ability to detect and identify low-abundance proteins. In conclusion, the results presented here show the effective analysis of ICAT-labeled peptides using a MALDI QqTOF mass spectrometer. Straightforward quantitative analysis is accomplished by comparison of peak intensities between the isotopically heavy and isotopically light labeled peptides; peptide peaks of interest can then be selected for sequence identification by MS/ MS analysis. The amenability of the entire system to automation, from the multidimensional chromatography steps to the MALDI QqTOF mass spectrometric analysis and database searching of MS/MS spectra, makes this a promising general approach to quantitative proteomic analysis.
(39) Kurland, C. G. FEBS Lett. 1991, 285, 165-9. (40) Garrels, J. I.; McLaughlin, C. S.; Warner, J. R.; Futcher, B.; Latter, G. I.; Kobayashi, R.; Schwender, B.; Volpe, T.; Anderson, D. S.; Mesquita-Fuentes, R.; Payne, W. E. Electrophoresis 1997, 18, 1347-60.
Received for review September 29, 2000. Accepted January 11, 2001.
986
Analytical Chemistry, Vol. 73, No. 5, March 1, 2001
ACKNOWLEDGMENT The authors thank Lyle Burton of MDS Sciex for his assistance in this work. T.J.G. was funded by an NIH Postdoctoral Genome Training Grant fellowship. Work at the University of Washington was supported by a grant from the Merck Genome Research Institute (MGRI) and a grant from the National (USA) Cancer Institute (1R33CA84698). Work at the University of Manitoba was supported by a grant from the Natural Sciences and Engineering Council of Canada and a grant from the National (USA) Institutes of Health (GM 59240).
AC001169Y