Top-Down Approaches for Measuring Expression Ratios of Intact

Dec 23, 2005 - Top-Down Approaches for Measuring Expression Ratios of Intact Yeast Proteins Using Fourier Transform Mass Spectrometry ..... the faster...
0 downloads 6 Views 376KB Size
Anal. Chem. 2006, 78, 686-694

Top-Down Approaches for Measuring Expression Ratios of Intact Yeast Proteins Using Fourier Transform Mass Spectrometry Yi Du,† Bryan A. Parks,† Seyoung Sohn,‡ Kurt E. Kwast,§ and Neil L. Kelleher*,†

Department of Chemistry, Department of Computer Science, and Department of Molecular & Integrative Physiology, University of Illinois at UrbanasChampaign Urbana, Illinois 61801

The extension of quantitation methods for small peptides to ions above 5 kDa, and eventually to global quantitative proteomics of intact proteins, will require extensive refinement of current analytical approaches. Here we evaluate postgrowth Cys-labeling and 14N/15N metabolic labeling strategies for determination of relative protein expression levels and their posttranslational modifications using top-down mass spectrometry (MS). We show that intact proteins that are differentially alkylated with acrylamide (+71 Da) versus iodoacetamide (+57 Da) have substantial chromatographic shifts during reversed-phase liquid chromatography separation (particularly in peak tails), indicating a requirement for stable isotopes in alkylation tags for top-down MS. In the 14N/15N metabolic labeling strategy, we achieve 98% 15N incorporation in yeast grown 10 generations under aerobic conditions and determine 50 expression ratios using Fourier transform ion cyclotron resonance MS in comparing these cells to anaerobically grown control (14N) cells. We devise quantitative methods for top-down analyses, including a correction factor for accurate protein ratio determination based upon the signal-to-noise ratio. Using a database of 200 yeast protein forms identified previously by top-down MS, we verify the intact mass tag concept for protein identification without tandem MS. Overall, we find that top-down MS promises work flows capable of large-scale proteome profiling using stable isotope labeling and the determination of >5 protein ratios per spectrum. Significant progress has been made in proteomics for profiling whole cell lysates, organelles, or protein complexes and determining expression levels of thousands of proteins in a high-throughput fashion.1-4 In 1999, there was a major shift to quantitative analyses using stable-isotope labeling methods for determining relative * Corresponding author. E-mail: [email protected]. Fax: 217-244-8068. † Department of Chemistry. ‡ Department of Computer Science. § Department of Molecular & Integrative Physiology. (1) Link, A. J.; Eng, J.; Schieltz, D. M.; Carmack, E.; Mize, G. J.; Morris, D. R.; Garvik, B. M.; Yates, J. R., 3rd. Nat. Biotechnol. 1999, 17, 676-682. (2) Washburn, M. P.; Wolters, D.; Yates, J. R., 3rd. Nat. Biotechnol. 2001, 19, 242-247. (3) Shen, Y.; Tolic, N.; Zhao, R.; Pasa-Tolic, L.; Li, L.; Berger, S. J.; Harkewicz, R.; Anderson, G. A.; Belov, M. E.; Smith, R. D. Anal. Chem. 2001, 73, 30113021.

686 Analytical Chemistry, Vol. 78, No. 3, February 1, 2006

differences in protein expression levels between two cell states, typically using metabolic5 or postgrowth covalent6 labeling methods. Since then, these approaches have undergone continued improvement and now are increasing our understanding of the molecular mechanisms associated with environmental or genetic changes in cells and tissues. Traditionally, two-dimensional polyacrylamide gel electrophoresis has been used for semiquantitative proteome analyses, comparing the optical density of stained proteins in different conditions.7 A drawback to this approach is lack of separation for some proteins, such that one spot may contain several proteins, thus preventing quantitative determination. In recent years, a number of pairwise protein quantification methods have been developed, for example, using stable isotopes to differentially label samples. In brief, labeled and unlabeled samples (peptides or proteins) are mixed and analyzed by methods such as on-line reversed-phase liquid chromatography (RPLC)-mass spectrometry (MS) to quantify their relative expression levels based on the assumption that both species behave equivalently during the separation, ionization, and detection processes. Relative protein ratios are determined by comparing the area under the trace of ion current for each species in the pair, which have identical sequences but different masses due to the substitution of stable isotopes (e.g., 2H for 1H, 15N for 14N, or 13C for 12C).8 These approaches can be divided into two general categories based upon the labeling method.9 The first is chemical labeling, in which a derivatization reagent for chemical modification of the proteins or peptides is used either after cell growth6,10 or during tryptic digestion.11 The second is biological labeling, where cells are either grown in media enriched for stable isotope-containing anabolites5,12,13 or cell extracts are digested in a stable isotope(4) Li, L.; Masselon, C. D.; Anderson, G. A.; Pasa-Tolic, L.; Lee, S. W.; Shen, Y.; Zhao, R.; Lipton, M. S.; Conrads, T. P.; Tolic, N.; Smith, R. D. Anal. Chem. 2001, 73, 3312-3322. (5) Oda, Y.; Huang, K.; Cross, F. R.; Cowburn, D.; Chait, B. T. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 6591-6596. (6) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Nat. Biotechnol. 1999, 17, 994-999. (7) Pandey, A.; Mann, M. Nature 2000, 405, 837-846. (8) MacCoss, M. J.; Matthews, D. E. Anal. Chem. 2005, 77, 294A-302A. (9) Ong, S. E.; Foster, L. J.; Mann, M. Methods 2003, 29, 124-130. (10) Kelleher, N. L.; Nicewonger, R. B.; Begley, T. P.; McLafferty, F. W. J. Biol. Chem. 1997, 272, 32215-32220. (11) Goodlett, D. R.; Keller, A.; Watts, J. D.; Newitt, R.; Yi, E. C.; Purvine, S.; Eng, J. K.; von Haller, P.; Aebersold, R.; Kolker, E. Rapid Commun. Mass Spectrom. 2001, 15, 1214-1221. 10.1021/ac050993p CCC: $33.50

© 2006 American Chemical Society Published on Web 12/23/2005

Figure 1. Overview of strategies employed for quantitative proteomics using top-down mass spectrometry. (A) Differential alkylation strategy. (B) 14N/15N-metabolic labeling strategy. (C) Quantitative determination of PTM occupancy by relative abundance measurements of differentially modified protein forms. Histone H4 with multiple acetylations from yeast (strain S288C) grown aerobically in 14N- versus 15N-labeled media.

containing environment.14 High-throughput analyses of differentially expressed proteins is possible using such approaches, typically by analyzing tryptic peptides by tandem MSsthe so-called “bottom-up” approach.15 Alternatively, quantitative proteomics can be combined with top-down mass spectrometry15 to capture not only the differential expression of intact proteins but also their posttranslational modifications (PTMs). The approach is an extension of the notion founded in a 1999 study on a multiphosphorylated yeast kinase.5 With the availability of 100% sequence coverage for proteins from many organisms,16,17 top-down proteomics makes possible quantitative study of intact proteins and their PTMs,15 allowing for the direct comparison of cells in different states without tryptic digestion and subsequent peptide analyses. However, tailored methods for both sample preparation and data processing of intact proteins are needed to realize the potential of electrospray ionization (ESI)-Fourier transform ion cyclotron resonance (FTICR)MS for quantitative analyses. Quantitative FTICR-MS of high-mass ions created by ESI can have a precision of ∼5%18 and an accuracy of >98%.19 FTICR-MS also has a dynamic range of up to 3 orders of magnitude depending on the signal-to-noise ratio (S/N) of the (12) (a) Washburn, M. P.; Ulaszek, R.; Deciu, C.; Schieltz, D. M.; Yates, J. R., 3rd. Anal. Chem. 2002, 74, 1650-1657. (b) Berger, S. J.; Lee, S. W.; Anderson, G. A.; Pasa-Tolic, L.; Tolic, N.; Shen, Y.; Zhao, R.; Smith, R. D. Anal. Chem. 2002, 74, 4994-5000. (13) Wu, C. C.; MacCoss, M. J.; Howell, K. E.; Matthews, D. E.; Yates, J. R., 3rd. Anal. Chem. 2004, 76, 4951-4959. (14) Yao, X.; Freas, A.; Ramirez, J.; Demirev, P. A.; Fenselau, C. Anal. Chem. 2001, 73, 2836-2842. (15) Kelleher, N. L.; Lin, H. Y.; Valaskovic, G. A.; Aaserud, D. J.; Fridricksson, E. K.; McLafferty, F. W. J. Am. Chem. Soc. 1999, 121, 806-812. (16) Zabrouskov, V.; Giacomelli, L.; Van Wijk, K. J.; McLafferty, F. W. Mol. Cell. Proteomics 2003, 2, 1253-1260. (17) Roth, M. J.; Forbes, A. J.; Boyne, M. T., 2nd.; Kim, Y. B.; Robinson, D. E.; Kelleher, N. L. Mol. Cell. Proteomics 2005, 4, 1002-1008. (18) Hicks, L. M.; O’Connor, S. E.; Mazur, M. T.; Walsh, C. T.; Kelleher, N. L. Chem. Biol. 2004, 11, 327-335. (19) Gordon, E. F.; Mansoori, B. A.; Carroll, C. F.; Muddiman, D. C. J. Mass Spectrom. 1999, 34, 1055-1062.

protein ions.20 These figures of merit become critical when implementing strategies for quantitative proteomics. Here, we evaluated two approaches for quantitative top-down proteomics (Figure 1, panels A and B). In the first, cysteine (Cys) residues were alkylated with either acrylamide or iodoacetamide and the resulting products were separated and identified using ESI-FTICR-MS. In the second, 14N/15N-metabolic labeling allowed for the determination of expression differences between two cell states using either tandem MS or prior knowledge of intact mass values21 for protein identification. Also, a correction factor was devised to obtain more accurate expression ratios for species of low abundance by accounting for their S/N. This report provides proof-of-concept for extending these capabilities for top-down proteomics by determining 50 expression ratios of yeast proteins from cells grown in anaerobic versus aerobic conditions. EXPERIMENTAL SECTION Differential Alkylation Strategy. Standard Protein Mixtures. Four protein standards (bovine R-lactalbumin, bovine β-lactoglobin, chicken lysozyme C-1, bovine ribonuclease A; Sigma, St. Louis, MO) were used to evaluate the differential alkylation method. One milligram of each protein was individually dissolved in 1 mL of buffer containing 50 mM Tris (pH 8.5), 6 M guanidine hydrochloride, and 10 mM tris(2-carboxyethyl)phosphine hydrochloride (TCEP) and incubated at 37 °C for 1 h for denaturation and reduction. The protein solutions were split in half and alkylated with either iodoacetamide (10 mM) or acrylamide (20 mM) by incubating in the dark at room temperature for 1 h. The differentially alkylated proteins were combined at different ratios (see Supporting Information Table 1), and the four proteins were mixed, loaded onto a C4 Symmetry 300 column (4.6 × 50 mm; Waters Corp., Milford, MA), washed for 20 min with 95% H2O (20) Senko, M. W.; Canterbury, J. D.; Guan, S.; Marshall, A. G. Rapid Commun. Mass Spectrom. 1996, 10, 1839-1844. (21) Gomez, S. M.; Nishio, J. N.; Faull, K. F.; Whitelegge, J. P. Mol. Cell. Proteomics 2002, 1, 46-59.

Analytical Chemistry, Vol. 78, No. 3, February 1, 2006

687

Table 1. Comparison of Molecular Mass, Protein Ratios for the 14N/15N Pairs (anaerobic/aerobic), Anaerobic to Aerobic mRNA Levels, and Post-translational Modifications of Proteins Unambiguously Identified by FTICR-MS gene name

ORF name

exptl Mr (Da)

theor Mr (Da)

protein ratioa

mRNA ratiob

ATP14 CMD1 CPR1 HSP10 RPL22A RPL24A RPL24B RPL26B RPL38 RPS16A RPS20 RPS28A RPS28B UBI4

YLR295C YBR109C YDR155C YOR020C YLR061W YGL031C YGR148C YGR034W YLR325C YMR143W YHL015W YOR167C YLR264W YLL039C

10407.9-6 16046.7-10 17416.0-11 11283.5-7 13562.6-8 17613.7-11 17547.7-11 14103.7-8 8695.41-5 15758.8-10 13818.1-8 7633.95-4 7606.92-4 8556.50-5

10407.3 16045.7 17415.5 11283.1 13562.2 17613.5 17547.5 14103.5 8695.32 15758.3 13817.9 7633.86 7606.83 8556.79

0.25 0.84 0.96 0.51 0.85 0.60 0.71 0.78 0.89 0.83 0.76 0.78 0.91 0.92

0.51 0.99 0.91 0.70 1.2 1.1 1.1 0.93 1.2 0.91 1.1 1.1 0.61 0.98

MS/MS MS/MS IMT MS/MS MS/MS MS/MS MS/MS MS/MS IMT IMT IMT IMT IMT IMT

signal peptide Met-off, N-Ac N-Ac Met-off, N-Ac none Met-on Met-on none Met-off Met-off, N-Ac Met-off, N-Ac Met-on, N-Ac Met-on, N-Ac Met-on

ANB1 HYP2

YJR047C YEL034W

17323.8-11 17501.3-11

17323.4 17502.1

N/Ae N/Ae

3.9 0.30

MS/MS intact mass valuef

N-Ac, Pi,hypusine N-Ac, Pi, hypusine

ID methodc

PTMsd

a Protein expression ratios (14N anaerobic/15N aerobic) were calculated as described in Figure 2. b mRNA ratios (anaerobic/aerobic) are from previous cDNA microarray expriments.25 c IMT, intact mass tag; MS/MS, tandem mass spectrometry. d Met-on and Met-off refers to proteins expressed with or without first methionine, N-Ac refers to N-terminal acetylation, and Pi refers to phosphorylation. Note that there is a 32-amino acid signal peptide in ATP14 and hypusine modifications of both ANB1 and HYP2. e 14N/15N ratios for these proteins could not be determined as one of the pair was expressed at a level beyond the detection limit. f Hyp2p was identified based on the 15N-labeled Mr intact mass value (assuming 98% 15N).

and 5% CH3CN, and eluted with a linear gradient (5-95% CH3CN, 0.1% TFA over 20 min). One-milliliter fractions of the eluted proteins were collected manually based upon their peak intensities observed from the trace of absorbance at 220 nm and were lyophilized prior to FTMS analyses. 14N/15N Metabolic Labeling Strategy. (a) Yeast Growth Conditions. Liquid precultures of Saccharomyces cerevisiae strain JM43 (MATR leu2-3,112 his4-580 trp1-289 ura3-52 [F+])22 were grown at 28 °C with shaking (200 rpm) in minimal growth media23 containing yeast nitrogen base (Sigma), 5% glucose, 0.1% Tween 80 (a source of oleic acid), 20 mg/L ergosterol, 10 mM sodium succinate (pH 5.0), 0.016% silicon antifoam (SSG-TEA medium), and either 0.5% (15NH4)2SO4 (>98 atom percent excess; Sigma) (for aerobiosis) or 0.5% (NH4)2SO4 (for anaerobiosis).25 The medium was supplemented with 40 mg/L each of histidine, leucine, tryptophan, and uracil and the following vitamins: 1.6 mg/L thiamine, 2.4 mg/L nicotinamide, 1.6 mg/L pyridoxine, 3.2 mg/L calcium pantothenate, 2.0 mg/L inositol, and 0.016 mg/L biotin. 15N-Labeled histidine, leucine, and tryptophan (Spectra Stable Isotopes, Columbia, MD) were used for the aerobic cultures. Precultures were kept in midlog growth phase (20 kDa (Figure 4). Using a Mr value and the number of Cys residues as a constraint, >90% confidence in protein identifications is possible from a well-annotated yeast protein database (∼50 000 protein forms) at high mass accuracy. Of course, the level of confidence is much lower than that (41) Kelleher, N. L.; Senko, M. W.; Siegel, M. M.; McLafferty, F. W. J. Am. Soc. Mass Spectrom. 1997, 8, 380-383.

Figure 4. Percentage of unique Cys-containing proteins in the yeast proteome as a function of the accuracy of protein mass determination. The percent uniqueness was calculated from the number of proteins with unique numbers of Cys residues within a certain mass accuracy (50, 100, and 200 ppm) divided by the total number of proteins within a 1000-Da mass window from 5 to 50 kDa.

attainable from tandem MS with high mass accuracy, which can produce expectation values for database searching that are 6-50 orders of magnitude better, even when using extremely complicated databases such as the new human protein database.17 Metabolic Labeling Strategy and Top-Down Proteomics. Unlike the differential alkylation strategy, in which chromatographic shifts were observed, 14N- and 15N-containing proteins coeluted during RPLC separation (data not shown), which substantially reduced the quantitation error. Moreover, both protein forms are expected to behave similarly during the ionization process. Due to the incorporation of 15N, however, the isotopic distribution will be different for 15N- and 14N-containing proteins, especially at lower 15N incorporation efficiencies. As shown in a simulation of isotopic distributions for a 10-kDa protein with various percentages of 15N incorporated (Supporting Information Figure 2), as the percentage of 15N incorporation decreases, the isotopic distribution becomes much broader and the abundances of each isotopic peak decrease. These effects are even more dramatic for proteins with higher molecular mass. Therefore, a high level of 15N incorporation is critical for efficient 14N to 15N protein ratio determinations. Factors That Affect the Determination of Protein Ratios. Several factors, including the S/N, space charging, radial perturbation, and signal decay rate24,42,43 can influence protein ratio measurements in FTICR-MS. These can pose serious challenges for 14N/15N-metabolic labeling strategies, especially when the abundance of the two protein species is highly disparate. As described below, several measures were taken to minimize the impact of these factors on the protein ratio measurement. To investigate the influence of S/N on the protein ratio measurement, we carried out simulations to establish a model and devise a correction factor. Since 14N/15N-labeled protein species are adjacent to each other and differentiated by 10-20 m/z, we assume the noise level is constant for both species during detection. Also, only additive white noise is considered, which models the random influence of indeterminate fluctuations in the system. Therefore, let IA ) SA - N and IB ) SB - N, where SA and SB (SA > SB) are signal intensities obtained from the (42) Ong, S. E.; Kratchmarova, I.; Mann, M. J. Proteome Res. 2003, 2, 173-181. (43) Gordon, E. F.; Muddiman, D. C. J. Mass Spectrom. 2001, 36, 195-203.

Figure 5. Effect of S/N on the signal intensity ratio for two protein species. (A) Simulation of two peak intensities (SA, SB) and additive noise (N) (signal intensities are IA ) SA - N and IB ) SB - N, respectively). (B) Change in the apparent ratio obtained from the peak intensities (SA/SB) as a function of increasing signal-to-noise (SB/N) ratios for the lower abundant species with different IA/IB ratios (10:1, 5:1, 2:1, and 1:1).

instrument with noise N and IA and IB are peak intensities without noise (Figure 5A) and thus reflect the true species abundances. When N > 0, the apparent ratio of signal intensities (SA/SB) does not reflect the true ratio of protein species abundances. To correct for this discrepancy, we devised a correction factor (γ) as described below. Let

γ)

IA/IB SA/SB

and thus

γ)

SB/N - SB/SA SB/N - 1

(1)

The correction factor takes into account the S/N of the lower abundant species (SB/N) and the observed abundance ratio between the two species (SA/SB). If the species are of the same abundance (i.e., SA/SB ) 1), or the noise is negligible, γ ) 1. As the S/N decreases and the abundances of the two species become more disparate, the correction factor exerts a more substantive effect on the signal intensity ratio. To illustrate this effect, a simple simulation was run using values of IA ) 500, 250, 100, and 50, and IB ) 50, as a function of noise level (Figure 5B). When IA ) IB, SA/SB is not influenced by SB/N and γ ) 1. However, at low S/N ratios (e.g., SB/N < 5), the apparent ratio of SA/SB is spuriously low, especially when the IA to IB ratio is high. The influence of S/N on ratio measurements was previously discussed by Ong et al.42 Near the limit of detection, they found that a true 1:10 ratio of a peptide pair was observed to be 1:5.25, primarily due to background “noise” of the lower intensity member. Using an SB/N ratio of 2 and SA/SB ratio of 5.25, our formula provides a correction factor (γ) of 1.81, resulting in a final corrected ratio of 1:9.5, within 5% of the true ratio. Thus, by taking into account Analytical Chemistry, Vol. 78, No. 3, February 1, 2006

691

the S/N, we can obtain a more accurate measurement of the ion pair ratio. This formula is a component of the algorithm we used for automatic ratio determination (Figure 2). However, since only the additive white noise was considered, it can only be applied to a linear system. In addition to S/N, it has been reported that both radial perturbations and different signal decay rates can result in a bias toward the higher abundance species during ion detection in FTICR-MS, especially when there are large differences in the abundance of the two ion species. This radial perturbation is due to the Coulombic repulsion between the two ion clouds as they interact during the excitation process. The differential signal decay phenomenon is attributed to the rates of loss of phase coherence for the two ion clouds, which results in the faster decay rates for the species of lower ion cloud density.43-45 Both of these effects become more serious as the length of the data collection period increases; therefore, to account for this effect, we truncated the time domain data before Fourier transformation. This procedure samples that earliest portion of the time domain transient that best reflects the ion abundance of the two species (vide infra).46 Thus, to measure the ratio between the protein pairs in complex mixtures, we used the nontruncated data for accurate protein Mr value determination given that it maintains high resolution and the truncated data for accurate protein expression ratio measurements. Application of 14N/15N-Metabolic Labeling Strategy to Yeast Proteome Analyses. To evaluate the suitability of using 14N/15N-metabolic labeling for quantitative analyses of intact proteins, we compared protein expression levels in yeast grown in the presence of (15NH4)2SO4 under aerobic conditions to those grown anaerobically in the presence of (14NH4)2SO4. Preliminary experiments with these different nitrogen sources revealed no difference (p > 0.05) in cellular growth rate (data not shown). After harvesting the final cultures (after 10 generations of growth) and isolating the proteins, equal masses of protein from the two conditions were mixed, fractionated by PF 2D, and analyzed by ESI-FTICR-MS as described above. The linearity of this strategy was established with r2 of 0.99 (Supporting Information Figure 3) using 1:3, 1:1, and 3:1 mixing of 14N/15N yeast lysate. Figure 6A shows the FTICR mass spectrum for ubiquitin (observed Mr ) 8556.50-5 Da) identified using an intact mass tag (IMT)21 of 8556.79-5 Da from previous MS/MS experiments.26 The observed Mr value is within 50 ppm, a reasonable threshold for protein identification using the IMT approach. 15N incorporation in the aerobic sample was calculated to be 97.8% by dividing the 102.66-Da difference in the 14N/15N protein pair by the 105 nitrogen atoms in yeast ubiquitin. This is close to the theoretical maximum using (15NH4)2SO4 and 15N-amino acids with an atomic percentage of g98% 15N. Moreover, the percentage of 15N incorporation was consistent among different proteins species and, thus, could be used for determining the number of nitrogen atoms in a protein, a modest constraint in database searching. Such a high percentage of 15N is required in order to minimize peak (44) de Koning, L. J.; Kort, C. W. F.; Pinkse, F. A.; Nibbering, N. M. M. Int. J. Mass Spectrom. Ion Processes 1989, 95, 71-92. (45) Mitchell, D. W.; Stephen, S. E. Int. J. Mass Spectrom. Ion Processes 1990, 96, 1-16. (46) Farrar, T. C.; Elling, J. W.; Krahling, M. D. Anal. Chem. 1992, 64, 27702774.

692 Analytical Chemistry, Vol. 78, No. 3, February 1, 2006

Figure 6. Application of 14N/15N-metabolic labeling strategy to the quantitative profiling of the yeast proteome. (A) Three protein pairs were observed in one FTICR mass spectrum (50 scans). Each pair was indicated with the same colored dots for different charge states. The masses of the 14N-labeled protein species are listed beside the corresponding colored dots. Yeast ubiquitin was identified using an IMT with 97.8% 15N incorporation. (B) Five protein pairs were observed in one FTICR mass spectrum (50 scans). Each pair is indicated with the same colored bars for different charge states. The masses of the 14N-labeled protein species (ranging from 12.6 to 23.2 kDa) are listed beside their corresponding colored square.

broadening in the spectra and obtain an accurate ratio measurement, especially when using top-down approaches for intact protein analyses. In terms of 14N (anaerobic) versus 15N (aerobic) comparisons, the expression ratio for ubiquitin was 1:1.1. Two more protein pairs were also observed (indicated by different colored dots) from this mass spectrum with their 14N-labeled protein Mr values indicated. Overall, we were able to obtain up to eight protein ratios from a single FTICR mass spectrum. For example, the spectrum in Figure 6B shows five protein pairs indicated by the colored bars in Figure 6B ranging in mass from 12 to 23 kDa. The ability to form multiple charge states creates the possibility that the 14Nand 15N-labeled protein species may not be ionized precisely in the same way. Therefore, if there is a greater relative portion of one species formed in one charge state, the remaining ions may be under-represented in other charge states, as observed for the 14N-labeled protein with M ) 21 535.6 Da, whose ratios are not r perfectly consistent for each charge state. However, by averaging all the charge states, the impact of this minor ionization effect is dampened. In total, we obtained 50 protein expression ratios from 26 mass spectra in comparing the anaerobic and aerobic samples. For those proteins for which we observed no substantial difference in expression between anaerobiosis and aerobiosis (46 in total),

Figure 7. FTICR mass spectrum for Atp14p (YLR295C). (A) Broadband spectrum showing a substantial difference in the expression ratio of the 14N and 15N protein species. The inset is the mass spectrum processed after five truncations. (B) Tandem MS with OCAD (45 scans). (C) ProSight PTM output using single protein mode. The first 32 amino acids (a known signal peptide) were truncated, resulting in 13 b-ions and 12 y-ions, which matched the predicted fragmented ions. The vertical lines with a hook toward the right represent y-ions, whereas those to the left represent b-ions.

an average ratio of 0.88 ( 0.02 (SEM) (anaerobiosis to aerobiosis) was found. These results suggest a systematic error during proteome mixing, requiring normalization in order to obtain more accurate expression ratios.24 Of the 50 pairs, four were expressed at substantially different levels in the two conditions (g2-fold difference). The FTICR mass spectrum for one such example is shown in Figure 7. As discussed above, we used the truncated data (Figure 7A inset) for protein ratio determination and the nontruncated data for protein mass determination and identification. The protein (Mr ) 10 407.9-6 Da) was fragmented with OCAD (Figure 7B) and identified with ProSight PTM (Figure 7C) as Atp14p, a component of the mitochondrial ATP synthase H chain. The observed protein is 32 amino acids shorter than the translated primary sequence of the ATP14 gene, consistent with the removal of a known signal peptide.47 These results demonstrate the precision of top-down MS in characterizing the expression of a mature protein, which can be extremely challenging using bottomup approaches. Four-fold (1:3.9) lower expression was observed for this protein under anaerobic conditions, as might be expected for a mitochondrial protein involved in oxidative phosphorylation. This level of expression correlates well with previous microarray analyses of the expression of its gene, which showed a 2-fold decrease in its mRNA level under steady-state anaerobiosis.25 In one FTICR mass spectrum, two protein species were observed that are apparently an anaerobic/aerobic protein pair, since they are adjacent to each other with close Mr values (Figure 8A). Both protein ions were fragmented together with OCAD (Figure 8C), and Anb1p was unambiguously identified as the 14Ncontaining species (anaerobic sample), with 9 b-ions and 26 y-ions consistent with N-terminal acetylation, a phosphorylation on Ser(47) Arselin, G.; Vaillier, J.; Graves, P. V.; Velours, J. J. Biol. Chem. 1996, 271, 20284-20290.

Figure 8. Aerobic and anaerobic isoforms of eukaryotic translation initiation factor eIF-5A. (A) Broadband scan showing the expression of 14N-Anb1p (anaerobic condition) and 15N-Hyp2p (aerobic condition). (B) Protein masses for the two gene products. (C) Tandem MS with OCAD (45 scans). (D) ProSight PTM output for Anb1p and characterization of its Ser-1 phosphorylation (+80 Da, circled), N-terminal acetylation (+42 Da, circled), and hypusine at Lys-50 (+87 Da, circled) together with two Cys alkylation from iodoacetamide (+57 Da, colored squares).

1, a hypusine on Lys-50, and iodoacetamide alkylation of both cysteine residues (Figure 8D). Due to a limitation in the current software, the 15N-labeled species was not identified automatically. The observed mass of this species (17 501.3 Da) was 16.8 Da different from that anticipated for Anb1p (17 518.1 Da, Figure 8B). However, its mass matched (within 1 Da) the theoretical value for Anb1p’s 15N-labeled aerobic counterpart, namely, Hyp2p. These proteins form an anaerobic/aerobic isoform pair for the eukaryotic translation initiation factor eIF-5A (Figure 8B) and are 90% identical in amino acid composition.48,49 The results shown in Figure 8A are consistent with substantially reduced expression of the aerobic isoform under anaerobiosis and that of the anaerobic isoform under aerobiosis. Moreover, these results are consistent with previous Northern blot49 and cDNA microarray analyses,25 which show the two genes are reciprocally regulated by oxygen.48,50 Of the 50 protein ratios determined, 14 proteins were unambiguously identified by either IMT or MS/MS (Table 1). Most of these proteins have similar levels of expression under aerobic and (48) Schwelberger, H. G.; Kang, H. A.; Hershey, J. W. J. Biol. Chem. 1993, 268, 14018-14025. (49) Kwast, K. E.; Burke, P. V.; Staahl, B. T.; Poyton, R. O. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 5446-5451. (50) Kwast, K. E.; Burke, P. V.; Poyton, R. O. J. Exp. Biol. 1998, 201, 11771195.

Analytical Chemistry, Vol. 78, No. 3, February 1, 2006

693

anaerobic conditions. A comparison of protein and mRNA expression ratios between anaerobic and aerobic conditions reveals that they are highly correlated. As shown in Table 1, several PTMs were observed including N-terminal acetylation, phosphorylation, hypusine, and signal peptide cleavage. Although no differences in PTMs were detected between the two conditions examined in this study, one of the primary advantages to top-down MS is the ability to easily detect and quantitate differences in PTM occupancies between two cell states (Figure 1C). CONCLUSIONS In this study, we evaluated two general strategies for measuring expression ratios of intact proteins using top-down mass spectrometry. Using a postgrowth covalent-labeling strategy, we show that the ratios of differentially alkylated proteins (either acrylamide or iodoacetamide) could be easily determined using ESI-FTICR-MS, and the number of Cys residues could be used as a constraint in database searching for protein identification without the need for MS/MS. However, due to chromatographic shifts, further development of alkylation tags with stable isotopes tailored for top-down MS is required. The concept of intact mass tags combined with stable isotope labeling can provide identification and quantitation of proteins and PTMs to the extent that they are visible in ESI-FTICR mass spectra of mixtures, with multiple proteins quantitatively profiled in a single mass spectrum. Therefore, it has the potential for high-throughput profiling of proteomes quantitatively at the intact protein level, which decreases the number of the analyses objects as compared to bottom-up approaches. With 100% sequence coverage, changes in PTM occupancy can also be semiquantitatively determined as depicted in Figure 1C. This approach also shares some pitfalls common to all stable isotope labeling-based mass spectrometric methods including a

694

Analytical Chemistry, Vol. 78, No. 3, February 1, 2006

S/N bias and dynamic range limitations. For FTICR MS, some care in spectral processing is required to eliminate effects that exaggerate protein ratios that are not near ∼1:1. Some manual interference is still required for data sampling and ratio verification, which leaves ample space for the development of more automated and reliable strategies for quantitative analyses. With a growing list of “intact mass tags” in hand, profiling a microbial proteome with minimal MS/MS will soon become feasible.26 Overall, we believe top-down approaches will become an important analytical strategy in quantitative proteomics that will help illuminate diverse molecular mechanisms in cell biology. ACKNOWLEDGMENT The authors thank Patricia V. Burke, Lihua Jiang, Yong-Bin Kim, and Paul Thomas for their assistance. The acid-labile analogue of SDS was a generous gift from Edward Bouvier and Reb Russell of the Waters Corp. The laboratory of N.L.K. received support from the National Science Foundation Career Award (CH 0134953), the National Institutes of Health (GM 067193), the Research Corporation (Cottrell Scholars Program), the Sloan Foundation, and the Henry and Lucille Packard Foundation. The laboratory of K.E.K. received support from the National Institutes of Health (GM 59826). The authors are also grateful to Jeff Chapman, John Hobbs, and Mark Lies of the Beckman Coulter for their assistance with the PF 2D system. SUPPORTING INFORMATION AVAILABLE Additional information as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org. Received for review June 4, 2005. Accepted November 7, 2005. AC050993P