Anal. Chem. 2000, 72, 1179-1185
Identification and C-Terminal Characterization of Proteins from Two-Dimensional Polyacrylamide Gels by a Combination of Isotopic Labeling and Nanoelectrospray Fourier Transform Ion Cyclotron Resonance Mass Spectrometry Toshiyuki Kosaka,* Tomoko Takazawa, and Takemichi Nakamura
Biomedical Research Laboratories, Sankyo Company, Ltd., 2-58 Hiromachi 1-chome, Shinagawa-ku, Tokyo 140-8710, Japan
We propose a novel method for the identification and C-terminal characterization of proteins separated by twodimensional polyacrylamide gel electrophoresis (2DPAGE). Proteins were digested in a gel in a buffer solution containing 50% 18O-labeled water, and mixtures of 18O/ 16O-labeled peptides were analyzed by nanoelectrospray Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS). This method was evaluated using horse skeletal muscle myoglobin as the model protein in SDS gel. The high resolution of FT-ICR MS minimized the overlapping of peptide peaks and facilitated identification of the C-terminal peptide, which was done by observing the undisrupted isotope peak pattern. As well, with its low ppm-level high mass accuracy, it can rapidly and reliably identify the in-gel-separated protein and determine its C-terminal by peptide mass fingerprinting alone. Therefore, this method should be applicable to routine and high-throughput proteome studies. Here, the method was applied to the analysis of rat liver proteins separated by 2D-PAGE. The C-termini of eight proteins were successfully identified out of 10 randomly picked Coomassie brilliant blue-stained spots. The feasibility and limitations of this approach are reported in this paper. Due to recent advancements in the technology of analytical methods, large-scale proteome1,2 analysis has now become a very common and indispensable tool in such areas as functional genomics, pathophysiology, and drug-target discovery. Proteome analysis, in general, requires two experimental steps: (1) efficient separation of the expressed proteins in a biological system of interest and (2) effective identification of the protein. For simultaneous separation of large numbers of proteins, two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) is virtually the only method that is currently available. The most sensitive and rapid * To whom correspondence should be addressed: (e-mail) kosaka@ shina.sankyo.co.jp; (fax) +81-3-5436-8567. (1) Wilkins, M. R.; Sanchez, J.-C.; Gooley, A. A.; Appel, R. D.; Humphery-Smith, I.; Hochstrasser, D. F.; Williams, K. Biotechnol. Genet. Eng. Rev. 1996, 13, 19-50. (2) Kalrn, P. Science 1995, 270, 369-370. 10.1021/ac991067b CCC: $19.00 Published on Web 02/17/2000
© 2000 American Chemical Society
way to identify gel-separated proteins is peptide mass fingerprinting by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS).3,4 In proteomic analysis, to study disease-related proteins or drug effects, the comparison of protein expression patterns by 2D-PAGE of the control relative to the diseased, drug-treated cell or tissue sample is frequently performed.5-8 Furthermore, the study of posttranslational modification such as phosphorylation and truncation to understand protein function is now a major focus of proteomic analysis. In this paper, to investigate posttranslational processing and/or truncation of proteins, we propose a novel method for the identification and C-terminal characterization of proteins separated by 2D-PAGE. It is widely known that a number of mature proteins are posttranslationally processed not only at the N-termini but also at the C-termini.9 Since current C-terminal sequencing technology is limited in terms of sensitivity and throughput, our knowledge about protein C-terminal processing is limited. However, there are several interesting and important questions. How often and how commonly are the C-termini of the protein processed? Are there any variations of C-terminal processing in various biological situations such as disease? How does the processing alter the protein structure and function? The C-terminal processing influences 2D-PAGE separation. It reduces the protein’s molecular weight. The processing of the C-terminal region, which contains charged amino acids, will lead to a shift in the pI. Thus, the processed isoforms may be separable by 2D-PAGE. Several analytical methods for C-terminal characterization of proteins have previously been reported. One of them, considered to be an effective tool for C-terminal characterization, (3) Jensen, O. N.; Podtelejnikov, A.; Mann, M. Rapid Commun. Mass Spectrom. 1996, 10, 1371-1378. (4) Jensen, O. N.; Podtelejnikov, A. V.; Mann, M. Anal. Chem. 1997, 69, 47414750. (5) Benito, B.; Wahl, D.; Steudel, N.; Cordier, A.; Steiner, S. Electrophoresis 1995, 16, 1273-1283. (6) Anderson, N. G.; Anderson, N. L. Electrophoresis 1996, 17, 443-453. (7) Andersen, H. U.; Fey, S. J.; Larsen, P. M.; Nawrocki, A.; Hejnæs. K. R.; Mandrup-Poulsen, T.; Nerup, J. Electrophoresis 1997, 18, 2091-2103. (8) Arnott, D.; O′Connell, K. L.; King, K. L.; Stults, J. T. Anal. Biochem. 1998, 258, 1-18. (9) Thanos, D.; Maniatis, T. Cell 1995, 80, 529-532.
Analytical Chemistry, Vol. 72, No. 6, March 15, 2000 1179
Figure 1. (a) Broad-band nano-ES FT mass spectrum of a tryptic in-gel digest of myoglobin prepared in buffer containing 50% H218O and (b) enlargement of the region including the C-terminal peptide marked with an asterisk. Peaks attributable to autodigestion of trypsin are labeled with a T. The peaks at m/z 650.3140 were identified as the C-terminal peptide by their isotope peak pattern.
is ladder sequencing by carboxypeptidases and MALDI-TOF MS.10,11 However, this method is limited to small-sized proteins and is not applicable to the analysis of proteins in 2D-gel. Therefore, we proposed a new approach that would allow Cterminal characterization of proteins separated by 2D-gel. The method consists of two steps. First, a protein is digested in-gel in a buffer containing 50% 18O 12 to label the proteolytic peptides' C-termini except for the peptide with the original protein Cterminus. As the original C-terminal peptide represents the nonlabeled peptide, it is distinguishable from the other proteolytic peptides.13-15 Second, the peptide mixture is analyzed by nanoelectrospray16 (nano-ES) Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometry.17 Since FT-ICR MS provides very high resolution and excellent mass accuracy, one should be
able to easily recognize the non-18O-labeled peptide from the mixture spectrum, and accurate mass fingerprinting helps confident protein and C-terminal identification. Nano-ES offers enough sensitivity to analyze proteins separated by 2D-PAGE. In addition, this is a potential method for high-throughput analysis since it is based on peptide mass fingerprinting, the quickest method for protein identification. In this paper, we investigated whether our method, which is a combination of isotope labeling and nano-ES FT-ICR mass spectrometry, would be applicable to identify the C-terminal of a protein in 2D-gel using a standard protein digest of horse myoglobin from SDS-PAGE. We also evaluated the feasibility and limitations of this approach, using protein spots from a 2D-PAGE of a rat liver cell lysate.
(10) Thiede, B.; Wittmann-Liebold, B.; Bienert, M.; Krause, E. FEBS Lett. 1995, 357, 65-69. (11) Patterson, D. H.; Tarr, G. E.; Regnier, W. D.; Martin, S. A. Anal. Chem. 1995, 67, 3971-3978. (12) Kuster, B.; Mann, M. Anal. Chem. 1999, 71, 1431-1440. (13) Rose, K.; Simona, M. G.; Offord, R. E.; Prior, C. P.; Otto, B.; Thatcher, D. R. Biochem. J. 1983, 215, 273-277. (14) Rose, K.; Savoy, L.-A.; Simona, M. G.; Offord, R. E.; Wingfield, P. Biochem. J. 1988, 250, 253-259. (15) Schno ¨lzer, M.; Jedrezejewski, P.; Lehmann, W. D. Electrophoresis 1996, 17, 945-953. (16) Wilm, M.; Mann, M. Anal. Chem. 1996, 66, 1-8. (17) Marshall, A. G.; Hendrickson, C. L.; Jackson, G. S. Mass Spectrom. Rev. 1998, 1-35.
EXPERIMENTAL SECTION Material and Sample Preparation. Myoglobin from horse skeletal muscle was purchased from Sigma (St. Louis, MO). Oxygen-18 enriched water (95 atom % 18O) was purchased from Aldrich (Milwaukee, WI). Trypsin of sequencing grade was purchased from Boehringer (Mannheim, Germany) and used without further purification. Rat livers were weighed and homogenized in a glass homogenizer with an amount of solubilizing solution (8.6 M urea, 5% 2-mercaptoethanol, 4% Triton X-100, 2.4% Ampholine pH 3.5-9.5) 9 times the liver weight (g). The homogenate was centrifuged at 100000g for 2 h, and the resulting
1180 Analytical Chemistry, Vol. 72, No. 6, March 15, 2000
Figure 2. (a) Broad-band nano-ES FT mass spectrum of the tryptic in-gel digest of spot 9. (b) Enlargement of regions including the C-terminal peptide (the monoisotoptic ion is marked with an asterisk). (c) Theoretical isotope distribution of the triply charged C-terminal peptide of spot 9.
supernatant was loaded onto Immobline DryStrips (Pharmacia, Uppsala, Sweden). Electrophoresis was performed using a Pharmacia Multiphor II apparatus according to the manufacturer’s instructions. Protein spots were visualized with a Coomassie brilliant blue (CBB) stain. To evaluate our method in terms of C-terminal characterization, we randomly selected 10 CBB-stained spots at 10-80 kDa molecular mass, which were observed in the 2D-PAGE of a cell lysate of a rat liver. Gel pieces of a myoglobin spot in SDS-gel and the 10 spots in 2D-gel were excised from the gel and in-gel digested18 with trypsin in the presence of H218O. Briefly, the gel pieces were washed with CH3CN/100 mM NH4HCO3 (1:1), dehydrated with CH3CN, and subsequently dried in a vacuum centrifuge. The gel pieces were hydrated in 100 mM NH4HCO3/H218O (1:1) containing trypsin (25 ng/µL) in a cold ice bath for 30 min. The supernatant was removed and replaced with the same buffer without trypsin. The enzymatic reaction proceeded overnight at 37 °C. Peptides were extracted with fresh 20 mM NH4HCO3 at room temperature and dried. Dried protein digests were dissolved in 0.1% TFA and desalted on ZipTip pipet tips (Millipore, Bedford, MA). Eluates pooled with 50% CH3CN/ 0.1% TFA from the ZipTipC18 pipet tips were dried and redissolved in 3% AcOH and 50% MeOH for nano-ES analysis. To improve (18) Shevchenko, I.; Wilm, M.; Vorm, O.; Mann, M. Anal. Chem. 1996, 68, 850858.
sequence coverage in nano-ES FT MS, a stepwise elution with 10, 20, 30, 40, and finally 50% CH3CN from the ZipTip pipet tips was performed and each fraction was analyzed. Mass Spectrometry and Database Searching. An Apex II FT-ICR mass spectrometer with 7-T superconducting magnet (Bruker Daltonics, Billerica, MA) equipped with a Wilm-Manntype nano-ES source16 was used for the analysis. The mass spectrometer was calibrated externally using human angiotensin I and a human adrenocorticotropic hormone fragment 18-39 as the standard peptide, and 16 scans were accumulated to obtain a 512K-point broad-band mass spectrum. The scan duty cycles varied between approximately 1 and 4 s, depending on the trapping time of the nano-ES-generated ions in a source hexapole ion guide. An MS-Fit software package19 as the peptide mass fingerprinting tool was used for protein identification and/or peptide assignment. The NCBInr and SwissProt protein database were used for MSFit search. RESULTS AND DISCUSSION Recognition of the C-Terminal Peptide. To evaluate the feasibility of C-terminal peptide identification by a combination (19) Clauser, K. R.; Baker, P.; Burlingame, A. L. In Proceedings of the 44th ASMS Conference on Mass Spectrometry and Allied Topics, Portland, OR, May 1216, 1996; p 365.
Analytical Chemistry, Vol. 72, No. 6, March 15, 2000
1181
Figure 3. (a) Broad-band nano-ES FT mass spectrum of the tryptic in-gel digest of spot 4 and (b) enlargement of the region including the C-terminal peptide. The monoisotopic ion of the C-terminal peptide is marked with asterisk. The doubly charged ions (dotted) with 18O-labeling are overlapped with these of the C-terminal peptide.
18O-labeling
and nano-ES FT-ICR MS, a standard protein digest was prepared and analyzed. Myoglobin from SDS-PAGE was ingel digested with trypsin in buffer containing 50% H218O. Figure 1a shows the broad-band nano-ES FT-ICR mass spectrum of the tryptic in-gel digests of myoglobin with 18O-labeling. Theoretically, the fragment attributable to the original protein C-terminal should be the only unlabeled peptide in the protein digest. Since FT-ICR MS offers very high resolution and good isotope patterns over a wide mass range, the isotope peak patterns of peptides with and without 18O-labeling can be clearly seen. It was not hard to identify the C-terminal peptide peak by searching for the normal isotope pattern in the spectrum of the 50% 18O-labeled digest mixture. There was also no need to compare it with the control spectrum of a nonlabeled digest mixture. Figure 1b shows an enlarged part of the nano-ES FT-ICR mass spectrum of the tryptic in-gel digest of myoglobin with 18O-labeling. From the peak with an undisrupted isotope distribution, the C-terminal peptide (marked with *, m/z 650.3140) was recognized very easily. Confident Identification of Proteins and C-Terminal Peptides. High mass accuracy at a low ppm level can be achieved on a routine basis with FT-ICR MS in the absence of internal standards. In database searching by peptide mass fingerprinting, high mass accuracy effectively reduces the chances of false positive matching and makes it possible to identify proteins confidently. The spectrum (Figure 1) of the myoglobin in-gel digest was calibrated externally, and the observed and calculated peptide mass are compared in Table 1. The mass accuracy was 1182 Analytical Chemistry, Vol. 72, No. 6, March 15, 2000
Table 1. Calculated and Observed Masses of Horse Myoglobin Peptides from In-Gel Tryptic Digestsa m/z obsdb 650.3140 684.3723 748.4341 941.4711 1086.5633 1271.6629 636.3358 1360.7576 680.8829 1378.8373 689.9258 751.8398 803.9328 831.4287 908.4537 618.6598 927.4862 646.3440 969.0136 661.3567 991.5338
charge
2 2 2 2 2 2 2 3 2 3 2 3 2
m/z calcdc
δ (ppm)
assigned peptide sequenced
650.3150 684.3721 748.4357 941.4733 1086.5618 1271.6636 1271.6636 1360.7589 1360.7589 1378.8422 1378.8422 1502.6698 1606.8553 1661.8539 1815.9030 1853.9622 1853.9622 1937.0173 1937.0173 1982.0572 1982.0572
-1.5 0.3 -2.2 -2.3 1.4 -0.5 0.2 -0.9 -0.7 -3.6 1.1 1.3 1.5 -2.6 -1.9 0.8 1.3 -0.5 1.1 -1.4 1.3
(K)ELGFQG(-) (K)FDKFK(H) (K)ALELFR(N) (K)YKELGFQG(-) (K)HLKTEAEMK(A) (R)LFTGHPETLEK(F) (R)LFTGHPETLEK(F) (K)ALELFRNDIAAK(Y) (K)ALELFRNDIAAK(Y) (K)HGTVVLTALGGILK(K) (K)HGTVVLTALGGILK(K) (K)HPGDFGADAQGAMTK(A) (K)VEADIAGHGQEVLIR(L) (R)LFTGHPETLEKFDK(F) (-)GLSDGEWQQVLNVWGK(V) (K)GHHEAELKPLAQSHATK(H) (K)GHHEAELKPLAQSHATK(H) (R)LFTGHPETLEKFDKFK(H) (R)LFTGHPETLEKFDKFK(H) (K)KGHHEAELKPLAQSHATK(H) (K)KGHHEAELKPLAQSHATK(H)
a The corresponding nano-ES FT mass spectrum is shown in Figure 1. b Monoisotopic mass. c Monoisotopic (M + H)+. d Residues before/ after peptide are in parentheses.
quite good over a wide range, within 3 ppm on average, even with an external calibration. This is better than the previously reported
Table 2. Calculated and Observed Peptide Masses from In-Gel Tryptic Digests of Spots 9 and 4 Observed in Rat Liver 2D-PAGEa m/z obsdb
δ (ppm)
assigned peptide sequenced
spot 9 heat shock cognate 71 kDa protein 774.4358 774.4361 858.4597 858.4573 509.2889 2 1017.5693 541.2891 2 1081.5682 590.8156 2 1180.6214 600.3419 2 1199.6748 626.3146 2 1251.6196 627.3138 2 1253.6166 634.8327 2 1268.6560 652.3031 2 1303.5993 494.2574 3 1480.7549 740.8830 2 1480.7549 744.3561 2 1487.7018 522.6171 3 1565.8328 808.9007 2 1616.7882 816.8953 2 1632.7831 564.5814 3 1691.7261 846.3697 2 1691.7261 596.6700 3 1787.9907 894.5025 2 1787.9907 607.9700 3 1821.8918 613.3459 3 1838.0136 991.5073 2 1981.9983 925.4495 3 2774.3273 1126.8372 3 3378.4895
-0.4 2.8 0.7 2.0 1.7 1.0 1.4 2.5 1.2 -0.7 1.1 2.2 1.7 1.8 3.3 -0.2 1.4 3.2 2.0 3.6 1.4 4.6 4.3 2.0 1.9
(R)NTTIPTK(Q) (R)GTLDPVEK(A) (K)ITITNDKGR(L) (K)LLQDFFNGK(E) (K)VQVEYKGETK(S) (K)DAGTIAGLNVLR(I) (R)MVNHFIAEFK(R) 1 Met-Ox (R)FEELNADLFR(G) (K)MKEIAEAYLGK(T) 1 Met-Ox (K)NSLESYAFNMK(A) (R)ARFEELNADLFR(G) (R)ARFEELNADLFR(G) (R)TTPSYVAFTDTER(L) (K)LLQDFFNGKELNK(S) (K)SFYPEEVSSMVLTK(M) (K)SFYPEEVSSMVLTK(M) 1 Met-Ox (K)STAGDTHLGGEDFDNR(M) (K)STAGDTHLGGEDFDNR(M) (R)IINEPTAAAIAYGLDKK(V) (R)IINEPTAAAIAYGLDKK(V) (K)NQVAMNPTNTVFDAKR(L) 1 Met-Ox (K)LDKSQIHDIVLVGGSTR(I) (K)TVTNAVVTVPAYFNDSQR(Q) (K)QTQTFTTYSDNQPGVLIQVYEGER(A) (K)LYQSAGGMPGGMPGGFPGGGAPPSGGASSGPTIEEVD(-) 2 Met-Ox
spot 4 cytochrome b5 738.4063 757.3073 593.8049 1186.6025 476.9093 714.8616 756.3785 504.5889 548.6188 822.4257 735.9812 1103.4710 925.4235
3.4 -0.7 2.0 2.5 -0.1 2.1 -0.2 1.1 0.9 2.6 -0.2 2.6 3.0
(K)VYDLTK(F) (R)LYMAED(-) 1 Met-Ox (K)YYTLEEIQK(H) (K)YYTLEEIQK(H) (K)TYIIGELHPDDR(S) (K)TYIIGELHPDDR(S) (K)FLEEHPGGEEVLR(E) (K)FLEEHPGGEEVLR(E) (K)TYIIGELHPDDRSK(I) (K)TYIIGELHPDDRSK(I) (R)EQAGGDATENFEDVGHSTDAR(E) (R)EQAGGDATENFEDVGHSTDAR(E) (K)FLEEHPGGEEVLREQAGGDATENFEDVGHSTDAR(E)
charge
2 3 2 2 3 3 2 3 2 4
m/z calcdc
738.4038 757.3078 1186.5996 1186.5996 1428.7123 1428.7123 1511.7494 1511.7494 1643.8393 1643.8393 2205.9285 2205.9285 3698.6595
a The corresponding nano-ES FT mass spectra are shown in Figures 2 and 3. b Monoisotopic mass. c Monoisotopic (M+H)+. d ( ), Residues before/after peptide; Met-ox, oxidation of methionine.
delayed-extraction MALDI-TOF mass accuracy with internal calibration.3 In nano-ES FT-ICR MS, we can always obtain a mass accuracy of 5 ppm or less even from the peptide maps of in-gel digest mixtures. It is clear that a peptide m/z error of 5 ppm will improve the specificity of the database search. Therefore, in our method, one should be able to achieve a high level of confidence in the identification of a protein and its C-terminal by peptide mass fingerprinting alone. If there was a C-terminal truncation in the horse myoglobin, in other words, if the C-terminal peptide ion was observed at an m/z other than 650.3140, the ion should show a normal isotope pattern and be left as an unmatched mass. However, the protein should still be identifiable as that of horse myoglobin from the other peptide masses. Once the protein has been identified, fitting the accurate m/z value of the unmatched C-terminal peptide ion to the protein sequence would be the only step necessary to identify the C-terminus of the posttranslationally processed protein. In the case of horse myoglobin, the nano-ES FT-ICR mass spectrum of the in-gel tryptic digest mixture (Figure 1) covered 79% of the myoglobin sequence, and the C-terminal peptide was
characterized. It is unavoidable that some C-terminal peptides are not recovered from the gel by the in-gel digestion procedure. Further, when analyzing the tryptic digests from a protein with a C-terminal lysine and arginine, the C-terminal peptide is not recognized.15 Therefore, one may not be able to identify the C-terminal peptide at all times, but should at least be able to achieve confident identification of the protein. Application to Proteins Separated by 2D-PAGE. We investigated the application of our method to the C-terminal characterization and identification of a protein separated by 2DPAGE. Ten spots (spots 1-10), which were observed in the 2DPAGE of a cell lysate of a rat liver, were randomly selected and were analyzed with our method. In a 2D-PAGE of complex proteins, such as the expressed proteins of a tissue or a cell, the comigrating or overlapping protein bands are often observed. Even in such cases, the high mass accuracy in FT-ICR MS makes it possible to identify and characterize protein mixtures in one single spot. As a result of the analysis, the C-terminal peptide of seven spots could be identified from the peptide mixture eluted with 50% CH3 Analytical Chemistry, Vol. 72, No. 6, March 15, 2000
1183
Figure 4. (a) Broad-band nano-ES FT mass spectrum of the tryptic in-gel digest of spot-3 eluted from the ZipTip desalting tip with 50% CH3CN. (b) Enlarged nano-ES FT mass spectrum (m/z 879-885) of the 50 and (c) the10% CH3CN fractions.
CN from the ZipTip in one step. Here, we present the results of spot 4 (2D-PAGE of a cell lysate of a rat liver: experimental MW ) 14 000, pI ) 4.9) and spot-9 (MW ) 68 000, pI ) 5.5) in detail as an example. Figures 2a and 3a show broad-band nano-ES FT mass spectra of the in-gel tryptic digests with 18O-labeling of spot 9 and spot 4, respectively. The peaks corresponding to the unlabeled peptide were observed in these spectra (marked with *). The enlarged spectra of the unlabeled peptide regions of spot 9 and spot 4 are shown in Figure 2b and 3b, respectively. The peaks at m/z 1126.8372 of spot 9 can be identified as a triply charged ion without 18O-labeling. In the case of spot 4, the peaks at m/z 757.3073, which show normal isotope patterns of a singly charged ion, are overlapped with the peaks at m/z 756.3785 (Figure 3b). These peaks at m/z 756.3785 in turn show disrupted isotope patterns of a doubly charged ion due to the 18O-labeling. Since one of the features of FT-ICR MS is the ultrahigh resolution, one can clearly characterize the C-terminal peptide even in such cases. Protein identification of spot 4 and spot 9 could be achieved by peptide mass fingerprinting against m/z values observed in each FT mass spectrum. Database searching with a maximum mass error setting of 5 ppm, gave a unique hit in the SwissProt database for each spot; i.e., spot 4 was cytochrome B5 and spot 9 was a heat shock cognate 71 kDa protein. Table 2 shows the calculated peptide mass of these proteins relative to the observed peptide mass in the FT mass spectra. The mass accuracy was 1184 Analytical Chemistry, Vol. 72, No. 6, March 15, 2000
within 5 ppm. The peaks at m/z 756.3785 and 1126.8372 were identified as the C-terminal peptide of cytochrome b5 and heat shock cognate 71 kDa protein, respectively. The theoretical isotope distribution of m/z 1126.8372 corresponding to residues 610646 of the heat shock cognate 71 kDa protein is shown in Figure 2c. The distribution, which was observed in the nano-ES FT-ICR MS (Figure 2b), concurred with the theoretical isotope distribution (Figure 2c). The observed peptides covered 51% of the cytochrome b5 sequence and 42% of the heat shock cognate 71 kDa protein sequence. Figure 4a shows a broad-band nano-ES FT mass spectrum of an in-gel tryptic digest of spot 3 (MW ) 20 000, pI ) 5.5) with 18O-labeling, which is eluted with 50% CH CN from the ZipTip in 3 one step. The peak of the normal isotope pattern could not be observed in this spectrum. However, database searching by peptide mass fingerprinting using peptide m/z values observed in Figure 4a gave a unique hit. Spot 3 was a phosphatidylethanolamine-binding protein and the sequence coverage was 71%. If there were no C-terminal truncation, the singly charged ion at m/z 881.4845 without 18O-labeling for the C-terminal peptide of phosphatidylethanolamine-binding protein would have been observed in the nano-ES FT MS. The fractionation of the peptide mixtures prior to the nano-ES MS measurement could improve sequence coverage and increase the chance to observe a Cterminal peptide. Parts b and c of Figure 4 show the enlarged
Table 3. Summary of the C-Terminal Identification of 10 Rat Liver Proteins on 2D-Gel Using Nano-ES FT Mass Spectrometry peptide without obsd by nano-ES FT MSb δ (ppm)c 18O-labeling
spot no.
protein identifieda/sequence coverage
1 2 3 4 5 6 7 8 9
ATP synthase D chain (P31399)/91% superoxide dismutase (P07632)/56% phosphatidylethanolamine-binding protein (P31044)/81% cytochrome b5 (P00173)/51% prohibitin (P24142)/58% 3-hydroxyanthranilate 3,4-dioxygenase (3929397)/61% catechol-O-methyltransferase (P22734)/31% ATP synthase β chain (P10719)/49% heat shock cognate 71kDa protein (P08109)/42%
842.9253 (2) 1001.5452 (1) 881.4868 (1) 757.3073 (1) 1855.0397 (1) not identified 625.3068 (2) 528.7517 (2) 1126.8372 (3)
heat shock protein 60 (P19227)/(51%)
not identified
10
-3.5 -0.2 2.6 -0.7 3.7 -0.5 -0.6 1.9
18O-Labeling
and
assigned peptide sequenced (R)KYPYWPHQPIENL(-) (R)LAXGVIGIAQ(-) (K)LHDQLAGK(-) (R)LYMAED(-) 1 Met-Ox (R)NITYLPAGQSVLLQLPQ(-) (K)PLG(-)e (K)AIYQGPSSPDKS(-) (K)ADKLAEEHGS(-) (K)LYQSAGGMPGGMPGGMPGGFPGGG APPSGGASSGPTIEEVD(-) 2 Met-Ox (K)DPGMGAMGGMGGGMGGGMF(-)e
a Accession numbers in the SwissProt or NCBInr database are in parentheses. b Monoisotopic mass; charge state in parentheses. c The difference between the observed mass of nonlabeled peptide and calculated mass of the C-terminal peptide. d Residues before/after peptide are in parentheses; Met-Ox denotes oxidation of methionine; X denotes carbamoylmethylcysteine. e The anticipated C-terminal peptide sequence from the protein identified by peptide mass fingerprinting.
nano-ES FT mass spectrum (m/z 879-885) of the 50% and the 10% CH3CN fractions, respectively. The singly charged ion at m/z 881.4868 without 18O-labeling, the C-terminal peptide, was observed in the 10% CH3CN fraction, and the sequence coverage was improved to 81%. Because of the suppression effect in the nano-ES mass measurement of the peptide mixtures, the Cterminal peptide was not observed when analyzing the elution with 50% CH3CN from the ZipTip in one step. The fractionation in the desalting step was effective for the identification of the C-terminal peptide. In the case of spot 6 (MW ) 33 000, pI ) 5.5) and spot 10 (MW ) 58 000, pI ) 5.6), peptides without 18O-labeling were not observed. The problem here was the recovery of the C-terminal peptide from the gel and the ZipTip desalting. When the tryptic C-terminal peptide is very hydrophilic, the peptide fails to bind to the C18 resin of the ZipTip pipet tip. In addition, a very small C-terminal peptide is not detectable in FT-MS. On the other hand, however, when the tryptic C-terminal peptide is very large and hydrophobic, the recovery from the gel and ZipTip can also be low. Thus, different enzymes might be preferable for the identification of the C-terminal peptides of these spots. With regard to protein identification, all of the 10 spots we have analyzed in this study were successfully identified. The results of the protein identification and C-terminal characterization are summarized in Table 3. CONCLUSION AND FURTHER PERSPECTIVES In this study, we investigated the method for the identification and C-terminal characterization of a protein in 2D-gel by a
combination of isotope labeling and nano-ES FT-ICR MS. Since FT-ICR MS can provide superb mass resolution and accuracy over a wide mass range, a single broad-band nano-ES FT-ICR mass spectrum of a 50% 18O-labeled digest was sufficient to distinguish the C-terminal peptide mass from a peptide mixture in many cases. Accurate mass fingerprinting of the protein digests permitted the identification of the proteins and their C-termini at a high level of confidence. This method is potentially compatible for highthroughput proteome analysis, as only a single mass spectrum of digest and peptide mass fingerprinting are necessary for the identification of a protein and its C-terminus in ideal situations. The C-termini of 8 proteins of a rat liver were identified out of 10 proteins tested, which were separated in 2D-PAGE. The characterization of protein C-termini depends on two factors: (1) the recovery of the C-terminal peptide in the in-gel digestion and desalting step; (2) the detection efficiency of the C-terminal peptide from the digest mixture in nano-ES FT MS. Improvement of the sample preparation procedure will be necessary to apply this method for global C-terminal characterization in proteome analysis. Optimization of the extraction procedures for in-gel digestion and the desalting step is now in progress.
Received for review September 14, 1999. Accepted December 14, 1999. AC991067B
Analytical Chemistry, Vol. 72, No. 6, March 15, 2000
1185