Anal. Chem. 1997, 69, 1315-1319
C-Terminal Sequence Analysis of Peptides and Proteins Using Carboxypeptidases and Mass Spectrometry after Derivatization of Lys and Cys Residues Valentina Bonetto, Ann-Charlotte Bergman, Hans Jo 1 rnvall, and Rannar Sillard*
Department of Medical Biochemistry and Biophysics, Karolinska Institutet, S-171 77, Stockholm, Sweden
C-Terminal sequence analysis of peptides and proteins using carboxypeptidase digestion in combination with matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) is convenient for protein and peptide characterization. After a short digestion, a sequence up to 20 residues can be identified, but the total number depends on the individual sequence. Due to the accuracy limits of the MALDI time-of-flight arrangement, the assignment of several residues with close mass values, including Lys/Glx, may remain ambiguous. We have used derivatization of lysine residues by guanidination to overcome the problem of Lys identification. The reaction is rapid and specific and results in full derivatization. In the case of Cys-containing peptides, problems arise from the fact that carboxypeptidases Y and P do not cleave peptides that contain nonderivatized cystine, cysteic acid, or (carboxymethyl)cysteine. Successful identification of Cys residues within the sequence is instead achieved by conversion of Cys to 4-thialaminine by (trimethylamino)ethylation. The two derivatizations of Lys and Cys side chains provide opportunities for proton attachment and therefore facilitate the analysis by MALDI-MS. This C-terminal sequence analysis method is also useful for large proteins after fragmentation with specific enzymes. N-Terminal sequence analysis of peptides by sequencerassisted Edman degradation is a routine method, but it is limited in the capacity of number of cycles. In addition, many posttraslational modifications are not easily identifiable, and N-terminally acetylated or in other manner blocked peptides require special procedures before analysis1 or they cannot be analyzed. Therefore, it is of great interest to develop a versatile method for C-terminal sequence analysis as a complement to the N-terminal degradation and as a useful means for confirmation of DNA sequence data. Several methods for C-terminal sequence analysis of proteins and peptides have been proposed.2-9 Both mass spectrometric * To whom correspondence should be addressed. Fax: +46 8 333 447. E-mail:
[email protected]. (1) Bergman, T.; Gheorghe, M. T.; Hjelmqvist, L.; Jo¨rnvall, H. FEBS Lett. 1996, 390, 199-202. (2) Bailey, J. M.; Shively, J. E. Biochemistry 1990, 29, 3145-3156. (3) Inglis, A. S. Anal. Biochem. 1991, 195, 183-196. (4) Tsugita, A.; Takamoto, K.; Kamo, M.; Iwadate, H. Eur. J. Biochem. 1992, 206, 691-696. (5) Nguyen, D. N.; Becker, G. W.; Riggin, R. M. J. Chromatogr. 1995, 705, 21-45. S0003-2700(96)00896-7 CCC: $14.00
© 1997 American Chemical Society
and chemical procedures are available, but the approaches thus far have not been as successful as the Edman degradation. Enzymatic methods with carboxypeptidases are attractive because they are rapid and require comparatively inexpensive equipment for direct analysis of enzymatic digests with a variety of mass spectrometry (MS) techniques.7,8,10,11 After incomplete digestion, the sequence can be read in the correct order, by calculation of the differences between consecutive mass peaks. A convenient MS technique for this purpose is matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) because it has the advantage that it can tolerate relatively high concentrations of salts and other chemicals in the sample.12,13 A problem with the differences in the cleavage rates of amino acids by carboxypeptidases can be largely overcome by using two or more carboxypeptidases in combination. A frequently used mixture consists of carboxypeptidases P (CPP) and Y (CPY), which are both nonspecific enzymes that complement each other, resulting in cleavages between virtually all amino acid residues.5,14,15 This method, therefore, has a great potential for wide use. However, a number of amino acid residues cannot be unequivocally identified by MALDI time-of-flight (MALDI/TOF) analysis because the MALDI/TOF arrangement has its limits in terms of resolution and accuracy of mass determination. Thus, e.g., Lys, Gln, and Glu have similar mass values of 128.17, 128.13, and 129.11, respectively. To overcome this problem with the MALDI/TOF technique, we have modified Lys residues chemically prior to degradation with carboxypeptidases and analysis with MALDI-MS. Reaction with O-methylisourea results in the formation of homoarginine with a residue mass of 170.18 that is sufficiently distant from all other residue masses. In addition, we (6) Cotter, R. J.; Cornish, T. J.; Cordero, M. In Mass Spectrometry in the Biological Sciences; Burlingame, A. J., Carr, S. A., Eds.; Humana Press: Totowa, NJ, 1996. (7) Chait, B. T.; Chaudhary, T.; Field, F. H. In Methods in Protein Sequence Analysis; Walsh, K. A., Ed.; Humana Press: Clifton, NJ, 1987; pp 483-492. (8) Rosnack, K. J.; Stroh, J. G. Rapid Commun. Mass Spectrom. 1992, 6, 637640. (9) Chait, B. T.; Wang, R.; Beavis, R. C.; Kent, S. B. Science 1993, 262, 89-92. (10) Bradley, C. V.; Williams, D. H.; Hanley, M. R. Biochem. Biophys. Res. Commun. 1982, 104, 1223-1230. (11) Scha¨r, M.; Bo¨rnsen, K. O.; Gassmann, E. Rapid Commun. Mass Spectrom. 1991, 5, 319-326. (12) Stults, J. T. Curr. Opin. Struct. Biol. 1995, 5, 691-698. (13) Kallweit, U.; Bo¨rnsen, K. O.; Kresbach, G. M.; Widmer, H. M. Rapid Commun. Mass Spectrom. 1996, 10, 845-849. (14) Thiede, B.; Wittmann-Liebold, B.; Bienert, M.; Krause, E. FEBS Lett. 1995, 357, 65-69. (15) Woods, A. S.; Huang, A. Y.; Cotter, R. J.; Pasternack, G. R.; Pardoll, D. M.; Jaffee, E. M. Anal. Biochem. 1995, 226, 15-25.
Analytical Chemistry, Vol. 69, No. 7, April 1, 1997 1315
have also investigated several modifications of Cys residues to find one suitable for carboxypeptidase degradation and MALDIMS of Cys-containing peptides. Combined, these two modifications of Lys and Cys residues increase the usefulness of MALDIMS-assisted C-terminal sequence analysis. EXPERIMENTAL SECTION Materials. (2-Bromoethyl)trimethylammonium bromide (BETA), O-methylisourea, R-cyano-4-hydroxycinnamic acid (CHCA), bovine insulin, insulin A and B chains with Cys residues modified to cysteic acid, somatostatin-14, and yeast alcohol dehydrogenase (YADH) were purchased from Sigma. CPY and CPP (analytical and sequencing grades) were purchased from Boehringer-Mannheim and Sigma, trypsin (sequencing grade) from BoehringerMannheim, and endoprotease Lys-C from Wako Biochemicals (Japan). Peptide Isolation. Vasoactive intestinal polypeptide (VIP), secretin, and a 60-residue peptide with C-terminal cystine (PEC60) were isolated from porcine intestine.16-18 Peptides from porcine brain extracts were obtained as side fractions during the isolation of galanin variant forms and endosulfine.19,20 The fraction eluted with 0.2 M NH4HCO3 from a carboxymethylcellulose column19 was subjected to reverse phase HPLC on TSK ODS (7.8 mm × 300 mm) with an aqueous acetonitrile gradient containing 0.1% trifluoroacetic acid (TFA). One-minute fractions were collected, lyophilized, and checked for purity by capillary electrophoresis at pH 2.5.21 Fraction 62 was found to be homogeneous, and an aliquot of it was used in the present study. For generation of peptides from a native protein, YADH was cleaved with Achromobacter Lys-C-specific protease at an enzyme-to-substrate ratio of 1:40 in 0.1 M NH4HCO3 and 0.6 M urea, pH 8.0, for 4 h at 37 °C. The peptides were purified by reverse phase HPLC. Tryptic Digestion. Peptides were digested with trypsin at an enzyme-to-substrate ratio of 1:50 (w/w) in 0.1 M ammonium bicarbonate, pH 8, for 4 h at 37 °C. The fragments were purified by reverse phase HPLC. Guanidination of Lys. The reaction was carried out as described22 with some modifications. An aqueous solution of the peptide was mixed with an equal volume of 0.5 M O-methylisourea, adjusted to pH 10.5 with NaOH. The reaction was allowed to proceed for 4-5 h and was stopped with an equal volume of 1% aqueous TFA. Reverse phase HPLC on a Vydac C18 has been used for desalting after the reaction (solvents were A, 0.1% TFA/ water, and B, 0.1% TFA in 80% acetonitrile/20% water; gradient, 0-60% B in 30 min). Carboxymethylation and Oxidation of Cys. The peptide was carboxymethylated in 0.4 M Tris/HCl, 6 M guanidine/HCl, 2 mM EDTA, pH 8.15, by treatment under nitrogen in the dark with neutralized iodoacetic acid for 2 h at 37 °C, after reduction with dithiothreitol (DTT) for 2 h at 37 °C under nitrogen. Peptides were oxidized with performic acid.23 (16) Mutt, V. Ark. Kemi 1959, 15, 69-74. (17) Mutt, V. In Gut Hormones; Bloom, S. R., Ed.; Churchill Livingstone: Edinburgh, London, New York, 1978. (18) Liepinsh, E.; Berndt, K. D.; Sillard, R.; Mutt, V.; Otting, G. J. Mol. Biol. 1994, 239, 137-153. (19) Sillard, R.; Ro ¨kaeus, A.; Xu, Y.; Carlquist, M.; Bergman, T.; Jo ¨rnvall, H.; Mutt, V. Peptides 1992, 13, 1055-1060. (20) Virsolvy-Vergine, A.; Salazar, G.; Sillard, R.; Denoroy, L.; Mutt, V.; Bataille, D. Diabetologia 1996, 39, 135-141. (21) Bonetto, V.; Jo ¨rnvall, H.; Mutt, V.; Sillard, R. Proc. Natl. Acad. Sci. U.S.A. 1995, 92, 11985-11989. (22) Kimmel, J. R. Methods Enzymol. 1967, 11, 584-589.
1316
Analytical Chemistry, Vol. 69, No. 7, April 1, 1997
(Trimethylamino)ethylation of Cys. The peptides were (trimethylamino)ethylated24 in 8 M urea, 0.2 M Tris/HCl, pH 8.5, in the presence of 0.5 M BETA, after reduction with DTT (2 h at 37 °C under nitrogen). The mixture was allowed to stand for 2448 h under nitrogen at room temperature, and the peptides were purified by HPLC on a Vydac C18 column (solvents were A, 0.1% TFA/water, and B, 0.1% TFA in 80% acetonitrile/20% water; gradient, 0-60% B in 30 min). Digestions with Carboxypeptidases. Peptides were dissolved in 50 mM sodium citrate buffer (pH 6.0) at 0.2 µg/µL. Approximately 500 pmol of peptide was used as starting material for the digestion. Stock solutions of CPY and CPP (0.05 µg/µL) were prepared in the same buffer from equal volumes of both stock solutions. Two microliters of the enzyme mixture was added to the peptide samples, and the reactions were allowed to proceed at room temperature. Aliquots of 2 µL were taken at intervals, and 2 µL of 1% TFA was then added to quench the reactions. Mass Spectrometry. For mass spectrometry with Finnigan Lasermat 2000, 0.5-µL samples of the different digestion mixtures were mixed on the MS target plate with 0.5 µL of saturated CHCA in 70% aqueous acetonitrile containing 0.1% TFA, and they were then left to dry at room temperature. Mass spectra were obtained from an average of 20-50 laser shot recordings at 337 nm, with bovine insulin or porcine VIP as internal standards. RESULTS AND DISCUSSION Nonderivatized Peptides. The C-terminally amidated peptides VIP and secretin and fragments of YADH were used as model peptides to test C-terminal sequence analysis with the mixture of CPY and CPP. Figure 1A shows a MALDI-MS spectrum of an aliquot taken from the digestion of VIP. The sample taken at 90 min reveals an 18-residue C-terminal sequence. Acidification with 1% TFA to stop the reactions considerably improves the quality of the MALDI spectra, increasing signal-tonoise ratio.25 It is evident that the C-terminal amide group of VIP does not block the analysis, in agreement with earlier reports.26 This is an advantage over chemical sequencing methods, where C-terminal amidation blocks analysis. Gaps in the sequence of successive mass peaks are sometimes noticed, as is exemplified by the lack of a peak in Figure 1A at the position corresponding to the peptide truncated by one residue. These gaps represent a problem when an unknown sequence is analyzed. However, low enzyme-to-substrate ratios (1:200-300 mol/mol) combined with long incubation times (a few hours) and repetitive analyses frequently reveal the missing value and, hence, the residue. From porcine secretin, degraded with the CPY/CPP mixture, aliquots were taken at different time points for analysis by MALDIMS. The sequence can be followed for 10-12 residues from the C-terminus after digestion for 10 min (Figure 1B). From the presence of additional mass peaks, it is evident that the secretin preparation contains an extra component (m/z ) 2919.7), an N-terminally truncated form of secretin. C-Terminal degradation of the two peptides can be monitored simultaneously because of their different molecular masses, but because of the low content (23) Glazer, A. N.; DeLange, R. J.; Sigman, D. S. In Laboratory techniques in biochemistry and molecular biology.; Work, T. S., Work, E., Eds.; Elsevier: Amsterdam, Oxford, New York, 1975. (24) Itano, H. A.; Robinson, E. A. J. Biol. Chem. 1972, 247, 4819-4824. (25) Bergman, A.-C.; Bergman, T. FEBS Lett. 1996, 397, 45-49. (26) Breddam, K. Carlsberg Res. Commun. 1986, 51, 83-128.
Figure 2. Guanidination of Lys to homoarginine by treatment with O-methylisourea.
Figure 3. C-terminal sequence analysis of VIP, (1-9)NYTRLRKQMAVKKYLNSILN-amide, derivatized by guanidination of Lys residues. The peptide was digested with CPY/CPP for 1 hour. # indicates doubly charged ions; gK, guanidinated lysine.
Figure 1. C-terminal sequence analysis of underivatized peptides by MALDI-MS after digestion with CPY/CPP. (A) VIP, (1-10)YTRLRKQMAVKKYLNSILN-amide, digestion time 90 min. Labels indicate the molecular masses of peptides truncated C-terminally by the boxed number of residues. (B) Secretin, (1-15)DSARLQRLLQGLV-amide, digestion time 10 min. *, peaks obtained for a variant form of secretin truncated by one residue at the N-terminus; #, doubly charged ions. (C) YADH fragment 234-275 (234-258)VRANGTTVLVGMPAGAK, digestion time 15 min.
of the N-terminally truncated secretin, its sequence is readable only partially. The C-terminal sequence revealed by the singly charged ions can be partially confirmed by the doubly charged ions, as is the case for secretin. To investigate the C-terminal sequence method for large proteins, fragments of YADH obtained with a Lys-C-specific endoprotease were subjected to degradation with the CPP/CPY mixture. The results with a 12-residue C-terminal sequence determination (Figure 1C) indicate that also large proteins can be analyzed via their fragments.
Peptides Derivatized at Lys Residues. To examine the possibility of Lys derivatization to minimize ambiguities in mass identification, we used VIP as a model peptide, since it contains three lysine residues close to the C-terminus. The Lys residues in VIP were derivatized with O-methylisourea (Figure 2). HPLC of the reaction solution showed a single peak, and subsequent mass analysis confirmed that the Lys residues were effectively derivatized to homoarginine (data not shown). The homoarginine obtained is easily identifiable from other residues by its unique mass. Peptides with guanidinated Lys residues (Figure 3) are readily cleaved with the mixture of CPP and CPY for analysis by MALDI-MS. To investigate characterization of unidentified peptides, we purified a side fraction from the preparation of variant forms of galanin and endosulfine19,20 by reverse phase HPLC. This basic fraction produced a chromatogram with a few well-resolved major peaks. The material corresponding to peak BP62 was subjected to conventional N-terminal sequence analysis, which failed, suggesting the N-terminus of this protein to be blocked. By C-terminal sequence analysis, it was, however, easily identified after modification with O-methylisourea (Figure 4). The amino acid sequence can be read for 12 residues. The lysines are identified unequivocally as homoarginine residues. Calculation of the mass values from several spectra reveals the sequence given in Table 1. However, the C-terminal residue itself and the residue at -11 remain ambiguous, since Ile and Leu have identical mass values. Search in the Swissprot database nevertheless identified the peptide as fragment 1-29 of ubiquinone-binding protein. This porcine form has not been analyzed before. Cys-Containing Peptides. Applicability of the C-terminal sequence method for Cys-containing peptides is shown by analysis of PEC-60, its fragments, somatostatin-14, and insulin A chain and Analytical Chemistry, Vol. 69, No. 7, April 1, 1997
1317
Figure 4. Mass spectrum of brain peptide BP62, (1-12)LDGIRKWYNAAGFNKL, derivatized by guanidination of Lys residues. Digestion with CPY/CPP for 90 min. #, doubly charged ions. Table 1. C-Terminal Sequence Analysis of O-Methylisourea-Treated Porcine Brain Peptide BP62 with Guanidinated Lys Residuesa position
∆mexp ( SDb
∆mtheorc
assignment
-1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15
113.1 ( 0.4 170.3 ( 0.4 114.2 ( 0.3 147.3 ( 0.6 57.5 ( 0.9 70.7 ( 1.2 69.9 ( 0.3 113.8 ( 0.7 163.1 ( 1.7 163.2 ( 1.4 185.6 ( 1.0 170.6 ( 0.6 156.4 ( 0.3 113.0 ( 0.4 57.2 ( 0.2
113.2 170.2 114.1 147.2 57.1 71.1 71.1 114.1 163.2 163.2 186.2 170.2 156.2 113.2 57.1
I/L gK N F G A A N Y Y W gK R I/L G
Figure 5. Sequence analysis of insulin B chain, (1-18)C[SO3H]GERGFFYTPKA, containing Cys oxidized to cysteic acid and digested with CPY/CPP for 10 min (A) and 2 days (B).
a Guanidinated lysine (gK) from treatment with O-methylisourea. Mean mass values, ∆mexp, calculated for each residue from eight separate mass spectra by subtraction of the values for adjacent peaks and given together with the values of standard deviations (SD). c Theoretical residue masses. b
B chains. Intact PEC-60 (60 residues) with C-terminal cystine was not degraded by the CPY/CPP mixture at an enzyme-to-substrate ratio of 1:10 (w/w) for 24 h, and it is known that long peptides are not easily cleaved by carboxypeptidases. However, a shorter peptide, intact somatostatin-14, was also not degraded by the CPP/ CPY mixture. It is obvious that carboxypeptidases do not readily digest peptides with disulfide bridges. We therefore investigated peptides with stably modified cysteine residues, i.e., oxidized insulin B chain (commercially available, cysteic acid at positions -12 and -24) and performic acid-oxidized somatostatin-14 (prepared in-house, cysteic acid residue at C-terminus). No degradation was detected for oxidized somatostatin-14, while the degradation of the oxidized insulin B chain stopped at the 10th residue (Figure 5), i.e., one residue before the cysteic acid position. A similar situation was observed with oxidized insulin A chain, where no cleavage occurred and a cysteic acid is located at the penultimate position. We also tested carboxymethylated somatostatin-14 with the CPP/CPY mixture during 150 min incubation with several additions of excess enzyme. Degradation of tryptic fragments of PEC60 with carboxymethylated Cys residues was also found to be terminated one residue before the carboxymethylated Cys residue. 1318
Analytical Chemistry, Vol. 69, No. 7, April 1, 1997
Figure 6. Derivatization of Cys residues with BETA to get 4-thialaminine.
Figure 7. C-Terminal sequence analysis of a Cys-derivatized tryptic fragment (23-40) of PEC-60 (IYDPVCGTDGVTYESECK), after treatment with BETA, digestion with CPP/CPY for 120 min, and Cys identification as 4-thialaminine (Thi).
As judged from these results, it is obvious that modifications to acidic derivatives of cysteine (cysteic acid or (carboxymethyl)
cysteine) prevent carboxypeptidase digestion and C-terminal sequence analysis. We therefore examined a modification that leaves a positive charge at the Cys residue, using BETA. This modification converts Cys residues to positively charged, stable residues of 4-thialaminine (Thi) (Figure 6). Such a modification also facilitates mass determination by MALDI in the positive ion mode, as was found to be true with PEC-60 fragment (23-40) treated with BETA. The derivatized peptide was easily degraded, and Thi at position -2 from the C-terminus was identified (Figure 7). Somatostatin-14 was also readily degraded after this modification, and with the expected release of Thi. The derivatization is suitable because it gives a residue mass of 189.19, which is sufficiently different from that of tryptophan (186.21). CONCLUSION C-Terminal sequence analysis of underivatized peptides using carboxypeptidase digestions in combination with MALDI-MS is a powerful method. Complications include uncertain positions because of gaps in the mass spectra, similar masses of Lys/Glu/ Gln, and blocks at Cys residues. Possible modes to overcome the gap problem are to use low enzyme-to-substrate ratios and/ or analysis of several time points from one digestion. The Lys/
Glx ambiguity can be overcome by specific derivatizations with O-methylisourea. The resulting homoarginine is readily cleaved with CPP and CPY and identified using MALDI-MS. The Cys problem can be solved by a modification leading to the basic residue Thi, which is identified by MALDI-MS. ACKNOWLEDGMENT This work was supported by the Swedish Medical Research Council, Pharmacia Research Foundation, The Swedish Foundation for International Cooperation in Research and Higher Education, Novo Nordisk Foundation, A° ke Wibergs Stiftelse, Clas Groschinskys Minnesfond, the Royal Swedish Academy of Sciences, the National Board for Laboratory Animals, the Swedish Cancer Society, Karolinska Institutet, and a fellowship from the Blanceflor Boncompagni-Ludovisi (ne´e Bildt Foundation). The supply of peptides by Professor Viktor Mutt, at this department, is gratefully acknowledged. Received for review September 9, 1996. January 6, 1997.X
Accepted
AC960896J X
Abstract published in Advance ACS Abstracts, February 15, 1997.
Analytical Chemistry, Vol. 69, No. 7, April 1, 1997
1319