Identification of carbohydrate structures in glycoprotein peptide maps

tryptic peptide in a reversed-phase tryptic map of recombinant tissue plasminogen activator (rt-PA). Use of selected ion extraction of the LC/MS data ...
0 downloads 0 Views 3MB Size
Anal. Chem. 1903, 65, 2953-2962

2953

ACCELERATED ARTICLES

Identification of Carbohydrate Structures in Glycoprotein Peptide Maps by the Use of LC/MS with Selected Ion Extraction with Special Reference to Tissue Plasminogen Activator and a Glycosylation Variant Produced by Site Directed Mutagenesis Andrew W. Guzzetta: Louisette J. Basa,f William S. Hancock,'J Bruce A. KeytJ and William F. Bennett*pt Genentech Inc., 460 Point Sun Bruno Boulevard, South Sun Francisco, California 94080

Electrospray ionization mass spectrometry utilizing a single quadrupole on line with reversedphase HPLC (LC/MS) enables the characterization of glycoproteins in a relatively short period of time. In this approach the protein is digested with a suitable protease and the peptides are separated by reversed-phase HPLC and detected by electrospray ionization mass spectrometry. The glycopeptides a r e initially observed as a cluster of negatively sloping ions in a contour plot of data from the LC/MS r u n (mlzvs retention time) or as a characteristic series of masses at different elution times. The search for a particular glycopeptide is based on previously known carbohydrate structures and on consensus glycosylation sites. Further structural information is obtainable with glycosidase digestion and LC/MS analysis. The mass shifts following glycosidase digestion allow f u r t h e r confirmation of t h e structure. This approach identifies the site of attachment of two hybrid glycoforms to the T11 tryptic peptide in a reversed-phase tryptic map of recombinant tissue plasminogen activator (rt-PA). Use of selected ion extraction of the LC/MS data files allows one to graphically describe the elution order of closely related glycopeptides. The potential of LC/MS for the characterization of small amounts of unknown glycoproteins is shown by the study of a n rt-PA mutant. A new potential site for glycosylation is created by site directed 0003-2700/93/0365-2953$04.00/0

mutagenesis of wild type rt-PA with replacement of a threonine residue with asparagine at residue 103. An examination of a tryptic map shows that the mutant contains two new complex carbohydrate chains. The introduction of the new asparagine proximal to asparagine 117 changes this native high-mannose site i n rt-PA to a complextype glycosylation. This method allows rapid identification of carbohydrate containing peptides and yields useful structural information on microgram amounts of material.

INTRODUCTION Recent studies implicate glycoproteins in a diversity of biological processes.14 Alterations in carbohydrate structure can effect efficacy and clearance of recombinant DNA-derived (rDNA) proteins. The carbohydrate moieties of erythropoietin have been directly linked to the secretion and biological + Department of Medicinal and Analytical Chemistry. t Department of Cardiovascular Research. (1)Dube,S.;Fisher, J. W.;Powell, J. S. J.Biol. Chem. 1988,263,1751617521. (2) Dorner, A. J.; Bole, D. J.; Kaufman, R. J. J. Cell Biol. 1987, 105, 2665-2674. ( 3 ) Siriam, M. R. FASEB J . 1989,3, 1915-1926. (4) Ho~hkiss,A.;Refino,C.J.;Leonard,C.K.;OConnor,J.V.;Crowley,

C.; McCabe, J.;Tate, K.; Nakamura, G; Powers, D.; Levinson, A.; Molder, M; Spellman, M. W. Thromb. Haemostasis 1988, 60, 225-261. (5) Wittwer, A. J.; Howard, S. C. Biochemistry 1990,29, 4175-4180. 0 1993 American Chemical Society

2954

ANALYTICAL CHEMISTRY, VOL. 65, NO. 21, NOVEMBER 1, 1993

Table I. Masses of Monosaccharide UnitsP charged state sugar* mass +2 +3 +4 fucose hexose HexNAc NANA

HexNAc+Hex "he

146.1 162.1 203.2 291.3 365.3

73.0 81.0 101.6 145.6 182.6

48.7 54.0 67.7 97.1 121.8

36.5 40.4 50.8 72.8 91.3

+5 29.2 32.4 40.6 58.2 73.1

residue mass of a sugar as linked in a polysaccharide.

* HexNAc = N-acetylhexosamine (in this case most likely N-acetylglucosamine); NANA = N-acetylneuraminic acid; HexNAc+Hex is the branch unit that differentiates diantennary from triantennary and from tetraantennary complex-type carbohydrate structures in rt-PA.

activity of the molecule.'j An example of the importance of carbohydrates in protein metabolism is the increased clearance of nonsialylated forms of tissue plasminogen activator via the asialoglycoprotein receptor in the liver.7 The characterization of glycoproteins, however, is time consuming and requires significant amounts of material due to the chemical complexity and diversity of glycoforms present in an individual protein. The approach outlined in this report, based on reversed-phase HPLC on line with electrospray mass spectrometry, reduces significantly the material and time requirements for the study of carbohydrate structures. Such studies can give important information on the efficiency with which a cell line sialylates a particular glycoprotein and is applicable to routine cell line screening and rapid monitoring of cell culture process changes. The ability to recognize glycosylation trends early and rapidly in the development of a glycoprotein pharmaceutical can save enormous amounts of money in downstream development costs. Reversed-phase HPLC mapping of protein fragments produced by enzymatic or chemical cleavage is a standard method for protein characterization.&"J Unfortunately, HPLC-based tryptic mapping, while an invaluable technique for protein structural studies,10 has been disappointing when applied to glycoproteins. For example, with recombinant tissue plasminogen activator (rt-PA) it was observed that the glycopeptides were present in the tryptic map as broad peaks due to carbohydrate microheterogeneity.'O Therefore a common approach for the characterization of oligosaccharide structures involves the cleavage of the linkage between the saccharide and polypeptide chain either by chemicalll or enzymatic reaction.12 The oligosaccharide can then be fractionated by approaches such as high-pH anion-exchange (HPAE) ~ h r o m a t o g r a p h y . ~The ~ J ~structure of the fractionated saccharides can then be studied by techniques such as mass spectrometrylsl' and NMR.l8 A recent advance in highpH anion-exchange chromatography is the ability to link this (6) Delorme, E.;Lorenzini,T;Giffin, J.,Martin, F.; Jacobsen, F.; Boone, T.; Elliot, S. Biochemistry 1992, 31, 9871-9876. (7) Cole, E. S.; Nichols, E. H.; Poisson, L.; Harnois, M. L.; Livingston,

D. J. Fibrinolysis 1993, 7, 15-22. (8) Hancock, W. S.; Bishop, C. A.; Hearn, M. T. W. Anal Biochem. 1978,89, 203-212. (9) Kohr, W. J.; Keck, R.; Harkins, R. N. Anal. Biochem. 1982,122, 348-359. (10) Chloupek, R. C.; Harris, R. J.; Leonard, C. K.; Keck, R. G.; Keyt, B.A.;Spellman,W.M.;Jones,A. J.S.;Hancock,W.S.J. Chromatogr.1989, 463, 375-396. (11)Tasaki, S.; Mizuochi, T.; Kobota, A. Methods Enzymol. 1982,83, 263-268. (12) Maley, F.; Trimble, R. B.; Tarentino, A. L.; Plummer,T. H. Anal. Biochem. 1989, 180, 195-204. (13) Towsend, R. R.; Hardy, M. R.; Hindsgaul, 0.;Lee, Y. C. Anal. Biochem. 1988, 174, 459-470. (14) Basa, L. J.; Spellman,M. W. J . Chromatogr. 1990,499, 205-220. (15) Webb, J. W; Jiang, K.; Gillece-Castro, B. L.; Tarentino, A. L.; Plummer,T.H.;Byrd, J. C.;Fisher, S. J.;Burlingame,A.L.Anal.Biochem. 1988,169, 337-349.

technique to electrospray ionization (ESI) mass spectrometry.lg HPAE-MS allows further confirmation of the hydrolyzed and released mono- and oligosaccharides. Mass measurement of proteins and peptides using electrospray ionization technology with quadrupole mass filters has recently come of age.2Sz4 For example, the on-line monitoring of chromatographic separations (LC/MS) was reported for methionyl human growth hormone, using tryptic mapping and LCIMSIMS.25 We also reported the application of ESI LC/MS technology to the tryptic map of rt-PA.26The carbohydrate peptide families were observed in this study as characteristic descending ion clusters in a contour plot of an rt-PA tryptic LC/MS run. The mass scans in these regions were then averaged to reveal ions associated with the rt-PA carbohydrate peptides. This new approach allows the construction of a more detailed elution map of the glycopeptides. Hemling et aLZ7has studied the carbohydrates of CD4 by ESI LCIMS, compared the results to those obtained by fast atom bombardment (FAB),and demonstrated the potential of ESI LCIMS in the study of glycoproteins. In part our technique relies on glycosylation heterogeneity to produce the characteristic ion clusters in the contour plot. Recently a study was published in which glycosylation fragments (oxonium ions) are detected at low mass through fragmentation of the glycopeptide before the first quadrupole. The Carr method allows identification of glycopeptide location in an LC/MS map and could be used when glycosylation heterogeneity does not exist.28 The study of the mass data described in this article allows identification of the general structures of glycoforms and also localizes the site of carbohydrate attachment to the polypeptide backbone. The study uses rt-PA and a glycosylation variant to demonstrate the power of ESI LC/MS to locate the glycopeptides in the LC/MS map and to describe the complex elution characteristics of the many glycoforms.

EXPERIMENTAL SECTION Materials. Tissue plasminogen activator and the TK variant of rt-PA were purified from CHO cell superr~atant.~ TPCK trypsin was purchased from Cooper Biomedical. Neuraminidase (Type X from Clostridiumperfringens)was obtained from Sigma. HPLCiSpectro Grade trifluoroacetic acid (TFA) supplied in 1-g ampules was from Pierce. Acetonitrile and water (UV grade) were from Burdick & Jackson. Neuraminidase Digestion. Tissue plasminogen activator was digested at 37 OC for 18 h with 0.2 unit of neuraminidase11 mg of rt-PA. The digestion buffer was 200 mM NaOAc, 2 mM CaCl2,and 0.04% azide, with the pH adjusted to 5.6. All neuraminidase digests were performed on the intact molecule prior t o trypsin digestion. (16) Chai, W.; Cashmare, G. C.; Stol,M. S.; Gaskell, S. J.; Orkiszewski, R. S.; Lawson, A. M. Biol. Mass Spectrom. 1991, 20, 313-323. (17) Burlingame, A. L.; Baillie, T. A.; Derrick, P. J. Anal. Chem. 1986, 58, 165R-211R. (18) Vliegenthart, J. F. G.; Dorland, L.; Van Halbeek, H. Adu. Carbohydr. Chem. 1983.41, 209-374. (19) Henion, J.; Conboy, J. J. Biol. Mass Spectrom. 1992,21,397-407. (20) Loo, J. A.; Udseth, H. R.; Smith, R. D. Anal. Biochem. 1989,179, 404-412. (21) Carr, S. A.; Hemling, M. E.; Bean, M. F.; Roberts, G. D. Anal. Chem. 1991, 63, 2802-2804. (22) Ikonomou, M. G.; Blades, A. T.; Kebarle, P. Anal. Chem. 1991, 63, 1989-1998. (23) Chait, B. T.; Kent, S. B. H., Science 1992,257, 1885-1894. (24) Fenn, J. B.; Mann, M.; Meng, C. K.; Wong, S. F.; Whitehouse, C. M., Science 1989, 246, 64-71. (25) Covey, T. R.; Huang, E. C.; Henion, J. D. Anal. Chem. 1991,63, 1193-1200. (26) Ling, V.; Guzzetta, A. W.; Canova-Davis, E.; S t u b , J. T.; Covey, T. R.; Shushan, B. I.; Hancock, W. S. Anal. Chem. 1991,63,2909-2915. (27) Hemling, M. E.; Roberts, G. D.; Johnson, W.; Covey, T. R.; Carr, S. A. Biomed. Environ. Mass Spectrom. 1990, 19, 677491. (28) Carr, S. A.; Huddleston, M. J.; Bean, M. F. Protein Sci. 1993,2, 183-196.

ANALYTICAL CHEMISTRY, VOL. 65, NO. 21, NOVEMBER 1, 1993

2955 ~

Table 11. Glycoforms of rt-PA Glycosylated Tryptic Peptides Plus and Minus Neuraminidase Treatment complex and +3 ion used in untreated maas treated mass hybrid glycopeptides= NANAb extracted ion plots expectedc observedd expected observed T45 T45 T45 T45 T45 T45 T45 T45 T45 T45 T45 T45 T17 T17 T17 T17 T17 T17 T17 T17 T17 T17 T17 T17 T11

T11 T11 T11 T11 TllC TllC TllC

diantennary diantennary diantennary triantennary triantennary triantennary triantennary tetraantennary tetraantennary tetraantennary tetraantennary tetraantennary diantennary diantennary diantennary triantennary triantennary triantennary triantennary tetraantennary tetraantennary tetraantennary tetraantennary tetraantennary 5 mannose 6 mannose 7 mannose hybrid 1 hybrid 2 5mannose 6mannose 7mannose

0 1 2

0 1 2 3 0 1 2 3 4

0 1 2 0 1 2 3 0 1 2 3 4

1 1

967.3 1064.4 1161.5 1089.1 1186.2 1283.3 1380.3 1210.9 1307.9 1405.0 1502.1 1599.2 1616.6 1713.7 1810.8 1738.4 1835.5 1932.5 2029.6 1860.2 1957.2 2054.3 2151.4 2248.5 1412.7 1466.8 1520.9 1577.6 1631.6

2898.90 3190.16 3481.42 3264.24 3555.50 3846.76 4138.02 3629.58 3920.84 4212.10 4503.36 4794.62 4846.79 5138.05 5429.31 5212.13 5503.39 5794.65 6085.90 5577.46 5868.72 6159.98 6451.24 6742.50 4235.20 4397.42 4559.56 4729.74 4831.88

1312.7 1366.7 1420.7

3934.96 4097.11 4259.25

2899.0 3190.2 3481.6

2898.90

2899.1

3264.24

3264.3

3629.58

3629.6

4846.79

4846.9

5212.13

5211.9

5577.46

5577.3

4235.1 4397.1 4559.7 4730.0 4891.6

4235.20 4397.42 4559.56 4438.48 4600.62

4235.2 4397.3 4559.2 4438.5 4600.2

3934.5 4097.0 4259.7

3934.96 4097.11 4259.25

3934.6 4096.9 4259.7

-e

3556.5 3846.9 4138.0

-

4503.4 4794.0 4846.9 2137.9 5429.4

-

5504.1 5794.1 6085.4

-

-

Hybrid 1 has the formula GlcNAczMmGallNANAl. Hybrid 2 has the formula GlcNAczMansGallNANAl. b NANA is the number of terminal sialic acids (N-acetylneuraminic acid) on each oligosaccharide. Expected mass is calculated from the average mass. d The observed masses were corrected using a weighted averageof the major glycopeptides to allow for instrument drift during the gradient analysis. a -, denotes alycoDeDtides not observed. Table 111. T11 Glycopeptides Observed in a LC/MS Analysis of the Trypsin Digest of the TIC Mutant before and after Neuraminidase Treatment glycosylation +4 ions used untreated mass treated T11 glycoformsaPb sites on T11 in extractions expectede observed observed mass 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0

GalMmGlcNAcfluc two diantennary G&MmGlcNAcflu~ two diantennary G&Man,GlcNAc&'ucl diantennary and triantennary G&Man,GlcNAQucz diantennary and triantennary G&MansGlcNAcl$'uci two triantennary G&M~G~cNAc~@uc~ two triantennary NANAGa4MmGlcNAcfluc two diantennary NANA1G&MmGlcNAc&'ucz two diantennary NANA2Ga4Ma~GlcNAcfluc two diantennary NANA2Ga4MmGlcNAcflucz two diantennary NANA3Ga4MmGlcNAcfluc two diantennary NANA~G&M~QG~cNAc&'uc~two diantennary NAN&G&MmGlcNAcfluc two diantennary NAN&G&MmGlcNAcflucz two diantennary

1607.1 1643.6 1698.4 1735.0 1789.8 1826.3 1679.9 1716.4 1752.7 1789.2 1825.5 1862.1 1898.3 1934.9

6424.3 6570.5 6789.7 6935.8 7155.0 7301.1 6715.6 6861.7 7006.8 7153.0 7298.1 7444.2 7589.4 7735.5

6424.5 6569.0 4

-

6715.8 6862.0 7006.6 7152.8 7297.6 7443.4 7587.5

-

6422.9 6569.1 6789.2 6935.1 7154.3 7301.4

-

-

-

Compositional formulas for two complex carbohydrate attachments that vary by degrees of core fucosylation, sialylation, and branching. Expected masses calculated using average weights. -, denotes glycopeptides not observed.

* NANA = N-acetylneuraminic acid.

Trypsin Digestion. The method and materials for reduction, S-carboxymethylation, and trypsin digestion are as detailed previously.10s".80 (29) Paoni, N. F.,Keyt, €3. A.; Refino, c. J.; Chow, A. M.; "w3. V.; Berleau, L. T.; Badillo, J.; Pena,L. C.; Brady, K.; Wurm, F. M.; Ogez, J.; Bennett,W. F. A tissue plasminogen activator variant (TlOBN,KHRR 296-299 AAAA) has a reduced plasma clearancerate, a reduced capacity to cause systemic plasminogen, and is resistant to inactivation by PAI-1. Throm. Haemostasis, in press. (30)Crestfield,A. M; Moore, S.; Stein,W. H. J . Biol. Chem. 1963,238, 622.

Chromatography. Chromatography was performed using a 2.1 x 250 mm, C18, 5-rm (Vydac) reversed-phase column with a flow rate of 0.2 mL/min. The column effluent was split 1:15

with a Valco tee to give a flow rate of about 13 pL/mininto the electrospray nebulizer. The A buffer was 0.05% TFA in water and the B buffer was 0.05% TFA in acetonitrile. The organic gradient was 0-25% B in 50 min and then to 60% B by 85 min. Were performed at 40 OC. A loo-pgsample was ~ 1separations 1 injected for every LC/MS analysis.

2958

ANALYTICAL CHEMISTRY, VOL. 65,NO. 21, NOVEMBER 1, 1993

11 -t

81,740,000

Untreated Wild Type

-

T17, T11

'1

'h 58,400,002

Treated Wild Type

-z

--2

0.0 1

10.0 101

20.0

30.0

40.0

50.0

60.0

70.0

201

301

401

501

601

702

Time (min)iScan

Flgure 1. Comparison of the total ion current plots (LC/MS)of a trypsin digest of untreated wild type rt-PA (upper trace) and neuraminidase-treated trypsindigested rt-PA (bottomtrace).The areas of elution of the glycopeptidefamilies in both the treated and untreated runs are noted with horizontal bars and the tryptic number designation of the glycopeptides. The column is a 2.1 mm X 250 mm C18, 5+m, 300-A reversed-phase column. The aradient was 0-25% B in 50 min and then to 60% B by 85 min. All separations were performed at 40 O C . The A and B buffers are described in t h e Experimental Section.

Mass Spectrometry. The split flow from the reversed-phase separation was analyzed with the SCIEX API I11 triplei quadrupole mass spectrometer. Quadrupole one was scanned from 500 to 2200 Da with a scan duration of 4.28 s, using a step size of 0.5 Da and a 1.2-ms dwell time per step. Quadrupole three was not scanned. In the mutant analysis, quadrupole one was scanned from 300 to 2200 Da with a scan duration of 4.02 s, using a step size of 0.5 Da and a 1.0-ms dwell time per step. Ion Extractions. From an averaged mass spectrum (taken from the region of a carbohydrate cluster observed in the contour plot of an LC/MS run), the width of the base of the ion to be extracted was measured. From this base width measurement a mass window was set accordingly. This window was typically 4 Da. The ions that are used in the extractions are listed in Tables I1 and 111.

RESULTS AND DISCUSSION rt-PA Glycopeptides. Recombinant tissue plasminogen activator is a 527-residue glycoprotein that upon trypsin digestion will theoretically yield 51 peptides.1° Tryptic peptides are numbered consecutively from the N-terminus. Contained in these 51 fragments are three N-linked glycopeptides: Peptide T11, GTWSTAESGAECTNWNSSALAQKPYSGR, contains asparagine 117,which was shown to have attachments of high-mannose structures. Peptide T17, YSSEFCSTPACSEGNSDCYFGNGSAYR, contains asparagine 184, and peptide T45, CTSQHLLNR, contains asparagine 448. Both of these latter glycosylation sites have attachments of complex-type carbohydrate^.^^ Asparagine 184 is glycosylated in approximately 50% of the rt-PA molecules. The presence or absence of glycosylation at this site gives rise to the isozymes of rt-PA, type I and type 11, respectively.32-34 Hence, peptide T17 exists in the tryptic map in both the glycosylated and unglycosylated forms. A (31) Spellman, M. W.; Basa, L. J.; Leonard, C. K.; Chakel, J. A,; O'Connor, J. V.;Wilson, S. W.; van Halbeek, H. J. Biol. Chem. 1989,264, 14100-14111. (32) Pohl, G.; Kiillstrom, M., Bergsdorf, N.; WallBn, P.; Jornvall, H. Biochemistry 1984, 23, 3701-3707. (33) Vehar, G. A,; Kohr, W. J.; Bennett, W. F.;Pennica, D.; Ward, C. A,; Harkins, R. N.; Collen, D. BiolTechnology 1984, 1051-1057.

fourth glycosylation site was recently discovered and was shown to be an 0-linked fucose attached to threonine 61 in tryptic peptide T8. This site appears to exist only in the glycosylated form.35 General Approach. A general approach for glycoprotein profiling will be demonstrated using wild type and neuraminidase-treated rt-PA that has been trypsin digested and analyzed by LCIMS. The contour plot is used to locate the carbohydrates. The averaged mass scan of the region is used to establish the glycosylation pattern. Consensus amino acid sequences are used to identify potential peptides that could be glycosylated. N-Linked glycoforms can then be matched with the potential peptides and the masses of these structures are then compared with observed masses. A glycosidase digestion is done, in this case neuraminidase digestion, to confirm the assignment, or if a tripleiquadrupole instrument is available, MS/MS can be used. LC/MS of the Peptide Map. Figure 1shows the total ion current (TIC) for the LCiMS analysis of a sample of native sequence (wild type) rt-PA (upper trace) and neuraminidasetreated rt-PA (lower trace). A similar elution profile was observed with standard UV detection at 214 nm (data not presented). A key problem with the identification of glycopeptides in this LCIMS study is the low abundance of individual species, due in large part to sialic acid, and branched chain heterogeneity. This microheterogeneity results in dispersion of peptides through a range of retention times and results in considerable mass degeneracy. For example, ion intensities of glycopeptides in a reversed-phase separation of rt-PA trypsin digest range from 20 000 to 200 000 maximum intensity counts, whereas a typical ion intensity for a nonglycosylated peptide is 1 X lo6 counts. Neuraminidase treatment is found useful in reducing this microheterogeneity and also helps to boost ion intensity. The observed mass, (34) Vehar, G. A.; Spellman, M. W.; Keyt, B.A.; Ferguson, C. K.;Keck, R.G.;Chloupek,R.C.;Harris,R.,Bennett,W.F.;Builder,S.E.;Hancock, W. S. Cold Spring Harbor Symp. Quant. Biol. 1986,51 (Ll), 551-562. (35) Harris, R.J.; Leonard, C. K.; Guzzetta, A. W.; Spellman, M. W. Biochemistry 1991, 30, 2311-2314.

ANALYTICAL CHEMISTRY, VOL. 65, NO. 21, NOVEMBER 1, 1993 Untreated

J !

E

T17. T11

i -

12

250 25.1

, . "

--

L

.

**>s &

300

275 27.6

-

325 32.6

30.1

350 35.1

I

u,

375 37.6

'L 400 40.1

Treated

T45

-

.

2000

-

.

. .

1800

e

E

T17, T11

2200

4

1600 14CO

.

-1

* ?

_-

-

"

- -.

*

*

.

A

ScanlTlme (mln)

Flgure 2. Comparison of the contour plots of trypsindlgested wild type rt-PA (upper trace) and the contour plot of neuramlnidase-treated trypslndigestedrt-PA (bottom trace). These plots are generated from the LC/MS runs depicted In Figure 1. The glycopeptide "streaks" are circled. The T45 peptide at 31 mln is the glycopeptide associatedwith a complex-type glycosylation pattern. Note that T17 (a complex-type glycosylated peptide around48 min) coeluteswith T11 (a high-mannosetype glycopeptide). With neuraminidase treatment the sialic acid heterogeneity is reduced, and Ilkewise,the heterogeneity observed in the contour plot Is somewhat reduced (bottom trace). The elution positions of the peptides are recorded as both time (minute) and scan number.

which is an average value calculated from the masses of each multiply charged ion, is then calculated for each peak and allowed the assignments of identity for each of the tryptic glycopeptide families shown in Figure 1and described in Table 11. The characterization studies described in this study are carried out with a standard 2.1-mm-i.d. HPLC column, with a split to reduce the amount of solvent flow to the mass spectrometer (see Experimental Section). When the amount of protein is limited to below the nanomole level, capillary LC has proven to be quite effective in this laboratory. A typical analysis using a 0.32 mm X 150 mm C18 reversedphase column requires as little as 25 pmol. The flow rate ranges between 3 and 4 pL/min, making the technique amenable to electrospray ionization mass spectrometry. The search strategy initially involves location of the glycopeptides in a contour map. In the contour plot of m/z data vs retention time the carbohydrate patterns appear as characteristic streaks or clusters of ions (see Figure 2). Once these patterns are located, the presence of carbohydrates can be confirmed by examining averaged mass scans from these regions. In these scans, glycopeptides can be readily detected by the observation of species that differ by masses that are typical of monosaccharide subunits at a particular charged

2957

state (see Table I). For example, sialic acid heterogeneity a t the +3 charged state in a glycopeptide will be revealed by ions that differ by 97 Da in the averaged mass spectrum. Using this method it is possible to observe sialic acid heterogeneity and branched chain differences without prior knowledge of the protein that is being analyzed. Preliminary assumptions can be made as to the nature of the glycosylation, but only if there is some mass heterogeneity present. The ion heterogeneity is necessary to reveal sialic acid, highmannose, or branched chain differences, and without it a glycosylation pattern cannot be established. The next step in the data analysis is to relate the mass of the ions to possible structures. In this study, we used the structures determined during the characterization of the carbohydrates of rt-PA.31 In general, a table of potential ions is constructed by adding the mass of a known carbohydrate structure to each peptide fragment that contains potential N-linked recognition sites. The peptide fragments can be predicted from the specificity of a given protease. The final step is the calculation of the masses of the expected multiply charged states for each peptide. In electrospray ionization mass spectrometry, the number of charged states is in part related to the number of basic sites in a given peptide. From a comparison of these calculated masses with the experimental results, a preliminary determination can be made of the structures of each of the major forms contributing to the glycopeptide streak. The results of this search with wild type and neuraminidase-treated rt-PA are given in Table 11. Selected Ion Extraction. The tryptic map can be searched by selected ion extraction using the mass of a specific charged state of an individual glycopeptide. The extraction allows plotting of the elution profile of this peptide, which is then overlaid with ion extracts that correspond to the masses of other related glycopeptides. With this procedure a high level of deconvolution can be achieved for a complex chromatographic profile. An example of the use of selected ion extraction is shown in Figure 3 with the elution profile of the eight major glycoforms present in T45 (upper trace). This figure was constructed by an overlay of the results of eight ion extractions based on the nonisobaric masses of previously characterized oligosaccharides. The carbohydrates observed in this study are of the complex type with more heavily sialylated and least branched structures predominating, which is consistent with previous results for wild type rt-PA.26p3l An asialo diantennary form was also noted. Sialic acid loss during the low-pH reversed-phase separation or storage of the sample is always a possibility, but we have established that the amount of sialic acid loss during a 50min LC/MS run is insignificant by correlating the abundance of the glycoforms observed with the reported values in the literat~re.3~ In this study we minimized any sample degradation by storage of the sample at -70 "C immediately following trypsin digestion of the glycoprotein. The tentative identification of the glycopeptides is then confirmed by digestion of the intact protein with a specific glycosidase followed by trypsin digestion and rechromatography. For example, Figure 3 (lower trace) shows an ion extraction of asialo T45 glycoforms in an LC/MS run of neuraminidase-treated, trypsin-digested rt-PA. On removal of the sialic acid residues, the carbohydrate pattern simplifies in the manner predicted in Table 11, that is from eight to three peaks. Another advantage of neuraminidase digestion is that the abundance of the carbohydrate ions of interest increases. In Figure 3, three different ion extracts are overlaid to give an elution profile for the asialo forms of the rt-PA glycopeptide T45. Thus, selected ion extraction greatly enhances the power of an HPLC separation with the ability

49.50

50.00

50.5

neuraminidase treatment

A Di Tri

3i.w

A 31.50

32.00

32.50

-

33.00

Time (min)

48.50

49.00

49.50

\ 50.00

50.50

has a unique mass.

of glycopeptide T11 is observed as the the seven-mannose form as the least esult is consistent with previous

The power of selected ion extraction is even when mixtures of glycopeptides elute with nonglycosyla peptides. In such a case traditional on-line monitoring, example by W or integrated ion intensity, may result in glycopeptides being undetected. Figure 4 shows the regio of the tryptic map that contains the closely eluting T17 an

ocalize the site of

xtraction is then used to locate the elution pattern of complexglycosylation as observed for T45; however, es are not observed

ite of attachment and eluted together with other T11 tides. The extracted ion plot is shown for both in Figure 4. Furthermore ialic acid also confirm the

ANALYTICAL CHEMISTRY, VOL. 65, NO. 21, NOVEMBER 1, 1993

J

TK Mutant

T47

T11

TllC

0

mutant

7"uIOS0

2959

t.-.

T45

Bmmose

-.

[ID ~ m m n o r e

2

E

450

500

550

322

358

393

Wild Type T45

JY)

375

406

435

UQ 464

425 49 4

450

523

Time (min)lScan

Flgure 5. Extracted ion profile of the chymotryptic-like clip of glycosylated peptide (Tl IC) showing a display of high-mannose-type glycoforms (uppertrace). Included in the ion extraction are two coeluting nonglycosylated peptides T47 and T3a (upper trace). The TIC plot (bottom trace) shows the region in which these peptides coelute. The elution positionsof the peptides are recordedas both time (minute) and scan number. The mass assignments and carbohydratenomenclature are presented in Table 11.

N

2

400 286

proposed structures in terms of the observed mass (see Table 11). The presence (but not the site of attachment) of the hybrid structures in &-PA was identified previously by Spellman et al. using HPAE.31 A fragment of the glycopeptide T11 (denoted by T11C) formed by a secondary cleavage in the trypsin digest, is observed in the contour plot as a weak cluster (data not shown). After the scans in this region are averaged, it is observed that the masses are consistent with a fragment of peptide T11 without the three C-terminal residues, serine, glycine, and arginine (seeTable 11). The proposed structure was confirmed by amino acid analysis of a purified sample (data not shown). The observation of this peptide is not evidence for further heterogeneity of the rt-PA molecule, but rather illustrates a low level of nonspecificitywith the protease, trypsin, used to prepare the map. The T l l C glycopeptides elute with two nonglycosylated peptides, T3a (a fragment of T3) and T47, in less than a l-min interval (see Figure 5). The detection of these glycopeptides is made possible by selected ion extraction. This example demonstrates the power of this technique over UV or TIC measurement. Although peptide T3a was previously inferred by Chloupeck et al. by amino acid analysis of HPLC fractions,lO this plotting of an elution profile is the first definitive location of the peptide in a map. It is probable that the fragment is formed by a nonspecific cleavage of peptide T3 by trypsin. Characterization of an rt-PA Variant Prepared by Site Directed Mutagenesis. The potential of this approach will now be illustrated with data from LCIMS of the glycopeptides present in a trypsin digest of rt-PA and a glycosylation variant of rt-PA. In this mutant the threonine at position 103 was replaced with asparagine to create a new potential glycosylation site, in addition, residues 296-299 (lysine,histidine, arginine, arginine), KHRR are each replaced with alanine. For simplicity this variant will be referred to as the TK variant in the subsequent text. The TK variant

Scanmime (min) Flgure 6. Comparison of the contour plot of a trypsin digest of the TK mutant (upper trace) and the contour plot of the control rt-PA (bottom trace). The glycopeptide "streaks" are circled. The streak corresponding to the T11 tryptic glycopeptides of the TK variant is noted as a cluster of ions eluting earlier than the T17 peptlde (upper trace). In the control, the T11 and T17 glycopeptides coelute (bottom trace).

was prepared by site directed mutagenesi~.~*3~ Adjacent to the T103N substitution, residues a t positions 104 and 105are tryptophan and serine, which is of the consensus type AsnX-Ser and therefore allows the possibility of an additional glycosylation site in the variant. This study deconvolutes complex elution profiles of the glycopeptides, identifies new glycopeptides present at low levels, and characterizes sites of altered glycosylation due to amino acid substitutions. In this characterization we first deal with the extra glycosylation site and subsequently discuss the tetraalanine substitution. Contour Plot of the TK Variant MS Data. A portion of the contour plot of the mass analyzed chromatogram for the trypsin digest of rt-PA and of the TK variant is shown in Figure 6 (upper trace). The large number of signals in the contour plot can be related to the formation of several multiply charged ions for each component and by the mass heterogeneity of the glycopeptides. Since the higher mass glycopeptides are also more polar and elute earlier in a reversedphase HPLC separation, these peptides can be observed as a cluster or streak as described previously. In a contour plot of data generated from native sequence rt-PA, the elution position of T45 is observed a t 30.5-32.5 min. Peptides T11 and T17 coelute and give rise to a complex pattern between 46.5 and 48.5 min, (Figure 6, lower trace). With the TK variant, peptide T11 shifts to an earlier elution position a t (36) Bennett, W. F.; Paoni, N. F.; Keyt, B. A.; Botatein, D.; Jones, A. J. S.; Presta, L.; Wurm, F. M.; Zoller, M. J. J . Biol. Chem. 1991,266, 5191-520 1.

1001

Treated

raminidase digestion 44.5-45.5

ent (upper trace). The dashed black ent trace. The major species repre the nonglycosylatedpeptides. The nongly t extracted and are only representedby a ri e extractedIon profilesare not plottedto scale actually only account for a small rise in total lo etween 49 and50 min (seedashed line). The extract re are generated by extractingthe quadruply c

mi

carbohydrat trace). Averagec peptide. T

the treated vari

K Variant T11 Glycoe analvsis of the variant 7s t

pectrum removes the potential mass ambiguity, and all ions shift as predicted by the initial conclusions. Table I11 lis xpected and observed masses. the pattern of glycosylation a served a change in the glycosylation

Similar biantennarv structures have b positions in wild type rt-k'A.31 The set of i structures that differed by 73 and 36

and 117 are glycosylated in the variant, forms containing one or two fucose and ze dues. A nonfucosylated complex in relatively large abundance comp rt-PA. Because we cannot discrim structures with this approach, the re of this heterogeneity between the two sit e mass of a single sialic as treated with neura

Gs 111 1li)rot" cells ana oiigomannose ana com es when isolated from melanoma cells. A

osaccharide structures

84 and 448 were similar veraged mass scans s assumption (data

ANALYTICAL CHEMISTRY, VOL. 65, NO. 21, NOVEMBER 1, 1993

2961

Table IV. Tryptic Peptides Associated with the KHRR to AAAA Mutation in the LC/MS Analysis of the Control and Variant rt-PA amino acid residue

tryptic peptide

expected

mass

276296 278-296 297-298 299-299 300-304 278-304

T26T27 T27 T28 T29 T30 T27P

2241.6 2000.3 311.2 174.0 544.6 2683.0

544.1 2683.0

7.2 70.2

276-304

T26T27v

2924.3

2924.3

69.7

observed ic

1999.8

-

*

retention time (min)

67.6 -

-

The lower case v denotes the variant tryptic peptide. Calculated from average weights. Amino acid residues involved in the mutation are in both italic lettering.

TK Mutant

-

dlantennary trlanfennary

tetraantennary

Wild Type 1,0x107

~

.........-...2

...................................................

8.0X106

..".....................

-

dlantennary trlantennary

tetraantennary

Flgure 9. Complex-type glycoforms attached to Asn 448 (peptide T45) of the TK variant (top trace) and the control sample (bottom trace) extractedfrom the LC/MS runs. The peaks were integratedto give the graphs shown. The type of complex glycoform is noted on the Xaxis for each graphwith rehttveabundance on the Y. The numberof terminal sialic acid residues attached

Ion Extraction of the TK Variant T11 Glycopeptide. An ion extraction for each of the masses shown in Table I11 was performed, and an overlay of all of these extractions is shown in Figure 8 (top frame) as well as the TIC for this region of the map. The TIC profile shows that the intensity observed for the glycopeptides (49-50 min) is much less than observed for an adjacent group of nonglycosylated peptides (50-50.25 min). The overlay of extracted ions is not drawn to scale with the TIC but rather reflects the relative abundance of the different glycosylated species. There is a significant amount of fine structure available in the overlay of the ion extracts, while the TIC is observed as an uninformative broad hump. The observed elution order is consistent with the expected polarity of the oligosaccharides, as an increased degree of sialylation or fucosylation leads to earlier elution of the glycopeptide from the reversed-phase column. We used a neuraminidase digestion to verify the proposed structures on variant peptide T11, and the structural changes can be seen in the extracted ion profile (see Figure 8, bottom trace for the resulting plot). The observed results were similar

amino acid sequence IKGGLFADIASHPWQAAIFAKd GGLFADIASHPWQAAIFAK HR

R SPGER GGLFADIASHPWQAAIFAAAA ASPGER IKGGLFADIASHPWQAAIFAA AAASPGER

-, represents masses that were not observed.

to those observed for T45, with a simplification of the pattern observed in ion extraction. Neuraminidase digestion was found to be the most valuable glycosidase digestion that could be used for this study since it greatly simplified the complex glycosylation patterns. KHRR to AAAA Substitution. In the same analysisthat allowed characterization of the extra glycosylation site contained in tryptic peptide T11, it is possible to identify and confirm the four other substituted residues in the variant form of rt-PA. Evidence for the alanine substitution of K296, H297, R298, and R299 is seen in the elution of a single tryptic peptide (278-304) at 70.2 min and the concomitant loss of separate tryptic peptides for T27 and T30, which typically eluted at 67.6 and 7.2 min, respectively (see Table IV), spectrum not shown. In wild type rt-PA, T28 and T29 are a dipeptide (HR[297-298] ) and arginine (R299);hence they are unretained on a reversed-phase column. The mass analysis (see Table IV) confirmed the identity of the variant tryptic peptide T27 to T30 as the sequence 278-304 with tetraalanine substituted for 269-299. A smaller peak eluted slightly earlier (69.7 min) was identified as T26 to T30 (noted as T26-T27v in Table IV), which is a result of incomplete tryptic cleavage of the K277-GZ78 bond. Quantitation of Glycopeptide Species. The use of electrospraymass spectrometry for the quantitation of peptide mixtures can be complicated. For example, variability of the predominant charged species can be observed for a given peptide. Ion suppression of coeluting and ionizing components still presents a real barrier to absolute quantitation. An advantage of selected ion extraction for both detection and quantitation of species is that interference of closely eluting sample components observed in traditional UV or TIC detection is eliminated, as well as variations in background. In an examination of 12 different rt-PA digests analyzed on two different Sciex instruments over a period of 18 months, a high degree of consistency for the carbohydrate structures was observed (unpublished data). The low degree of variability may be related to the structural similarities of the glycopeptides and the high degree of resolution that was achieved in the map. Figure 9 shows the distribution of glycoforms a t the third glycosylation site, Asn 448 in a control sample and for the variant. The TK variant has more asialo and monosialylated forms than the reference material. At this stage, it is not clear that the difference is due solely to the mutations a t Asn 103, or the tetraalanine substitution (296-299), or differences in cell culture and purification protocols. These are issues that will be addressed in future studies.

CONCLUSIONS With the advent of site directed mutagenesis there is an expanding need for such a method, as described here, that

2962

ANALYTICAL CHEMISTRY, VOL. 65, NO. 21, NOVEMBER 1, 1993

allows the rapid screening of glycoprotein structures, particularly when microgram amounts of material are available. An attractive feature of the approach is that a biochemist can gain a great amount of structural information using only a single quadrupole mass spectrometer, thus making the technique less expensive. In addition, it is expected that future developments in the power of the search algorithms will allow for more rapid analysis of the LCiMS data. With such automation, the use of electrospray mass spectrometry on line with HPLC peptide and carbohydrate mapping should prove an invaluable tool to the protein chemist. These results can then be complemented by structural studies of the

~arbohydrate,'~ which would allow characterization of isomeric structures that cannot be directly determined by this approach.

ACKNOWLEDGMENT The authors acknowledge Michael W. Spellman for helpful discussions in the area of carbohydrate chemistry.

RECEIVED for review May 10, 1993. Accepted July 22, 1993." @

Abstract published in Advance ACS Abstracts, September 1,1993.