1714
Anal. Chem. IQSO, 62, 1714-1722
PERSPECTIVE: ANALYTICAL BIOTECHNOLOGY
Carbohydrate Characterization of Recombinant Glycoproteins of Pharmaceutical Interest Michael W . Spellman Department of Medicinal and Analytical Chemistry, Genentech, Inc., South San Francisco, California 94080
Carbohydrate charscterlzation of recomtdnant glycoproteins entaib detennlnatlon of the prknary structures and poWs of attachmeni of the oHgosaccharlde mdeties. Thls artide reviews several methods for dlgosaccharlde- and glycosylstlon-slte characterization. A major recent advance In carbohydrate analyds has been the use of hlghgH a n h exchange (HPAE) chromatography for separation of glycoprotelnderlved ollgosaccharldes. These separstlons are sensitive to molecular dte, Carbohydrate composttbn, llnkage poamons, and anomerlc contiguratlons. As a result, HPAE chromatography is a powerful technkpe for glycoprotdncharacterizatbn and can serve as the basis of an “ollgosaccharlde map”. Characterization of potentlai N-glycosyiatlon Snes involves detennlnlng whether each potential &e Is glycosylated, the extent of agocraccharlde prowmhg at each site, and, kleaRy, a detailed descrlptlon of the dlstrlbution of oilgosaccharides at each site. Several approaches to characterlzlngglycosyl a t h l e s are desubed, hrdudlng p.ptkk mapphg and mass spectrometry. Treatment of a glycoproteln with endo-8-Nacetylglucosamhridase H (endo H) followed by peptide:Nglycoddase F (PNGase F) can be used to dkthrgulsh Snes that are not glycosylated from those carrying hlgh-mannose structures and from those containlng attached complex OIL gosawharkbs. After lndMduel glyfxsyhtbn Snes have been labeled by this series of reaction, the resultlng peptldes are characterized by automated Edman degradation. Thio technique Is partkulariy valuabk for characterking peptldes that contain more than one potential Nglycorylatbn &e. An example Is also given In whlch HPAE chromatography Is used in conjunctlon wlth reversed-phase hlgh-performance liquid chromatography trypUc mato obtain detayed infonnatbn on the dlstrlbutlon of oligosaccharides at Individual glycosylatlon sltes.
INTRODUCTION Some of the most significant advances in recombinant DNA technology in recent years have been in the development and application of large-scale mammalian cell culture for protein expression (I). Among other things, the application of mammalian cell culture has made it feasible to produce recombinant glycoproteins in quantities sufficient for pharmaceutical use. Mammalian-cell-expressedrecombinant glycoproteins that are approved or under development as pharmaceutical agents include tissue plasminogen activator (2, 3), erythropoietin ( 4 , 5 ) ,soluble CD4 (6),factor VIII (7), and ,!%interferon 0003-2700/90/0362-1714$02.50/0
( 4 9 ) . Of these, recombinant tissue plasminogen activator (rt-PA) and recombinant erythropoietin have been approved by the U.S.Food and Drug Administration for therapeutic use in humans, while most of the others are in clinical trials. The carbohydrate moieties of recombinant glycoproteins are of interest because of their possible functional roles, which could include direct or indirect influence on the activity of a glycoprotein, possible roles in clearance from circulation, targeting to a particular tissue, and influences on the solubility and stability of the protein (10). For example, the oligosaccharides of rt-PA have been reported to influence both the enzymatic activity (11-13) and the clearance (14-16) of the molecule. The carbohydrate structures of erythropoietin have long been reported to be necessary for activity (17-19). More recently it has been demonstrated that glycosylation of specific sites is necessary for secretion and function of erythropoietin
(20).
The expression of a glycoprotein in a heterologous cell l i e could have adverse consequences if it resulted in the production of recombinant glycoproteins carrying carbohydrate heterophile antigenic determinants (21). The possibility of “non-self”carbohydrate structures is, of come, much greater when expression is carried out in non-mammalian cells, but some mammalian cell lines are also known to produce antigenic carbohydrate structures. In fact, the antigenic Gaia(1-3)Gal moiety has been reported to be synthesized even by cultured human (adenocarcinoma)cells (22). Thus, it is desirable to demonstrate that a given glycoprotein does not contain carbohydrate heterophile antigenic structures. Characterizationof the carbohydrate side chains of recombinant glycoproteins presents a significant analyticalchallenge for several reasons. The cDNA sequence reveals little information about glycosylation and predicts only the positions of potential sites for N-glycosylation (Am-Xaa-Ser/Thr) (23). A t this time it is not possible to predict from the cDNA sequence whether a potential N-glycosylation site is actually glycosylated, nor is it possible to predict whether a recombinant glycoprotein is glycosylated at Ser or Thr residues (0-glycosylation). This information can only be obtained experimentally. The carbohydrate structures of any glycoprotein are determined in part by the glycosylation apparatus of the host cell and in part by the tertiary structure of the particular glycoprotein being produced. Thus, heterologous expression of a particular glycoprotein would be predicted to yield different populations of oligosaccharides in each of the different expression host cells (IO). In addition, the proceasing of glycoprotein carbohydrates results in heterogeneous populations of structurally related oligosaccharides (microheterogeneity), and different glycosylation sites within a glyco0 1990 American Chemical Society
ANALYTICAL CHEMISTRY, VOL. 62, NO. 17, SEPTEMBER 1, 1990
A
1715
Mano( 1 4 6 ) \Mano( 1 4 6 ) Mano( 1 4 3 )/ Manp( 1~4)GlcNAc,9( 144)GlcNAc Mano( 1-31
>
>
NeuAca(2-+3)Ga1,9(1+4)GlcNAcB( 1+2)Mano( 1 4 6 )
B
F u c ~ 1( 4 6 ) Manp( 1~4)ClcNAc,9( 1-+4)klcNAc
NeuAco(2+3)CalB( 1+4)ClcNAcp( l+2)Mana( 1 4 3 )
C
Mano( 1 4 6 ) ‘Mano( 146) / Man@(1 4 3 ) ManA l+rl)ClcNAc,9( 1+4)ClcNAc NeuAco(2+3)Gal,9( l-+4)ClcNAcp(1+2)Mana( 1 4 3 )
>
Figure 1. Representatbe hi@mannose(A), complex (B), and hybrid (C)blinked structures. The abbreviations used are as follows: Man, mannose,; GlcNAc, N-acetylglucosamine; Gal, galactose; NeuAc, N-acetylneuraminic acid (sialic acid); Fuc, fucose (6deoxyigalactose).
protein can have different populations of attached carbohydrate structures (site heterogeneity). All of these factors combine to make a complete carbohydrate structure elucidation of any glycoprotein a major analytical challenge. The goal of this article is to survey some of the techniques and strategies available for glycoprotein characterization, with an emphasis on those techniques that we at Genentech have found to be particularly useful in the analysis of recombinant glycoproteins. The scope of the article will be confined to recombinant glycoproteins produced by mammalian cell expression and will not include those produced in fungal cells or in cultured insect cells.
RECOMBINANT PERSPECTIVE ON GLYCOPROTEIN CHARACTERIZATION Any of the techniques commonly used for glycoprotein characterization can be used to characterize recombinant glycoproteins. These include serial lectin affinity chromatography (24-26), exoglycosidase digestion (27),mass spectrometry (28,29),nuclear magnetic resonance (NMR) spectroscopy ( 3 0 , 3 1 )and several other chemical (32) and chromatographic (33,34)methods. There is, however, a particular perspective to the characterization of recombinant .glycoproteins, which arises from their source and intended use. Recombinant glycoproteins intended for pharmaceutical use are generally available in large quantities. The availability of large amounts of material favors the use of a technique such as NMR, which gives a great deal of structural information but requires micromolar concentrations of oligosaccharides, over a technique such as lectin affiiity chromatography,which is more sensitive but gives equivocal structural information. Another nearly universal situation among recombinant glycoproteins is that a high-resolution reversed-phase highperformance liquid chromatographic (RP-HPLC) peptide map is usually characterized early in the development of the manufacturing process of the glycoprotein (35). Techniques based on RP-HPLC peptide separations can often be adapted to characterize individual glycosylation sites. Finally, in the characterization of a glycoprotein intended for pharmaceutical use, it is desirable to have means of assessing the lot-to-lot consistency of production (36). To be suitable for such an application, a technique must be reproducible, have sufficient resolution to distinguish between closely related structures, and not be excessively labor-intensive. Chromatographic oligosaccharideor glycopeptide “mapping” techniques would be most suitable for this purpose, in a manner analogous to the use of RP-HPLC peptide mapping to monitor lot-to-lot
consistency of the polypeptide.
DETERMINATION OF OLIGOSACCHARIDE STRUCTURE Asparagine-linked oligosaccharides can be divided into three major categories: high-mannose, complex (N-acetyllactosamine type), and hybrid structures (Figure 1). All three classes share a common “core” structure consisting of the innermost three mannose residues and two GlcNAc residues (23). High-mannose structures contain only mannose residues in the outer chains, while the outer chains of complex oligosaccharides contain GlcNAc, galactose, and sialic acid (most often N-acetylneuraminicacid). As the name suggests, hybrid oligosaccharides have structural features of both high-mannose and complex oligosaccharides. The structures given in Figure 1are merely representative of the three categories of N-linked oligosaccharides;each category comprises a diverse array of structures that can differ (e.g.1 in numbers of glycosyl residues, substitution positions, or branching patterns. The structural diversity of N-linked oligosaccharides has been the subject of several excellent reviews (23, 37). The following elements define the complete primary structure of an oligosaccharide: the glycosyl residue composition; the positions of glycosidic bonds; the anomeric configurations (aor @),absolute configurations (Dor L) and ring forms (furanose or pyranose) of the constituent sugars; the sequence of glycosyl residues; and the attachment positions of any non-carbohydrate substituents (e.g. phosphate or sulfate). When a glycoprotein is expressed in a mammalian cell line, it is reasonable to assume that all residues are in the pyranose (six-membered) ring form and that all residues except fucose have the D absolute configuration. The remaining structural features must be determined by physical, chemical, or enzymatic methods. With the possible exception of NMR, no single analytical technique is capable of the complete elucidation of an oligosaccharide structure. As a consequence, carbohydrate structure elucidation is usually performed by using several complementary techniques. Two of the most powerful techniques for carbohydrate structure elucidation are mass spectrometry (MS) and NMR spectroscopy. Both techniques have been the subject of extensive, recent reviews (28-31). In some ways these techniques are complementary, with different sets of advantages and disadvantages. The main advantage of MS, particularly fast-atom bombardment (FAB-MS), is sensitivity. In addition FAB-MS can provide an unequivocal molecular weight of an oligosaccharide and, under favorable circumstances,sequence
1716
ANALYTICAL CHEMISTRY, VOL. 62, NO. 17, SEPTEMBER 1, 1990
information and information about non-carbohydrate substituents (28, 29). The main disadvantage of FAB-MS for carbohydrate structure elucidation is that it usually gives no information about linkage positions and anomeric configurations. NMR is the single most powerful technique for carbohydrate analysis (30, 31) and can, by itself, allow complete structure elucidation. The most widely used approach to NMR of glycoprotein-derived carbohydrate structures is the "structural-reporter group" concept introduced by Vliegenthart and co-workers (30). In this approach, interpretation of the 'H NMR spectrum of an oligosaccharide or glycopeptide is based on the chemical shifts and coupling constants of protons that resonate at clearly distinguishable regions of the spectrum. The structural-reporter groups include the anomeric protons, H-2 and H-3 of mannose, H-3 of sialic acid, H-5 and the methyl protons of fucose, H-3 and H-4 of galactose, and the methyl protons of N-acetyl groups (30). The main disadvantage of NMR as a method for carbohydrate structure determination is its inherent lack of sensitivity. As was mentioned above, the quantities of recombinant glycoproteins available are often sufficient to overcome this limitation. One method for carbohydrate structure elucidation that has been used successfully with several recombinant glycoproteins (9, 13, 22, 38-41) is the use of sequential exoglycosidase digestion in conjunction with gel-permeation chromatography, which was introduced by Kobata and co-workers (27). This is a very powerful method for glycoprotein characterization in which the susceptibility of an oligosaccharide to purified exoglycosidases is inferred from changes in its elution position in gel-permeation chromatography. Because each exoglycosidase is specific for anomeric configuration as well as identity of the terminal sugar residue(s), the technique is useful for determining both sequence and anomeric configurations of an oligosaccharide. This approach is usually supplemented by methylation analysis (37) to determine the linkage positions of the glycosyl residues. The main disadvantage of this method for use with recombinant pharmaceuticals is that it is quite labor-intensive and, as a result, better suited to a one-time structure elucidation than to routine monitoring purposes (e.g. to assess lot-to-lot consistency of glycosylation). The technique of high-pH anion exchange (HPAE) chromatography has been demonstrated recently to be capable of separating closely related oligosaccharides (42, 43). These separations take advantage of the fact that the hydroxyl groups of carbohydrates are weakly acidic (44) and, a t pH values >12, the resulting oxyanion derivatives can be separated by anion exchange chromatography. Hardy and Townsend (42)first demonstrated that HPAE separations are sensitive to molecular size, sugar composition, and linkage of the monosaccharide units. When used in conjunction with pulsed amperometric detection (PAD), HPAE chromatography permits carbohydrate analyses to be carried out a t the picomole level. Figure 2 shows an HPAE separation of several desialylated complex-type oligosaccharides, ranging from biantennary to tetraantennary (45). This chromatogram illustrates several characteristics of HPAE in general, smaller oligosaccharides elute before larger oligosaccharides;the addition of the 6-deoxy sugar fucose to an oligosaccharide results in earlier elution (compare the first with the second, and the fifth with the sixth, labeled peaks in Figure 2); the technique gives near-base-line resolution between isomeric triantennary oligosaccharides(the third labeled peak is a 2,4-branched triantennary structure, while the fourth is a 2,6-branched triantennary structure). HPAE chromatography was used in conjunction with 'H NMR and FAB-MS in the elucidation of the carbohydrate
a
I
e2'
24'
26'
eS'
90' 32' Time (mtnl
94'
96'
38'
Figure 2. Separation of desialylated complex-type oligosaccharides by HPAE chromatography. The symbols used in the shorthand structural representations are as follows: GlcNAc (W); mannose (0); galactose (0);and fucose (0). See Table I for detailed structures. The chromatographic system was a Dionex BioLC equipped with an AS6 column and PAD. Separations were carried out at a flow rate of 1 mL/min at a constant base concentration of 0.1 M NaOH. Oligosaccharides were eluted with a linear gradient from 0 to 0.1 M sodium acetate in 40 min. All other conditions were as described (45).
I
c-a2
I
1
0
2
4
6
8
10
12
Time (min)
Figure 3. Separation of rt-PA complex-type oligosaccharides by neutral-pH anion exchange chromatography. The system was that described in Figure 2, but equipped with a Mono Q column (HR 515; Pharmacia). The column was preequilibrated in water at a flow rate of 1 mL/mln, and oligosaccharides were eluted with a linear gradient from 0 to 0.45 M sodium acetate over 20 min. The elution positions of mono- through tetrasialyl oligosaccharidesare indicated (CQl-CQ4). Reprinted with permission from ref 3. Copyright 1989 by the American Society for Biochemistry and Molecular Biology. structures of CHO-cell-expressed rt-PA (3). In this study, the complex-type oligosaccharides were first subjected to neutral-pH anion exchange chromatography (Mono Q), to fractionate on the basis of sialic acid content (Figure 3). The mono-, di-, tri-, and tetrasialyl subfractions were then subjected to 'H NMR at 500 MHz and, from this, the structures of the component oligosaccharides were determined (see ref 3 for the details of the lH NMR spectral interpretation). Samples of the unfractionated pool of complex oligosaccharides and the mono-, di-, tri-, and tetrasialyl subfractions were then desialylated and analyzed by HPAE chromatography (Figure 4). The five major oligosaccharide components (labeled "1-5" in Figure 4) were identified through a combination of lH NMR of the parent, sialylated compounds, chromatographic coelution with authentic standards, and FAB-MS after permethylation of collected peaks from the HPAE chromatograms. The identities of the peaks in Figure 4 are shown in Table I. The use of 'H NMR in conjunction with both neutral- and high-pH anion exchange chromatog-
ANALYTICAL CHEMISTRY, VOL. 62, NO. 17, SEPTEMBER 1, 1990 N
n
I
I
n
2
n
3
n
1717
4
m
A.
3 U
~
h
b.
Asialo C-Q1
d.
Asialo C 4 3
1
3
Is T i m (.in) L
I
5 c
I
20
I
25
I
30 limo (rnin)
I
35
I
40
1
45
Flgure 4. Separation of desialylated rt-PA oligosaccharldes by HPAE chromatography. Trace a is the desialylated pool of complex-type oligosaccharkles. Traces b-e are desialylated Mono Q fractlons C-Q1 to ‘244, respectively. The chromatographic conditions were as describedin Figure 2. Peak identfflcations are given in Table I. Reprinted with permission from reference 3. Copyright 1989 by the American Society for Biochemistry and Molecular Biology.
raphy made it possible to characterize the complex oligosaccharides of rt-PA completely without losing information on sialylation heterogeneity (3). For example, the ratio of the areas of peak 1 in traces b and c gives the relative amounts of monosialyl- and disialyl-fucosylated diantennary complex oligosaccharides in rt-PA. Similarly, peak 4 in traces b, c, and d indicates the occurrence and relative amounts of monosialyl-, disialyl-, and trisialyl-fucosylated 2,g-branched triantennary structures, respectively. The high-resolution oligosaccharide separations afforded by HPAE chromatography proved to be a useful complement to NMR and FAB-MS in these studies, particularly for identifying minor components that would not have been identified by ‘H NMR of mixtures. As will be described below, the sensitivity of the HPAE separations also made it feasible to use this technique to characterize the distribution of oligosaccharides at individual N-glycosylation sites. The speed and selectivity of HPAE chromatography make it a promising technique for profiling the oligosaccharides of glycoproteins. Figure 5 shows the HPAE profiles of the N-linked oligosaccharides released by peptide:N-glycosidase F (PNGase F) treatment of four glycoproteins (45). These separations are dominated by the number of residues of sialic acid on each oligosaccharide, and the chromatograms contain regions where mono-, di-, tri-, and tetrasialyl oligosaccharides elute. The chromatograms obtained with the four glyco-
Flgure 5. HPAE profiling of N-linked oligosaccharides of four glycoproteins: A, rt-PA; B, ribonuclease b; C, human transferrin: D, bovine fetuin. The reduced and S-carboxymethylated glycoproteins were treated with PNGase F, and the oligosaccharides released were recovered by ethanol fractionation. Chromatographic conditions were as descried in Figure 2 except that elution was carried out with a two-stage linear gradient from 0 to 0.075 M sodium acetate in 5 min and from 0.075 to 0.2 M sodium acetate in 25 min. The labels at the top indicate the elution positions of neutral (N), monosialyl (l), disiaiyl (2), trisialyi (3), and tetrasialyl (4) oligosaccharides. Reprinted with permission from ref 45. Copyright 1990 by Elsevier.
proteins differ markedly from each other and, in each case, agree well with what has been published previously on the carbohydrate structures of the glycoprotein. Recombinant t-PA (Figure 5A) is known to contain attached high-mannose, hybrid, and mono-, di-, tri-, and tetrasialyl complex oligosaccharides (3). Ribonuclease B (Figure 5B) carries exclusively neutral (high-mannose) oligosaccharides (46).The predominant oligosaccharide structure of human transferrin (Figure 5C) is a non-fucosylated disialyl diantennary oligosaccharide (47).The carbohydrate structures of bovine fetuin (Figure 5D) are extremely heterogeneous, differing in the extent of sialylation, the number of peripheral branches, and the linkage (/31,4 vs /31,3)of galactose residues (48). The use of an oligosaccharide fingerprinting technique could have many applications in biotechnology such as distinguishing different glycoproteins or cell lines and assessing consistency of production of a glycoprotein or as a general characterization tool.
CHARACTERIZATION OF POTENTIAL N-GLYCOSYLATION SITES Potential sites for glycosylation of Asn residues (Nglycosylation) are recognized by the three-amino-acid sequence Asn-Xaa-Ser/Thr, where ‘Xaa” is any amino acid (23). It should be noted, however, that this consensus sequence does not guarantee glycosylation at a particular site. In fact, only about one-third of potential sites in secreted glycoproteins are actually glycosylated (49). There are also examples of glycoproteins that exist as variants that differ by the presence or absence of carbohydrate at individual glycosylation sites. These include plasminogen (50),tissue plasminogen activator (51),and recombinant tissue factor (52). Thus, for any gly-
1718
ANALYTICAL CHEMISTRY, VOL. 62, NO. 17, SEPTEMBER 1 , 1990
Table I. Identities of the Major Oligosaccharide Peaks in Figure 4 peak no.
structure
Gala( 1-+4)GlcNAc/3( l-2)Mana( 1+6)
1
F u c ~ 1-6) ( \
\
Man@(1-4)GlcNAc@( 1-4)GlcNAc
/
Gal@(1+4)GlcNAc@(l+2)Mana( 1-3)
Gal@(l-r4)GlcNAcp( 1-2)Mana( 1-6) \
2
/
Man@(1+4)GlcNAc@(1-4)GlcNAc
Gal@(1+4)GlcNAc/3( 1-2)Mana( 1-3)
Galp(1+4)GlcNAc,g( 1+2)Mana( 1-6)
3
F u c ~ 1-6) ( \
\
Man@(1-+4)GlcNAc@( 1-4)GlcNAc
Gal@(1+4)GlcNAc/?( 1-44
/
Mana( 1-3)
Gal@(1+4)GlcNAcp( 1-2) /
Gal@(1-+4)GlcNAcp( 1-6), F u c ~ 1-6) (
Man&(1-6)
4
Galp( 1+4)GlcNAc@(1-2)
/
\
\
Man@(1-+4)GlcNAc@( 1-4)GlcNAc /
Gal@(1+4)GlcNAc@(l-+2)Mana(1-3)
Gal@(1+4)GlcNAcp( 1-6)\
5
Gal@(l-c4)GlcNAcp( 1-2) Galp(1+4)GlcNAc@(1-4)\ GalP( l--rlZ)GlcNAcn(1-2)
coprotein, it is necessary to determine experimentally if and to what extent the individual sites are glycosylated. The characterization of individual glycosylation sites can be carried out to several different levels of detail: (1)Is a particular glycosylation site utilized? (2) Does the site carry predominantly less-fully processed (i.e. high-mannose) or more fully processed (i.e. complex) oligosaccharides? (3) What is the detailed distribution of oligosaccharidesat a particular site?
Information about the utilization and degree of carbohydrate processing at individual glycosylationsites (items 1and 2) is less likely to vary from one cell type to another than more subtle carbohydrate structural details (item 3). For example, the carbohydrate structures of human tPA have been characterized for the same gene product expressed in Chinese
/ Man4
4
\
F u c ~ 1-6) ( \
Man@(1+4)GlcNAcp( 1+4)GlcNAc /
/Man&( -3) hamster ovary (3,41),Bowes melanoma (40,53),human colon fibroblast (40), and murine C127 (41) cells. As would be expected, the four cell types were found to produce populations of oligosaccharidesthat differed significantly from one another. It should also be noted that the glycosylation patterns of CHO-expressed rt-PA from two sources ( 3 , 4 1 )were found to differ in the presence of hybrid type (3),tetraantennary complex type (3),and a galactosyl complex type (41) oligwaccharides. These signifcant differences in glycosylation by the same host-cell type may reflect clonal variation (54) or differences in cell-culture conditions (55)or purification processes. It is striking that the pattern of site utilization and degree of processing at the individual glycosylation sites was quite similar between all cell types examined in these studies: position 218 was not glycosylated due to the proline residue within the consensus glycosylation sequence;position 184 was
ANALYTICAL CHEMISTRY, VOL. 62, NO. 17, SEPTEMBER 1, 1990
30
40
SO'
60
30
40
50
60
30
40
50
60
H
P
a
Tiw (I(.)
Figure 8. Reversed-phase HRC tryptic maps of glycosidase-treated rt-PA: A, control; B, PNGase F-treated; C, endo H-treated. Chroma-
tographic conditions were as described in ref 3. Glycopeptide peaks are shaded and labeled by residue number of the attachment Asn. glycosylated approximately 50% of the time, giving rise to variants glycosylated at three sites (type I) or at only two sites (type 11);and position 117 carried exclusively less-fully processed (i.e. high-mannose) oligosaccharides. The latter observation suggests that Asn-117 is relatively inaccessible to the glycosyl processing enzymes. If so, it is in agreement with the earlier studies by Hsieh et al. (56),who demonstrated that the high-mannose oligosaccharides of Sindbis virus glycoproteins are located preferentially at a less surface-accessible glycosylation site. It seems likely, therefore, that the utilization of particular glycosylation sites and, in a coarse sense, the degree of processing at individual sites are governed largely by the amino acid sequence and tertiary structure of the particular protein, while the more subtle details of outer-chain processing reflect the glycosylation apparatus of the host cell. Depending on the degree of detail needed, several different techniques can be used to characterize glycosylation sites. A peptide mapping format is often the most straightforward approach to determining whether a particular site is glycosylated. This information can be obtained by direct amino acid sugar determinations of collected peptides on an amino acid analyzer (57) or by the use of endoglycosidase digestions in conjunction with peptide mapping. Figure 6 shows the effects of enzymatic deglycosylation on the tryptic map of rt-PA. In the control tryptic map (trace A) the glycopeptides elute as clusters of peaks. Treatment with the enzyme PNGase F, which cleaves the 8-aspartylglucosaminylbond of all known types of N-linked oligosaccharides (58),causes each of the clusters of peaks to coalesce into a single peak at later retention time (trace B). Glycopeptides that carry predominantly high-mannose and/or hybrid structures can be distinguished from those carrying complex structures by digestion with a different enzyme, endo-8-N-acetylglucosaminidaseH (endo HI, which hydrolyzes the linkage between the two N-acetylglucosamine residues within the core of high-mannose or hybrid oligosaccharides (59). The tryptic map of endo H treated rt-PA (Figure 6, trace C) reveals that the peptide containing Asn-117 is shifted to later retention time, while those containing Asn-184 and Asn-448 are unaffected by endo H. Thus, it can be determined readily by this method that the high-mannose oligosaccharidesof rt-PA are attached exclusively to Asn-117. Carr and Roberts (60) have described a method for identifying glycopeptides using FAB-MS rather than chromatography. This technique takes advantage of the fact that deglycosylation with PNGase F converts the attachment Asn residue to Asp (58); thus, FAB-MS analysis of a PNGase F-treated glycopeptide will reveal an ion that is increased by
1719
1a m u over the theoretical (nonglycosylated) peptide mass for each site deglycosylated by PNGase F. This technique has been applied successfully to rt-PA (60) and to recombinant soluble CD4 (61). Both of the approaches described above will allow unambiguous characterization of glycosylation sites so long as it is possible to fragment the protein such that no single glycopeptide contains more than one glycosylation site. For a heavily glycosylated protein, such as the envelope glycoprotein (gp120) of the human immunodeficiency virus (HIV-1),it may not be possible to isolate glycosylation sites on individual glycopeptides. Recombinant gp120 (rgpl20), which has a polypeptide mass of 60 000, is approximately 50% carbohydrate by weight and contains 24 potential N-glymylation sites (62). When reduced and S-carboxymethylated rgpl20 is treated with trypsin, many of the resulting glycopeptides contain two, three, or even four glycosylation sites. Therefore, in our characterization studies of rgpl20 (63),we devised a technique to allow unambiguous characterization of several glycosylation sites within the same glycopeptide, shown schematically in Figure 7. The glycoprotein is subjected to digestion with endo H followed by PNGase F. Endo H treatment hydrolyzes the chitobiose core of high-mannose and/or hybrid oligosaccharides, leaving a residue of Nacetylglucosamine attached to the glycosylation site. The subsequent PNGase F treatment removes the endo H-resistant (i.e. complex-type) oligosaccharides and converts their attachment Asn residues to Asp residues (58). The GlcNAc-Asn product of the endo H digestion of high-mannose or hybrid oligosaccharides is not itself a substrate for PNGase F. Therefore, the sequential treatment with these two enzyme results in GlcNAc-Asn at a site that carried high-mannose or hybrid oligosaccharides, Asp at a site that carried complextype structures, and Asn at a site that was not glycosylated. Having thus labeled the individual glycosylation sites, it is possible to characterize the resulting peptides either by tandem mass spectrometry or by N-terminal sequence analysis. Automated N-terminal sequence analysis is particularly well suited to this purpose because of the great sensitivity of the technique and because the appearance of the phenylthiohydantoin (PTH) derivatives of GlcNAc-Asn, Asn, or Asp at particular cycles in the repetitive sequence analysis of a peptide can be correlated directly to individual glycosylation sites within a multiply glycosylated peptide. Figure 8 shows the separation between the PTH derivatives of GlcNAc-Asn, Asn and Asp on the PTH analyzer of an Applied Biosystems Model 477A/ 120A protein sequencer. These conditions differ slightly from those published by Paxton and co-workers (64). Figure 9 shows partial sequence data obtained with a glycopeptide from rgpl20 using this procedure. In this example the appearance of the PTH derivative of GlcNAc-Asn at cycle 5 indicated that rgpl20 carries high-mannoseoligosaccharides at Asn-362. Using this approach, we were able to determine which of the 24 potential glycosylation sites of rgpl20 were utilized and which sites carried high-mannose and/or hybrid oligosaccharides as the predominant structures (63). The techniques described above are adequate for characterizing glycosylation site heterogeneityto the level of locating the positions of any endo H-susceptible structures. To carry out a detailed characterization of the distribution of oligosaccharides at individual sites is significantly more difficult and material-consuming. The most sensitive approach to characterizing glycosylation site heterogeneity is probably mass spectrometry, either of derivatized glycopeptides (28,61,65) or of suitably derivatized oligosaccharidesreleased from isolated glycopeptides (28,61,66). The main limitation of mass spectrometry for these purposes is that it is unable to distinguish between some of the commonly occurring isomeric
1720
ANALYTICAL CHEMISTRY, VOL. 62, NO. 17, SEPTEMBER 1, 1990
High Mannose (or Hybrid)
I
:>M>M-GNAc-GNAc-Asn
Complex
-
Sia Gal GNAc M >M-GNAc-GNAc-Asn Sa-Gal-GNAc-M
I I
E
~
Sa-Gal-GNAc-M H
>M-GNAc-GNAc-Asn
Sa-Gal-GNAc-M
Non-Glycosylated
I
Asn
pNwF*
GNAc-Asn
I
M-M
- -
I I
End0 H
___)
I
End0 H
Asn
___)
I
I
I I
PNoasBF,
PNGase F
I
GNAc-Asn
I
I I
Asp
I I
Asn
Figure 7. Scheme showing the use of sequential digestions with endo H and PNGase F to characterized potential N-glycosylation sites. The abbreviations used in this scheme are as follows: M, mannose: GNAc, N-acetylglucosamine; Sia, sialic acid (N-acetylneuraminic acid). Reprinted with permlssion from ref 61. Copyright 1990 by the American Society for Biochemistry and Molecular Biology. I
I
1 1.0
6.1
51
&I
7.C
11
$0
C I I
I:
01
$0
1s
c
Figure 8. Separation of PTH-GlcNAc-Asn from other PTH derivatives. Separatlons were carried out on the PTH analyzer of an Applied Biosystems Model 477A/ 120A gas-phase sequencer: upper trace, detail from the separation of a PTH amino acid calibration mixture; lower trace, separation of the PTH derivatives of GlcNAc-Asn (Sigma) and Asp.
residues (e.g. galactose and mannose) or oligosaccharide structures (e.g. 2,4-branched and 2,6-branched triantennary oligosaccharides) and gives little or no information about linkage positions and anomeric configurations. The technique of HPAE chromatography is well suited to examining glycosylation site heterogeneity, because the sensitivity of pulsed amperometric detection makes it feasible to carry out analyses on amounts of tryptic peptides that can be prepared without major difficulty and because the selectivity of the chromatography is adequate to allow identification of most of the commonly occurring N-linked structures. Figures 10 and 11give an example of the use of reversed-phase tryptic mapping in conjunction with HPAE to characterize the site heterogeneity a t Asn-448 of rt-PA. The Asn-448containing glycopeptide cluster (shaded in Figure 10) was purified by reversed-phase HPLC. The isolated glycopeptide was treated with PNGase F to liberate the carbohydrates, which were then desialylated by mild acid treatment (3)and
analyzed by HPAE chromatography (Figure 11). The identifications of the peaks in Figure 11are as described for Figure 4. Similar analyses carried out with the Asn-117- and Asn184-containing glycopeptides allowed the detailed description of the glycosylation site heterogeneity of rt-PA (3). This information is summarized for type I rt-PA in Figure 12. The most obvious aspect of rt-PA site heterogeneity is that Asn-117 carries exclusively high-mannose oligosaccharides (Figure 12, panel A), while Asn-184 and Asn-448 carry complex-type structures (Figure 12, panel B). As discussed above, this level of site heterogeneity is readily detectable by several other methods, including peptide mapping of the endo Htreated protein. The use of high-resolution oligosaccharide separations (Figure 11)allows characterization of more subtle features of glycosylation site heterogeneity. As shown in Figure 12, panel B, Asn-184 and Asn-448 carry populations of oligosaccharides that differ from each other in the ratio of diantennary to triantennary structures and in the relative amounts of 2,4-branced vs. 2,g-branched triantennary oligosaccharides. Analysis of the isolated rt-PA variant that lacks carbohydrate at Asn-184 (type I1 rt-PA) demonstrated that oligosaccharide processing a t Asn-117 and Asn-448 is unaffected by the presence or absence of carbohydrate attached to Asn-184 (3). The differences between the glycosylation patterns a t the three glycosylation sites of rt-PA are a further indication of the importance of the tertiary structure of a protein in influencing oligosaccharide processing.
CONCLUSIONS The availability of large amounts of material and the need for repetitive (e.g. lot-blot) analyses influence the approaches used to characterize recombinant glycoproteins that are intended for human pharmaceutical use. High-resolution chromatographic separations of oligosaccharides are particularly valuable because the chromatographic "fingerprint" of a glycoprotein can be used to monitor changes in glycosylation between production lots or those that result from changes in cell culture conditions. The technique of HPAE chromatography, which is sensitive to changes in molecular size, linkage positions, branching patterns, and anomeric configurations, is well suited to these applications and can be a
ANALYTICAL CHEMISTRY, VOL. 62, NO. 17, SEPTEMBER 1, 1990
1721
ClcNAc I
Thr-Cln-Leu-Phe-AsnSer... Phe
Thr
I
Gln
Cycle 1
Cycle 4
Cycle 2
Cycle 5 ClcNAc-Asn
' II
Cycle 3
,
.
.
.
.
,
.
.
.
.
Ll
~ I
.
.
.
Leu
.
~
l
.
.
.
Cycle 6
.
Figure 9. Sequential Edman degradation of an rgpl20 peptide treated as shown in Figure 7. Cycle 5 corresponds to Asn-362 of rgp120.
aiu
0.4-1
I
, 1 20
30
1
40
Time (min)
Figure 10. Characterization of glycosyiation site heterogeneity. The chromatogram is a detail from the tryptic map of type I rt-PA. Chromatographic conditions were as described in ref 3. The peaks arising from the Asn-448-containing glycopeptide (shaded) were collected, pooled, and treated wRh PNGase F. The released oligosaccharides were then desialylated and analyzed by HPAE chromatography (Figure 11).
valuable method for carbohydrate characterization when used in combination with standard physical and chemical techniques (NMR, FAB-MS, methylation analysis). Potential N-glycosylation sites can be characterized by any of several methods, including peptide mapping, mass spectrometry, and HPAE chromatography of oligosaccharides released from isolated glycopeptides. Treatment of a glycoprotein with endo H followed by PNGase F results in Asn at a non-glycosylated site, Asp a t a site that carries complex-type structures, and
L
LO
LS'
20'
2s'
SO'
31'
40'
Time (mln)
Figure 11. HPAE chromatographic analysis of the desialylated oligosaccharides reieased from Asn-448 of type I rt-PA. Chromatographic conditions were described in Figure 2. Peak identities are gtven in Table 1.
GlcNAc-Asn at a site that carried high-mannose of hybrid-type structures. This treatment can be used in conjunction with automated Edman degradation and is particularly valuable for characterizing peptides with more than one glycosylation site.
1722
ANALYTICAL CHEMISTRY, VOL. 62. NO. 17. SEPTEMBER 1, 1990 60
W A%
117
(19) GOldwasser. E.: Kung, C. K.31.: Ellason. J. J. W .chem.1974, 248, 4202-4208. (20) Dub, S.: Fisher, J. W.: PowM J. S. J. Bkl. chem. 1088, 289. 17616-17S91 . . .. .. .. (21) Feizi. T.: Childs. R. A. Blochem. J . 1987. 245. 1-11. 122) Kagawa, Y . : Takasaki. S.; mumi, J.; HosaI, K.: Shimizu. H.: Kochhe. N.: KObata. A. J. BM. chem.1988, 263. 17508-17515. (23) Kwnfeld. R.: Komfeld, S.Annu. Rev. Bicdwm . 1985. 54. 831-869. (24) Dsawa. T.: Tsuji, T. Annu. Rev. Blodrem. 1987, 56. 21-42. (25) Merkle. R. K.; Cummings. R. 0. Methcds E n z y d . 1987. 138.
-
__
---~-"".( Il*~.)CO
(26) Green. E. 0.:Beenriger. J. U. TIES 1989. 74. 168-172. (27) Yamashlta, K.: MizUOChi. T.; Kobata. A. M m a s E m p d . 1982. 83.
._-.
3 ox- 11c
-& 60
...
40
(28) Dell, A. A&. Carb&W. chem.Bodmm . 1987, 45, 19-72. 129) Egge. H.: Peter-Katakic. J. M a s s specborn.Rev. 1987. 6 . 3 3 1 3 9 3 . 130) Vliegemhart. J. F. 0.: Dwland. L.: Van Halbeek, H. A&. Carb&ydr. Chem. Biadlem. 1983, 41, 209-374. (31) Lindberg, 8. Mastlwds E n z y d . 1972, 28. 178-195. (32) Meiiis, S. J.; Baenriger. J. U. Anal. Bicchem. 1983. 134. 442-449. (33) Hase, S.;Nalruka. S.: Oku. H.: Ikenaka, T. AMI. Bodmm. 1987. 767, 321-326. (34) Hancock. W. S.: CanovaDavk. E.: Banemby. J.; Chloupek. R. I n 810technoiogkaily Derived Madical Agsnb: Guerlgulan. J. L.. FauoNsso. V., Pogglioiini. 0..Ed$.; Raven Press: New York. 1988: pp 29-49, (35) Garnick, R. L.: Solli. N. J.; Papa, P. A. Anal. chem. 1988. 60.
----~-"". . Kobala, A. Bkl. CarmW. 1984. 2 , 8 5 1 8 1 .
20
ICdf-lCC7
Koerner. T. A. W.: Preslegard. J. H.; Yu. R. K. Memodr Enzymd,
STRUCTURE Flpue 12. H&cgam summarizing tlm giywsylation site hetemge y of type I rt-PA A. hlgh-mannose oligosaccharides; E. desialylated complex oligosaccharides. The symbols used in the Shorthand structures are given in Figure 2. Detailed structures are given in Table 1.
ACKNOWLEDGMENT I thank the following people: John OConnor and William Hancock for helpful discussions and advice; Louisetta Basa and Cordelia Leonard, who carried out most of the experiments reported here: and Reed Harris, who worked out sequencer conditions for characterizing glycosylation sites. LITERATURE CITED (1) Aralhcun. W. R.: Birch. J. R. Scknce 1988, 232. 1390-1395. (2) Pennica, D.: Holmes. W. E.: Kohr, W. J.: Harkins. R. N.: Vehar. 0. A.: Ward. C. A.: Bennett W. F.: YeIverton. E.: Seeburg, P. H.: Heyneker. H. L.: ooeddel. D. V.: Collen, D. Natue 1983. 307, 214-221. 13) Spellman. M. W.: &sa. L. J.; Lwnard. C. K.: Chakel, J. A,; 0cOnn.x. J. V.: Wiiscn. S.: Van Halbeek. H. J. 861. Chem. 1989. 264. 14100-14111 .. . . . . .. 14) Jacobs. K.: Shosmaker. C.: Rudersdord. : R ~.R.:. Neill., S. . D ~ Kairfman~ M.: Muhon, A : shwhra. JOW s. s : kewic&. R.; Fnch. E. F.: r(awakM. M.: Shimlzu. T.: Mhlare. T. Abm. 1985. 313. 806-810. I51 Sawmi. h.: bmner. B.: De,. A : Fukuda. M. J Biol. ChM. 1087. 262. 12059-12076 ~... Smith, 0.H.; Bym, R. A.: MBrSlers. S.A.; Gregory. T.: Groopman. J. E.: Capon. D. J. S c b m 1987. 238. 1704-1707. Vehar. 0. A,: Keg. B.: Eaton. D.: Rodriguez. H.: OBrlen. 0.P.: Rotblal, F.: Opermann, H.: Keck, R.: Wcd, W. 1.: Harkins. R. N.: Tuddenham. G. D.: Lawn. R. M.: Capon. D. J. Natm 1984, 312. 337-342. Conradt. H. S.;Egge, H.: Peler-Katalinic. J.: Reiser. W.: Sikhxi. T.: Schaper. K. J. Biol. Chem. 1987. 262. 14600-14605. Kagawa. Y.; Takasaki. S.: Utsumi. J.: Hosoi. K.: Shimizu. H.; Kochibe, N.: Kobala. A. J. BW. &m. 1988. 263, 17508-17515. Rademachsr. T. W.; Parekh. R. 0.; Dwek. R. A. Annu. Rev. skdrem. 4enn .""", E 7 , 79c , a.,n Einarsson, M.; Brandt. J.; Kaplan. L. BiocMm. BlophyJ. Acta 1985. ~
i:
.
~
-
7%4d--IRR, . .. . I411 Parein. R. 0.: Dwek. R. A,: Rudd. P. M.: Thomss. J. R.. Ra6emcher. T. W., Warren. T : W w . T.G.: Harten, 0.; R&. 0.: PalMr. M.: Ram aohaoran. T.. Tameir. 0.C. Biochsmirby 1989. 28. 7670-7679. (42) Hardy. M R.: Townsend. R. R. Roc. M U . Acad S d . U . S . A . 1988. 65. 3289-3293. 143) Cnen. L.-M.; Yet. M 0.:Shao. M . 4 FASEB J . 1088. 2 , 2819-2824. 144) Rendbmen. J A In C a m @ @ $ h sowiar: AdyBncas in Chomib Iry Sera5 NO. 117; Oould. R. F.. Ed., A m c a n Chemical Society: Wasnmman. D.C.. 1971: DO 51-69. (45) B a s , ~ . i s p e i i ~K nW .W : ~ . ~ C ~ t o1990, g r .499, 205-220. (46) Liang. C.J.: Yamashha. K.: Kobata. A. J. Biobsm. 1980.88, 51-58. (47) Spik, G.: Debruyne, J.; Monbeuil. J.: Van Halbeek, H.; Viiegenthart. J. F. G. FEES Len. 1985. 785. 65-69. (48) Oreen. E. D.; Adsil, G.; Baenziger. J. U.: W1m. S.;Van Halbesk. H. J. Biol. Cham. 1988. 263, 18253-18268. (49) Sbuck, 0.K.: Lennarz. W. J. In The Bbsbmlsby of w t e b ? s am3 Prolecghxns: Lennarz. W. J.. Ed.; Wnum: New York. 1980: pp
i:
?F._p.?
~~~
I ,
e . , " 4" "I" I-,". .
(121 DpdeMkker. 0.:Van Damme, J.: Boaman. F.: Bliku. A,: De samer. P. ProC. SOC. Exp. &I.Med. 1988. 782, 248-257. (13) Wbwer. A. J.; Waward. S. C.: Can. L. J.: Haraka, N. K.: Feder. J.: Parekh, R. B.: RUM, P. M.: Dwek. R. A.: Radsmacher. T. W. Biochehamhby 1989. 28. 7662-7669. (14) Smedsrod. B.: Einarsson. M.: Pertoll. H. momb. H e m t a s k 1988, 59. 480-484. (15) Hotchkiss. A.: Relino, C. J.: Leonard. C. K.: O h n o r . J. V.: Crowby. C.: McCabe. J.: Tale. K.: Nakamwa. 0.: Powers, D.; Levinson, A,: Mohler. M.; Spellman, M. W. Thrmb. H e m t a s i s 1988, 60. ?SS-2?7 ._ .. (18) Collen, D.: Slassen, J.: Larsen, 0. Bkxd1988. 71. 218-219. (17) Rarnbach. W. A,: Shaw, R. A,; Cooper, J. A. D.: Apt. H. L. Rm. soc. Exp. Biol. Med. 1958. 99. 482-483. (18) Lukowsky. W. A.: Painter. R. H. Can. J . Blochem. 1072, 50. 909-917.
- --
1987, 736, 38-59.
Mizuochi. T.: Spellman. M. W.: Larkin. M.; Solomon, J.: &sa, L. J.; Feizi. T. Biochsm. J. 1988. 254. 599-603. Miruochi. T.; Spellman, M. W.: Larkin. M.: Solomon. J.; Besa. L. J.: Feizi, T. Biomed. Chromstog. 1988. 2 . 260-270. Parekh. R. B.: Dwek. R. A,: Thcmas. J. R.; Opdenakker. G.: Radamacher. T. W.: Wbwer. A. J.: Howard. S.C.: Nelson. R.; S W . N. R.: Jennings, M. G.: Harakas. N. K.: Feder. J. Bnchemisby 1989. 28,
(50) Hayes. M. L.; Castellino. F. J. J. W .chem. 1979. 254. 8768-8771. (51) Pohl. 0.i Kallslrom. M.: Bergsdorl. N.: Wallen. P.: Jwnvall, H. BioG'mmsby 1984. 23, 3701-3707. (52) Paborsky. L. R.: Harris, R. J. Unpubikhedwork. (53) Pohl. G.:Kenne. L.: Nilsson. B.: Einarrsan. M. Ew. J. Blochem. 1087. 170. 69-75. (54) Rothman. R. J.: Warren, L.; Vliegenlharl. J. F. G.: Hard. K. J. Biochemisby 1989, 28. 1377-1384. (55) Gmchee. C. F.: Monica. T. BblTechrwlOgy 1990. 8 . 421-426. (561 Hrieh. P.: Rosner. M. R.; Robbins. P. W. J. &I.chem. 1983. 258. 2555-2561. (57) Harris, R. J.; Chamow. S.M.: Oregay. T. J.; Spellman. M. W. E u . J. Blochem. 1990, 188, 291-300. 158) Tarentino. A. L.; Oomez. C. M.: Ammer. T. H.. Jr. B-by 198.5, 24, 4665-4671. (59) Tai. T.; Yamashila, K.; Kobala. A. Biochem. Bh&s. Res. Canmun. 1977, 76. 434-441. (601 Carr. S. A.; Robem. 0. R. Anal. Blochem. 1988. 157. 398-408. (611 Carr. S. A.; Hemling. M. E.; Fdena-Wa-rman. 0.; Sweat, R. W.: Anumula. K.: Barr. J. W.; Huddleston. M. J.: Taylor, P. J. BW. Cham. 1989. 264, 21286-21295. (62) Lasky. L. A.: Groopman, J. E.; Fennle. C. W.; Bans P. M.; Capon. 0. J.: Dowbenko, D. J.: Nakamwa. G. R.: Nun-, W. M.; Rem, M. E.; Berman. P. w. s c b m 1986, 233, 209-212. (83) Leonard. C. K.: Spellman. W. W.; Riddle. L.: Hank. R. J.: Thanas, J. N.: Gregory. T. J. J. Biol. Chem. 1990, 265. 10373-10382. (641 Padon, R. J.: Maser. G.: Pande. H.; Lee. T. D.: Shlveiy. J. E. Pmc. Nan. Acad. Scl. U.S.A. 1987. 84, 920-924. (65) Carr. S. A.; Roberts. 0. 0.:Jurewicr, A.; Frederkk. B. Bkdrem*, 1988. 70, 144-1454. (66) Webb. J. W.: Jang. K.; OlleceGastro. B.: Taremino. A. L.; Plummer. T. H.: Byrd. J. C.; Fisher. S. J.: Burlingame. A. L. Anal. Bodmm. 1988, 169, 337-349.