AMI. ctwm. 1 3 9 3 , ~ 877-884 .
877
Collisional Fragmentation of Glycopeptides by Electrospray Ionization LCIMS and LCIMSIMS: Methods for Selective Detection of Glycopeptides in Protein Digests Michael J. Huddleston, Mark F. Bean,' and Steven A. Carr* SmithKline Beecham Pharmaceuticals, P.O.Box 1539, King of Prussia, Pennsylvania 19406
Marerpectrometrlcmethodsofglycopeptldsgpeclflcdetectkm in liquidchromatography/electroapraymassspectrometry (LC/ ESMS) of glycoprotein digests are explored wlng a variety ofgly~ldemoddramlt~applledtordu~complement receptortype I, a 240-kDa glycoprotein containing25 potential sites of N-glycoeyiatlon. The most speciflc method, requiring a triple quadrupole, involves monltorlng of sugar oxonium fragment ions during precursor-lon scan ESMS/MS. Signais derivedfrom nonglycosylatedpeptidesare virtually eHminated, resuiting in a total-lon current chromatographic trace of only the glycopeptides present In the digest. The corresponding mass spectra ylsid molecular welght and glycopeptide microheterogeneityinformation. An alternative and complementary approach that we term cdlkknalaxcltationscanning also involvesfragmentationof glycopeptidesto sugar oxonium ionfragments but does not involve any massselectlonprocess, permitting the experiment to be performed on a single quadrupole instrument. The rerultlngtotal ion chromatogram is similar to the UV chromatogram (215 nm), but a selectedion chromatogram for carbohydrate-specific ions such as the Kacetyihexooamine oxonium ion (m/z 204) produces a glycopeptide-specifictrace. Although there can metimes be peptide interferences in the spectra of the indicated glycopeptkle-containlngchromatographic peaks, this latter approach permits peptide mapping to be performed on the same data rret that also Indicatesthe locationof glycopeptides inthe chromatogram. Both methodsare sultablefor detection of glycopeptides with ail common classes of oligosaccharides in either N- or O-linkage to the peptide.
I NTRODUCT1ON Determining the locations, heterogeneity, and structures of oligosaccharidesin recombinant glycoproteins is valuable to the biotechnology industry because such information can be used to correlate efficacy with structure and is often needed for regulatory purposes.lY2 To date, most techniques for carbohydrate analysis have involved chemical or enzymatic release of carbohydrates prior to anal~sis.38~However, characterizingglycopeptides,as opposed to analyzingthe pool of released carbohydrates from a glycoprotein, has the advantage that the sequence context of each specific family of glycoforms is preserved. Several methods for locating glycopeptide-containing fractions in an HPLC map of a glycoproteindigest have been developed. The phenolsulfuric acid method, a classic colorimetric approach for oligosac-
* Authors to whom correspondence should be addressed.
(1) Cummings, D. A. Glycobiology 1991, 1, 115-130. (2) Goochee, C. F.; Gramer, M. J.; Andersen, D. C.; Rahr, J. B.; Rasmussen, J. R. EiolTechnology 1991,9, 1347-1355. (3) Hardy, M. R.; Townsend, R. R. h o c . Natl. Acad. Sci. U.S.A. 1988, 85,3289-3293. (4) Lee, Y. C. Anal. Biochem. 1990,189, 151-162.
charide detection, is still one of the more frequently used methods of glycopeptide detection and localization because it is simple and quantitative and can be performed on glycoproteins, glycopeptides, or oligosaccharides.6 Nevertheless, localization of glycopeptides in a chromatographic run by this method does require testing of each of the isolated fractions separately. Glycopeptidescan be selectively isolated from protein digests by appropriate use of lectin binding (concanavalin A or wheat germ agglutinin are often used); mass spectrometry can then be used for initial characterization of the molecular weight and microheterogeneity, although care must be taken to thoroughly desalt samples isolated from lectin columns.6 Alternately, lectins can be made part of a glycopeptide-specific staining procedure.' Recently, there has been progress in adapting chemical methods to quick, high-throughput analyses using microtiter plates and microsample readers.8 Glycopeptidestructural information from the above methods is often scant or absent. Several mass spectrometric (MS) techniques are used for locating glycopeptides. All of these have the advantage over chemicalor lectin-based methods in that they yield molecular weight information. Together with prior knowledge of the recombinant protein sequence and of the range of carbohydrate structures previouslycharacterized from the expression cell line used, molecular weight measurements can reveal a relatively detailed picture of the site of glycosylation, the oligosaccharide structural classes present, and their molecular heterogeneity. One of the earlier such methods involved analytical HPLC separation (with UV9 or UV and mass spectrometric detectionlo) of the components of a protein digest before and after PNGase F enzymatic release of N-linked oligosaccharidesfollowed by a search for differences between the two chromatograms. This method was limited to N-linked glycopeptides. Other MS methods depend on the frequently observed heterogeneity of the carbohydrate portion of the glycopeptide; a glycopeptide with more carbohydrate attached tends to elute slightly earlier in reversed-phase separations than the glycopeptide containing less carbohydrate. Although the chromatographicseparation is insufficient for UV detection to resolve the peaks, the adjacent mass spectra appear quite different." As a result, a diagonal line can be observed after plotting mlz versus time whereas peptides result in lines parallel to the mlz axis.12 Alternately, a peak-pair searching algorithm can be used to (5) Dubois, M.; Gilles, K. A,; Hamilton, J. K.; Rebers, P. A.; Smith, F. Anal. Chen. 1956, 28, 350-356. (6) Bean, M. F.; Bangs, J. D.; Doering, T. L.; Englund, P. T.; Cotter, R. J. Anal. Chem. 1989,61, 2686-2688. (7) Hsi,I(.-L.;Chen,L.;Hawke,D.H.;Zieske,L.R.;Yuan,P.-M.Anal. Biochem. 1991,198, 23&245. (8) Fox,J. D.; Robyt, J. F. Anal. Biochem. 1991,195, 93-96. (9) L'Italien, J. J. J. Chromatgr. 1986, 359, 213-220. (10)Carr, S. A.; Roberts, G. D. Anal. Biochern. 1986, 157, 396-406. (11) Hemling, M. E.; Roberta, G. D.; Johnson, W.; Carr, S. A.; Covey, T. E. Biomed. Enuiron. Mass Speetrom. 1990, 19, 677-691. (12) Ling, V.; Guzzetta, A. W.; Canova-Davis, E.; Stults, J. T.;Hancock, W. S.; Covey, T. R.; Shushan, B. I. Anal. Chem. 1991,63,2909-2916.
0003-2700/93/0365-0877$04.00/0 0 1993 American Chemical Soclety
878
ANALYTICAL CHEMISTRY, VOL. 65, NO. 7, APRIL 1, 1993
query LC/ESMS data by automatically searching the mass spectral data for characteristic m/z differences between spectral peaks such as those corresponding to hexose or N-acetylhexosamine.13 Both of these methods fail in the absence of site microheterogeneity. The need to find faster, more sensitive, more selective, and more universal techniques for locating glycopeptides during chromatography of glycoprotein digests becomes pressing when dealing with larger glycoproteins with many potential glycosylation sites. For example, the soluble complement receptor glycoprotein (sCR1) discussed in this paper is 240 kDa and has 25 potential sites of N-glycosylation. A tryptic digest of this glycoprotein is expected to produce 174 tryptic fragments, many of them glycopeptides. The HPLC separation of this digest is exceedingly complex, and most of the glycopeptides coelute with peptides in the digest. Thus, techniques for glycopeptide detection that rely solely on HPLC UV difference mapping before and after removal of carbohydrate9 can only identify a small percentage of the glycopeptides present. Manually interrogating individual chromatographic fractions by chemical or mass spectrometric means quickly becomes unwieldy and time-consuming as the number of fractions increases. What we needed was an online HPLC method which not only detected the location of glycopeptide-containing fractions but also yielded structural information while retaining the peptide sequence context of the attachment site. N-linked oligosaccharides expressed in mammalian cells have somewhat restricted heterogeneity with well-characterized structural types; the most common are represented by the following: deoxyhexose (fucose), hexose (mannose or galactose), N-acetylhexosamine (N-acetylglucosamine or Nacetylgalactosamine), and N-acetylneuraminic acid with the core. 0-Linked olisame MansGlcNAcz-(Asn-X-Ser/Thr) gosaccharides usually contain hexose (galactose) or N-acetylhexosamine (N-acetylgalactosamine) with a common HexNAc(Ser/Thr) core. Our objective was to develop methods for specific detection and molecular weight determination of all types of glycopeptides using liquid chromatography/electrospray mass spectrometry (LC/ESMS). In this paper we describe the collision-induced fragment ions of oligosaccharides and glycopeptides that may be exploited for selective detection of glycopeptides in complex digests. Neutral-loss scanning and parent- or precursor-ion scanning are compared. The analytical utility of collision-induced dissociation (CID) in Q2 of a triple quadrupole and CID in the ion source region of a single quadrupole, together with a novel method we refer to as collisional-ezcitation scanning, are discussed. Simultaneous with our initial presentation of this work on glycopeptide-specific detection,13J4Conboy and Henion presented preliminary data on the use of high-orifice voltage CID LC/ ESMS for glycopeptide analysis.15J6 An application of the methodology described in this paper to fetuin glycoprotein and ita extension to the distinction of 0- and N-linked glycopeptides has been published elsewhere.':
EXPERIMENTAL SECTION Chemicals. Man4GlcNAc, Gal-GalNAc-Ser, and the Asnlinked oligosaccharides illustrated in Figure 1were obtained from (13) Carr, S. A.; Armbruster, T.; Hemling, M. E.; Soneson, K. K.; Huddleston, M. J. Proceedings of the 39th ASMS Conference on Mass Spectrometry and Allied Topics, Nashville, TN, May 19-24, 1991; pp 483-484. (14) Huddleston, M. J.;Bean,M. F.;Barr, J. R.; Carr, S. A. Proceedings of the 39th ASMS Conference on Mass Spectrometry and Allied Topics, Nashville, TN, May 19-24, 1991; pp 280-281. (15) Conboy, J. J.; Henion, J. D. Proceedings of the 39th ASMS Conference on Mass Spectrometry and Allied Topics, Nashville, TN, May 19-24, 1991; pp 1418-1419. (16) Conboy, J. J. Ph.D. Thesis, Cornell University, New York, 1992. (17) Carr, S. A.; Huddleston, M. J.; Bean, M. F. Protein Sci., in press.
(a)
Galpl-4GlcNAcpl-2Manol-6 \Manpl-4GlcNAcpl.4GlcNAcpl-N-Asn
Galpl-4GlcNAcp1.2Manal-3/
(c)
Manul-2Manal-6, Manal-6 Mannl-2Manol-3' Manal.2Manol-2Manal-3
(d)
'
)Manpl-4GlcNAcpl-4GlcNAcpl-KAsn
Galpl.3GalNAcal-OSer
Flgure 1. Structuresof model glycopeptides: (a)blantennarycomplex Asn-iinked oligosaccharide, M(monoisotopic or m.1.) = 1754.6; (b) same structure but with Ngcetyineuraminicacids at the branch termini, Wm.1.) = 2336.8 (c)triantennary oligomannose Asn-linked oligosaccharide, M(m.1.) = 1996.7; (d) 0-linked disaccharide, Wrn.1.)= 470.2.
BioCarb Chemicals (Accurate Chemical & Scientific Corp., Westbury, NY) and used without desalting or purification. A soluble, recombinant form of the CD4 surface glycoprotein (the lymphocytic receptor of the HIV virus) which lacks the transmembrane domain (known as soluble or sCD4) was expressed in Chinese hamster ovary cells and purified at SmithKline Beecham as published.18 The secreted,truncated form of the HIV envelope protein gp120 (85kDa) missing the first 31 amino acids from the mature form and containing the first four amino acids of tPA at the N-terminus was cloned, expressed in Drosophila cells, and purified at SmithKline Beecham as previously described.lg The clone of soluble human complement receptor type I, a recombinant soluble form of type I complement receptor lacking the transmembrane and cytoplasmic domains20(sCR1, also known as TP-10 HD, 250 kDa) was received from T Cell Sciences, Inc. (Cambridge, MA), and the protein was expressed in Chinese hamster ovary cells. Enzymatic Digests. The native glycoproteins gp120, sCD4, and sCR1 were each reduced and S-carboxymethylated prior to proteolytic digestion with trypsin by the same procedure. Reduction was performed in 0.5 M Tris-HC1 buffer, pH, 8.3, containing 2 mM EDTA and 6 M guanidine hydrochloride with dithiothreitol in a 50-fold molar excess over cysteine. Denaturation and reduction were allowed to proceed under nitrogen for 3 h at 37 "C. An aliquot of aqueous iodoacetic acid (100-fold molar excess over DTT) that had been degassed and pH adjusted to 8.3 with NH40H was added and the reaction allowed to continue for 1h. The glycoprotein was dialyzed against 3.5 L of 50 mM NH4HC03for 15 h at 4 "C and stored dried at -70 "C. Prior to proteolytic digestion,the samples were dissolved in 0.050 M NH4HC03buffer at pH 8.5. TPCK-treated trypsin (Cooper Biomedical, Malvern, PA) dissolved in the same buffer was added in two equal portions 3 h apart to give a final enzyme-to-substrate ratio of 150. Digests were incubated at 37 "C for 6 hand halted by lowering the pH to 54 with dilute acetic acid. Flow Injection and Liquid Chromatography. Oligosaccharide model compounds, MandGlcNAc and Gal-GalNAc-Ser, were infused at 250 pmollpl and the Asn-linked oligosaccharide models (Figure 1)flow injected at 100pmol/pL in 50% methanol, 50% water, 0.2% formic acid (v/v/v) by means of a Model 22 syringe pump (Harvard Apparatus, S. Natick, MA). Chromatographic separations for microbore LC/MS/MS experiments were performed using a BrownleeLabs Micro GradientSystem syringe pump and a 1.0 X 100-mmAquapore column packed with 7-pmdiameter particle, 30-nm pore size, CIS-bonded silica (Applied Biosystems, Foster City, CA). The eluting solvent was run as a linear gradient from (A) 0.1 % trifluoroacetic acid (TFA)in water (18) Deen, K. C.; McDougal, J. S.; Inacker, R.; Folena-Wasserman, G.; Arthos, J.; Rosenberg, J.; Maddon, P. J.; Axel, R.; Sweet, R. W. Nature 1988, 331, 82-84. (19) Culp, J. S.; Johansen, H.; Hellmig, B.; Beck, J.; Matthews, T. J.; Delers, A.; Rosenberg, M. BiolTechnology 1991, 9, 173-177. (20) Weisman, H. F.; Barrow, T.; Leppo, M. K.; Marsh, H. C., Jr.; Carson, G. R.; Concino, M. F.; Boyle, M. P.; Roux, K. H.; Weisfeldt, M. L., Fearon, D. T. Science 1990,249, 146-151.
ANALYTICAL CHEMISTRY, VOL. 65, NO. 7, APRIL
I 1 gndirnt HPLC
- 2.1 mm - 1.0 mm hrwd S i l k a -
HPLC eolumn
split flow
nebulization
D y (air)
0.125 mm
argon colliiion
1, 1993 879
pulso-countlng moda detection
4 . CEY
apillary
stainless steel capillary (5 kV) FP
ayoshellr (20' K)
Flguro 2. Diagram of the LCIESMS system. OR = oriflce. q0-3) = quadrupole rod sets 0-3 (01 and 03 are mass analyzers). FP = faraday plate. CEM = channeklectron mukipier. Hlgh-OR refers to a relatlvely hlgh voltage potential between OR and 00,used to induce fragmentatbn of molecules prior to the first mass analyzer.
to (B) 90% acetonitrile, 10%H20,0.09% TFA using the following program: 040% B during 75 min and 60-100% B in 5 min at a flow rate of 40 pL/min. For microbore LC/MS/MS analyses we injected ca. 800 pmol of protein digest. The microbore LC/ MS setup is illustrated in Figure 2. The column effluent is split 101by means of a (Valco, Houston, TX) low dead-volume *tee" union with fused-silica capillary tubing outlets (Polymicro Technologies, Phoenix, AZ) of different diameters and lengths to produce the desired split ratio. The major portion of the stream flowed through a 100-pm4.d. capillary to a Hewlett-Packard 1050 variable wavelength UV detector (Hewlett-Packard Co., San Fernando, CA) operated at 215 nm and then to a Gilson 203 fraction collector (Gilson Medical Electronics, Middleton, WI); the minor portion of the split stream passed to the electrospray nebulizing needle tip through a 50-pm4.d. capillary. Mass Spectrometry. Electrospray mass spectra were recorded on an API-I11 triple-quadrupole mass spectrometer (Perkin-Elmer SciexInstruments,Thornhill, Canada) fitted with an articulated, pneumatically-assisted nebulization probe and an atmospheric pressure ionization source. The tuning and calibration solution consisted of an equimolar mixture of poly(propylene glycol) (PPG) 425,1000, and 2000 (3 X l O - 5 , l X 10-4, and 2 X 10-4M, respectively)in 50/50/0.1 HzO/methanol/formic acid (v/v/v), 1 mM NH40Ac. Calibration across the m/z range 10-2400 was effected by multiple-ion monitoring of eight "4OAc/PPG solutionion signals. Mass-to-chargeratio assignments in the figuresare the measured peak tops and will be more closely related either to the monoisotopic or to the average mass depending on the molecular weight. Although computed peak centroids traditionally yield better mass assignments than peak tops, the presence of various charge states and the resulting variation in peak widths in a single spectrum makes such centroiding error-prone. The ion spray voltage was operated at 4.5 kV with 35-40 psi "zero" grade compressedair nebulization gas flowingat 0.8 L/min. A curtain of nitrogen gas (99.999% ) was used to keep atmospheric gases out of the analyzer at a flow rate of 1.2 L/min. Normal and precursor-ion mass spectra were acquired by scanning 61. Product-ion spectra were taken by colliding the Q1-selected precursor ion of interest with argon gas in Q2 and scanning Q3 to analyze the fragment-ion products; precursor-ion scans were performed the same way except that Q1 was the scanning and Q3 the selecting quadrupole. Neutral-loss spectra required scanning Q1 and 6 3 synchronously but offset one from the other by the m/z of interest. For all tandem MS experiments the 99.999% argon collision target gas thickness was (5-7 X 1014 molecules/cm2,a figure computer-derived from a capacitance manometer reading of the gas inlet pressure and a formula for the downstream density of a free jet.21 Precise collision energy measurements were not made, although these can be approximated as being somewhat greater than the ion charge state multipled by the voltage potential (20-40 V) between the QO and Q2 quadrupoles. (21)French, J. B. Can. Aeronaut. Space Inst. Trans. 1977, 3,77.
Normal scan ESMS and LC/ESMS spectra were recorded at instrument conditions sufficient to resolve the isotopes of the PPG/NHf doubly-chargedion at m/z 520 (85% valleydefinition). Although the observation of 0.5 m/z isotope spacing is a reliable indication of a doubly-charged ion under normal conditions,such distinctions could not always be made with confidence for fastscanningLC/ESMS where fast scan rates produce relativelysmall ion-current samples and nonideal spectral peak shape. For precursor-ion tandem MS, the analyzing quadrupole (Ql) was operated at a resolution of about unit m/z(using a 35% peak valley definition at low m/z and a 60% peak valley definition at high mlz), while the mass-selecting quadrupole Q3 was set to pass a 2-3-Da window around the ion of interest so as to enhance sensitivity. Orifice potential (OR), quadrupole rod offseta,and collision gas pressure were adjusted using the MaqGlcNAc model. Althoughthe leeway for optimization is broad, we carefullyadjust the instrument parameters for acquisition of LC/ESMS/MS spectra with low levels of analyte ( 6 0 pmol). We have found that the optimal orifice potential is 55 V, although other values were used in earlier experiments (seefigurelegends). All tandem MS spectra have been smoothed once. Collisional-excitationscanning (see Results and Discussion) can be implemented on the Sciex instrument in two ways: linear ramping of the orifice voltage with m/z is provided for in the software package; it is also possible to use mutiple-ion monitoring with two very wide ion windows to maintain the sampling orifice potential high (140V) during the low m/z scan window (150-500) and maintain it at a low (65 V)voltage in the second ion window which encompasses the scan from m/z 500 to 2100 (see Figure 7). It is the latter approach that we now prefer, and we sometimes refer to this approach as a stepped orifice-voltagescan.
RESULTS AND DISCUSSION The carbohydrate nomenclature used in this paper employs the abbreviations Hex (hexose), HexNAc (N-acetylhexosamine), and NeuAc (N-acetylneuraminic acid) to refer to the carbohydrate residues without specifying which of the isobaric and mass spectrometrically equivalent isomers are actually present (e.g., mannose, glucose, etc. are e. subset of hexose). In the mass spectra, loss of neutral carbohydrate moieties or pieces thereof from the charged parent ion are designated with a =''. For example, -Hex (-C6HI0O5,162 Da) indicates a loss of neutral dehydrohexose. Carbohydrate fragment ions with structures formally equivalent to oxonium ions are indicated with a "+" sign following the structure; for example Hex+ (CsH1105+,m/z 163) implies the oxonium ion of hexose. It is important to note that -Hex and Hex+ (and the other corresponding carbohydrate fragment pairs) differ by one hydrogen (1Da) from one another; this nomenclature simplifies annotation of the mass spectra. Designations such as (HexNAc-Asn H)+ and (Hex2HexNAc + H)+ indicate
+
(M+H+K)~' '03-
Hex.HexNAc-Hex
, Hex-HexNAc-HexNAc-As"
I
I
Hex.
j
I
Hex,
Hex-HexNAc.Hex / 5-
-
(Mi2H)" a78
g
$ f HexHexNAc
sal
! I (M+K)' 908
1 .
200
I
,/I , . . ,
3m
L , , ,1 1 : . 400
!
500
hal
, I
600
,
I#.
. TC
iOC
900
000
112
+
Figure 3. Product-IonMSlMS of mlz 878.0 ((M 2H)*+)for 500 pmol of Asn-ilnked oligosaccharide la. An asterisk (*) indlcates fragments formed by two-bond cleavages. Sum of 7 scans (mlz 10-1800, step 1.0, 16.8 slscan); orlflce voltage = 55 V.
structures corresponding to smaller, intact molecules derived from the charged parent molecule by fragmentation and hydrogen rearrangement. Because glycopeptides contain hexose or N-acetylhexosamine isomers which fragment in a predictable manner upon collisionalexcitation, it seemed reasonable that we would be able to recognize glycopeptides in a complex mixture by their characteristic collision-induceddissociation (CID) fragment ions. Moreover, such diagnostic fragment ions would allow selective identification of glycopeptide-containingchromatographic fractions during LC/MS analyses of enzymatic digests of glycoproteins. To test these hypotheses, we have employed pneumatically- assisted electrospray ionization on a triple-quadrupole mass analyzer in the configuration illustrated in Figure 2. Pneumatically-assisted electrospray has been shown previously to have advantages over continuous-flow fast atom bombardment for sensitive detection of glycopeptides.ll Exploration of various normal and tandem MS scan modes (product-ion, precursor-ion, and neutralloss) was initially performed on a small oligosaccharidemodel, Man4GlcNAc (M, = 869.3), and later on Am-linked and Serlinked oligosaccharides (Figure 1). The peptide-linked oligosaccharides were better glycopeptide models for our purposes in that formation of HexNAc+ oxonium ion fragments required two-bond cleavages, as in true glycopeptides. Product-IonScanning for Model Glycopeptides. The structurally-informative fragment ions shown in Figure 3 resulted from tandem MS selection of the (M -t2HP+ of the Asn-linked oligosaccharide (Figure la) for CID product-ion scanning. The results illustrate that low-energy (typically