Article pubs.acs.org/ac
N- and O‑Glycosylation Analysis of Etanercept Using Liquid Chromatography and Quadrupole Time-of-Flight Mass Spectrometry Equipped with Electron-Transfer Dissociation Functionality Stephane Houel,† Mark Hilliard,‡ Ying Qing Yu,† Niaobh McLoughlin,‡ Silvia Millan Martin,‡ Pauline M. Rudd,‡ Jonathan P. Williams,§ and Weibin Chen*,† †
Late Stage Development, Pharmaceutical Life Sciences, Waters Corporation, Milford, Massachusetts 01757, United States National Institute for Bioprocessing Research and Training, Fosters Avenue, Mount Merrion, Blackrock, County Dublin, Ireland § Waters Corporation, Atlas Park, Wythenshawe, Manchester M22 5PP, U.K. ‡
S Supporting Information *
ABSTRACT: Etanercept is a highly glycosylated therapeutic Fc-fusion protein that contains multiple N- and O-glycosylation sites. An in-depth characterization of the glycosylation of etanercept was carried out using liquid chromatography/mass spectrometry (LC/MS) methods in a systematic approach in which we analyzed the N- and O-linked glycans and located the occupied O-glycosylation sites. Etanercept was first treated with peptide Nglycosidase F to release the N-glycans. The N-glycan pool was labeled with a 2-aminobenzamide (2-AB) fluorescence tag and separated using ultraperformance liquid chromatography−hydrophilic interaction liquid chromatography (UPLC-HILIC). Preliminary structures were assigned using Glycobase. These assignments, which included monosaccharide sequence and linkage information, were confirmed by exoglycosidase array digestions of aliquots of the N-glycan pool. The removal of the N-glycans from etanercept facilitated the selective characterization of O-glycopeptides and enabled the O-glycans to be identified. These were predominantly of the core 1 subtype (HexHexNAc O-structure) attached to Ser/Thr residues. α2→3,6,8,9 sialidase was used to remove the sialic acid residues on the O-glycans allowing the use of an automated LC/MSE protocol to identify the O-glycopeptides. Electron-transfer dissociation (ETD) was then used to pinpoint the 12 occupied O-glycosylation sites. The determination of Nand O-glycans and O-glycosylation sites in etanercept provides a basis for future studies addressing the biological importance of specific protein glycosylations in the production of safe and efficacious biotherapeutics.
R
typically established by preclinical and clinical studies from which the acceptable variability is also defined.6 Analysis of the N- and O-glycans attached to therapeutic proteins, on the other hand, provides crucial information on the types of glycan structures that exist among the protein population,7 helps to define the heterogeneity of the sample,8 and establishes methods to monitor the variability of glycan structures among different production methods.9 Information acquired from the analysis of protein glycosylation also helps to establish structure−function relationships for therapeutic proteins10 and is closely related to patent applications because healthcare products such as glycosylated protein therapeutics with possibly different biological activities need to be assessed. A full molecular description and the production of different glycoforms may be covered by patent. As our understanding of the importance of the role of N- and O-glycans in therapeutic molecules increases, there is a pressing need to develop new
ecombinant therapeutic glycoproteins are typically manufactured in mammalian cell expression systems (including CHO, murine myeloma, and HEK cell lines) because their protein glycosylation machinery broadly resembles that of humans.1,2 The glycosylation of therapeutic proteins is affected by the culture medium, the efficiency of protein expression, and the physiological status of the host cells. Thus, glycosylation patterns of therapeutic proteins produced in mammalian expression systems are generally heterogeneous and can vary from batch to batch. In addition, the glycoform populations can be very different between recombinant products and native proteins derived from human sources.3,4 From both a regulation and manufacturing point of view, it is essential to demonstrate that glycosylation is consistent, showing that there is proper control over the production process, and to establish acceptable variation limits for biotherapeutic production. Glycosylation plays a vital role in stability, in vivo activity, solubility, serum half-life, and immunogenicity of many recombinant therapeutic proteins.5 The roles of glycosylation in pharmacokinetics, bioavailability, clearance, and potency are © 2013 American Chemical Society
Received: August 20, 2013 Accepted: December 5, 2013 Published: December 5, 2013 576
dx.doi.org/10.1021/ac402726h | Anal. Chem. 2014, 86, 576−584
Analytical Chemistry
Article
of N-linked glycosylation in the Fc portion of the protein, and to verify the structures of the O-linked glycans. In addition, with the employment of an ETD fragmentation technique, all of the reported 13 O-glycosylation sites have been determined. The results represent the first detailed report on the characterization of the highly glycosylated etanercept and demonstrate the utility of the methodology for the glycosylation analysis of other challenging biological samples.
analytical strategies for a better understanding their diverse structures. The analysis of protein glycosylation has long been recognized as a challenging task.11,12 In comparison to other types of biomolecules, such as proteins and peptides, glycans possess many additional structural features such as variable sequence, linkage, branching, and anomericity of the constituent monosaccharides. These attributes complicate the structural elucidation of the glycoprotein. As a consequence, orthogonal technologies are required to fully determine the glycan structures.11−13 Liquid chromatography/mass spectrometry (LC/MS) has become an invaluable combination of technologies because of the inherent robustness, sensitivity and specificity and the quantitative data it provides. The combination of LC for the separation and MS(/MS) for the detection and further detailed structural analysis of glycans and glycopeptides provides in-depth information about protein glycosylation. The recently introduced electron-transfer dissociation (ETD) technique14 has opened up new possibilities for the structural characterization of glycoproteins at the glycopeptide level. ETD is a nonergodic fragmentation process capable of retaining the labile glycosidic linkage, providing the facile deduction and verification of the glycan mass and the identification of the amino acid residue at the glycosylation site.15 This information is highly valuable for O-linked glycosylation analysis which is more challenging because, in contrast to N-linked glycans, O-glycans do not have a single consensus sequence for a potential glycosylation site, do not have a common core structure, are more heterogeneous, and are often clustered together at high density, particularly in Ser/ Pro/Thr-rich domains. The application of liquid chromatography coupled to mass spectrometry (LC/MS) to in-depth analysis of glycosylation of therapeutic proteins and peptides has grown in recent years.16,17 Etanercept is an Fc-fusion protein that binds to TNF-α receptor, and it was first approved by the FDA in 1998 for treating rheumatoid arthritis. In 2012, the annual sale of etanercept (trade name Enbrel) almost reached $ 8.0 billion and ranked as the no. 3 among the best selling pharmaceuticals.18 The therapeutic significance and high profitability of this biotherapeutic makes it a popular target for biosimilar development. Production of etanercept in mammalian cell lines results in a high level of glycosylation, which can be the source of both macro- and microheterogeneity. Three major Nlinked sites and 13 O-linked sites have been previously reported19 although no detailed characterization of the glycosylation has been provided. Despite the fact that Nglycosylation at the Asn-Xaa-Ser/Thr consensus sequence motifs can be identified at N149, N171, and N317, the sitespecific N-glycosylation heterogeneity at each individual site is as yet unknown. In addition, reports of the 13 O-linked glycosylation sites (and site heterogeneity) of the molecule are generally ambiguous. For significant conclusions to be drawn on the processing and biological function of etanercept glycosylation, a complete characterization of glycosylation is required. In the current study, we developed an integrated analytical approach to perform in-depth characterization of both N- and O-linked glycosylation of etanercept. The approach utilizes liquid chromatography coupled to fluorescence detection (LCFLR) and LC/MS as the main analytical tools, combining a judicious sample preparation strategy, to elucidate the structures of N-linked glycans, determine the site heterogeneity
■
MATERIALS AND METHODS A brief description on materials and methods is provided in this section. For a more complete version of the experimental procedures, please refer to the Supporting Information. Sample Preparations. N-Glycan Release, Fluorescent Labeling, and Exoglycosidase Digestion. Methods for releasing and labeling with 2-aminobenzamide (2-AB) and for the N-linked glycan structure analysis using exoglycosidase arrays (see the Supporting Information for the specificity of the exoglycosidase used here) and retention time alignment are detailed in the literature.21 O-Linked Glycan Release. The etanercept sample was resuspended in 28% NH3·H2O saturated with (NH4)2CO3 and incubated at 60 °C for 16 h, and salt was removed with porous graphitic carbon (PGC) (Thermo Scientific Hypersep HyperCarb cartridges). All samples were fluorescently derivatized via reductive amination with 2-AB and mixed with sodium cyanoborohydride in 30% (v/v) acetic acid in dimethyl sulfoxide (DMSO) at 65 °C for 2 h. Excess fluorophore was removed with filter paper. Exoglycosidase digestion arrays were performed following the published procedure,21 and all samples were run on an ultraperformance liquid chromatograph (UPLC) with a fluorescence detector. Desialylation and Tryptic Digestion of Etanercept. Ten microliters of etanercept protein stock solution (50 μg/μL) was mixed with 325 μL of 8 M urea and 125 μL of 1 M Tris−HCl (pH 7.6), and the protein was subsequently reduced (by DTT) and alkylated (iodoacetamide). The protein was buffer exchanged using a NAP-5 column (GE Healthcare Life Sciences) to 0.75 mL of 0.1 M Tris−HCl (pH 7.6). Two hundred microliters of the protein solution was incubated with PNGase F and sialidase for 5 h at 37 °C. This step was performed to remove all N-linked glycans and the sialic acid residues on O-linked glycans. The deglycosylated protein was incubated with 10 μg of trypsin overnight at 37 °C to generate tryptic peptides. Preparation of the Fc Subunit of Etanercept. Preparation of Fc-fragment of etanercept fusion protein was achieved using enzyme IdeS, FabRICATOR (Genovis AB, Sweden), and the digestion procedure follows the protocol recommended by the manufacturer. Liquid Chromatography and Mass Spectrometry. Hydrophilic Interaction Liquid Chromatography (HILIC)UPLC-FLR. Analysis of the labeled N-glycans was performed on a Waters Acquity UPLC BEH Glycan column (2.1 mm × 150 mm, 1.7 μm particle).22 Instrumentation used included an Acquity UPLC with a fluorescence detector (Waters Corporation, Milford, MA, U.S.A.) under the control of Empower chromatography workstation software. Experimental Conditions for Peptide Mapping (LC-MSE Methods). A multiplexed data acquisition method (MSE)23 was employed for the mass spectrometric analysis of the tryptic digest of etanercept. The LC/MSE data was acquired on a quadrupole time-of-flight mass spectrometer (Synapt G2-S 577
dx.doi.org/10.1021/ac402726h | Anal. Chem. 2014, 86, 576−584
Analytical Chemistry
Article
HDMS, Waters Corporation) equipped with ETD functionality. An Acquity UPLC I-Class system (Waters Corporation) was coupled to the mass spectrometer as an inlet system. Tryptic peptides produced from the digestion of etanercept were separated on an Acquity BEH UPLC C18 column (2.1 mm × 150 mm, 1.7 μm) using a gradient from 3 to 35% acetonitrile over 90 min at a flow rate of 0.2 mL/min and a column temperature of 65 °C. ETD Setup and Experimental Methods. ETD experiments was performed on the same Synapt G2-S mass spectrometer used for LC/MSE experiments. The ETD functionality of the instrument has been described in detail elsewhere,24,25 and a brief description is provided in the Supporting Information. Data Analysis. The LC/MSE data was processed using BiopharmaLynx 1.3.3 (Waters Corporation) for sequence confirmation and glycopeptide identification. In order to identify O-glycopeptides in the etanercept digest, a modifier that represents core 1 O-glycan subtype (galactose[β1− 3]GalNAc, C14H23O10N, +365.13 Da) was included as variable modifications in the BiopharmaLynx data search. Up to eight core 1 structures on serine and/or threonine residues on a peptide were permitted during the search. The mass tolerance was set at 10 ppm for precursor and 20 ppm for fragment ions, respectively. The identified peptides were confirmed by MSE spectra with at least five b/y fragment ions (on average) from triplicate analysis. Data interpretation of ETD mass spectra of glycopeptides was done using tools inside MassLynx 4.1(Waters Corporation, Milford, MA). MaxEnt 3 software was used to deconvolute the raw data into singly charged monoisotopic spectra to assist the interpretation of the protonated fragment ions. BioLynx was used to verify the manually assigned sites of glycosylation by in silico fragmentation of the proposed O-glycopeptide.
taken in the current study for analyzing the glycosylation of etanercept. Analysis of N-Linked Glycosylation of Etanercept Using Exoglycosidase Array and HILIC-LC Separation with Fluorescence Detection. Analysis of the Total Released N-Linked Glycans Using Weak Anion-Exchange (WAX) Chromatography. WAX of N-glycans is a chargedependent separation based upon the numbers of charged structures, such as sialic acid, attached to the glycan. WAX was used to fractionate charged N-glycans to reduce sample complexity thereby aiding their subsequent characterization. Fetuin (bovine) was used as a positive control as it contains mono- (S1), di- (S2), tri- (S3), and tetra-sialylated (S4) Nglycans. Charged-based separation of N-glycans released from etanercept revealed that neutral structures make up the majority of the N-glycans (56%). This was followed by monosialylated (36%) and disialylated (8%) structures (Figure 1). Each of these WAX fractions was subsequently analyzed by
RESULTS AND DISCUSSION Etanercept is a genetically engineered dimeric fusion protein, and each monomer contains the TNF-α receptor linked to the Fc portion of human IgG1 (CH2 and CH3 domains) by an Oglycosylated peptide (Scheme 1). The TNF-α receptor portion
Figure 1. Weak anion-exchange (WAX) fractionation and analysis of sialylated N-linked glycans from etanercept: (i) fetuin N-linked glycans used as a positive control to identify sialylated speciation; (ii) total released N-linked glycans from etanercept; (iii) ABS exoglycosidase digest, suggesting that all charged N-glycans present on etanercept are sialylated [mono- (S1) and disialylated (S2)]. Percentage areas of each of the glycan species is presented in the embedded table.
■
Scheme 1. Diagram of Schematic Illustration of the Etanercept Glycosylation Structure
UPLC-HILIC-FLR (see Supporting Information Figure S-1), and exoglycosidase digests were carried out to fully characterize individual N-glycan structures present in each fraction (i.e., confirm all carbohydrate structures present, including monosaccharide sequence and linkage information). The identification of each glycan was based on the retention time matches in glucose unit (GU) value.26 The GU values for N-glycans were assigned using a retention time calibration curve from the 2-AB-labeled dextran ladder chromatogram. Each exoglycosidase digest gives specific cleavage information that is associated with a retention time shift due to cleavage of the terminal monosaccharide. Figure 2 shows an example of monosialylated structures in which it was observed that the major monosialylated N-glycans were biantennary branched glycans with/without core fucose. Furthermore, the sialidases, ABS and NAN1, also confirmed that all the sialylated structures were α23 linked, characteristic of CHO cell glycosylation. Analysis of Total N-Linked Released Glycans by UPLCHILIC-FLR and Exoglycosidase Digestion Array. A full panel of exoglycosidase digestions was performed on the total glycan pool to determine and assign the N-linked glycan structures within etanercept (Figure 3). The undigested profile of
contains four domains, and more than a dozen of the Oglycosylation sites are reported to be located in the linker domain,19 which has a high frequency of serine, threonine, and proline residues in the region. In addition, two N-linked glycosylation sites are also included in the TNF-α receptor portion.20 These structural features pose a significant challenge for glycosylation characterization and call for a systematic approach to analyze the glycosylation status. Supporting Information Scheme S-I displays the overall strategy under578
dx.doi.org/10.1021/ac402726h | Anal. Chem. 2014, 86, 576−584
Analytical Chemistry
Article
Figure 2. UPLC-HILIC-FLR analysis coupling with exoglycosidase array to determine N-glycans present in each anionic fraction from WAX analysis (monosialylated, see Figure 1): (i) monosialylated (S1) N-glycan pool from WAX fractionation; (ii) NAN1 (recombinant sialidase) releases α2−3 sialic acids; (iii) ABS (Athrobacter ureafaciens sialidase) releases α(2→3,6,8) sialic acids; (iv) BKF (fucosidase from bovine kidney) releases α1−2,1− 6 fucose; (v) BTG (bovine testes ß-galactosidase) releases β1−3 and β1−4 linkages, galactose; (vi) GUH (hexosaminidase) release β GlcNAc but not GlcNAc linked to β1−4 Man. Please refer to Supporting Information Scheme S-II for symbol notation.
Figure 3. UPLC analysis of total released N-linked glycans and exoglycosidase arrays by HILIC-FLR from etanercept: (i) whole N-glycan pool released by PNGase F; (ii) NAN1 releases α2−3; (iii) ABS releases α(2→3,6,8) sialic acids; (iv) BFK releases α1−2 or α1−6 fucose; (v) BTG releases β1−3 and β1−4 linked galactose; (vi) SPG releases β1−4 linked galactose residues; (vii) GUH releases β GlcNAc but not GlcNAc linked to β1−4 Man; (viii) JBM releases α1−2/α1−6 and α1−3 linked mannose residues.
etanercept contains 12 major peaks. The most abundant peaks were F(6)A2, biantennary, core-fucosylated N-glycans with no terminal galactose (21.82%), followed by the monosialylated Nglycans F(6)A2G2S(3)1, biantennary, core-fucosylated with two terminal galactose residues and one sialic acid (20.36%), and A2G2S(3)1, biantennary N-glycans with two terminal galactose residues and one sialic acid (12.51%), the disialylated N-glycans F(6)A2G2S(3,3)2, biantennary, core-fucosylated
with two terminal galactose residues and two sialic acids (6.98%), and the unsialyated F(6)A2G1, biantennary, corefucosylated with one terminal galactose residue (9.38%) and F(6)A2G2, biantennary, core-fucosylated with two terminal galactose residues (7.78%). ABS and NAN1 digestions confirmed that all the sialylated N-glycans contained α2−3 linked sialic acids. Additional 579
dx.doi.org/10.1021/ac402726h | Anal. Chem. 2014, 86, 576−584
Analytical Chemistry
Article
peptides from the sialidase treated sample were analyzed by LC/MSE (no need to select precursor ion) and ETD (targeted fragmentation). Determination of O-Linked Glycan Structures of Etanercept. O-Linked glycans from etanercept were released using ammonia-based β-elimination and analyzed by HILIC-FLR. Released O-glycans were subsequently treated by exoglycosidase array digestions to confirm the O-glycan structures. Core 1 O-glycans, monosialylated (49.0%), disialylated (10.1%), and neutral O-glycans (7.3%), were identified. The process of the ammonia-based β-elimination release and labeling of O-glycans generates a peeled product, which is not a complete O-glycan but an artifact of the process and comprises 33.6% of the glycan structures (Supporting Information Figure S-3). The presence of the major structures was confirmed by the results from exoglycosidase digestions. These results suggest that the majority of etanercept O-glycans are of the core 1 type and are capped with α2-3 linked sialic acid. In addition to α2-3 linkage, the sialic acid is also connected to GalNAc through α26 bonds as suggested by the presence of the disialyl form of core 1 O-glycans. This finding agrees well with the previous report27 that proteins expressed in a CHO cell line contained only core 1 type O-linked glycans. Identification of O-Linked Glycopeptides from Tryptic Digests of Etanercept. The analysis of released O-glycans revealed that only core 1 type O-glycans with terminal sialic acid residue are present on etanercept. The presence of sialylated O-glycan structures on etanercept also suggested that the sialic acid residues could be enzymatically trimmed down using α(2→3,6,8,9) sialidase (Arthrobacter ureafaciens sialidase) leaving behind a neutral HexHexNAc (core 1) tag attached to the peptide backbone, as this enzyme cleaves all nonreducing unbranched N-acetylneuraminic acids (sialic acids).28,29 Armed with this knowledge, we designed an effective sample preparation strategy to reduce the heterogeneity of the region by removing the sialic acid residues on the O-glycans. This greatly reduced the complexity and size of these Oglycopeptides prior to LC/MSE analysis. The simplification also allowed the use of the most common proteases such as trypsin or Lys-C to generate peptides for detailed analysis of glycosylation sites with each peptide potentially containing a number of O-glycosylation sites. The removal of the sialic acid residues also improves the efficiency of reversed-phase LC separation to resolve the different forms of glycopeptides. The identification of glycopeptides was facilitated by using a bioinformatics tool, BiopharmaLynx, in which a tag with core 1 structure was added as a variable modification; all in-silico digested peptides from etanercept were searched against this variable modification. Table 1 lists all the core-1-containing glycopeptides that were identified during the LC/MSE analysis. The variation in the retention times for all of the identified O-glycopeptides was within ±0.1 min in the triplicate analysis. The average number of b or y fragment ions that matched to the glycopeptides ranged from 5 to 57. The number of matching b or y ions is related to the abundance of individual glycopeptide precursors. The measured mass errors on the parent ions were also determined based on the triplicate analyses. The mass error ranged from 0.2 to 5.0 ppm. A multiplexed data acquisition method (MSE) that utilizes collision-induced dissociation (CID) for glycopeptide fragmentation was applied to provide peptide sequence confirmation. Not surprisingly, many neutral loss peaks that correspond to
digestion with BKF revealed that the majority of N-linked glycans from etanercept contain a core fucose residue. The number and linkage of galactose residues was determined by digestion with two enzymes, BTG and SPG, which confirmed that all galactose residues were β1−4 linked. Digestion with GUH confirmed that the core structure for the biantennary N-glycans was A2. To confirm the presence of high mannose N-glycans, an additional digest with JBM was performed. Etanercept contained a number of high mannose N-glycans, where M5 (3.29%) was the most abundant followed by M4 (0.26%), M6 (0.53%), and M7 (0.05%). See the Supporting Information for a complete table of all N-linked glycans (Supporting Information Table S-1). Location of Tri- and Tetra-Antennary N-Glycans on Etanercept and Site-Specific N-Glycosylation of the Fc Subunit. The intact etanercept molecule was digested by the IdeS enzyme to generate two components (subunits): TNF-α receptor and Fc (see Scheme 1) for site-specific N-glycosylation analysis. TNF-α receptor contains two N-glycosylation sites, whereas Fc has only one. These components were separated on a C4 column and collected for PNGase F treatment. Released N-glycans from each component were analyzed by HILIC-FLR methods (Figure 4). The TNF-α receptor component
Figure 4. (A) UPLC-UV separation of etanercept subunits (TNF-α receptor subunit and Fc portion) from FabRICATOR enzyme digestion using a BEH C4 column. (B) UPLC-FLR analysis of released N-linked glycans from the TNF-α receptor region. Tri- and tetra-antennary N-glycans were only found in this region but not in the Fc subunit. (C) UPLC-FLR analysis of released N-linked glycans from the Fc region. (D) Overlay of the chromatograms of the released Nlinked glycans from Fc and TNF-α receptor regions. The profile of the combined FLR chromatogram looks identical to that of the N-linked glycan profile from unfractionated sample.
contained the larger tri- and tetra-antennary glycan structures. These glycan structures were not present on the Fc fragment, which predominantly contained small biantennary neutral Nglycans (Figure 4). Detailed information on the structures of all the N-glycans on Fc subunit can be found in the Supporting Information (Figure S-2, mass analysis of Fc subunit). Analysis of O-Linked Glycosylation of Etanercept Using LC/MSE and ETD. O-Glycans were initially released from the protein by reductive β-elimination. Analysis of released O-glycans provided information on the type and structure of the attached O-glycans but not on the occupied sites. To determine the O-linked glycosylation sites, tryptic 580
dx.doi.org/10.1021/ac402726h | Anal. Chem. 2014, 86, 576−584
Analytical Chemistry
Article
Table 1. Identified Tryptic Peptides Containing Core 1 Tag in This Studya O-linked glycan tryptic peptide
sequence
T01
LPAQVAFTPYAPEPGSTCR
T13
CRPGFGVARPGTETSDVVCKPCAPGTFSNTTSSTDICRPHQICNVVAIPGNASMDAVCTSTSPTR
T14
SMAPGAVHLPQPVSTR
T15
SQHTQPTPEPSTAPSTSFLLPMGPSPPAEGSTGDEPK
T17
THTCPPCPAPELLGGPSVFLFPPKPK
core 1b
MH+
RT
b/y found
error (ppm)
0 1 1* 1** 1** 3 2 1 1 7 6 6 5 2 1 0
2062.01 2427.14 7376.34 7377.33 7377.33 2743.26 2378.14 2013.00 2013.00 6316.69 5951.56 5951.56 5586.43 4491.03 3209.59 2844.46
48.3 45.1 50.6 51.1 51.8 34.4 36.3 37.9 39.4 44.6 45.4 46.5 49.9 53.7 66.7 67.7
34 24 14 57 28 6 35 26 30 20 5 18 8 5 39 41
5.0 0.4 0.5 0.8 0.7 0.5 3.5 0.4 0.2 0.6 0.6 0.7 1.1 0.4 0.6 3.6
a
The precursor mass (MH+, monoisotopic mass) comes from a deconvolution algorithm in BiopharmaLynx, and the labeled retention time (in minutes) corresponds to the apex of the extracted ion chromatogram peak for each precursor. The number of b/y ions and the measured mass errors are from an average value of triplicate analysis. bAsterisk (*) is the number of converted Asp residues in peptide.
forms, suggesting partial occupancy may take place on those glycosylation sites. For peptides T13, T14, and T15, no aglycosylated forms were detected in the current analysis. Three glycoforms for T13 peptide were identified. These glycopeptides all contained a single core 1 tag. The differences among the three glycoforms can be attributed to one glycoform which has one Asp site that represents Asn to Asp conversion due to PNGase F treatment (N → D) and two glycoforms which contains two enzymatically converted Asp sites. Inspection of the amino acid sequence of the T13 peptide shows that this peptide contains several serine and threonine residues and two NXT/S consensus sequences for N-linked glycosylation (from the TNF-α receptor domain). Since all of the O-glycopeptides were generated after the N-linked glycans were removed from the protein using PNGase F, therefore the identification of a T13 peptide with just one enzymatic conversion site suggests that the two N-linked glycosylation sites may not be fully occupied. On the other hand, the fact that two glycoforms were found that both contain two enzymatic conversion sites could be explained by the presence of two different O-linked sites on the peptide. The other possibility remains that chemical deamidation may have taken place prior to the PNGase F treatment31 and led to an Asp/isoAsp peptide mixture. It was previously demonstrated that peptides containing Asp/isoAsp can be chromatographically resolved. The combination of MSE data and a targeted informatics tool (BiopharmaLynx) provides an effective means of identifying the number of O-glycosylation sites on the protein. MSE spectra from CID fragmentation generally contain many neutral loss peaks that correspond to the cleavage of the core 1 glycan moiety from the peptide backbone that results in the loss of galactose and N-acetyl glucosamine residues. Although the neutral loss data is not useful for O-glycosylation site identification, the neutral loss peaks provided confirmation of core 1 tags (hence the number of glycosylation sites) associated with the peptide. However, as a result of the removal of sialic acid in the sample preparation step, the information on the site heterogeneity was lost. Since every core 1 modified site could
the loss of core 1 tags were observed in the high-energy fragmentation spectra (data not shown). Monitoring this neutral loss mass shift is useful for counting the number of O-glycosylation sites on the peptide. However, sufficient peptide backbone fragmentation was also generated (many b/ y ions found in Table 1) in the analysis. In total, five different tryptic peptides, T1, T13, T14, T15, and T17, contained Oglycosylation sites. These peptides come from the TNF-α receptor C-terminal region (T13, T14, and T15), the TNF-α receptor N-termini (T1), and the Fc hinge region. If the maximum number of O-glycosylation sites on all the peptides is combined, the total number of O-linked glycosylation sites is equal to 13, indicating that at least 13 core 1 sites exist on the entire fusion protein. Thirteen O-glycosylation sites were reportedly occupied in the literature,19,30 although neither the specific glycosylation site(s) nor the site heterogeneity was described. Peptide T15 contains a total of 11 serine and threonine residues. For this peptide, a number of glycoforms with various numbers of occupied sites were identified (see Table 1). The number ranges from two to seven as indicated by the number of core 1 tags found on the peptides. Interestingly, two T15 glycopeptides that both contain six core 1 tags were detected, except that they eluted at different retention times (45.4 and 46.5 min), suggesting this peptide has two isoforms that have different occupied site(s) (see Supporting Information Figure S-4). Similarly, glycopeptide T14, containing a single core 1 tag, also shows two isoforms, which were resolved chromatographically (data not shown). The chromatographic resolution achieved in the current study facilitates the in-depth characterization on the highly heterogeneous glycopeptides since isoforms of glycopeptides tend to coelute in reversed-phase chromatography and are difficult to differentiate by mass spectrometry. Following chromatographic separation, glycopeptides with the same number of modification tags can be selectively fragmented (e.g., via targeted ETD) to reveal structural differences. Among all the O-glycopeptides, only peptides T1 and T17 contained both modified and unmodified 581
dx.doi.org/10.1021/ac402726h | Anal. Chem. 2014, 86, 576−584
Analytical Chemistry
Article
potentially associate with two O-glycans (with one or two sialic acids), the total numbers of O-glycosylation sites could shed some light on sample complexity and glycosylation heterogeneity. Determination of the O-Glycosylation Site Occupied by ETD. Information acquired during LC/MSE peptide mapping experiments, for example, retention time, masses (m/z), charge states, and ion intensity of identified glycopeptides, facilitates the design of more efficient and targeted ETD workflows for the localization of O-linked glycosylation sites. For example, tryptic peptide T14 with two core 1 tags elutes at 36.3 min in the LC/MSE experiment with three charge states (2+, 3+, and 4+). An ETD method was designed between 35.6 and 36.6 min to target specifically the 4+ charged species of the T14 peptide (m/z 595.4). In addition, the ion intensity ratio between the radical anions and selected peptide precursor ions is a crucial factor for effective ETD fragmentation. When peptides did not undergo extensive fragmentation, a “supercharging” reagent, 3nitrobenzyl alcohol (m-NBA), was mixed with the LC eluents via postcolumn addition to increase the peptide charge states, thus improving the overall glycopeptides ETD fragmentation efficiency.25 Figure 5 shows a MaxEnt 3 deconvoluted ETD spectrum from T14 peptide (m/z 595.4, MH44+), which contains two
Figure 6. ETD MS/MS analysis of two glycoforms of T15 glycopeptide. (A) Deconvoluted ETD spectrum of T15 glycopeptide with six core 1 tags, retention time 45.4 min. The ETD fragmentation led to identification of all six glycosylation sites as shown by the c or z ion series. (B) Deconvoluted ETD spectrum of T15 glycopeptide with seven core 1 tags, retention time 44.6 min. The ETD spectrum led to the identification of all seven glycosylation sites as is illustrated by the c or z ion series shown in the figure. In comparison with the spectrum in panel A, the spectrum indicates that Thr213 is occupied in this glycopeptides (underline amino acid residue in Table 2).
Figure 5. Identification of the O-linked glycosylation sites in T14 peptide from ETD spectra. The MaxEnt 3 deconvoluted spectrum was displayed, and fragment ion peaks were labeled on the deconvoluted spectrum to show the glycosylation sites.
tags (Figure 6B, eluting at 44.6 min), respectively. Despite the incomplete sequence coverage, both ETD spectra contained enough c and z ion peaks to locate all the glycosylation sites unambiguously. Four threonine residues, at positions 205, 208, 213, 217, and three serine residues, at positions 212, 216 and 226, are associated with seven core 1 tags. For T15 that has six core 1 tags, the glycosylation sites are the same as the one with seven core 1, except that Thr213 (underlined amino acid residue in Table 2) is unoccupied. ETD sequencing was also performed on other Oglycopeptides that represent modified glycoforms of four tryptic peptides (T1, T14, T15, and T17). Twelve Oglycosylation sites (Table 2) were assigned (in red in the sequence). Among all the O-linked glycopeptides identified from LC/MSE analysis, only the location of the glycosylation site on T13 could not be unambiguously determined even though the peptide contains only one O-glycosylation site (two isoforms). The glycosylated T13 peptide is a large peptide (MW 7376 Da, see Table 1) and contains many potential Oglycosylation sites (seven serine residues and nine threonine residues). In addition, there are seven proline residues distributed throughout the peptide backbone, which could produce many sequence coverage map gaps. As a result, no
core 1 O-glycans. A series of highly intense fragment ion peaks are observed, and detailed annotation for the fragment ion peaks is shown above the spectrum. The fragment ion peak at m/z 625.30 (Figure 5) corresponds to the z2 ions of the peptide with a combined mass of the dipeptide TR (265.15 Da) and a core 1 tag (365.13 Da). Similarly, the fragment ion peak at 1077.48 matches the mass of tripeptide STR (347.18 Da) plus two core 1 tags (2 × 365.13 = 730.26 Da). This ETD information allowed the sites of O-glycosylation to be determined as Ser-199 and Thr-200 for the T14 glycopeptides eluting at 36.3 min. Out of 24 combined c or z ions for the peptide, 23 of them were found and assigned (∼95%), from which the sites of the O-glycosylation modification were confidently identified. Peptide T15 contains 11 serine and threonine residues and is highly O-glycosylated as indicated by CID data (Table 1). Figure 6 shows the deconvoluted spectra of T15 with six core 1 tags (Figure 6A, eluting at 45.4 min) and T15 with seven core 1 582
dx.doi.org/10.1021/ac402726h | Anal. Chem. 2014, 86, 576−584
Analytical Chemistry
Article
the framework for successfully defining the sites of attachment for O-glycosylation, such as those in mucins and Fc-fusion proteins. At the same time, these results provide a means to obtain detailed analyses of glycoproteins expressed in CHO cell lines, which comprise the majority of protein therapeutics.1
Table 2. Identified Glycosylation Sites (in Red) from ETD Spectra of Glycopeptides from Etanercept in This Study
■
ASSOCIATED CONTENT
S Supporting Information *
Additional information as noted in the text, which includes the detailed experimental procedures, schemes and figures. This material is available free of charge via the Internet at http:// pubs.acs.org.
■
AUTHOR INFORMATION
Corresponding Author
*E-mail:
[email protected]. ETD spectrum was successfully acquired to pinpoint the single O-glycosylation site confidently. For this challenging glycopeptide, an alternative approach is needed to reduce the peptide size by further cleaving trypsin-derived T13 peptide using AspN, resulting in peptides that can produce ETD spectra that are more amenable to interpretation. There are two aspartic (D) sites in the T13 tryptic peptide that are available for AspN digestion. The final piece of information for the complete characterization of O-glycosylation is the heterogeneity of glycan structures at each Ser and Thr glycosylation site. In order to acquire this data, etanercept was digested by trypsin without the treatment of PNGase F or sialidase, and tryptic peptides were analyzed by an LC/MSE method. The data was handled using BiopharmaLynx with up to eight core-1-containing O-glycan as a variable modification for each glycopeptide. Interestingly, only glycopeptides with a single or double glycosylation site were identified. For example, for the T1 peptide, two types of Oglycans, NANA-Hex-HexNAc or (NANA)2-Hex-HexNAc, were detected as coexisting at Thr8 residue (data not shown). The fact that we could identify T15 peptide when sialic acid residues were removed by sialidase, whereas no T15 glycopeptides were identified from the tryptic digest samples without sialidase treatment, suggests that multiple glycosylated T15 peptides were not easily recovered/detected. This suggests that the complexity and size of these glycopeptides could be reduced in order to generate meaningful results on the site heterogeneity of the O-glycosylation.
Notes
The authors declare no competing financial interest.
■
REFERENCES
(1) Hossler, P.; Khattak, S. F.; Li, Z. J. Glycobiology 2009, 19, 936− 949. (2) Li, H.; d’Anjou, M. Curr. Opin. Biotechnol 2009, 20, 678−684. (3) Fenaille, F.; Groseil, C.; Ramon, C.; Riandé, S.; Siret, L.; Chtourou, S.; Bihoreau, N. Glycoconjugate J. 2008, 25, 827−842. (4) Klausen, N. K.; Bayne, S.; Palm, L. Mol. Biotechnol. 1998, 9, 195− 204. (5) Lowe, J. B.; Marth, J. D. Annu. Rev. Biochem. 2003, 72, 643−691. (6) Higgins, E. Glycoconjugate J. 2010, 27, 211−225. (7) Thakur, D.; Rejtar, T.; Karger, B. L.; Washburn, N. J.; Bosques, C. J.; Gunay, N. S.; Shriver, Z.; Venkataraman, G. Anal. Chem. 2009, 81, 8900−8907. (8) Borisov, O. V.; Field, M.; Ling, V. T.; Harris, R. J. Anal. Chem. 2009, 81, 9744−9754. (9) Han, M.; Guo, A.; Jochheim, C.; Zhang, Y.; Martinez, T.; Kodama, P.; Pettit, D.; Balland, A. Chromatographia 2007, 66, 969− 976. (10) Siemiatkoski, J.; Ma, S.; Park, J.; Brorson, K.; Swann, P.; McLeod, L. BioProcess Int. 2011, 9, 48−53. (11) Leymarie, N.; Zaia, J. Anal. Chem. 2012, 84, 3040−3048. (12) Jensen, P. H.; Kolarich, D.; Packer, N. H. FEBS J. 2010, 277, 81−94. (13) Wuhrer, M.; Deelder, A. M.; Hokke, C. H. J. Chromatogr., B 2005, 825, 124−133. (14) Syka, J. E. P.; Coon, J. J.; Schroeder, M. J.; Shabanowitz, J.; Hunt, D. F. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 9528−9533. (15) Christiansen, M. N.; Kolarich, D.; Nevalainen, H.; Packer, N. H.; Jensen, P. H. Anal. Chem. 2010, 82, 3500−3509. (16) Mariño, K.; Bones, J.; Kattla, J. J; Rudd, P. M. Nat. Chem. Biol. 2010, 6, 713−723. (17) Zauner, G.; Kozak, R. P.; Gardner, R. A.; Fernandes, D. L.; Deelder, A. M.; Wuhrer, M. Biol. Chem. 2012, 393, 687−708. (18) Genetic Engineering & Biotechnology News. http://www. genengnews.com/insight-and-intelligenceand153/top-20-best-sellingdrugs-of-2012/77899775/?page=2. (19) Gur, A.; Oktayoglu, P. Anti-Inflammatory Anti-Allergy Agents Med. Chem. 2010, 9, 24−34. (20) Tan, Q.; Guo, Q.; Fang, C.; Wang, C.; Li, B.; Wang, H.; Li, J.; Guo, Y. mAbs 2012, 4, 761−774. (21) Royle, L.; Radcliffe, C. M.; Dwek, R. A.; Rudd, P. M. Methods Mol. Biol. 2007, 347, 125−143. (22) Ahn, J.; Bones, J.; Yu, Y. Q.; Rudd, P. M.; Gilar, M. J. Chromatogr. B: Anal. Technol. Biomed. Life Sci. 2010, 878, 403−408. (23) Chakraborty, A. B.; Berger, S. J.; Gebler, J. C. Rapid Commun. Mass Spectrom. 2007, 21, 730−744. (24) Williams, J. P.; Brown, J. M.; Campuzano, I.; Sadler, P. J. Chem. Commun. 2010, 46, 5458−5460.
■
CONCLUSIONS This paper describes the first direct analysis of the N- and Oglycosylation of etanercept by liquid chromatography and mass spectrometry. A series of UPLC-HILIC-FLR, LC/MSE via CID, and MS/MS via ETD experiments provide a comprehensive characterization of etanercept glycosylation. Our analysis of etanercept demonstrates that HILIC chromatography, in combination with exoglycosidase enzyme arrays, provides a routine and in-depth approach to structural characterization of the released glycans. In addition, at the glycopeptide level, ETD fragmentation can define the specific O-glycosylation sites for the majority of the O-glycosylated variants of etanercept. Moreover this analysis can be accomplished in an online LC/ MS format. This represents a significant step forward in the analysis of highly glycosylated protein therapeutics with highly clustered O-glycosylation. Although specific for etanercept, there is a commonality of Ser/Thr- and Pro-rich sequences in the linker domains of many Fc-fusion proteins. This study lays 583
dx.doi.org/10.1021/ac402726h | Anal. Chem. 2014, 86, 576−584
Analytical Chemistry
Article
(25) Williams, J. P.; Pringle, S.; Richardson, K.; Gethings, L.; Vissers, J. P. C.; De Cecco, M.; Houel, S.; Chakraborty, A. B.; Yu, Y. Q.; Chen, W.; Brown, J. M. Rapid Commun. Mass Spectrom. 2013, 27, 2383− 2390. (26) Campbell, M. P.; Roylez, L.; Radcliffe, C. M.; Dwek, R. A.; Rudd, P. M. Bioinformatics 2008, 24, 1214−1216. (27) North, S. J.; Huang, H.; Sundaram, S.; Jang-Lee, J.; Etienne, A. T.; Trollope, A.; Chalabi, S.; Dell, A.; Stanley, P.; Haslam, S. M. J. Biol. Chem. 2010, 285, 5759−5775. (28) Bongers, J.; Devincentis, J.; Fu, J.; Huang, P.; Kirkley, D. H.; Leister, K.; Liu, P.; Ludwig, R.; Rumney, K.; Tao, L.; Wu, W.; Russell, R. J. J. Chromatogr., A 2011, 1218, 8140−8149. (29) Uchida, Y.; Tsukada, Y.; Sugimori, T. J. Biochem. 1979, 86, 1573−1585. (30) Tracey, D.; Klareskog, L.; Sasso, E. H.; Salfeld, J. G.; Tak, P. P. Pharmacol. Ther. 2008, 117, 244−279. (31) Palmisano, G.; Melo-Braga, M. N.; Engholm-Keller, K.; Parker, B. L.; Larsen, M. R. J. Proteome Res. 2012, 11, 1949−1957.
584
dx.doi.org/10.1021/ac402726h | Anal. Chem. 2014, 86, 576−584