ARTICLE pubs.acs.org/bc
Site-Specific Modification of Recombinant Proteins: A Novel Platform for Modifying Glycoproteins Expressed in E. coli Grant E. Henderson Kevin D. Isett and Tillman U. Gerngross* ABSTRACT:
The site-specific modification of proteins is expected to be an important capability for the synthesis of bioconjugates in the future. However, the traditional repertoire of reactions available for the direct modification of proteins suffers from lack of specificity, necessitating costly downstream processing to isolate the specific species of interest.1 Here, we use a well-established, glycan-specific chemistry to PEGylate model glycoproteins, each containing a unique reactive GalNAc attached to a specifically engineered threonine residue. By engineering E. coli to execute the initial steps of human, mucin-type O-glycosylation, we were able to obtain homogeneous site-specifically modified glycoproteins with fully human glycan linkages. Two mucin-based reporters as well as several fusion proteins containing eight-amino-acid GalNAc-T recognition sequences were glycosylated in this engineered glycocompetent strain of E. coli. The use of one sequence in particular, PPPTSGPT, resulted in site-specific glycan occupancy of approximately 69% at the engineered threonine. The GalNAc present on the purified glycoprotein was oxidized by galactose oxidase and then coupled to hydroxylamine functionalized 20 kDa PEG in the presence of aniline. The glycoprotein could be converted to the PEGylated product at approximately 85% yield and >98% purity as determined by comparison to the products of control reactions.
’ INTRODUCTION To date, modification of proteins at naturally occurring functional groups has been the predominant strategy for the synthesis of bioconjugates.1 However, complications arise when proteins contain multiple reactive sites and/or have a reactive conjugation site near the active site. Traditionally, these problems have been solved by simply removing the reactive residues to convert the protein to a more suitable starting material or controlling the reaction to promote selectivity for the desired product. This process can be time-consuming and unreliable; therefore, new methods for protein modification are required. In order to improve upon traditional protocols, new methods should provide control over the conjugation reaction with minimal disruption of protein structure. This goal can be achieved through the selective targeting of bioorthogonal groups which have been engineered at desired locations within proteins of interest.25 The targeting of glycans for chemical modification r 2011 American Chemical Society
is not unprecedented. In particular, metabolic labeling of glycan structures through the use of chemically modified monosaccharides has found use in basic research involving the detection and characterization of glycosylation events in vivo 68 and likely could be employed for the synthesis of therapeutic bioconjugates in the future. However, this technique is limited to cell types with endogenous glycosylation capabilities, which due to heterogeneity in glycan structure and localization are not necessarily ideal hosts for the production of glycoprotein precursors for bioconjugation. Since aldehyde groups are typically absent in proteins and biocompatible aldehyde-specific reactions have been reported, we developed a new, glycosylation-based method to introduce Received: November 19, 2010 Revised: March 3, 2011 Published: March 12, 2011 903
dx.doi.org/10.1021/bc100510g | Bioconjugate Chem. 2011, 22, 903–912
Bioconjugate Chemistry
ARTICLE
Figure 1. Reaction schematic: 1. The epimerization of UDP-GlcNAc to UDP-GalNAc by WbpP from P. aerugosa and transfer of GalNAc to Ser/Thr residues in E. coli followed by in vitro oxidiation by galactose oxidase resulting in aldehyde functionalized glycoproteins. 2. Subsequent nucleophilic addition of hydroxylamine-functionalized poly(ethylene glycol) in the presence of aniline.
aldehydes in a site-specific manner, resulting in the capability for site-specific protein modification. Although the functional groups typically found within proteins are not easily oxidized to carbonyls in a site-specific manner, the hydroxyls present on certain glycans are easily converted to aldehydes with high selectivity.923 The use of galactose oxidase (GAO) in particular for the oxidation of glycoproteins is well-documented, but seems to have fallen out of use in favor of the alternative method of periodate oxidation of vicinal diols. Unfortunately, periodate can oxidize cysteine, methionine, tryptophan, tyrosine, and histidine residues regardless of their location within a protein, and serine and threonine are oxidized when located at the N-terminus. Furthermore, the subsequent labeling of periodate oxidized glycoproteins tends to result in heterogeneous product mixtures with higher than expected extents of conjugation.19 In applications such as the modification of therapeutic proteins, where product homogeneity, structure, and function are extremely important, oxidation by GAO is preferred. Galactose oxidase is active toward several primary alcohols including but not limited to the C-6 hydroxyls of Gal and GalNAc.24 While both the oxidation kinetics and equilibrium conversion have been reported to be affected by the microenvironment of the C-6 hydroxyl of the terminal sugar,25 activity toward a diverse set of substituted Gal variants including both R- and β-linked GalNAc have been reported.9,10 The initiating step of mucin-type O-glycosylation in vertebrates involves the enzymatic transfer of GalNAc from UDP-GalNAc to certain serine and threonine residues of nascent glycoproteins resulting in proteins decorated with a single RGalNAc at the designated site. This basic process been studied extensively and the substrate requirements have been determined.26,27 Nonpathogenic E. coli strains contain intracellular UDP-GlcNAc but not UDP-GalNAc; therefore, for cytoplasmic expression of GalNAc decorated glycoproteins, the initiating step of the mucin-type glycosylation pathway was reconstituted by coexpressing UDPGlcNAc C-4 epimerase and UDP-GalNAc/polypeptide UDPGalNAc transferase (GalNAc-T) along with a suitable target
protein. The human GalNAc-T2 isoform was chosen for this study based on its relatively high in vitro activity and the abundance of information pertaining to potential O-glycosylation recognition sequence (OGRS). In order to achieve mucin-type O-glycosylation in E. coli, we produced several reporter proteins, (i) a glutathione S-transfease/ MUC1 fusion protein and (ii) a panel of OGRS peptides fused to human antibody fragments (Fabs) directed against TNF-R in a glycoengineered E. coli strain. In contrast to oligosaccharyltransferase-mediated transfer of N-glycans to asparagines within the recognition sequon Asn-X-Ser/Thr, which occurs en bloc, O-glycosyltransferases add sugars sequentially, requiring just the appropriate OGRS and UDP-GalNAc for the first addition. Gupta et al. have developed an OGRS database, O-GlycBase,28 which has in the past been used to train computational neural networks for the prediction of O-glycosylation sites.29,30 Although these programs can be used for generally predicting O-glycosylation sites, they cannot be used to model individual transferases due to the overlapping activities of the various GalNAc-T isoforms. To determine recognition sequence requirements for individual transferases, in vitro reactions involving purified enzymes and substrates are typically used3139 (a pH-based high-throughput assay for glycosyltransferase activity has been described, but it has not yet been used for the characterization of GalNAc-T isoforms40). Recently, the in vitro use of glycosyltransferases has been extended to engineering applications such as the generation of glycopeptide libraries41 and use in bioconjugate chemistry.42 To find a suitable OGRS for hGalNAc-T2, we evaluated a small subset of the OGRS present in O-GlycBase v 6.00 for use as glycosylation targets on recombinant proteins. Since GalNAc-T2 is an initiating transferase, only OGRS without transacting glycosylation sites were considered. The literature suggests that the eight amino acids from 3 to þ4 relative to the glycosylated serine or threonine are involved in recognition by GalNAc-T. Ultimately, 17 eight-amino-acid sequences were selected. While this small library was by no means exhaustive, screening members for glycosylation resulted in the identification of an OGRS with relative high occupancy. In addition, this approach provides a simple 904
dx.doi.org/10.1021/bc100510g |Bioconjugate Chem. 2011, 22, 903–912
Bioconjugate Chemistry
ARTICLE
Table 1. Oligonucleotides Used in This Work sequence (50 -30 )
label oGH119
GAAGAAGGATCCAGGAAGGAGGACTGGAATG
oGH120
GAAGAAGCGGCCGCCTGCTGCAGGTTGAG
oGH174
TAATAAGAATTCAGGTGGCACTTTTCGGGG
oGH175
TAATAACCATGGTTCCATAGGCTCCGC
oGH176
GAGCGTGGCAGCCGCGGTATCATTGCAGCACTGGGGCC
oGH177
CCGCGGCTGCCACGCTCACCGGCTCCAGATTTA
oGH178
TAATAAGGCGCGCCCTACTACTGGGCTGCTTCC
oGH179 oGH180
TAATAACCATGGCATTAATTGCGTTGCGC GCCCGTGAGCCTGGTGAAAAGAAAAACCACCCTGGCGCCC
oGH181
CCAGGCTCACGGGCAACAGCTGATTGCCCTTCACCGCC
oGH196
GAAGAAGCGGCCGCCAAAAATATGGGCGCAG
oGH199
TAATAAGAGCTCATGAGGAAGGAGGACTGGAATG
oGH200
TTATTAACCGGTTCAGTGGTGGTGGTGGTG
oGH286
TTAACCGGTTCATTTCAAAAACATGATG
oGH528
/5PHOS/GGAGCCACGCGGAACCAG
oGH529 oGH530
TTTTGGGCTAGCAGGAGGAATTCACCATGTCCCCTATACTAGG CCTTCTAGACCAAAAAAACGGGTATG
oGH531
CGACCTGCAGGCATGCAAG
oGH589
/5PHOS/CACGGTGTTACTAGCGC
oGH590
TAATAAGAGCTCTTAACAAGATTTTGGTTCAGC
oGH601
/5PHOS/GCAAGTATGGGTTTTATCGC
oGH602
GCTGTTCCAACCACTGTAGTAGATGCTTGAGCATGCAAGCTTGGCTG
oGH603
GAAGCACAAACTGAACTGCCTCAAGCATGAGCATGCAAGCTTGGCTG
oGH604 oGH605
ATCAACCCAACCACTCAGATGAAAGCTTGAGCATGCAAGCTTGGCTG CTGGCTGGTACTGAATCTCCTGTAGCATGAGCATGCAAGCTTGGCTG
oGH606
CTGGCTCCAACTGCACCACCTGAAGCTTGAGCATGCAAGCTTGGCTG
oGH607
CTGCAACCAACCCAAGGTGCAATGGCCTGAGCATGCAAGCTTGGCTG
oGH608
CTGACTCAAACCCCAGTTGTTGTTGCTTGAGCATGCAAGCTTGGCTG
oGH609
CCAGGTTCTACTGCTCCACCAGCTGCATGAGCATGCAAGCTTGGCTG
oGH610
CCTCCTCCAACTTCTGGTCCAACTGCATGAGCATGCAAGCTTGGCTG
oGH611
CCTGTTCCAACTCCTCCTGATAATGCATGAGCATGCAAGCTTGGCTG
oGH612 oGH613
CAACCAGTAACTTCTCAACCACAAGCATGAGCATGCAAGCTTGGCTG CGTGCAGCAACTGTAGGTTCTCTGGCTTGAGCATGCAAGCTTGGCTG
oGH614
ACTCCACCAACTGTTCTGCCTGATGCATGAGCATGCAAGCTTGGCTG
oGH615
GTAGGCCTGACTCCGTCTGCAGCTGCTTGAGCATGCAAGCTTGGCTG
oGH616
GTACTGCCTACTCAATCTGCACATGCTTGAGCATGCAAGCTTGGCTG
oGH617
GTTCGTGCAACTCGTACCGTAGTAGCATGAGCATGCAAGCTTGGCTG
oGH618
GTTCGTCCTACTTCTGCTGTTGCTGCTTGAGCATGCAAGCTTGGCTG
L1
ATGAAAAAGACAGCTATCGCAATTGCAGTGGCCTTGGCTGGTT
L2 L3
AAGGGCTTTGGGTCATTTGAATATCAGCTTGCGCTACGGTAGCGAAACCAGCCAAGGCC AATGACCCAAAGCCCTTCTTCTCTGAGCGCTTCTGTTGGTGATCGTGTGACCATCACCT
L4
TGATACCACGCAACGTTAGTACCGACGTTCTGGGACGCCTTACAGGTGATGGTCACACG
L5
AACGTTGCGTGGTATCAGCAGAAACCGGGCAAGGCACCTAAAGCCCTGATTTATTCTGC
L6
CGGAGAAGCGATACGGCACGCCAGAGTACAGGAAAGATGCAGAATAAATCAGGGCTTTA
L7
CCGTATCGCTTCTCCGGTTCTGGCTCTGGTACTGACTTCACTCTGACCATCTCCTCTCT
L8
TGTACTGCTGACAGTAGTAGGTAGCAAAATCTTCCGGTTGCAGAGAGGAGATGGTCAGA
L9
CTACTACTGTCAGCAGTACAACATCTACCCTCTGACGTTCGGCCAAGGCACCAAAGTTG
L10 L11
GGAAGATGAACACAGACGGTGCCGCTACCGTACGTTTAATTTCAACTTTGGTGCCTTGG CGTCTGTGTTCATCTTCCCGCCGTCTGATGAACAACTGAAGTCTGGCACTGCGTCTGTA
L12
TGCACTTTAGCTTCACGCGGGTAAAAGTTATTCAGCAGACACACTACAGACGCAGTGCC
L13
CGTGAAGCTAAAGTGCAGTGGAAAGTCGATAACGCTCTGCAATCTGGTAATTCTCAGGA
L14
GAATAGGTGGAATCTTTGCTGTCCTGTTCAGTTACAGACTCCTGAGAATTACCAGATTG
L15
AGCAAAGATTCCACCTATTCTCTGTCTAGCACCCTGACCCTGTCCAAAGCGGATTACGA
L16
AGACCTTGGTGGGTTACTTCACACGCATATACTTTGTGTTTTTCGTAATCCGCTTTGGA 905
dx.doi.org/10.1021/bc100510g |Bioconjugate Chem. 2011, 22, 903–912
Bioconjugate Chemistry
ARTICLE
Table 1. Continued sequence (50 -30 )
label L17 L18
GTAACCCACCAAGGTCTGTCCTCCCCGGTTACTAAATCCTTTAACCGTGGCGAATGTTG TCAACATTCGCCACGG
H1
ATGAAAAAGACAGCTATCG
H2
TGCGCTACGGTAGCGAAACCAGCCAAGGCCACTGCAATTGCGATAGCTGTCTTTTTCAT
H3
GCTACCGTAGCGCAAGCTGAAGTACAACTGGTAGAATCTGGTGGTGGTCTGGTTCAGCC
H4
AAACATAGCCGCTTGCGGCGCAAGACAGACGCAGAGAACCACCAGGCTGAACCAGACCA
H5
GCAAGCGGCTATGTTTTTACGGACTACGGCATGAACTGGGTTCGCCAGGCTCCGGGCAA
H6
TCGGTTCGCCGATGTAAGTGTTAATCCAGCCCATCCATTCCAGACCTTTGCCCGGAGCC
H7 H8
ATCGGCGAACCGATCTATGCTGATTCTGTTAAGGGCCGTTTCACTTTCAGCCTGGACAC TGCGCGCAGGCTGTTCATTTGCAGGTAAGCGGTGGATTTGCTGGTGTCCAGGCTGAAAG
H9
GCCTGCGCGCAGAAGATACCGCTGTTTATTACTGCGCACGTGGCTACCGTAGCTACGCA
H10
GCAGAAGATACAGTCACCAGGGTGCCCTGACCCCAGTAATCCATTGCGTAGCTACGGTA
H11
GGTGACTGTATCTTCTGCTTCTACTAAAGGCCCGTCTGTATTCCCTCTGGCACCTTCTA
H12
ACCAGGCAACCCAGTGCCGCCGTACCACCGGAAGTGGATTTGCTAGAAGGTGCCAGAGG
H13
CTGGGTTGCCTGGTTAAAGATTATTTCCCGGAACCTGTAACTGTGAGCTGGAACTCCGG
H14
GATTGCAGAACCGCCGGGAAGGTATGCACGCCGGAAGTCAGCGCACCGGAGTTCCAGCT
H15 H16
GGCGGTTCTGCAATCCAGCGGCCTGTACTCCCTGTCTAGCGTTGTTACCGTTCCGTCTA TATGGTTAACGTTGCAGATGTACGTCTGGGTGCCCAGAGAGCTAGACGGAACGGTAACA
H17
TCTGCAACGTTAACCATAAACCATCTAACACTAAAGTTGATAAGAAAGTCGAACCGAAA
H18
TCACGCCGCGCAAGTATGGGTTTTATCGCAAGATTTCGGTTCGACTTTCT
oKI158
/5PHOS/GAAAAAAAAATGAAAAAGACAGCTATCGCAATTG
oKI159
/5PHOS/CTCCTCAACACTCTCCTCTATTAAAGCTTTTTG
oKI160
TAGAGCTCATGAAAAAGACAGCTATCGCAATTG
oKI161
ATACCGGTTCACGCCGCGCAAGTATG
oKI162 oKI464
ATACCGGTTCAACATTCGCCACGGTTAAAG TAGGAATTCACCATGGATATTCAAATGACCCAAAGCCCTTC
oKI465
ATGCATGCTCACGCCGCCGGCGCCGGGG
oKI481
/5PHOS/TCTACTGCGCCTCCTGCGGCGTGAGCATGCAAGC
oKI482
/5PHOS/ACCCGGTGCAGGTTTGTTATCGCAAGATTTCGGTTCGACTTTCTTATC
β-lactamase resistance marker by overlap PCR using oGH174 and oGH175 as flanking primers and oGH176 and oGH177 as internal overlapping primers (see Tables 1 and 2 for lists of primers and plasmids). The resulting DNA fragment contains EcoRI and NcoI sites at the 50 and 30 ends, respectively. The multiple cloning site from pGH1143 was excised using EcoRI and AscI. A BsmBI site was removed from the lacI open reading frame by overlap PCR using oGH178 and oGH179 as flanking primers and oGH180 and oGH181 as internal overlapping primers. The resulting DNA fragment contains AscI and NcoI sites at the 50 and 30 ends, respectively. The three fragments were joined by triple ligation to make pMGV. The gene encoding polypeptide GalNAc transferase 2 was amplified from a human brain cDNA library using primers oGH119 and oGH120, which add BamHI and NotI sites to the 50 and 30 ends, respectively. The PCR product was cloned between the BamHI and NotI sites in pET21a(þ) to make pET21GT2. The gene, along with the 6His tag, was amplified from pET21GT2 using primers oGH199 and oGH200, which add SacI and AgeI sites to the 50 and 30 ends, respectively. The PCR product was cloned between the SacI and AgeI sites in pMGV to make pMGV(GT2). The gene encoding UDP-GlcNAc C-4 epimerase from P. aeruginosa was codon optimized for expression in E. coli and synthesized by DNA 2.0 (Menlo Park, CA) along with its endogenous promoter. The promoter and gene were amplified using primers oGH196 and oGH286, which add NotI and AgeI sites to the 50
demonstration of the utility of O-glycocompetent E. coli for basic research on GalNAc-T substrate specificity. Following the expression and purification of a glycosylated OGRS/Fab fusion protein selected from the library, we used a well-established glycan labeling technique to PEGylate the site-specifically placed N-acetylgalactosamine (GalNAc). In summary, this method entails enzymatic oxidation by GAO and reaction of the resultant aldehydes with hydroxylaminefunctionalized molecules of interest in the presence of aniline to form stable oxime bonds (Figure 1). While this method has been used extensively for the labeling of glycoproteins, its use as a sitespecific bioconjugation technique has been limited by the ability to produce homogeneously glycosylated proteins with unique oxidizable sugars. As previously discussed, we circumvented this problem by engineering E. coli with the ability to O-glycosylate proteins rather than using eukaryotic hosts, which typically contain an array of glycosylation modifications leading to numerous potentially reactive glycans. In contrast to previously reported techniques for glycoPEGylation,42 the method reported here does not require an exogenous source of UDP-GalNAc and all enzymes and reagents are available from commercial sources.
’ EXPERIMENTAL PROCEDURES Construction of the Glycosylation Helper Plasmid. An existing BsaI site was removed from the pUC19 origin and 906
dx.doi.org/10.1021/bc100510g |Bioconjugate Chem. 2011, 22, 903–912
Bioconjugate Chemistry
ARTICLE
Table 2. Plasmids Used in This Work identifier
description
reference
pMGV
E. coli multiple gene expression vector
This work
pMGV(GT2)
PT7GalNAc-T2 in pMGV
This work
pMGV(wbpP)
PwbpPwbpP in pMGV
This work
pMGV(GT2,wbpP)
PT7GalNAc-T2 and PwbpPwbpP in pMGV
This work This work
pMGV(OmpATNFRFab)
PT7OmpATNFRFab in pMGV
pMGV(TNFRFab)
PT7TNFRFab in pMGV
This work
pBAD33(MC3)
p15a origin, CmR
Guzman et al.45
pBAD33(MC3)TNFRFab pBAD33(MC3)TNFRFab::OGRS##
ParaTNFRFab in pBAD33(MC3) Fab fusion to OGRS peptide
This work This work
00
DNKPAPGSTAPPAA
01
AVPTTVVDA
02
EAQTELPQA
03
INPTTQMKA
04
LAGTESPVA
05
LAPTAPPEA
06 07
LQPTQGAMA LTQTPVVVA
08
PGSTAPPAA
09
PPPTSGPTA
10
PVPTPPDNA
11
QPVTSQPQA
12
RAATVGSLA
13
TPPTVLPDA
14 15
VGLTPSAAA VLPTQSAHA
16
VRATRTVVA
17
VRPTSAVAA
pBAD33(MC3)GST::4xMuc1
GST fused four tandem repeats of Muc1
This work
pPB230
Mucin expression vector
gift from P. Bobrowicz, Glycofi, Inc.
pGal1-GST
Vector containing GST tag
gift from K. Griswold, Dartmouth College
and 30 ends, respectively. The PCR product was cloned between the NotI and AgeI sites in pMGV to make pMGV(wbpP). The 1.5 kb AscI/SphI fragment of pMGV(wbpP) was cloned between the BsmBI and SphI sites in pMGV(GT2) to make pMGV(GT2, wbpP). This plasmid encodes the GalNAc-T and epimerase necessary for glycosylation of target proteins in the E. coli cytoplasm. Cloning and Expression of the GST-Mucin Fusion Reporter Protein. The tandem repeat region of mucin was amplified from pPB230 using oGH589 and oGH590 to add a SacI site to the 30 end. GST was amplified from pGal1-GST using oGH528 and oGH529 to add an NheI site to the 50 end. pBAD33(MC3) was amplified using oGH530 and oGH531 to add SacI and XbaI sites to the 50 and 30 ends, respectively. The PCR products were digested with the enzymes listed above and ligated to form pBAD33(MC3)GST::4Muc1. Shuffle T7 Express (NEB, Ipswich, MA) was transformed with pBAD33(MC3)GST::4Muc1 either alone or in combination with pMGV(GT2, wbpP). Transformants were grown to approximately 1 g/L DCW in 5 mL LB containing 25 g/L chloramphenicol and 100 g/L ampicillin prior to induction by addition of 62.5 μL 40% arabinose. Cultures were induced overnight, and cell pellets were harvested by centrifugation at 5000 g for 5 min. The pellets were resuspended in 0.5 mL PBS containing 1 complete, EDTA free protease inhibitor (Roche). The cells were lysed by ultrasonication and insoluble lysate was removed by centrifugation at
16 000 g. Soluble lysates were analyzed for glycosylation by Western blot using HRP-conjugated Vicia villosa lectin (EY Laboratories, San Mateo, CA). Cloning of anti-TNFr Fab and Creation of OGRS Fusions. The genes encoding heavy and light chain of anti-TNFR Fab secreted via OmpA leader peptides were synthesized by overlap extension PCR using oligonucleotides, L1-L18 for light chain and H1H18 for heavy chain.44 The PCR products were cloned between the SacI and AgeI sites in pMGV. The light chain was amplified using primers oKI160 and oKI62, and the heavy chain was amplified using primers oKI158 and oKI161. The PCR products were digested with SacI (light) or AgeI (heavy) and cloned between the SacI and AgeI sites of pMGV. The leaders were removed by quick-change PCR using primers oKI458461. The leaderless construct was amplified using oKI464 and oKI465, which add EcoRI and SphI sites to the 50 and 30 end, respectively. The PCR product was cloned between the EcoRI and SphI sites in pBAD33(MC3)45 to make pBAD33(MC3)TNFRFab. The mucin-based OGRS was added by quick-change PCR using phosphorylated primers, oKI481 and oKI482 to make pBAD33(MC3)TNFRFab00. All other OGRS were added by quick change PCR using a phosphorylated reverse primer, oGH601, and a nonphosphorylated forward primer (oGH602618) encoding an OGRS to make pBAD33(MC3)OgTNF0117. 907
dx.doi.org/10.1021/bc100510g |Bioconjugate Chem. 2011, 22, 903–912
Bioconjugate Chemistry
ARTICLE
Table 3. Strains Used in This Work identifier
description
strain
sGH683
pMGV(GT2) and pBAD33(MC3)GST::4xMuc1
Shuffle T7 Express
sGH684
pMGV(wbpP) and pBAD33(MC3)GST::4xMuc1
Shuffle T7 Express
sGH685
pMGV(GT2,wbpP) and pBAD33(MC3)GST::4xMuc1
Shuffle T7 Express
sKI396
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab00
Shuffle T7 Express
sGH733
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab01
Shuffle T7 Express
sGH734
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab02
Shuffle T7 Express
sGH735
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab03
Shuffle T7 Express
sGH736 sGH737
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab04 pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab05
Shuffle T7 Express Shuffle T7 Express
sGH738
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab06
Shuffle T7 Express
sGH739
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab07
Shuffle T7 Express
sGH740
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab08
Shuffle T7 Express
sGH741
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab09
Shuffle T7 Express
sGH742
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab10
Shuffle T7 Express
sGH743
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab11
Shuffle T7 Express
sGH744 sGH745
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab12 pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab13
Shuffle T7 Express Shuffle T7 Express
sGH746
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab14
Shuffle T7 Express
sGH747
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab15
Shuffle T7 Express
sGH748
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab16
Shuffle T7 Express
sGH749
pMGV(GT2,wbpP) and pBAD33(MC3)TNFRFab17
Shuffle T7 Express
pH 4.6, for two hours. The products were separated by SECHPLC on an agilent 1200 series HPLC equipped with a Bio Sil SEC 250 column from BioRad. The fractions containing unreacted protein were pooled and suspended in 50 mM sodium phosphate, pH 7.4, at a concentration of 100 μM. The glycoproteins were incubated for 20 h at room temperature in the presence of 5 μM galactose oxidase. Oxidized glycoproteins were passed over a zeba spin column pre-equilibrated with anilinium acetate, pH 4.6, and reacted with 1 mM 20 kDa Aminooxy-PEG for one hour. Analysis of PEGylated Proteins. PEGylation reactions were analyzed by RP-HPLC, MALDI-TOF, and SDS-PAGE. For RPHPLC, reactions were loaded directly onto the column. Fractions containing reacted and unreacted protein were collected and analyzed by MALDI-TOF. For SDS-PAGE analysis, unreacted AO-PEG was removed by protein-L immunoprecipitation prior to loading. In a 1.5 mL microcentrifuge tube, 15 μL Protein-L agarose (GenScript) was washed three times with PBS. Then, the PEGylation reactions were added and the tubes were incubated at room temperature for three hours with shaking. The supernatant containing unreacted Aminooxy-PEG was discarded, and the protein-L resin was washed once with PBS. Proteins were eluted by boiling in SDS-Loading buffer (Invitrogen) containing 100 mM DTT for 10 min.
Screening of Glycosylation Activity against a Panel of Selected OGRS. Shuffle T7 Express (NEB, Ipswitch, MA)
was transformed with pMGV(GT2,wbpP) and pBAD33(MC3)OgTNF## (where ## corresponds to the OGRS peptide fusion; see Table 3). anti-TNFR Fab was expressed identically to the GST-Mucin reporter described above. Soluble lysates were analyzed for glycosylation by Western blot using HRPconjugated Vicia villosa lectin. Expression and Purification of Glycosylated Fabs. sGH741 was grown at 30 °C to approximately 1 g/L DCW in 800 mL LB media containing 25 g/L chloramphenicol and 100 g/L ampicillin. To induce Fab expression, arabinose was added at a final concentration of 0.5% and the cultures were incubated at 30 °C for an additional 20 h. The cultures were centrifuged at 5000 g for 5 min, and the cell pellets were resuspended in 800 mL LB containing 25 g/L chloramphenicol, 100 g/L ampicillin, and 0.2% dextrose. The cultures were incubated at 30 °C for an additional 3 h. After the second incubation, the cultures were centrifuged at 5000 g for 5 min and the cell pellets were resuspended in 50 mL lysis buffer (100 mM Tris HCl, pH 7.4; 10 mM EDTA). The cells were incubated overnight at 50 °C with shaking. The 50 °C incubation disrupts the cell membrane and precipitates poorly assembled Fab. After the 50 °C incubation, the lysate was centrifuged at 12 000 g for 30 min. The supernatant was passed through a 0.45 μm filter prior to Fab purification by FPLC. Protein-L resin was equilibrated using 10 colume volumes of PBS at 1 mL/min prior to loading the soluble lysate. After loading, the column was washed with 10 column volumes of PBS and Fab was eluted in 50 mM glycine HCl, pH 3.0. The pH was immediately adjusted to 6.5 by addition of 1 M Tris, pH 8.0. Fractions containing Fab were pooled, concentrated to 4 g/L, and aliquoted into 100 μL fractions. PEGylation of Recombinant Glycoproteins. Glycoproteins (100 μM) and 20 kDa Aminooxy-PEG (AO-PEG) (1 mM) were reacted at room temperature in 100 mM anilinium acetate,
’ RESULTS In Vivo Glycosylation of Mucin-Based Reporter Proteins. While activity has been observed for recombinant GalNAc-T expressed in eukaryotic cells, a literature search did not identify any studies involving soluble expression of active GalNAc-T in prokaryotes. In our hands, human GalNAc-T2 could be expressed under the control of the T7 promoter in E. coli only when cultures were not induced with IPTG. A Western blot using HRP conjugated Vicia villosa lectin showed that the GST::4Muc1 908
dx.doi.org/10.1021/bc100510g |Bioconjugate Chem. 2011, 22, 903–912
Bioconjugate Chemistry
ARTICLE
Figure 2. Vicia villosa lectin Western blot of the glycosylated mucin fusion protein: lane 1, UDP-GlcNAc C-4 epimerase only; lane 2, GST-Mucin coexpressed with UDP-GlcNAc C-4 epimerase; lane 3, GalNAc-T2 only; lane 4, GST-Mucin coexpressed with GalNAc-T2; lane 5, UDP-GlcNAc C-4 epimerase and GalNAc-T2; lane 6, GSTMucin coexpressed with UDP-GlcNAc C-4 epimerase and GalNAc-T2. The arrow indicates the GST-Mucin fusion protein. Reactivity with vva lectin in lane 6 but not in lanes 15 shows that the reporter protein is only glycosylated in the presence of both epimerase and GalNAc-T2.
Table 4. Vertebrate OGRS Peptides Selected from OGlycBase and the Results of in Vivo Glycosylation by GalNAc-T2 in E. colia peptide
protein
glycosylated
AVPTTVVD
human R-2-HS-glycoprotein precursor
þ
EAQTELPQ
human lithostathine precursor
INPTTQMK
human HMW I kininogen precursor
LAGTESPV LAPTAPPE
human transferrin receptor protein human plasminogen precursor
þþ þþþ
LQPTQGAM
human G-CSF precursor
LTQTPVVV
bovine β-casein
PGSTAPPA
human muc1
PPPTSGPT
Sus scrofa plasminogen
PVPTPPDN
R-1-microglobulin
QPVTSQPQ
human R-2-HS-glycoprotein precursor
RAATVGSL TPPTVLPD
human apolipoprotein E human IGF-2 precursor
VGLTPSAA
human TNF-β
VLPTQSAH
sex hormone-binding globulin precursor
Figure 3. Coomasie stained SDS-PAGE (top) and Vicia villosa lectin Western blot (bottom) of Fab fused to OGRS #117: OGRS #117 were selected from O-GlycBase v 6.00 and fused to the C-terminus of Fab. The OGRS fusions were expressed in the glycocompetent E. coli strain and glycosylation was assessed by reactivity with vva lectin. The results of the screen are summarized in Table 4.
þþ
glycosylation sites by at least 30 amino acids were chosen. The selected OGRS were expressed as C-terminal fusions to the heavy chain of anti-TNF-R Fab separated by a single Gly-4-Ser linker. The fusion proteins were expressed in glycocompetent E. coli and reporters were screened for glycosylation by Western blot using VVA-lectin HRP conjugate, which is specific for terminal GalNAc. The results of the screen are shown in Table 4 and Figure 3. It is hypothesized that OGRS that did not interact with GalNAc-T2 were either structurally altered due to the presentation as C-terminal fusions or they are the targets for other GalNAc-T isoforms. The OGRS from Sus scrofa plasminogen, PPPTSGPT, showed the greatest degree of staining on Vicia villosa lectin Western blots, and subsequent LC-MS analysis confirmed a glycosidic linkage to HexNAc (Figure 4). This glycoprotein was purified by protein-L chromatography and further analysis showed up to 69% occupancy at Thr-4 (Figure 5). PEGylation of O-Glycosylated anti-TNF-r Fab. The glycosylated PPPTSGPT peptide fused to anti-TNF-R Fab was PEGylated at the engineered GalNAc by reacting with a 10-fold molar excess 20 kDa AO-PEG in 100 mM anilinium acetate pH 4.6 following oxidation by GAO. Aniline catalyzed oxime bond formation is a generally accepted method for labeling glycans, and excess aniline can be easily removed by SEC following the reaction. The product was either precipitated on protein-L sepharose and checked by SDS-PAGE (Figure 6) or isolated by SEC-HPLC and analyzed by MALDI-TOF (Figures 7 and 8). On the basis of an estimated occupancy of 69%, the extent of conversion of the glycoprotein is 0.85 after one hour and the purity is >98% by comparison to control reactions containing nonoxidized glycoprotein. Surprisingly, it was found that freshly purified glycoprotein reacts with AO-PEG to some extent even without prior oxidation (Figure 7). After direct reaction with AO-PEG, approximately 510% of the starting material is converted to a high molecular
þ þ þþþ þþ þþþ þþþ
VRATRTVV
platelet glycoprotein Ib-R chain
VRPTSAVA
human apolipoprotein CIII precursor
a not glycosylated, þ weak glycosylation, þþ moderate glycosylation, þþþ strong glycosylation.
fusion protein was glycosylated in cells coexpressing GalNAc-T2 and epimerase, but not in the absence of either, indicating that the epimerase and transferase are required and sufficient to impart E. coli with the ability to O-glycosylate proteins in vivo (Figure 2). Detailed analysis of the GST::4Muc1 was not done because the eight potential glycosylation sites present in this reporter likely would have complicated the analysis. However, fusion of a fifteen amino acid mucin-based peptide, CDNKPAPGSTAPPAA, to an anti-TNF-R antibody fragment also resulted in O-glycosylation as determined by Western blot. Analysis of this protein by LC/MS revealed an increase in molecular weight of 203 Da as is expected for addition of a single GalNAc (data not shown). Screening OGRS Selected from O-GlycBase v 6.00. OGRS were selected from the OGlycBase database of mammalian O-glycoproteins. Several GalNAc-T isoforms are able to glycosylate nascent proteins, but others require prior glycosylation events. To enrich for OGRS that are recognized by initiating GalNAc-T isoforms, only sites that were separated from other 909
dx.doi.org/10.1021/bc100510g |Bioconjugate Chem. 2011, 22, 903–912
Bioconjugate Chemistry
ARTICLE
Figure 7. Analysis of PEGylation reactions by SEC-HPLC: Reactions were performed without removal (left) and with removal (right) of the contaminating reactive species. Without removal of the contaminant, approximately 6% of the nonoxidized starting material is converted to the high molecular weight species. Upon oxidation, the conversion increases to 70%. When the contaminant is removed, less than 2% of the nonoxidized starting material is converted, while 65% is converted after oxidation.
Figure 4. Analysis of glycosylated anti-TNFRFab09 by LC/MS: Glycoprotein was isolated as described and analyzed without further processing. The glycoprotein (MW 48524) comprises approximately 69% of the total. The unglycosylated full-length (MW 48318) and unglycosylated C-terminal degradation product (MW 47610) comprise the remaining 31%. The shift of 203 Da is expected upon glycosidic linkage to a single HexNAc. Lack of glycosylation of the proteolytically degraded product suggests that the glycosylation event was unique to the engineered peptide.
Figure 5. Quantification of glycan occupancy at OGRS peptide Thr-4 by RP-UPLC: The C-terminal glycopeptide elutes at 9.6 min and the aglycosylated C-terminal peptide elutes at 10.3 min. The glycopeptide and aglycopeptide are estimated at 69% and 31%, respectively.
Figure 8. Analysis of PEGylation products by MALDI-TOF: The reaction products were separated by SEC-HPLC and fractions from each peak were pooled for MALDI-TOF analysis. Representative data from each peak is presented showing that the slow eluting peak is composed of unreacted Fab and the fast eluting peak is the PEGylated product as expected. The PEGylated product cannot be well-resolved due to the polydispersity of the 20 kDa AO-PEG.
Figure 6. SDS-PAGE showing PEGylation of glycosylated Fab: After the reaction, excess 20 kDa AO-PEG was removed by immunoprecipitation using protein-L agarose. Lane 1, overnight incubation in oxidation reaction buffer followed by 1 h incubation in PEGylation buffer; lane 2, overnight incubation with GAO followed by 1 h incubation in PEGylation buffer; lane 3, overnight incubation in oxidation reaction buffer followed by 1 h incubation with 20 kDa AO-PEG; lane 4, overnight incubation with GAO followed by 1 h incubation with 20 kDa AO-PEG. The results indicate that sequential oxidation and incubation with AOPEG are required for PEGylation. The light chain (bottom band) serves as a convenient internal control. Migration of the heavy chain to a single high molecular weight species in lane 4 indicates that the reaction is specific to a single location on that subunit.
weight product after four hours. The conversion does not increase appreciably when the reaction is allowed to proceed for an additional sixteen hours and the product is not hydrolyzed upon removal of the excess AO-PEG (data not shown) indicating a fast, irreversible reaction. This suggests that a reactive contaminant is present in the starting protein preparation, currently hypothesized to be due to carbonylation of a susceptible residue in the anti-TNF-R Fab. Indeed, UPLC analysis showed that approximately 510% of the purified Fab has a molecular weight 16 Da greater than the expected mass of the intact protein confirming the presence of an additional oxygen atom (data not shown). On the basis of the assumption of a reactive contaminant, the starting material was pretreated with AO-PEG and the high molecular weight product was removed by SEC-HPLC. The unmodified Fab was used as starting material for subsequent oxidation and PEGylation reactions. Nonoxidized control reactions run using this clean starting material resulted in