Article pubs.acs.org/jpr
An Extended Spectrum of Target Proteins and Modification Sites in the General O‑Linked Protein Glycosylation System in Neisseria gonorrhoeae Jan Haug Anonsen,†,‡,§ Åshild Vik,†,‡ Wolfgang Egge-Jacobsen,§,† and Michael Koomey*,†,‡ †
Department of Molecular Biosciences, ‡Center for Molecular Biology and Neuroscience, and §Glyconor Mass Spectrometry and Proteomics Unit, University of Oslo, 0316 Oslo, Norway S Supporting Information *
ABSTRACT: The bacterial human pathogen Neisseria gonorrhoeae expresses a general O-linked protein glycosylation (Pgl) system known to target at least 12 membrane-associated proteins. To facilitate a better understanding of the mechanisms, significance and function of this glycosylation system, we sought to further delineate the target proteome of the Pgl system. To this end, we employed immunoaffinity enrichment of glycoproteins using a monoclonal antibody against the glycan moiety. Enzymatically generated peptides were subsequently analyzed by MS to identify glycopeptides and glycosylation sites. In this way, we increase the total number of known glycoproteins in N. gonorrhoeae to 19. These new glycoproteins are involved in a wide variety of extracytoplasmic functions. By employing collision fragmentation, we mapped nine new glycosylation sites, all of which were serine. No target sequon was readily apparent, although attachment sites were most often localized with regions of low sequence complexity. Moreover, we found that 5 of the proteins were modified with more than one glycan. This work thus confirms and extends earlier observations on the structural features of Neisseria glycoproteins. KEYWORDS: post-translational modification, glycosylation, O-linked, diNAcBac, HCD, CID, site assignment
■
INTRODUCTION Bacterial species expressing general or broad-spectrum protein glycosylation systems are being encountered with increasing frequencies. To date, the glycoproteins of such systems are found to be localized in part or completely outside of the cytoplasm. The covalent addition of stable glycan tags presumably enables diverse proteins to be recognized and modified by conserved core processes influencing protein trafficking, interaction, folding, and turnover. Glycosylation substrate selection is determined by the presence of specific structural signals in target proteins as well as by colocalization with the glycosylation machinery and donor substrates. Four well-defined, general protein glycosylation systems have been characterized to date, with perhaps the best understood being the N-linked glycosylation system of Campylobacter jejuni. Here, over 65 proteins have been shown to be modified with a unique heptasaccharide at asparagine residues contained within the sequon (D/E)YNX(S/T) (where Y and X do not include proline).1,2 Notably, the C. jejuni oligosaccharyltransferase PglB is highly related in structure and membrane topology to the catalytically active component STT3 of the N-linked eukaryotic oligosaccharyltransferase complex.2 In contrast, the three other well-defined systems all involve O-linked glycosylation but otherwise vary substantially in their © 2012 American Chemical Society
structural frameworks. A system defined in Mycobacterium tuberculosis (but likely present in related species and other members of the Actinobacteria) involves the attachment of mannose containing oligosaccharides to serine and threonine residues in multiple lipoproteins.3 Analogous to the case of the C. jejuni system, the oligosaccharyltransferase involved is structurally related to its eukaryotic counterpart, the protein mannosyltransferases.4 Sites of mannose attachment are not associated with a specific sequon but rather are restricted to unstructured, surface exposed segments rich in proline (along with serine and threonine). Subsequently, an O-linked system has also been defined in the gut symbiont Bacteroides f ragilis. This system involves at least 20 diverse proteins modified with fucosecontaining oligosaccharides at serine and threonine residues with the specific three-amino acid motif D(S/T)(A/I/L/M/T/V).5 A system highly related to this has been recently identified in the periodontal pathogen Tannerella forsythia.6 However, the identity of the putative oligosaccharyltransferase has yet to be revealed in either system. A fourth well described O-linked glycosylation system is that found in the Neisserial species. This broad spectrum system has Received: June 29, 2012 Published: October 2, 2012 5781
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
been most fully delineated in N. gonorrhoeae, the agent of the sexually transmitted disease gonorrhea.7 Like the situation in other Gram-negative systems, it entails synthesis of a glycan chain on the lipid carrier undecaprenyl pyrophosphate (UndPP) at the cytoplasmic face of the inner membrane.8,9 This Und-PP-glycan is then flipped into the periplasm and transferred en bloc to various extracytoplasmic proteins by the action of oligosaccharyltransferases. The N. gonorrhoeae system shows remarkable similarity to the C. jejuni system in the use of related biosynthetic pathways for generating undecaprenyl pyrophosphate-linked N,N′-diacetylbacillosamine (diNAcBac) glycoforms. Currently, 12 distinct glycoproteins have been defined in the N. gonorrhoeae system that all share the property of being tethered to the membrane as lipoproteins or via transmembrane domains.7 The most abundant glycoprotein is the pilin protein PilE, which transiently resides in the inner membrane but can be externalized to the cell by virtue of its assembly into surface pili. Most of the glycoproteins contain low complexity regions that encompass serine sites of glycan occupancy.7 These domains are biased toward composition with alanine, proline and serine and thus bear some resemblance to the attachment sites seen in the M. tuberculosis O-mannosylation system.3 Thus, the broad scope of these systems are dictated by the relaxed specificity of their respective transferases as well as the bulk properties and context of the protein-targeting signal rather than by a strict amino acid consensus sequence. Systems similar to, and in some cases identical to, that detailed in N. gonorrhoeae strains have been documented in strains of Neisseria meningitidis, the etiologic agent of epidemic meningitis, and Neisseria lactamica, an oropharyngeal commensal of young children10 Many efforts have been focused on identifying the complete repertoire of glycoproteins or glycoproteome as a means to better understand the mechanisms, significance and function of protein glycosylation in these systems. It has also been of interest to identify specific glycopeptides and ultimately specific sites of glycan attachment. Original strategies for comprehensive analyses of C. jejuni and B. f ragilis glycoproteomes have relied heavily on standard cell fractionation, lectin-based detection and affinity purification followed by gel-based, bottom-up mass spectrometric methodologies.5,11 Subsequently, broader coverage has been obtained through more refined chromatographic techniques enabling selective enrichment of glycopeptides followed by MS fragmentation regimens. In the specific case of C. jejuni, the efficacy of targeted enrichment for glycopeptides was likely facilitated by the hydrophilic nature of the heptasaccharide glycoform.12 As both the C. jejuni and B. f ragilis systems utilize sites of occupancy that can be defined by specific motifs, the extent of their glycoproteomes can be estimated by bioinformatic analysis by searching for the presence of the motifs in the subset of proteins trafficked to the periplasm and beyond.13 A number of factors confound similar attempts at characterizing the N. gonorrhoeae glycoproteome. First, lectins with reactivity for the protein-associated glycans (in any of their forms) have yet to be identified. Second, occupancy sites in the neisserial system are associated with quasi-related domains bearing signatures of low complexity that make their prediction in silico problematic. Finally, although their oligosaccharides are structurally related to those in the C. jejuni system, they are maximally three sugars in length.14 As such, the associated glycopeptides are less amenable to the standard enrichment procedures.
In light of these circumstances, we here detail an alternative solution to further delineate the glycoproteome of N. gonorrhoeae strain N400 employing immunoaffinity-based glycoprotein capture combined with gel-based, shot-gun MS methodology.
■
MATERIALS AND METHODS
Bacterial Strains and Culture Condition
Bacterial strains used in this study are described in Table S1 (Supporting Information) and were grown on convential GC medium as described previously.8 Carboxy-terminal 6xHistagged alleles of ngo0360, ngo1364, ngo1365, ngo0983, ngo 1440 and ngo1769 were generated as previously described7 by use of primers in Table S2 (Supporting Information). The mutant alleles pglO::kan (ngo0096, resulting in strains lacking the oligosaccharyl transferase responsible for glycan attachment to proteins) and pglC::kan (ngo0084, resulting in strains lacking lipid-linked glycan) were introduced into various strain backgrounds by transformation with genomic DNA from strains as described7 and selecting for kanamycin resistant transformants. Whole Cell Lysate
N. gonorrhoeae cells were harvested from 20 plates incubated for 18−20 h at 37 °C, resuspended in 10 mL of buffer A (PBS pH 7.8, 0.1% Triton X-100) with protease inhibitors (1× Complete protease inhibitor mixture EDTA-free (Roche) and 1 mM PMSF), and lysed by sonication. The cell suspensions were centrifuged at 20000g for 20 min at 4 °C to remove debris and unbroken cells, and the supernatants were kept as whole cell lysates. Immunoprecipitation
For immunoprecipitation, 200 μL of protein G-sepharose (2 mg/mL, Sigma Aldrich) was washed 3 times with 200 μL of buffer A with centrifugation at 500g at 4 °C for 5 min between washes. 100 μL of npg1 (1 mg/mL) (mAb recognizing a diNAcBac monosaccharide-associated epitope10) was diluted with 800 μL of buffer A and then added to the protein G-sepharose pellet, whereupon the suspension was incubated with rotation overnight at 4 °C. After incubation, the protein G-sepharose mAb suspension was centrifuged at 500g at 4 °C for 5 min, the pellet was washed three times in buffer A, and the supernatant was removed. Five milliliters of whole cell lysate was added to the protein G-sepharose bound mAb, and the sample was incubated with rotation at 4 °C overnight. The sample was subsequently centrifuged at 500g for 5 min at 4 °C, and the supernatant was removed. The sample was then washed three times with 1 mL of buffer A containing 1 M NaCl and then three times with buffer A. To elute the immunopreciptated proteins, 50 μL of 100 mM glycine buffer (pH 2.4) was added to the sample, incubated with rotation for 30 min at 4 °C, and centrifuged at 500g at 4 °C for 5 min, and 40 μL of eluate was removed and kept. The glycine buffer elution step was repeated three times, and the eluates were pooled to form the immunoprecipitation sample. Samples were immediately run on a gel or neutralized with 50 μL of 1 M Tris pH 8.0 and stored at −80 °C. Protein Purification
Hexa-histidine (6xHis)-tagged proteins were purified using NiNTA beads (Qiagen) as described previously7 except without the inclusion of the proteinase inhibitor PMSF. 5782
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
SDS-PAGE and Immunoblotting
mass tolerances of parent ions for immunoprecipitation samples were set at 10 ppm, fragment ions at 0.05 Da and the false discovery rate (FDR) at 0.03. Methionine oxidation was set as a variable modification, and cysteine carbamidomethylation was selected as a fixed modification. The N. gonorrhoeae monosaccharide, diNAcBac with a mass of 228.110 Da was set as a variable modification. Trypsin was set as enzyme employed. Samples from a pglA background were also searched for diNAcBac modifications using the BioworksBrowser with similar parameters as for Proteome Discoverer. Glycopeptide MS2 spectra were viewed and manually verified using the Xcalibur V2.0.7 (Thermo Scientific) for the presence of the diNAcBac reporter ions at m/z 229.118, at m/z 211.107 and at m/z 169.096.18
Procedures for SDS-PAGE and immunoblotting were as previously described.15 6xHis-tagged proteins were detected by using a tetra-His mouse monoclonal antibody at 1:1000 dilution (Qiagen). For glycan detection, a 1:10000 dilution of the npg2 rabbit monoclonal antibody recognizing a diNAcBacGal-associated epitope was employed.10 Secondary antibodies were alkaline phosphatase-conjugated goat antirabbit antibodies (Qiagen) for the npg monoclonal antibodies or goat antimouse antibodies (Sigma) for the tetra-His antibodies, both at 1:2000 dilutions. In-Gel Protein Digestion
Coomassie stained gel slices containing either immunoprecipitated or purified proteins were washed and destained as previously described.16 Alkylation, reduction and digestion steps with either trypsin (Sigma) or AspN (Roche) at 37 °C overnight were performed as described previously.16 Dried samples were frozen at −80 °C and redissolved in 0.1% formic acid prior to liquid chromatographic tandem mass spectrometric (LC−MS2) analyses.
Bioinformatic Analysis
N. gonorrhoeae gene and protein data were obtained from the DBGET Web site (http://www.genome.jp/dbget/) operated by the Kyoto University Bioinformatics center. Determination of Pfam protein domains were done through the NCBI conserved domain database search (http://www.ncbi.nlm.nih.gov/Structure/cdd/ wrpsb.cgi) with low complexity filter turned off.19−21 Predicted protein localizations were done by PsortB V3.0. (http://www.psort. org/psortb/).22 Predictions for transmembrane domains were done using the TMHMM V2.0 server (http://www.cbs.dtu.dk/services/ TMHMM/) operated by the Center for Biological Sequence analysis, Technical University of Denmark. Signalpeptides were predicted using signalP V4.0 (http://www.cbs.dtu.dk/services/ SignalP/) operated by the Center for Biological Sequence analysis, Technical University of Denmark.23 The sequence alignment and relative abundance of amino acids were done using WebLogo V2.8.2 (http://weblogo.berkeley.edu/logo.cgi).24,25
Reverse Phase LC−MS2 Analysis of Proteolytic Peptides
Nanoflow LC−MS and MS2 analyses (nano-LC−MS2) of proteolytic peptides were performed using an Agilent 1200 series capillary HPLC system with a corresponding autosampler, column heater, and integrated switching valve coupled via a nanoelectrospray ion source to a LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific, Bremen, Germany). Five microliters of peptide sample was injected onto the extraction column (Zorbax 300 SB-C18 of 5 μm particle size, 5 by 0.3 mm (Agilent)), and the sample was washed with a mobile phase of 0.1% formic acid and 3% acetonitrile by the capillary pump with a flow rate of 4 μL min−1. The peptides were eluted from the extraction column in the back-flush mode onto the C18 reverse phase column (0.075 × 150 mm, GlycproSIL C18−80 Å, Glycpromass, Stove, Germany). The mobile phase consisted of acetonitrile and MS grade water, both containing 0.1% formic acid. LC separation was achieved with a gradient from 5 to 55% of acetonitrile with 0.1% formic acid over 120 min for immunoprecipitation samples or 60 min for purified proteins employing a flow rate of 0.2 μL min−1. Nanospray ionization was achieved by applying an 1.2 kV voltage between a 8 μm diameter emitter (PicoTip Emitter, New Objective, Woburn, MA) and the capillary entrance of the Orbitrap. Mass spectra were acquired in the positive-ion mode applying a data-dependent automatic switch between survey scan and MS2 acquisition on the Thermo Scientific LTQ OrbitrapXL mass spectrometer operated by Xcalibur 2.0. For immunoprecipitated samples, the three most intense ions from one full survey scan at a resolution of 30000 at m/z 400 were fragmented by alternating higherenergy C-trap dissociation (HCD) with the first m/z fixed at 100 and collision induced dissociation (CID) with the first m/z at 350 both with R = 7500 at m/z 400. Alternatively, purified proteins were fragmented in the HCD only mode. All Orbitrap analyses were performed with the lock mass option (lock mass set at m/z 445.12002417) for internal calibration. The ion selection threshold was 500 counts, and selected fragment ions were dynamically excluded for 180 s.
■
RESULTS
Evidence for an Extended Range of Glycoproteins in N. gonorrhoeae
Sample complexity remains a significant obstacle to effective proteomic efforts. To facilitate efforts to finding new targets of the general O-linked glycosylation system in N. gonorrhoeae, we developed an immunoprecipitation strategy coupled with a mass spectrometry-based glycopeptide identification analysis. N. gonorrhoeae modifies proteins with various sugar residues and modify these sugars with varying numbers of acetyl groups.8,10 To minimize potential problems related to issues of microheterogeneity, we opted to use a genetically defined strain producing only the diNAcBac monosaccharide. As a control, we carried out parallel processing of an extract from an isogenic pglC strain defective in glycan synthesis.8 Both samples were subjected to immunoprecipitation using the npg1 monoclonal antibody recognizing a diNAcBac-associated epitope.10 Eluates were loaded on an SDS-PAGE gel, and entire lanes were cut into gel pieces and in-gel digested. The tryptic peptides were subsequently analyzed on an Orbitrap XL mass spectrometer employing alternating CID and HCD fragmentation. To identify glycosylated peptides, spectra from both pglA and pglC samples were automatically searched against two different inhouse generated N. gonorrhoeae protein databases, allowing for up to 4 dynamic diNAcBac modifications of 228.110 Da per peptide. Spectra that were flagged as positive by the software search were manually inspected for presence of the three reporter oxonium ions of diNAcBac at m/z 229.118, m/z 211.107 and m/z 169.096.18 Peptide spectra containing all three reporter oxonium ions were considered evidence of glycosylated peptides. As expected, several peptide spectra from the pglA sample fullfiled
MS Data Analysis
Mass spectrometric data were analyzed with the in-house generated N. gonorrhoeae FA1090 and N. gonorrhoeae NCPP11945 protein databases using Proteome Discoverer (2.0) software with the SEQUEST search engine (Thermo Scientific). The 5783
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
Figure 1. Identification of immunoaffinity enriched, trypsin-derived glycopeptides by MS2. Combined CID/HCD spectra show reporter ions of diNAcBac at m/z 229.118, m/z 211.107 and at m/z 169.096. Superscript associated with protein designations denotes modification with either 1 (1) or 2 (2) glycans. Asterisk (*) denotes unmodified peptide. (A) MS2 spectrum of the [M + 3H]3+ precursor at m/z 1051.85 derived from Ngo1365 modified with one glycan. (B) MS2 spectrum of the [M + 3H]3+ precursor at m/z 1128.22 derived from Ngo1365 modified with two glycans. (C) MS2 spectrum of the [M + 3H]3+ precursor at m/z 848.74 derived from Ngo0094 modified with one glycan. (D) MS2 spectrum of the [M + 3H]3+ precursor at m/z 816.76 derived from Ngo0094 modified with one glycan. (E) MS2 spectrum of the [M + 4H]4+ precursor at m/z 1011.97 derived from Ngo1225 showing the presence of one glycan. (F) MS2 spectrum of the [M + 4H]4+ precursor at m/z 1068.99 derived from Ngo1225 showing the presence of two glycans. (G) MS2 spectrum of the [M + 3H]3+ precursor at m/z 1160.19 derived from Ngo0360 with one glycan. (H) MS2 spectrum of the [M + 3H]3+ precursor at m/z 816.76 derived from Ngo1494 modified with one glycan.
these criteria, whereas none were seen in the pglC sample. We considered glycosylated peptides with peptide sequence overlap due to missed trypsin cleavage sites or salt adducts as a single unique glycopeptide. Using this method we found 16 unique glycopeptides, 10 of which were previously unidentified.
Among the previously unidentified glycopeptides detected in our analysis was the Ngo1365 tryptic peptide 384EWAPSENQAAAPQAGVQTASEAKPASEAK412. This peptide was identified with both one and two glycans (Figure 1A and B, respectively). In the case of Ngo0094, 2 nonoverlapping singly 5784
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
glycosylated peptides, 120VWIFINESDDTVSAPARPAVK140 and 149QQAAAPFTESVVSVSAPFSPAK170 (Figure 1C and D) were detected. The peptide 25EAAPASASEPAAASAAQGDTSSIGGTMQQASYAMGVDIGR64 from Ngo1225 was detected with both one and two glycans (Figure 1E and F). The identification of identical peptides with varying numbers of glycans was most likely not due to loss of the glycan during analysis, since peptides with varying glycans separate on the HPLC column prior to entering the MS, but rather showed that these glycopeptides can carry one and sometimes two glycans. Furthermore, the peptides 356MTAAENQEAEQAASEPANENTASEPAAASDVK387 from Ngo0360 (Figure 1G) and 23NAVQPQAGSAPAANAEAAATDTLNIYNWSNYVDESTVEDFKK 64 from Ngo1494 (Figure 1H) were found with one glycan. Moreover, the large peptide 165 VVSAKINQKDDSENYLVDHSSGAYLIDK192 from the glycoprotein Ngo1237 was detected with one glycan, and the peptide 151SGGSFPNPDEAAPADNAASGTASAPADSAAPAEAK185 from the glycoprotein Ngo1328 was detected with both one and two glycans. To obtain shorter glycopeptides and thereby narrow down the possible glycosylation sites, AspN digests of affinity purified hexa-histidine (6xHis) tagged proteins expressed in wild-type backgrounds carrying the N. gonorrhoeae disaccharide diNAcBac-AcGal14 were performed as previously described.7 MS2 analysis of these samples identified the peptide 165DNAASGTASAPA171 from Ngo1328 with one (previously shown7) and with two glycans and the peptide 182DHSSGAYLI190 from Ngo1237 with one glycan (Figure 2A and B).
Together, this method therefore allowed the detection and idenfication of 18 glycopeptides, 8 glycopeptides previously identified from the glycoproteins Ngo1276, Ngo1328, Ngo1717, Ngo2139, Ngo1043, and Ngo13717 and 10 novel glycopeptides. These 10 new glycopeptides belonged to 4 known (Ngo1225, Ngo1494, Ngo1328, and Ngo1237) and 3 new glycoproteins (Ngo1365, Ngo0094, and Ngo0360). Determination of Glycosylation Sites by MS2 Analysis
We next sought to map the precise glycosylation site on the 18 identified glycopeptides detected in the LC−MS run of the immunoprecipitated samples. The general protein glycosylation system in N. gonorrhoeae is an O-linked glycosylation system7,8 and hence attaches glycans to hydroxyl-containing amino acids, predominantly threonine or serine. Glycopeptides were identified from the pglA samples by the presence of the three diNAcBac reporter oxonium ions at m/z 229.118, m/z 211.107 and m/z 169.096 in the low mass area of the fragmentation spectra. The corresponding peptide sequence was identified through a software search against the in-house generated Neisseria protein database, and the number of glycans were determined by the mass shift from the theoretical peptide mass of the MS1 precursor. We then searched combined CID/HCD fragmentation spectra of this peptide for fragment ions carrying the glycan. A fragment ion was considered to bear a glycan if we detected an ion with a mass identical, within 0.02 Da, to the theoretical mass of the fragment ion with a glycan (228.110 Da). To map the glycosylation sites, we subsequently identified these glycan carrying fragment ions. Because we digested with trypsin, y-fragment ions carrying glycan were commonly more abundant than b-fragment ions carrying glycan. To map the glycan modifications, we therefore primarily used the glycancarrying y-ions, and used the b-ions carrying glycan as well as abundant unmodified fragment ions to reduce the number of possible modification sites. Manual investigation of fragment ions therefore allowed the identification of glycosylation sites either indirectly, as the glycan could be mapped within an amino acid sequence containing only a single hydroxyl bearing amino acid, or directly, as the glycan could be mapped to a specific hydroxyl bearing amino acid. For instance, we identified one glycosylation site on the previously identified glycoprotein Ngo2139. The diNAcBac reporter oxonium ions at m/z 229.118, m/z 211.108 and m/z 169.097 could be detected in the low mass range of the MS2 spectrum from the precursor ion at m/z 624.30. The triply charged precursor ion at m/z 624.30 (giving an observed mass of 1870.88 Da [M + H+]), corresponded to the peptide 76 DSAPAASAAAPSADNGAAK109 with a single diNAcBac modification (Figure 3A). By de novo sequencing of the peptide and analysis of b- and y-ions, we detected a doubly charged ion at m/z 799.390 (giving an observed mass of 1597.773 Da [M + H+]) corresponding to the y16 ion (theoretical mass 1369.670 Da [M + H+]) with one diNAcBac modification. This demonstrated that the glycan was attached within the sequence 79PAASAAAPSADNGAAK109. Although abundant ions at y10 and y9 were detected, no ion corresponding to peptides with attached glycan could be detected C-terminally of y10 (i.e., within the peptide 84 APSADNGAAK109). Moreover, the abundant doubly charged ion detected at m/z 521.246 corresponded to b10 with a diNAcBac modification, demonstrating that the glycan was attached N-terminally of b10 and therefore within the sequence 76DSAPAASAAA85. The only hydroxyl containing amino acid contained within both sequences corresponding to the y16 ion and the b10
Figure 2. Identification of immunoaffinity enriched, AspN derived glycopeptides by MS2. HCD spectra show reporter ions of AcHex (at m/z 433.181) and diNAcBac (at m/z 229.118, m/z 211.107 and at m/z 169.096). Superscript associated with protein designations denotes modification with either 1 (1) or 2 (2) glycans. Asterisk (*) denotes unmodified peptide. (A) MS2 spectrum of the [M + 2H]2+ precursor at m/z 959.90 derived from Ngo1328 modified with two glycans. (B) MS2 spectrum of the [M + 2H]2+ precursor at m/z 678.81 derived from Ngo1237 modified with one glycan. 5785
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
ion, i.e., within the sequence 79PAASAAA85, is Ser82. Ngo2139 was therefore modified with a diNAcBac at Ser82. Employing a similar reasoning, we investigated fragment ions from MS2 spectra of the other glycopeptides (Figure 3 and Table 1). In this manner 10 glycosylation sites were identified, of which 9 were previously unidentified.
nonreduntant sequence coverage; i.e., sequence identical peptides are presented as a single unique peptide. Only proteins identified with at least two unique peptides were included in the list. The N. gonorrhoeae oligosaccharide transferase, PglO, is predicted to opererate in the periplasm;8 we therefore chose to omit proteins from the list that were not predicted to be localized to the periplasm by virtue of their lacking either a predicted signal peptide or a transmembrane domain. The generated list is shown in Table 2. Glycopeptides identified by MS analysis are shown both with amino acid sequence and peptide position relative to the unprocessed N-terminus as well as the number of glycan modifications identified on the peptides. Moreover, the site of modification identified in this work is shown in bold with underline. A search of MS2 spectra of peptides derived from the pglC background failed to reveal
Identification of Potential Glycoproteins
The IP-LCMS method is limited by the ability to generate and detect identifiable glycopeptides. To find potential glycoproteins not picked up by the searches described above, we identified all proteins within the immunoprecipitated eluates from pglA and pglC strains by MS. A list of pglA specific proteins was then generated by excluding all proteins also found in the pglC sample. Identified peptides are presented with
Figure 3. continued 5786
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
Figure 3. Identification of glycan occupancy sites from glycopeptides using CID/HCD. Reporter ions of diNAcBac at m/z 229.118, m/z 211.107 and at m/z 169.096 are shown. Superscript associated with protein designations denote modification with either 1 (1) or 2 (2) glycans. Asterisks denote unmodified peptides. Numbers in closed brackets denotes relative abundance of fragment ions. (A) MS2 spectrum of the [M + 3H]3+ precursor at m/z 1127.55 derived from Ngo1365 modified with two glycans. Presence of glycosylated y6 identified a glycosite at Ser409 of peptide 384 EWAPSENQAAAPQAGVQTASEAKPASEAK412. (B) MS2 spectrum of the [M + 3H]3+ precursor at m/z 848.45 derived from Ngo0094 modified with one glycan. A glycosite was identified at Ser132 on the peptide 120VWIFINESDDTVSAPARPAVK140 based on the presence of glycosylated y9. (C) MS2 spectrum of the [M + 3H]3+ precursor at m/z 816.42 derived from Ngo0094 modified with one glycan. Peptide 149QQAAAPFTESVVSVSAPFSPAK170 was glycosylated at Ser163. (D) MS2 spectrum of the [M + 3H]3+ precursor at m/z 1159.52 derived from Ngo0360 modified with one glycan. The peptide 356MTAAENQEAEQAASEPANENTASEPAAASDVK387 was glycosylated at Ser378. (E) MS2 spectrum of the [M + 3H]3+ precursor at m/z 1078.52 derived from Ngo1276 modified with one glycan. The peptide 358LSDTAYAGSGAASAPAASAPAASAPAASASEK389 was modified at Ser380. (F) MS2 spectrum of the [M + 3H]3+ precursor at m/z 1429.74 derived from Ngo1717 showing the presence of one glycan. The peptide25VQTSVPADSAPAASAAAAPAGLVEGQNYTVLANPIPQQQAGK66 was modified with a diNAcBac at Ser38. (G) MS2 spectrum of the [M + 3H]3+ precursor at m/z 1088.19 derived from Ngo1043 modified with two glycans. One glycosite was identified at Ser49 on peptide 27EAAQAVESDVKDTAASAAESAASAVEEAK55. (H) MS2 spectrum of the [M + 4H]4+ precursor at m/z 1011.47 derived from Ngo1225 showing the presence of one glycan. The glycosite was identified at Ser30 on peptide 25EAAPASASEPAAASAAQGDTSSIGGTMQQASYAMGVDIGR64. (I) MS2 spectrum of the [M + 3H]3+ precursor at m/z 624.30 derived from Ngo2139 modified with one glycan. The glycosite was identified at Ser82 on peptide 76DSAPAASAAAPSADNGAAK109. (J) MS2 spectrum of the [M + 4H]4+ precursor at m/z 831.68 derived from Ngo1371 modified with one glycan. The peptide 315KAEPAPAAEPAPSAPAEAAQAASEAKPAAAEPK347 was modified with a diNAcBac at Ser337.
Verification of New Glycoproteins
the presence of the three reporter oxonium ions for the diNAcBac glycan. Moreover, none of the previously identified glycosylated proteins were detected in the glycan negative samples. As seen in Table 2, all previously described gonococcal glycoproteins7 appear in the pglA specific list and, with the exception of Ngo0994, were represented by many peptides. Moreover, the protein sequence coverage of glycoproteins from identified peptides were relatively high, including Ngo0994. The same was also true for the three new glycoproteins identified (Ngo1365, Ngo0094 and Ngo0360). In addition to the proteins for which the glycosylation status was already established, several other proteins appeared in the pglA specific list identified with a large number of peptides and high protein sequence coverage.
We opted to directly assess the glycosylation status of new proteins highly represented at the peptide level and exhibiting features commonly associated with glycosylation further. These included Ngo0360, Ngo1363, Ngo1364, Ngo1365, Ngo1440, and Ngo1769. The proteins were engineered to carry a 6xHis extension at their C-termini and subsequently affinity purified from wild-type (carrying diNAcBac-AcGal), and either pglC (glycan negative) or pglO (glycan negative) backgrounds.7 As seen in Figure 4A, Ngo0360, Ngo1364, Ngo1365, Ngo1440, and Ngo1769 were glycosylated as they reacted with the glycanspecific npg2 antibody in a pgl-dependent manner, while Ngo1363 did not (data not shown). This analysis therefore confirmed the glycosylation status of Ngo0360 and Ngo1365 5787
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
5788
AniA
Ngo1276
PPI
CcoP
Ngo1225
Ngo1371
831.68 [4+]
1011.47 [4+]
1088.19 [3+]
1429.74 [3+]
1078.52 [3+]
1159.52 [3+]
816.42 [3+]
848.45 [3+]
1127.55 [3+]
624.30 [3+]
precursor mass [charge]b peptide sequence DSAPAASAAAPSADNGAAK
109
EAAQAVESDVKDTAASAAESAASAVEEAK55 EAAPASASEPAAASAAQGDTSSI GGTM(ox)QQASYAM(ox) GVDIGR64
27
25
KAEPAPAAEPAPSAPAEAAQAASEAKPAAAEPK347
315
VQTSVPADSAPAASAAAAPAG LVEGQNYTVLANPIPQQQAGK66
25
LSDTAYAGSGAASAPAASAPAASAPAASASEK389
358
M(ox) TAAENQEAEQAASEPANENTASEPAAASDVK387
356
QQAAAPFTESVVSVSAPFSPAK170
149
VWIFINESDDTVSAPARPAVK140
120
EWAPSENQAAAPQAGVQTASEAKPASEAK412
384
76
c
1
1
2
1
2
1
1
1
2
1
# of glycansd
y19 (1018.521 [2+])
y35-H2O (1793.301 [2+])
y8 (1032.534 [1+])
PAEAAQAASEAKPAAAEPK347
SASEPAAASAAQGDTSSIGG TM(ox)QQASYAM (ox)GVDIGR64
ASAVEEAK55
SAVEEAK55
329
30
48
49
VQTSVPADSAPAASAAAA42
25
b18 (1794.901 [1]) y7 (961.488 [1+])
PAASAAAAPAGLVEGQNYTVLANPIPQQQAGK66
SAPAASASEK389
389
35
380
PAASAPAASASEK
SEPAAASDVK387
y32 (1666.354 [2+])
y10 (1146.583 [1+])
y13 (1385.688 [1+])
377
378
y10 13C (1203.593 [1+])
SAPFSPAK170
SAPARPAVK140
PANENTASEPAAASDVK387
y17 (950.451 [2+])
163
371
y8 (1032.541[2+])
y9 (562.827 [2+])
PASEAK412
407
y6 (830.429 [1+]) 132
QTASEAKPASEAK412
400
y13(887.455 [2+]
PQAGVQTASEAKPASEAK412
395
DSAPAASAAA85
b10(521.246 [2+]) y18(1113.575 [2+])
PAASAAAPSADNGAAK
76
y16(799.390[2+])
109
corresponding amino acid sequence with diNAcBacf 79
ion with diNAcBac (m/z [charge])e
y33-H2O (1600.194 [2+])
y4 (476.235 [1+])
y5 (521.258 [1+])
y8 (760.386 [1+])
y8 (758.404 [1+])
32
51
SASEK389
PAASASEK389
PAAASDVK387
Ser49 (second glycan undetermined)
Ser38
Ser380 (second glycan undetermined)
Ser378
Ser163
Ser132
Ser409 (second glycan at either Ser403 or Thr401)
Ser82
glycosylation sitei
Ser337
SEPAAASer30 SAAQGDTSSIGGT M(ox)QQASYAM (ox)GVDIGR64
EEAK55
385
382
380
EPAAASDVK387
PFSPAK170
165
y6 (646.358 [1+])
379
SPAK170
y9 (887.450 [1+])
109
PSADNGAAK109
APSADNGAAK
167
85
84
corresponding amino acid sequence without diNAcBach
y4 (402.237 [1+])
y9 (415.703 [2+])
y10(901.437[1+])
ion without diNAcBac (m/z [charge])g
c
“Protein” and “protein name” are derived from the KEGG (Kyoto Encyclopedia of Genes and Genomes) database. b“Precursor mass” (in Da) includes diNAcBac glycans with a mass of 228.110 Da. “Peptide sequence” was derived from sofware searches and M(ox) denotes an oxidized methionine. d“# of glycans” denotes the number of diNAcBac monosaccharides (228.110 Da) the precursor mass differs from the theoretical peptide mass. e“Ion with diNAcBac” specifies the informative b- and y-ions with a glycan attached (with m/z and charge) detected in the MS2 spectrum of the glycopeptide, and 13 C is the carbon-13 stable isotope of carbon generating an M + 1 Da peak. f“Corresponding amino acid sequence with diNAcBac” represents the amino acid sequence shown to be glycosylated on the basis of the informative b- and y-ions. g“Ion without diNAcBac” specifies the most informative y-ions (with m/z and charge) determined to only be present as unmodified in the MS2 spectrum of the glycopeptide. h“Corresponding amino acid sequence without diNAcBac” represents the corresponding amino acid sequence shown not to be glycosylated. iThe “glycosylation site” is presented as modified serines numbered according to unprocessed protein.
a
HemX
Ngo0360
unknown
PilQ
Ngo0094
Ngo1043
PilQ
Ngo0094
DsbA
MtrC
Ngo1365
Ngo1717
GNA1946
protein namea
Ngo2139
protein
a
Table 1. Summary of the Most Informative b- and y-Ions Utilized for Identification of O-Linked Glycosylation Sites
Journal of Proteome Research Article
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
Table 2. Proteins Selectively Enriched Following Immunoaffinity Purificationj
“Rank order” is based on the number of unique peptides identified (see e). b,c,d“Accession NGO#”, “protein names”, and “protein descriptions” were derived from NCBI/KEGG databases. e“Σ#peptides” denotes the number of unique peptides identified. f“% coverage” denotes the percentage of the protein covered by unique peptides and is corrected for signal peptide processing. g“Glycopeptide sequence” was identified from MS2 analyses and is shown with residues numbered according to unprocessed protein. Serine occupancy sites are underlined and in bold. Glycopeptides identified from AspN digest of individually affinity purified proteins (+6His) are underlined. h“# of glycans” denotes the number of glycans identified on the peptides and is shown with 1 and 2. i“Glycosite” denotes the identified glycosylation site and is numbered according to the unprocessed protein. j Proteins with light grey shading are glycoproteins identified earlier,7 while those with dark grey shading are glycoproteins identified in this work. The glycosylation status of unshaded proteins are unknown save for MtrE (#5), which was shown not to be glycosylated as evidenced by its nonreactivity with the npg2 mAb following His-tagging and affinity purification (data not shown). a
localization and membrane association.7 And in fact, the majority of glycoproteins are lipoproteins with glycosylation sites located close to the acylated cysteine residue. On this basis, we realized that some potential glycoproteins would easily be missed because of our experimental design and dependence on MS as an identification method, as this method is particularly challenging when it comes to analyzing acylated peptides. Our experimental design would therefore be particularly biased against finding small lipoglycoproteins with few tryptic sites. In an attempt to obviate this risk, we performed an in silico search for such proteins and found that Ngo0983 (Lip) fulfilled all these criteria. It is a small periplasmic lipoprotein consisting of a varying number of AAEAP repeats and contains two serines and a threonine close to the acylation site. Moreover, Lip shows aberrant migration patterns on immunoblots of protein extracted from N. gonorrhoeae versus E. coli.26 Using the affinity tagging and immunoblotting approach, we found that Ngo0983 reacted with npg2 in a pgl specific manner, confirming that it is a glycoprotein.
Figure 4. Confirmation of glycosylation status by immunoblotting of affinity purified, glycoprotein candidates. (A) C-terminally 6xHistagged candidate glycoproteins were purified from wild type (wt) and pgl mutant backgrounds and tested for pgl-specific np2 mAb reactivity. (B) Ngo0983−6xHis was purified and tested as in A.
Glycan Occupancy Site Features
As summarized in Figure 5 and Table 2, we observed the same strong association between glycan occupancy sites and ASPrich LCRs observed earlier.7 Specifically, 11 of 12 novel glycopeptides identified here map within such LCRs with the sole exception being that found for Ngo1237 (Figure 5A and 5B). Although we failed to identify a specific glycopeptide for Ngo0983, this same association applies there, as it is made up exclusively of AAEAP repeats and a single tripeptide of hydroxyl-bearing residues (SST) (Figure 5C).
and further identified Ngo1364, Ngo1440, and Ngo1769 as bona fide glycoproteins. In Silico Glycoprotein Identification
Previous work has shown a correlation between gonococcal protein glycosylation and low complexity regions (LCR) rich in alanine, serine, and proline (ASP) as well as periplasmic 5789
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
The 9 new glycosylation sites defined in this work together with the three sites previously identified7 as well as the Ser63 glycosite on PilE27 provide a broader data set to examine other features governing site occupancy. We therefore created an alignment using all known glycosites centered on the glycosylation site and included three amino acids C-terminal of the glycosite and five amino acids N-terminal of the glycosylation site. The heights of stacks represent the sequence conservation, are shown in bits, and take into account the small sample size. As can be seen in Figure 5D, no clear sequon-like signals are found, although there is a clear tendency for the presence of alanine and other small amino acids flanking the targeted serine residue.
■
DISCUSSION Characterization of the glycoproteome within a biological system is critical to understanding the overall significance of protein glycosylation. Major challenges to glycoproteomic analyses include finding sensitive and reliable means for glycoprotein identification as well as to reduce overall sample complexity. Current methodologies for enrichment of glycoproteins and glycopeptides include selective affinity purification utilizing glycan specific antibodies, lectins and chemoenzymatic labeling. Here, we combined an immunoaffinity-based strategy with downstream MS analyses to further our knowledge of the glycoproteome of the N. gonorrhoeae general glycosylation system. To our knowledge, this is only the second example in which glycan-targeted immunopurification has been utilized in glycoproteome characterization following that performed on O-GlcNAc modifications within eukaryotes.28 In this manner, we significantly increased the depth of coverage and confirmed and extended our earlier results relating to the scope of glycoproteins and their associations based on structure, function, localization and attachment sites. Specifically, we identified six new glycoproteins, five associated glycopeptides and four corresponding attachment sites (serine in all cases). Like previously identified neisserial glycoproteins, all are predicted to be anchored on or within the membrane with significant domain segments oriented to the periplasm. The new glycoproteins, particularly those associated with well-defined functions, merit further discussion. MtrD and MtrC (Ngo1364 and Ngo1365) are the inner-membrane transporter and the periplasmic membrane fusion protein components of the clinically relevant RND-type multidrug efflux pump.29 A mutation resulting in the upregulation of these two components together with MtrE (Ngo1363) are found in gonococcal strains displaying high level antimicrobial resistance.30 The glycosylation status of MtrD is also noteworthy, as it represents the first inner membrane glycoprotein bearing multiple transmembrane spanning domains (Figure 5C). In addition, another new glycoprotein, MacA (Ngo1440), is structurally distinct but functionally similar to MtrC in that it also serves as a membrane fusion protein in conjunction with an ABC type-efflux pump.31 While this could indicate a role for protein glycosylation in the function of this family of proteins, pgl null mutants do not display defects in efflux pump associated phenotypes48. Ngo1769 is a cytochrome peroxidase with two c-type heme groups and is implicated in defense against reactive oxygen species.32 Together with two other c-type cytochromes (CycB and CcoP) and AniA, it represents the fourth glycoprotein to be directly involved in periplasmic electron transfer reactions. PilQ is an abundant, outer membrane protein that in its dodecameric form is critical to the ability of
Figure 5. Domain organization of N. gonorrhoeae glycoproteins and occupancy sites. (A) Glycoproteins for which occupancy sites were identified. (B) Glycoproteins for which glycopeptides were identified. (C) Glycoproteins identified for which neither occupancy sites nor glycopeptides have been determined. Residues are numbered according to the unprocessed protein. Yellow rectangles denote the location of glycopeptides. Asterisks (*) denote glycopeptides carrying two glycans. Serine occupancy sites are colored in red. A vertical red line denotes a lipoprotein processing site. (D) Alignment of all occupancy sites identified in glycoproteins from N. gonorrhoeae. The alignment encompasses the −5 to +3 region relative to the serine occupancy site (at position 0). The height of amino acids show their relative abundance in position to the glycosite, whereas the height of amino acid stacks represent sequence conservation and is shown in bits. Positively charged amino acids are labeled red, and negatively charged amino acids are colored blue. Small amino acids are colored green, and large amino acids colored black. 5790
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
type IV pili to reach the bacterial cell surface.33,34 Given this status, it represents the first β-barrel, outer membrane glycoprotein in this sytem. While the MS data for its glycosylation is unequivocal, it is quite remarkable that we have never observed PilQ reactivity with the npg1 mAb (or any other of the glycanspecific mAbs) in immunoblotting. We envision two explanations for these observations. One may be that the specific epitopes recognized by the mAbs are not present when the glycan is attached to PilQ. This possibility seems unlikely as the mAb reacts with all other glycoproteins examined to date and since PilQ was found in the immuno-affinity enriched sample. An alternative scenario is based on the observation that C-terminally truncated forms of PilQ have been observed in immunoblotting and the suggestions that these were derived proteolytically.35 As such, glycosylation of PilQ may only occur on its truncated forms. This idea is also supported by the atypical distribution of observed PilQ-derived peptides that are restricted to the N-terminal 50% of the polypeptide. Finally, the glycoprotein Ngo0360 is annotated as HemX whose gene is part of an operon encoding enzymes for the biosynthesis of uroporphyrinogen III. However, the exact role of HemX has not yet been determined. Additionally, we identified three novel glycopeptides and five new attachment sites from previously known glycoproteins. Together with the data for the new glycoproteins, the findings confirmed the selective targeting to serine and the strong association of attachment sites to low complexity regions (LCR) rich in alanine, serine and proline (ASP) residues. The earlier observed tendency for the localization of glycan-bearing LCRs near the N-termini of lipoproteins was also found here. Finally, we observed six new cases of multiple occupancy sites in single proteins in addition to the two examples from earlier. Despite the increased number of occupancy sites identified, no sequon-like motifs were seen with merely a tendency for alanine and other small amino acids flanking the targeted serine. As such, further attempts at identifying the structural and sequence constraints defining site selection would be well served by in vivo studies using specifically altered proteins or in vitro using defined peptide substrates.9 The efficacy of the enrichment scheme here is readily seen in the over-representation of peptides from bona fide glycoproteins. On the basis of ranked order utilizing the number of peptides detected, 16 of the top 20 hits are glycoproteins (Table 2). The data however point out one of the confounding features of affinity purification relating to the potential enrichment of nonglycosylated proteins due to their association with glycoproteins. Notably, the fifth ranked protein MtrE (Ngo1363, the outer membrane component of the MtrDCE complex) is likely found because of its association with the glycoproteins MtrD and MtrC, as its fails to react with the npg1 antibody when purified directly (data not shown). Likewise, the 22nd ranked CcoO protein is likely seen here because it is in a complex with the CcoP glycoprotein as opposed to its being glycosylated itself.36 Despite these caveats, many of the other highly ranked hits are strong glycoprotein candidates. A second concern here concerns the potential bias imposed by MS peptide detection and the fact that some glycopeptides might be missed due to intrinsic structural features rendering them resistant to proteolytic digestion and detection. This issue is particularly exemplified in the case of glycopeptides mapping near the acylated N-terminal domains of lipoproteins. We therefore also incorporated an in silico approach to identify candidates on the basis of the presence of the ASP-rich LCR
together with periplasmic targeting signals and identified Ngo0983, also known as Lip, and confirmed its glycosylation. Lip was originally highly scrutinized as an abundantly expressed and potentially protective antigen; however, its surface accessibility was never definitively established.37 Interestingly, the five amino acid tandem repeats of [AS]-[AT]-E-A-[PAS] that make up the bulk of Lip are also present in another glycosylated lipoprotein, Ngo1043. Our identification of glycopeptides was facilitated by the presence of three reporter ions for diNAcBac in HCD spectra at m/z 229.118, m/z 211.107 and at m/z 169.096. This is in agreement with previous work demonstrating the usefulness of HCD for identification of glycosylated peptides by reporter ion detection.12,38,39 By de novo sequencing of combined CID/ HCD spectra from the 18 identified unique glycopeptides and searching for glycan retaining fragments, 10 glycosylation sites were identified. Our ability to map O-linked glycosylation sites on monosaccharide carrying peptides depended on the precursor charge state and peptide sequence. In contrast to the abundant backbone fragmentation that was generated from doubly charged unmodified precursor peptides (z = 2), all glycopeptide precursors with sufficient backbone fragmentation for site mapping had a charge state greater than 2 (z > 2). The necessity for higher charge states for abundant backbone fragmentation of glycopeptides is in agreement with our previous work, showing proton sequestration by the N. gonorrhoeae glycan and limited peptide backbone fragmentation of doubly charged trypsin generated glycopeptides7 and in conformity with the “mobile proton” model.40 Our results showed that the abundance of glycan-carrrying fragment ions was dependent on peptide sequence and showed a pattern of enhanced detection of glycan-carrying fragment at certain amino acid residues. For instance, we benefitted from the proline effect,41,42 generating abundant glycosylated yn ions in the interesting region around the glycosylation site, since glycosylation sites are associated with areas rich in Ala, Ser and Pro.7 As such, the generation of glycan-carrying fragments is governed by the sequence dependent peptide fragmentation experienced in collisionally driven MS2 experiments.40,43 However, as a consequence of the labile nature of glycosidic bonds, CID/HCD MS2 experiments commonly generated less abundant glycan-carrying peptide fragments than their glycan-free counterpart. When mapping glycosites, we therefore benefitted greatly from the high mass accuracy offered by the Orbitrap XL.17 An interesting result in this regard was seen for Ngo0360 (Figure 3E). Here, where proline followed glutamic acid, the glycan-carrying fragment (y17 diNAcBac) was more abundant than the glycan-free fragment (y17). This increased abundance of glycan-carrying versus glycan-free fragments was likely due to the combined proline and glutamic acid effects that preferably cause dissociation of peptide bonds immediately N-terminally of proline and immediately C-terminally of glutamic acid.40−42,44 Since generation of glycan-carrying peptide fragments is dependent upon the collision energy in relation to the peptide sequence, the collision energy can be optimized to best map glycosylation at different sequences. In conclusion, we designed and implemented a workflow utilizing glycan-targeted, immunoaffinity purification to provide deeper insight into the O-glycoproteome of N. gonorrhoeae. Our findings clearly document the utility and plausible applicability of the methodology to studies of other protein glycosylation systems. Although not employed here, the recently described ability to reconstitute diNAcBac based glycosylation in Escherichia coli makes it feasible to directly screen protein 5791
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
(8) Aas, F. E.; Vik, Å .; Vedde, J.; Koomey, M.; Egge-Jacobsen, W. Neisseria gonorrhoeae O-linked pilin glycosylation: functional analyses define both the biosynthetic pathway and glycan structure. Mol. Microbiol. 2007, 65, 607. (9) Hartley, M. D.; Morrison, M. J.; Aas, F. E.; Børud, B.; Koomey, M.; Imperiali, B. Biochemical characterization of the O-linked glycosylation pathway in Neisseria gonorrhoeae responsible for biosynthesis of protein glycans containing N,N′-diacetylbacillosamine. Biochemistry 2011, 50, 4936. (10) Børud, B.; Aas, F.; Vik, Å.; Winther-Larsen, H. C.; EggeJacobsen, W.; Koomey, M. Genetic, structural, and antigenic analyses of glycan diversity in the O-linked protein glycosylation systems of human Neisseria species. J. Bacteriol. 2010, 192, 2816. (11) Young, N. M.; Brisson, J. R.; Kelly, J.; Watson, D. C.; Tessier, L.; Lanthier, P. H.; Jarrell, H. C.; Cadotte, N.; Michael, F. S.; Aberg, E.; Szymanski, C. M. Structure of the N-linked glycan present on multiple glycoproteins in the gram-negative bacterium, Campylobacter jejuni. J. Biol. Chem. 2002, 277, 42530. (12) Scott, N. E.; Parker, B. L.; Connolly, A. M.; Paulech, J.; Edwards, A. V. G.; Crossett, B.; Falconer, L.; Kolarich, D.; Djordjevic, S. P.; Højrup, P.; Packer, N. H.; Larsen, M. R.; Cordwell, S. J. Simultaneous glycan-peptide characterization using hydrophilic interaction chromatography and parallel fragmentation by CID, Higher Energy Collisional Dissociation, and Electron Transfer Dissociation MS applied to the Nlinked glycoproteome of Campylobacter jejuni Mol. Cell. Proteomics 2011, 10(2). doi:10.1074/mcp.M000031-MCP201. (13) Fletcher, C. M.; Coyne, M. J.; Comstock, L. E. Theoretical and experimental characterization of the scope of protein O-glycosylation in Bacteroides fragilis. J. Biol. Chem. 2011, 286, 3219. (14) Børud, B.; Viburiene, R.; Hartley, M. D.; Paulsen, B. S.; EggeJacobsen, W.; Imperiali, B.; Koomey, M. Genetic and molecular analyses reveal an evolutionary trajectory for glycan synthesis in a bacterial protein glycosylation system. Proc. Natl. Acad. Sci. U. S. A. 2011, 108, 9643. (15) Freitag, N. E.; Seifert, H. S.; Koomey, M. Characterization of the pilF-pilD pilus-assembly locus of Neisseria gonorrhoeae. Mol. Microbiol. 1995, 16, 575. (16) Aas, F. E.; Egge-Jacobsen, W.; Winther-Larsen, H. C.; Lo̷ vold, C.; Hitchen, P. G.; Dell, A.; Koomey, M. Neisseria gonorrhoeae type IV pili undergo multisite, hierarchical modifications with phosphoethanolamine and phosphocholine requiring an enzyme structurally related to lipopolysaccharide phosphoethanolamine transferases. J. Biol. Chem. 2006, 281, 27712. (17) Olsen, J. V.; de Godoy, L. M. F.; Li, G.; Macek, B.; Mortensen, P.; Pesch, R.; Makarov, A.; Lange, O.; Horning, S.; Mann, M. Parts per million mass accuracy on an Orbitrap mass spectrometer via lock mass injection into a C-trap. Mol. Cell. Proteomics 2005, 4, 2010. (18) Chamot-Rooke, J.; Rousseau, B.; Lanternier, F.; Mikaty, G.; Mairey, E.; Malosse, C.; Bouchoux, G.; Pelicic, V.; Camoin, L.; Nassif, X.; Duménil, G. Alternative Neisseria spp. type IV pilin glycosylation with a glyceramido acetamido trideoxyhexose residue. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 14783. (19) Marchler-Bauer, A.; Lu, S.; Anderson, J. B.; Chitsaz, F.; Derbyshire, M. K.; DeWeese-Scott, C.; Fong, J. H.; Geer, L. Y.; Geer, R. C.; Gonzales, N. R.; Gwadz, M.; Hurwitz, D. I.; Jackson, J. D.; Ke, Z.; Lanczycki, C. J.; Lu, F.; Marchler, G. H.; Mullokandov, M.; Omelchenko, M. V.; Robertson, C. L.; Song, J. S.; Thanki, N.; Yamashita, R. A.; Zhang, D.; Zhang, N.; Zheng, C.; Bryant, S. H. CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res. 2011, 39, D225. (20) Marchler-Bauer, A.; Anderson, J. B.; Chitsaz, F.; Derbyshire, M. K.; DeWeese-Scott, C.; Fong, J. H.; Geer, L. Y.; Geer, R. C.; Gonzales, N. R.; Gwadz, M.; He, S.; Hurwitz, D. I.; Jackson, J. D.; Ke, Z.; Lanczycki, C. J.; Liebert, C. A.; Liu, C.; Lu, F.; Lu, S.; Marchler, G. H.; Mullokandov, M.; Song, J. S.; Tasneem, A.; Thanki, N.; Yamashita, R. A.; Zhang, D.; Zhang, N.; Bryant, S. H. CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res. 2009, 37, D205.
candidates for glycosylation susceptibility13. In agreement with previous work,45−47 the results also show the ability of both CID and HCD to identify the sites of glycan attachment as well as the established ability to reveal glycan structure and glycopeptide identity. Taken together this study supports the recognition that comprehensive glycoprotein analysis within complex samples is best achieved by a concerted approach encompassing fractionation technology, MS and bioinformatics.
■
ASSOCIATED CONTENT
S Supporting Information *
Table S1, Strains used in this study. Table S2, Primers used in this study. This material is available free of charge via the Internet at http://pubs.acs.org.
■
AUTHOR INFORMATION
Corresponding Author
*Phone: 47 22 854091. Fax: 47 22 856041. E-mail: johnk@ imbv.uio.no. Notes
The authors declare no competing financial interest.
■
ACKNOWLEDGMENTS We thank professor Einar Uggerud, University of Oslo, for valuable discussions related to results. This research was supported in part by Research Council of Norway Grants 166931, 183613, and 183814, and by funds from GlycoNor, the Department of Molecular Biosciences and Center for Molecular Biology and Neurosciences of the University of Oslo.
■
ABBREVIATIONS HCD, higher-energy C-trap dissociation; CID, collision induced dissociation; LC-ESI- MS2, liquid-chromatography electrospray-ionization tandem mass spectrometry; diNAcBac, N,N′-diacetylbacillosamine; OTase, oligosaccharide transferase; Pgl, protein glycosylation system; LCR, low complexity region; ASP, alanine, serine proline
■
REFERENCES
(1) Nita-Lazar, M.; Wacker, M.; Schegg, B.; Amber, S.; Aebi, M. The N-X-S/T consensus sequence is required but not sufficient for bacterial N-linked protein glycosylation. Glycobiology 2005, 15, 361. (2) Kowarik, M.; Young, N. M.; Numao, S.; Schulz, B. L.; Hug, I.; Callewaert, N.; Mills, D. C.; Watson, D. C.; Hernandez, M.; Kelly, J. F.; Wacker, M.; Aebi, M. Definition of the bacterial N-glycosylation site consensus sequence. EMBO J. 2006, 25, 1957. (3) Herrmann, J. L.; Delahay, R.; Gallagher, A.; Robertson, B.; Young, D. Analysis of post-translational modification of mycobacterial proteins using a cassette expression system. FEBS Lett. 2000, 473, 358. (4) VanderVen, B. C.; Harder, J. D.; Crick, D. C.; Belisle, J. T. Export-mediated assembly of mycobacterial glycoproteins parallels eukaryotic pathways. Science 2005, 309, 941. (5) Fletcher, C. M.; Coyne, M. J.; Villa, O. F.; Chatzidaki-Livanis, M.; Comstock, L. E. A general O-glycosylation system important to the physiology of a major human intestinal symbiont. Cell 2009, 137, 321. (6) Posch, G.; Pabst, M.; Brecker, L.; Altmann, F.; Messner, P.; Schaffer, C. Characterization and scope of S-layer protein Oglycosylation in Tannerella forsythia. J. Biol. Chem. 2011, 286, 38714. (7) Vik, Å .; Aas, F. E.; Anonsen, J. H.; Bilsborough, S.; Schneider, A.; Egge-Jacobsen, W.; Koomey, M. Broad spectrum O-linked protein glycosylation in the human pathogen Neisseria gonorrhoeae. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 4447. 5792
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793
Journal of Proteome Research
Article
(21) Marchler-Bauer, A.; Bryant, S. H. CD-Search: protein domain annotations on the fly. Nucleic Acids Res. 2004, 32, W327. (22) Yu, N. Y.; Wagner, J. R.; Laird, M. R.; Melli, G.; Rey, S.; Lo, R.; Dao, P.; Sahinalp, S. C.; Ester, M.; Foster, L. J.; Brinkman, F. S. L. PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. Bioinformatics 2010, 26, 1608. (23) Petersen, T. N.; Brunak, S.; von Heijne, G.; Nielsen, H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat. Methods 2011, 8, 785. (24) Crooks, G. E.; Hon, G.; Chandonia, J. M.; Brenner, S. E. WebLogo: A sequence logo generator. Genome Res. 2004, 14, 1188. (25) Schneider, T. D.; Stephens, R. M. Sequence logosa new way to display consensus sequences. Nucleic Acids Res. 1990, 18, 6097. (26) Woods, J. P.; Spinola, S. M.; Strobel, S. M.; Cannon, J. G. Conserved lipoprotein H.8 of pathogenic Neisseria consists entirely of pentapeptide repeats. Mol. Microbiol. 1989, 3, 43. (27) Parge, H. E.; Forest, K. T.; Hickey, M. J.; Christensen, D. A.; Getzoff, E. D.; Tainer, J. A. Structure of the fibre-forming protein pilin at 2.6 Å resolution. Nature 1995, 378, 32. (28) Teo, C. F.; Ingale, S.; Wolfert, M. A.; Elsayed, G. A.; Not, L. G.; Chatham, J. C.; Wells, L.; Boons, G. J. Glycopeptide-specific monoclonal antibodies suggest new roles for O-GlcNAc. Nat. Chem. Biol. 2010, 6, 338. (29) Janganan, T. K.; Zhang, L.; Bavro, V. N.; Matak-Vinkovic, D.; Barrera, N. P.; Burton, M. F.; Steel, P. G.; Robinson, C. V.; BorgesWalmsley, M. I.; Walmsley, A. R. Opening of the outer membrane protein channel in tripartite efflux pumps is induced by interaction with the membrane fusion partner. J. Biol. Chem. 2011, 286, 5484. (30) Shafer, W. M.; Folster, J. P.; Warner, D. E. M.; Johnson, P. J. T.; Balthazar, J. T.; Kamal, N.; Jerse, A. E. Expression of the MtrC-MtrDMtrE efflux pump in Neisseria gonorrhoeae and bacterial survival in the presence of antimicrobials. In National Institute of Allergy and Infectious Diseases, NIH: Frontiers in Research; Humana Press, Inc.: Totowa, NJ, 2008; Vol 1, Chapter 6. (31) Rouquette-Loughlin, C. E.; Balthazar, J. T.; Shafer, W. M. Characterization of the MacA-MacB efflux system in Neisseria gonorrhoeae. J. Antimicrob. Chemother. 2005, 56, 856. (32) Turner, S.; Reid, E.; Smith, H.; Cole, J. A novel cytochrome c peroxidase from Neisseria gonorrhoeae: a lipoprotein from a Gramnegative bacterium. Biochem. J. 2003, 373, 865. (33) Collins, R. F.; Ford, R. C.; Kitmitto, A.; Olsen, R. O.; Tonjum, T.; Derrick, J. P. Three-dimensional structure of the Neisseria meningitidis secretin PilQ determined from negative-stain transmission electron microscopy. J. Bacteriol. 2003, 185, 2611. (34) Collins, R. F.; Davidsen, L.; Derrick, J. P.; Ford, R. C.; To̷ njum, T. Analysis of the PilQ secretin from Neisseria meningitidis by transmission electron microscopy reveals a dodecameric quaternary structure. J. Bacteriol. 2001, 183, 3825. (35) To̷ njum, T.; Caugant, D. A.; Dunham, S. A.; Koomey, M. Structure and function of repetitive sequence elements associated with a highly polymorphic domain of the Neisseria meningitidis PilQ protein. Mol. Microbiol. 1998, 29, 111. (36) Pitcher, R. S.; Cheesman, M. R.; Watmough, N. J. Molecular and spectroscopic analysis of the cytochrome cbb(3) oxidase from Pseudomonas stutzeri. J. Biol. Chem. 2002, 277, 31474. (37) Hitchcock, P. J.; Hayes, S. F.; Mayer, L. W.; Shafer, W. M.; Tessier, S. L. Analyses of gonococcal H8 antigensurface location, inter-strain and intrastrain electrophoretic heterogeneity, and unusual two-dimensional electrophoretic characteristics. J. Exp. Med. 1985, 162, 2017. (38) Olsen, J. V.; Macek, B.; Lange, O.; Makarov, A.; Horning, S.; Mann, M. Higher-energy C-trap dissociation for peptide modification analysis. Nat. Methods 2007, 4, 709. (39) Zhao, P.; Viner, R.; Teo, C. F.; Boons, G. J.; Horn, D.; Wells, L. Combining high-energy C-trap dissociation and electron transfer dissociation for protein O-GlcNAc modification site assignment. J. Proteome Res. 2011, 10, 4088.
(40) Paizs, B.; Suhai, S. Fragmentation pathways of protonated peptides. Mass Spectrom. Rev. 2005, 24, 508. (41) Bleiholder, C.; Suhai, S.; Harrison, A.; Paizs, B. Towards understanding the tandem mass spectra of protonated oligopeptides. 2: The proline effect in Collision-Induced Dissociation of protonated Ala-Ala-Xxx-Pro-Ala (Xxx = Ala, Ser, Leu, Val, Phe, and Trp). J. Am. Soc. Mass Spectrom. 2011, 22, 1032. (42) Dong, N.-p.; Zhang, L.-x.; Liang, Y.-z. A comprehensive investigation of proline fragmentation behavior in low-energy collisioninduced dissociation peptide mass spectra. Int. J. Mass Spectrom. 2011, 308, 89. (43) Huang, Y. Y.; Triscari, J. M.; Tseng, G. C.; Pasa-Tolic, L.; Lipton, M. S.; Smith, R. D.; Wysocki, V. H. Statistical characterization of the charge state and residue dependence of low-energy CID peptide dissociation patterns. Anal. Chem. 2005, 77, 5800. (44) Yu, W.; Vath, J. E.; Huberty, M. C.; Martin, S. A. Identification of the facile gas-phase cleavage of the Asp Pro and Asp Xxx peptidebonds in matrix-assisted laser-desorption time-of-flight mass-spectrometry. Anal. Chem. 1993, 65, 3015. (45) Wuhrer, M.; Catalina, M. I.; Deelder, A. M.; Hokke, C. H. Glycoproteomics based on tandem mass spectrometry of glycopeptides. J. Chromatogr., B 2007, 849, 115. (46) Dell, A.; Morris, H. R. Glycoprotein structure determination mass spectrometry. Science 2001, 291, 2351. (47) Peter-Katalinic, J. In Mass Spectrometry: Modified Proteins and Glycoconjugates; Burlingame, A. L., Ed.; Elsevier Academic Press, Inc.: San Diego, 2005; Vol. 405, p 139. (48) Shafer, W. Personal communication.
5793
dx.doi.org/10.1021/pr300584x | J. Proteome Res. 2012, 11, 5781−5793