Mass Spectrometry Following Mild Enzymatic Digestion Reveals

Apr 21, 2010 - Centre for Biologics Research, Biologics and Genetic Therapies Directorate, Health Canada, Ottawa, Ontario. K1A 0K9, Canada, Banting an...
2 downloads 0 Views 3MB Size
Mass Spectrometry Following Mild Enzymatic Digestion Reveals Phosphorylation of Recombinant Proteins in Escherichia coli Through Mechanisms Involving Direct Nucleotide Binding Yi-Min She,† Xiaohui Xu,‡ Alexander F. Yakunin,‡ Sirano Dhe-Paganon,‡ Lynda J. Donald,§ Kenneth G. Standing,| Daniel C. Lee,⊥ Zongchao Jia,⊥ and Terry D. Cyr*,† Centre for Biologics Research, Biologics and Genetic Therapies Directorate, Health Canada, Ottawa, Ontario K1A 0K9, Canada, Banting and Best Department of Medical Research, and Structural Genomics Consortium, University of Toronto, Toronto, Ontario M5G 1L6, Canada, Department of Chemistry and Department of Physics and Astronomy, University of Manitoba, Winnipeg, Manitoba R3T 2N2, Canada, and Department of Biochemistry, Queen’s University, Kingston, Ontario K7L 3N6, Canada Received December 24, 2009

Abstract: A straightforward method using mild enzymatic digestions combined with MALDI mass spectrometry (MS) was used to enhance determination of the multiple phosphorylation sites of a set of recombinant nucleotidebinding proteins in Escherichia coli, including kinases and cystathionine beta-synthase (CBS) domain containing proteins. The protein kinases reveal abundant phosphorylations in the kinase domains and relatively low phosphogluconoylation (258 Da) at the N-terminal His-tag. In contrast, the CBS domain-containing proteins possess a highly conserved phosphorylation in vivo at Ser-2 of the His-tag. Multistage MS/MS and selected reaction monitoring established that the CBS domain proteins also contain a combined modification of gluconoylation (178 Da) and phosphorylation (80 Da) at two different sites, instead of an isobaric phosphogluconoylation (258 Da) event at the N-terminus. Functional analysis of 20 recombinant proteins as identified by mass spectrometry has shown the phosphorylation at the N-terminal His-tag is relevant to nucleotide binding and phosphotransfer reaction catalyzed by a serine protein kinase. Keywords: mass spectrometry • recombinant protein • Escherichia coli • protein phosphorylation • post-translational modification • kinase • CBS domain

Introduction Expression of proteins in bacteria, Escherichia coli for example, has been widely used to economically produce recombinant proteins for 3-dimensional protein structure and functional studies. A common method employed for purifica* To whom correspondence should be addressed. Terry Cyr, Centre for Biologics Research, Health Canada, 251 Sir Frederick Banting Driveway, Tunney’s Pasture, Ottawa, Ontario K1A 0K9, Canada. E-mail: terry.cyr@ hc-sc.gc.ca. Phone: 1-613-957-1068. Fax: 1-613-941-8933. † Health Canada. ‡ University of Toronto. § Department of Chemistry, University of Manitoba. | Department of Physics and Astronomy, University of Manitoba. ⊥ Queen’s University. 10.1021/pr9011987

 2010 American Chemical Society

tion of recombinant proteins is the use of an affinity tag, such as polyhistidine tag attached at the N- or C-terminus of proteins in E. coli. This technique provides sufficient materials for downstream protein structure analysis with electron cryomicroscopy (Cryo-EM), X-ray crystallography, nuclear magnetic resonance (NMR) or mass spectrometry (MS). As such, MS facilitates the structural evaluation of target proteins and allows detection of sequence errors, mistranslation, modifications, impurities and degradation products.1 Among the identified recombinant proteins in E. coli, Geoghegan et al. reported the first observation of extra masses of 178 and 258 Da in the Histagged fusion protein fragments at a pleckstrin-homology (PH) domain of beta-adrenergic receptor kinase (GRK-2) and two SH2 domains of a tyrosine kinase (ZAP-70) and a regulatory subunit of phosphatidylinositol 3-kinase (p85).2 On the basis of various experimental data, they proposed a plausible mechanism for the formation of a phosphogluconoylated His-tag (Gly-Ser-Ser-[His]6-) at the N-terminus by way of a nonenzymatic acylation with the cellular metabolite 6-phosphogluconolactone followed by dephosphorylation through a hostcell phosphatase (Figure S1, Supporting Information).2 A similar phenomenon has been observed in the SH3 domain of src tyrosine kinase by Kim et al.3 Recent studies by Geoghegan and co-workers on the His-tagged catalytic domains of protein serine/threonine kinases Aurora 2, Aurora A, p21-activated kinases (PAK1, PAK7), cAMP-dependent protein kinase A and proline-rich tyrosine kinase 2 (Pyk2) expressed in E. coli or insect cells showed unexpected hyperphosphorylation on the multiple serine residues in the vector-derived N-terminal His-tag.4,5 Of special note was the presence of a serine phosphorylation site at the N-terminal His-tag, rather than tyrosine residue, in the tyrosine kinase Pyk2. Phosphorylations of the N-terminal His-tag have been also reported in the recombinant myosin I heavy chain kinase,6 Aurora A7 and StkSA1.8 Much of the evidence was obtained from a rather restricted group of proteins, mainly kinases. Interestingly, these phosphorylations were observed to occur not only in the active sites of kinase domains caused by autophosphorylation but also in the His-tag region of the protein N-termini. Because recombinant proteins are commonly used for a wide variety of research, diagnostic, therapeutic and biopharmaJournal of Proteome Research 2010, 9, 3311–3318 3311 Published on Web 04/21/2010

technical notes ceutical applications, chemical and enzymatic modifications of such proteins are particularly useful for helping researchers understand the associated biological functions and for assessing protein stability and drug safety.9 Detecting and preventing these variable modifications have also become major challenges for the biotechnology industry and are critical for the comparison of innovators and subsequent entry biologics,9,10 as well as in academia for protein crystal preparation and subsequent 3D-structure studies. We have therefore carried out a detailed analysis of the post-translational modifications of several types of phosphorylated recombinant proteins expressed in E. coli, including serine/threonine/tyrosine kinases and other nucleotide-binding proteins. To accomplish this, we have utilized a strategy that combines limited enzymatic digestion methods with matrix-assisted laser desorption ionization (MALDI), electrospray ionization (ESI) and multistage tandem mass spectrometry (MSn) to identify the extent and specific localization of phosphorylation sites within the protein sequences. The methods allowed direct identification of the phosphorylated peptides on the protein kinases without the need for any preenrichment or HPLC separation and enabled us to identify a new category of phosphorylated recombinant cystathionine beta-synthase (CBS) domain proteins expressed in E. coli. Similar to those of recombinant kinases, the CBS domain containing proteins also display intensive modifications at the N-terminal His-tag and a unique phosphorylation at Ser-2. Further analysis of the 20 recombinant proteins, as determined by mass spectrometry, indicates the phosphorylation of Nterminal His-tag is relevant to direct nucleotide binding and phosphotransfer reaction catalyzed by a specific serine protein kinase in E. coli.

Materials and Methods Enzymatic and Chemical Reagents. Endoproteinase Lys-C (Lysobacter enzymogenes) and sequencing grade trypsin (bovine pancreas) were obtained from Roche Diagnostic Corporation. 2,5-Dihydroxybenzoic acid (DHB), ammonium acetate (NH4Ac), ammonium bicarbonate (NH4HCO3), sodium bicarbonate (NaHCO3), 4-sulfophenyl isothiocyanate (SPITC), adenosine triphosphate (ATP), and magnesium chloride (MgCl2) were purchased from Sigma-Aldrich Canada Ltd. Protein Expression and Purification. The recombinant proteins were expressed as fusions with an N-terminal His-tag in E. coli strain BL21 (DE3) cells (Stratagene) unless otherwise noted. Expression of the kinase C eta (PKCη) C2 domain (gi|88193048, residues 1-138) from the plasmid yielded the N-terminal fusion of MGSSHHHHHHSSGLVPRLGS containing a hexahistidine tag and thrombin cleavage site.11 E. coli tyrosine kinase (Etk) C-terminal kinase domain (gi|20137632, residues 451-726) R614A, Y574F, Y574N, R614K, and R572A mutants were subcloned into pET expression vectors, yielding fusions with an N-terminal His10 tag (MGHHHHHHHHHHSSGHIEGRHIGS).12 For the CBS domain containing proteins, three expression plasmids were constructed in plasmids with the N-terminal His-tags:13 MGSSHHHHHHSSGRENLYFQG, MGSSHHHHHHSSGRENLYFQGH, and MGSSHHHHHHSSGRENLYFQGH......GS. NE2398 from Nitrosomonas europaea (gi|30250323, residues 1-146) was expressed in p11 vector and has two additional amino acids in the C-terminus (GS). Atu1752 from Agrobacterium tumefaciens (gi|220702516, residues 1-144) was expressed in p15TvLic vector. The modified vector pET15b was used to clone RPA3416 from Rhodopseudomonas palustris (gi|39936479, residues 1-142), inosine-5′-monophosphate de3312

Journal of Proteome Research • Vol. 9, No. 6, 2010

She et al. hydrogenase (guaB) MJ0653 (gi|15668834, residues 1-194), inosine-5′-monophosphate dehydrogenase-like protein PH0267 from Pyrococcus horikoshii (gi|14590193, residues 1-178), inosine monophosphate dehydrogenase (guaB-1) AF0847 from Archaeoglobus fulgidus (gi|11498453, residues 1-189), TV1335 from Thermoplasma volcanium (gi|13542143, residues 1-76), and Atu1337 from Agrobacterium tumefaciens str. C58 (gi|15888662, residues 1-121). Other His-tagged proteins included Shigella flexneri ion transport SF0624 (gi|15800371, residues 205-292), S. flexneri transcriptional regulator SF3840 (gi|15804356, residues 1-112), and S. flexneri SF3131 altronate hydrolase (gi|26249680, residues 1-84). Purification of His-tagged proteins was carried out using immobilized metal-chelate affinity chromatography (IMAC) on nickel affinity resin (Qiagen), as described previously.11-13 Autophosphorylation of PKCη and Etk was activated in vitro with 5 mM ATP and 10 mM MgCl2, whereas the CBS domain proteins were used directly in MS analysis without further ATP treatment. Chemical Modifications. N-terminal sulfonation of the tryptic digests was performed using 10 mg/mL SPITC in 20 mM NaHCO3, and the reaction was incubated at 56 °C for 3 h. Following C18 Ziptip cleanup, the derivatized peptides were analyzed by quadrupole time-of-flight (QTOF) MS. Enzymatic Digestion and MS Analyses. Mass measurements were performed on an Applied Biosystems QStar XL QTOF mass spectrometer equipped with an oMALDI source operating with a nitrogen laser (337 nm) as well as a nanoESI source. The intact proteins were determined in 2% formic acid (FA)/50% methanol (pH 2.5). For subsequent sequence identification, the proteins were reduced with 10 mM dithiothreitol (DTT) (56 °C, 1 h), and alkylated with 55 mM iodoacetamide (RT, 45 min), then dialyzed against 10 mM NH4HCO3 and dried in a SpeedVac (Savant). The samples were digested at 37 °C with 20 ng of trypsin or 100 ng of endoproteinase Lys-C in 25 mM NH4HCO3 (pH 7.6). The digest used for MALDI MS was prepared in a ratio of 1:1 (v/v) of the sample to the DHB matrix. Online LC MS/MS analyses were achieved with a Waters nano-Acquity UPLC system coupled to a linear ion trap Fourier transform ion cyclotron resonance mass spectrometer (LTQFT ICR, Thermo Fisher). Peptides were trapped by a RP Symmetry C18 nanoAcquity column (180 µm i.d. × 20 mm, 5 µm) at 5 µL/min, and subsequently separated on a C18 analytical column (100 µm i.d. × 100 mm, 1.7 µm, BEH130) at 500 nL/min. Mobile phases consisted of solvent A (0.1% FA) and solvent B (99.9% acetonitrile/0.1% FA). LC separation was achieved by a 90-min linear gradient from 5 to 45%, then 85% of solvent B. MS/MS experiments were conducted in the datadependent mode following a full FT-MS survey scan at m/z 300-2000. Survey scans were acquired in the ICR cell with a resolution of 100,000. Multiply charged peptide ions were isolated for parallel MS/MS analysis of the top eight most intense precursor ions selected from the survey spectrum. The AGC target values were set to 1 000 000 for FT-MS, and to 10 000 for MS/MS. Ion fragmentation in the ion trap was achieved with helium at a normalized collision energy of 35%. Selected reaction monitoring (SRM) of multiple ion transitions was used to screen and validate the modified peptides of interest. The ion structure of the peptides was examined by multistage MS/MS (MS3) measurements in the linear ion trap. Data Analysis. The raw LTQ-FT MS data were searched against an in-house customized database with the His-tagged recombinant protein sequences using Mascot (Matrix Science).

Phosphorylation of Recombinant Proteins in E. coli

technical notes

Figure 1. Phosphorylation of the PKCη C2 domain. (a) Deconvoluted nanoESI spectrum of the intact protein; (b) MS spectrum of the 12 h tryptic digest; (c) MS spectrum of the 2 h tryptic digest; (d) MS spectrum of the 2 h Lys-C digest. The peptide fragments and phosphopeptides (-P) are labeled above each peak. (e) Crystal structure of PKCη (PDB accession code: 2FK9).

Figure 2. Phosphorylation of the Etk R614A. (a) Deconvoluted nanoESI spectrum of the intact protein; (b) MS spectrum of the 2 h tryptic digest; (c) MS spectrum of the 2 h Lys-C digest. The number of phosphate groups is listed on each peak. (d) X-ray crystal structure of Etk (PDB accession code: 3CIO).

The parameter settings allowed for two missed cleavages from trypsin digestion, and one fixed modification of cysteine carbamidomethylation. Deamidation of asparagine and glutamine, methionine oxidation, phosphorylation of serine/threonine/tyrosine, gluconoylation and phosphogluconoylation at the N-terminus of the proteins were included as variable modifications. The peptide mass tolerance was set to 5 ppm for the FT MS peaks and 1 Da for ion trap MS/MS fragment ions.

Results and Discussion Mild Enzymatic Digestion and MALDI Mass Spectrometry. Protein kinase activity is regulated by phosphorylation, often intramolecular autophosphorylation, of residues within their activation domains. We have utilized enzymatic digestions and MS to analyze several recombinant kinases which were phos-

phorylated in vitro. Examination of the phosphorylation states of E. coli expressed proteins, including human PKCη and E. coli Etk by nanoESI MS revealed abundant peaks separated by factors of 80 Da (Figures 1a, 2a and Figure S2, Supporting Information). To identify the sites of phosphorylation, we first subjected a standard overnight tryptic digest of the proteins for MALDI QTOF MS analyses. Not surprisingly, we found that the phosphopeptide signals were below detection limits (Figure 1b), since ion suppression by a negative phosphate group is a known phenomenon in the positive MALDI mode. Thus enrichment by IMAC and TiO2 affinity purification,14,15 or HPLC separation, is often a prerequisite for MS analysis of phosphopeptides. However, selective enrichment of phosphopeptides with an affinity column is always time-consuming and it may be hampered by interference from nonspecific binding of peptides containing carboxylic groups.16,17 To alleviate this Journal of Proteome Research • Vol. 9, No. 6, 2010 3313

technical notes

She et al.

Figure 3. Deconvoluted nanoESI mass spectra of the Gly-Ser-Ser-[His]6-tagged CBS domain proteins. (a-g) CBS domain containing proteins; (h) Non-CBS domain containing protein SF0624 prepared with selenomethionine in the amino acid sequence.

suppression effect and to achieve rapid analysis of phosphorylated recombinant proteins, we instead investigated the possibility of generating longer peptides by incomplete tryptic digestion or by a gentle enzymatic cleavage using endoproteinase Lys-C. Peptide mapping of PKCη following a mild two-hour tryptic digestion yielded four peptide ions at m/z 1253.619, m/z 1381.717, m/z 1762.906 and m/z 2455.206 (Figure 1c) that correspond to the singly and doubly phosphopeptides of 27 WSLRHpSLFK35, 27WSLRHpSLFKK36, 16IGEAVGLAPTRWpSLR30, and 16IGEAVGLQPTRWpSLRHpSLFK35. MS/MS measurements of these modified peptides identified two phosphorylation sites at Ser-28 and Ser-32 based on the neutral loss of 98 Da (phosphoric acid) and the characteristic fragment of 69 Da resulting from dephosphorylation of serine (Figure S3, Supporting Information). Both of these phosphorylated serines reside in helix R1, localized near the putative lipid-binding region of the kinase C2 domain (Figure 1e). Additional analysis of three peaks from a second enzymatic digestion by Lys-C (Figure 1d), confirmed the phosphorylation sites in the peptide residues 8-36 (8FNGYLRVRIGEAVGLQPTRWpSLRHpSLFK35K36). MS/MS analyses indicated that the ions at m/z 3380.789 (peptide 8-35), m/z 3508.858 (peptide 8-36) have a single phosphorylation, whereas the ion at m/z 3460.761 (peptide 8-35) has two phosphorylation sites. These results are consistent with the intact mass measurement in Figure 1a, the high abundance peak at 15674.0 Da can be assigned as a combination of monophosphorylations at either Ser-28 or Ser-32, and the small peak at 15754.0 Da was also identified as phosphorylation at both sites. Multiply phosphorylated peptides have traditionally been very difficult to detect by MS.15,18 The phosphopeptide peaks were missing in the mass spectrum of a trypsin digested Etk for either 12 or 2 h (Figure 2b). However, a mild digestion by endoproteinase Lys-C combined with MALDI MS successfully identified seven tyrosine-phosphorylation sites at the C-terminus (Figure 2c). MS mapping of the Lys-C digest showed a series of peptide ions in the mass range of m/z 2500-3100 separated by 80 Da, consistent with one to seven phosphory3314

Journal of Proteome Research • Vol. 9, No. 6, 2010

lated tyrosines in the peptide sequence RASTApYSpYGpYNpYpYGpYSpYSEKE (residues 706-726). The precise sites of phosphorylation were identified by MS/MS. This phosphorylated tyrosine-cluster is structurally located at the protein C-terminus of the cytoplasmic domain (Figure 2d), which is the C-terminal kinase domain of the full-length Etk. When both tryptic and Lys-C digests were further analyzed by online LC-MS/MS, only ions from the singly and doubly phosphorylated peptides were detected in the C-terminus of Etks. Thus we believe that a mild enzymatic digestion followed by MALDI MS has a considerable advantage over the existing methods for determining multiple phosphorylation sites in a large peptide. The use of mild enzymatic digestion obviously enhances the detectability of phosphorylated peptides in the positive MALDI mode. This may be attributed to enhanced ionization efficiency of the phosphopeptides resulting from the positively charged residues, particularly those containing arginine, generated from incomplete trypsin or Lys-C digestion. Sites of phosphorylation frequently occur in relatively flexible loops of protein structures (e.g., Figures 1e and 2d) that are solvent exposed, which are favorable for mild protease digestion to cleave at the protein surface region and release phosphopeptides for MS detection at relatively high-intensity. The MS/MS analysis also identified the low abundance protein peaks (Figure 2a and Figure S2, Supporting Information) as modifications at the N-terminal His-tag by gluconoylation (C6H10O6, 178.0477 Da) and phosphogluconoylation (C6H11O9P, 258.0141 Da). The observations agree with the previous findings for the His-tagged recombinant kinase fragments reported by Geoghegan and colleagues.2 N-Terminal Phosphorylation of the His-Tagged Proteins Containing CBS Domains. The CBS domain is an evolutionarily conserved protein domain that is present in the proteomes of archaebacteria, prokaryotes, and eukaryotes.19 We continued to analyze the in vivo phosphorylation of a group of recombinant CBS domain containing proteins expressed in E. coli. Measurements of the intact proteins showed highly intense peaks corresponding to the protein plus single phosphorylation

Phosphorylation of Recombinant Proteins in E. coli

technical notes

Figure 4. MALDI QTOF MS mapping of the tryptic digest of the TV1335. (a) N-terminal tryptic peptides; (b) SPITC modified peptides (+215 Da).

Figure 5. MALDI QTOF MS/MS analyses of the tryptic TV1335 peptides. (a) Phosphorylated His-tag peptide at m/z 1539.62; (b) +258 Da His-tag peptide at m/z 1717.68; (c) SPITC modified phosphopeptide at m/z 1754.60.

(+80 Da), gluconoylation (+178 Da) and a possible phosphogluconoylation (+258 Da) (Figure 3a-g and Table S1, Supporting Information). To determine whether the modifications are structurally similar to those observed in kinases, the proteins were digested with trypsin using a short time-period as described above, and then analyzed by MALDI MS and MS/ MS. For the protein TV1335, there were four ions at m/z 1459.63, 1539.59 1637.68 and 1717.66, that were inferred as the N-terminal His-tag and its modified forms at the first two residues (GS) as shown by the MS/MS fragmentations (Figures 4a, 5a, 5b and Figure S4, Supporting Information). The modified (+ 258 Da) parent ion of m/z 1717.68 readily undergoes simultaneous neutral losses of 98 Da (H3PO4, phosphoric acid), 178 Da (C6H10O6, gluconoyl group), 258 Da (C6H11O9P) and 276 Da (C6H13O10P, phosphate + gluconoyl groups) yielding highintensity fragment ions at m/z 1619.69, m/z 1539.61, m/z 1459.63 and m/z 1441.63 in Figure 5b. Apparently, the direct loss of the gluconoyl group from its precursor ion of m/z 1717.68, with the retention of an exterior phosphorylation, would not be likely if the modification is linearly arranged as

a single phosphogluconoylation in Figure S1 (Supporting Information). This suggests a new peptide structure with separate modifications of phosphorylation and gluconoylation localized at different residues (G or S) of the His-tag. N-terminal sulfonation on the tryptic digest yielded a SPITC modified phosphopeptide at m/z 1754.61 (Figure 4b), and MS/ MS analysis clearly identified the phosphorylation site at residue Ser-2 (Figure 5c) as evidenced by the characteristic C-terminal y12 fragment with an increased mass of 80 Da (phosphate group). In contrast, the derivative product of gluconoylated peptide peak at m/z 1852.68 is of low intensity, presumably caused by resistance to that modification at the N-terminus. Similar results were obtained in the other Nterminal Gly-Ser-Ser-[His]6-tagged recombinant CBS domain containing proteins (NE2398, Atu1752, RPA3416, PH0267, AF0847, MJ0653); all of them have a conserved single phosphorylation site at Ser-2 and gluconoylation at the N-terminus. Structure of the +258 Da His-Tag Peptide. The small satellite peak of +258 Da protein component in the recombinant protein fragments of kinases was previously defined as a Journal of Proteome Research • Vol. 9, No. 6, 2010 3315

technical notes

She et al.

Figure 6. LTQ-FT SRM chromatograms of the His-tag peptides in the Etk R614A and the TV1335 monitoring the loss of the specified modifications. The two structures of +258 Da His-tag peptides in the Etk and the CBS domain containing protein are shown above.

single phosphogluconoylation on the N-terminal amide group of the His-tag,2 whereas MS/MS fragmentations of the peptides in these CBS domain containing proteins revealed a branched configuration of two modifications in the structure, i. e. the gluconoyl and phosphoryl moieties positioned at the adjacent residues (Gly-Ser) of His-tag at the N-terminus. To validate the new structure, our next experiments were performed on the LTQ-FT MS instrument by selected-reaction monitoring (SRM) and multistage MS/MS analyses. In such experiments, four Histagged peptide ions of the CBS domain containing protein at m/z 730.32 (2+, unmodified His-tag), m/z 770.31 (2+, phosphorylated His-tag), m/z 819.35 (2+, gluconoylated His-tag), m/z 859.33 (2+, phosphorylated and gluconoylated His-tag) and their neutral loss products were targeted for MS/MS analysis in the linear ion trap. Multistage MS/MS analyses of the ion structures indicated that both the +80 Da and +258

Da peptides of the CBS domain containing proteins share a conserved phosphorylation site at residue Ser-2 (Figure S5, Supporting Information). Figure 6 shows a side-by-side comparison of the His-tag peptides detected by SRM in both Etk and the CBS domain containing protein TV1335. The unmodified and modified Histag peptides (GHHHHHHHHHHSSGHIEGR) in the Etk display similar chromatographic profiles that show an extremely narrow peak eluted from the UPLC separation. However, the peptides modified with gluconoylation or phosphogluconoylation have a slightly shorter retention time than the unmodified peptide following the reversed-phase C18 column separation, due to hydrophilic group attachment. In contrast, much broader peaks were observed in the peptide analogues of the CBS domain containing protein, which could be caused by either the inherent peptide structure or by multiple interacting hydrogen-bonded conformations, and consequently, similar chromatographic retention times. Consistent with the mass measurements of the proteins in Figure 3, the Ser-2-phosphorylated His-tag peptide (GSSHHHHHHSSGR) of TV1335 appeared as the most intense peak in Figure 6. Whereas the gluconoylated-Gly-1-phosphorylated-Ser-2 (+258 Da) protein in the CBS domain containing proteins appears at low intensity (Figure 3a-g), it can be explained as a minor product derived from gluconoylation of the phosphorylated protein at the N-terminus. Phosphotransfer Reaction of the CBS Domain Containing Proteins in E. coli. Since the Ser-2 was the only phosphorylation site observed in all His-tagged CBS domain containing proteins, we subsequently examined if this modification occurs in other proteins without CBS domain expressed in E. coli. MS analyses indicated no phosphorylation in the proteins of ion transport protein SF0624 (Figure 3h), transcriptional regulator SF3840 (Figure S6, Supporting Information) and altronate hydrolase SF3131 except N-terminal acetylation (+42 Da) and gluconoylation (+178 Da). Surprisingly we found the hypothetical protein Atu1337 behaved like the CBS containing proteins in its phosphorylation at Ser-2 (Figure S7, Supporting Information) even though it contains an antibiotic biosynthesis monooxygenase (ABM) domain.

Table 1. Summary of Phosphorylation at the N-Terminal Serine Residues of Gly-Ser-Ser-[His]6-Tagged Recombinant Proteins Expressed in E. coli

3316

protein name

His-tag sequence

phosphorylation

ref.

Aurora-2 catalytic domain Aurora A catalytic domain PAK1 catalytic domain PAK7 catalytic domain PKA catalytic subunit StkSA1 catalytic domain NE2398 CBS domain RPA3416 CBS domain PH0267 CBS domain Atu1752 CBS domain AF0847 CBS domain TV1335 CBS domain MJ0653 CBS domain Atu1337 ABM domain GRK-2 PH domain Kinase ZAP-70 SH2 domain Kinase p85 SH2 domain SF0624 non-CBS domain Altronate hydrolase SAF domain Transcriptional regulator

GSSHHHHHHSSGLVPRGSHMK GSSHHHHHHSSGLVPRGSH GSSHHHHHHSSGLVPR GSSHHHHHHSSGLVPR GSSHHHHHHSSGLVPRGSH GSSHHHHHHSSGLVPRGSH GSSHHHHHHSSGRENLYFQGH GSSHHHHHHSSGRENLYFQG GSSHHHHHHSSGRENLYFQG GSSHHHHHHSSGRENLYFQG GSSHHHHHHSSGRENLYFQG GSSHHHHHHSSGRENLYFQGH GSSHHHHHHSSGRENLYFQGH GSSHHHHHHSSGRENLYFQGH GSSHHHHHHSSGLVPRGSHM GSSHHHHHHSSGLVPRGSHM GSSHHHHHH GSSHHHHHHSSGRENLYFQG GSSHHHHHHSSGRENLYFQG GSSHHHHHHSSGRENLYFQG

Multiple Multiple Multiple Multiple Multiple Multiple Single Ser-2 Single Ser-2 Single Ser-2 Single Ser-2 Single Ser-2 Single Ser-2 Single Ser-2 Single Ser-2 None None None None None None

4

Journal of Proteome Research • Vol. 9, No. 6, 2010

5 5 5 5 8

2 2 2

technical notes

Phosphorylation of Recombinant Proteins in E. coli Table 1 summarizes phosphorylation at the N-terminal Histag of twenty recombinant proteins as determined by MS, including a collection of those reported by Geoghegan and colleagues.4,5 All proteins were expressed in E. coli with various Gly-Ser-Ser-[His]6- tags, and they are obvious candidates to look at the effect on the extent of protein phosphorylation. Previous findings have shown the hyperphosphorylation on the multiple serine residues of the His-tag in the catalytic domain of the kinases of Aurora-2, Aurora A, PAK1, PAK7, PKA and StkSA1,4,5,8 but no serine phosphorylation in the noncatalytic domains (PH, SH2) of the GRK-2, ZAP-70 and P85 kinases.2 Our analyses indicate that the CBS domain proteins are all singly phosphorylated at Ser-2 of the His-tags, but no phosphorylation was observed in the recombinant segment at non-CBS domain of ion transporter SF0624. These observations suggest a common characteristic of the serine-phosphorylated proteins that might be relevant to nucleotide binding property as kinases and CBS domain proteins are known to be nucleotide binding enzymes. Measurements of the protein complexes on a CBS domain containing protein Atu1752 demonstrate the direct nucleotide binding to protein dimers to form stable protein complexes associated with four molecules of ATP (Figures S8, S9 and Table S2, Supporting Information). The predominate Ser-2 phosphorylation of the CBS domain containing proteins suggests a serine kinase could be involved in catalyzing the phosphotransfer and recognizing the Nterminal serine residues near multiple-histindine tag region of protein substrates. Autophosphorylation was used to elucidate the multiple serine phosphorylations at the His-tag of the kinases;5 however, this mechanism is inadequate to explain the Ser-2 phosphorylation of the proteins without a kinase domain. Moreover, a nonenzymatic phosphorylation is not likely due to nonspecific chemical reactions, and the intensity is expected to be low because of a low reactivity of serine residues. In contrast, the phosphotransfer reaction mediating by a serine kinase would be reasonably unbiased for any phosphorylated recombinant proteins with a Gly-Ser-Ser-[His]6- tag. A direct nucleotide binding to the CBS domain containing proteins enables the phosphate readily available and consequently enhances the phosphotransfer reaction for Ser-2 phosphorylation. The enzymatic phosphotransfer can be used as a universal pathway for interpreting the formation of in vivo serine phosphorylation in any proteins with an N-terminal GlySer-Ser-[His]6-tag such as Atu1337 (Figure S7, Supporting Information), and the tyrosine kinase Pyk2 expressed in the insect cells.5

Conclusions We have established a rapid, straightforward MALDI QTOF MS analysis using mild digestion to determine the phosphorylation sites of recombinant proteins such as kinases. The method is advantageous for high sensitivity detection of phosphopeptides, and is especially suitable for the identification of multiple phosphorylations. Using this strategy, we have identified the exact phosphorylation sites of PKCη, Etk and CBS domain containing proteins expressed in E. coli. MS/MS analysis of the phosphopeptide of the CBS domain containing proteins revealed a single phosphorylation at Ser-2, together with gluconoylation at Gly-1, creating a combined +258 Da phosphorylated-and-gluconoylated His-tag peptide with a distinct structure from the phosphogluconoylation (+258 Da) produced in the His-tagged protein kinases. Measurements on the CBS domain containing proteins provide a broad view of

the serine phosphorylation at the His-tag of recombinant proteins, and the findings should be helpful in assisting the interpretation of the phosphorylation sites at the N-terminal His-tag of proteins without kinase domain. Our observations consequently suggest the existence of an E. coli serine protein kinase which catalyzes the phosphotransfer to the hydroxyl group of serine residues at the His-tag of the recombinant proteins.

Acknowledgment. We are grateful to Dr. Daryl Smith, Dr. Michael Rosu-Myles, Dr. Yves Aubin, and Dr. Michele Regimbald-Krnel at the Centre for Biologics Research, BGTD, Health Canada for their careful reading and critical comments on the manuscript. Supporting Information Available: Supplemental Tables S1-S2 and Figures S1-S9. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Cohen, S. L.; Chait, B. T. Mass spectrometry as a tool for protein crystallography. Annu. Rev. Biophys. Biomol. Struct. 2001, 30, 67– 85. (2) Geoghegan, K. F.; Dixon, H. B.; Rosner, P. J.; Hoth, L. R.; Lanzetti, A. J.; Borzilleri, K. A.; Marr, E. S.; Pezzullo, L. H.; Martin, L. B.; LeMotte, P. K.; McColl, A. S.; Kamath, A. V.; Stroh, J. G. Spontaneous alpha-N-6-phosphogluconoylation of a “His tag” in Escherichia coli: the cause of extra mass of 258 or 178 Da in fusion proteins. Anal. Biochem. 1999, 267, 169–184. (3) Kim, K. M.; Yi, E. C.; Baker, D.; Zhang, K. Y. Post-translational modification of the N-terminal His tag interferes with the crystallization of the wild-type and mutant SH3 domains from chicken src tyrosine kinase. Acta Crystallogr., D: Biol. Crystallogr. 2001, 57, 759–762. (4) Du, P.; Loulakis, P.; Xie, Z.; Simons, S. P.; Geoghegan, K. F. Tandem mass spectrometry of multiply phosphorylated forms of a ‘histidine-tag’ derived from a recombinant protein kinase expressed in bacteria. Rapid Commun. Mass Spectrom. 2005, 19, 547–551. (5) Du, P.; Loulakis, P.; Luo, C.; Mistry, A.; Simons, S. P.; LeMotte, P. K.; Rajamohan, F.; Rafidi, K.; Coleman, K. G.; Geoghegan, K. F.; Xie, Z. Phosphorylation of serine residues in histidine-tag sequences attached to recombinant protein kinases: a cause of heterogeneity in mass and complications in function. Protein Expr. Purif. 2005, 44, 121–129. (6) Szczepanowska, J.; Zhang, X.; Herring, C. J.; Qin, J.; Korn, E. D.; Brzeska, H. Identification by mass spectrometry of the phosphorylated residue responsible for activation of the catalytic domain of myosin I heavy chain kinase, a member of the PAK/STE20 family. Proc. Natl. Acad. Sci. U.S.A. 1997, 94, 8503–8508. (7) Haydon, C. E.; Eyers, P. A.; Aveline-Wolf, L. D.; Resing, K. A.; Maller, J. L.; Ahn, N. G. Identification of novel phosphorylation sites on Xenopus laevis Aurora A and analysis of phosphopeptide enrichment by immobilized metal-affinity chromatography. Mol. Cell. Proteomics 2003, 2, 1055–1067. (8) Lomas-Lopez, R.; Cozzone, A. J.; Duclos, B. A modified His-tag vector for the production of recombinant protein kinases. Anal. Biochem. 2008, 377, 272–273. (9) Jenkins, N.; Murphy, L.; Tyther, R. Post-translational modifications of recombinant proteins: significance for biopharmaceuticals. Mol. Biotechnol. 2008, 39, 113–118. (10) Aon, J. C.; Caimi, R. J.; Taylor, A. H.; Lu, Q.; Oluboyede, F.; Dally, J.; Kessler, M. D.; Kerrigan, J. J.; Lewis, T. S.; Wysocki, L. A.; Patel, P. S. Suppressing posttranslational gluconoylation of heterologous proteins by metabolic engineering of Escherichia coli. Appl. Environ. Microbiol. 2008, 74, 950–958. (11) Littler, D. R.; Walker, J. R.; She, Y. M.; Finerty, P. J. Jr.; Newman, E. M.; Dhe-Paganon, S. Structure of human protein kinase C eta (PKC eta) C2 domain and identification of phosphorylation sites. Biochem. Biophys. Res. Commun. 2006, 349, 1182–1189. (12) Lee, D. C.; Zheng, J.; She, Y. M.; Jia, Z. Structure of Escherichia coli tyrosine kinase Etk reveals a novel activation mechanism. EMBO J. 2008, 27, 1758–1766. (13) Proudfoot, M.; Sanders, S. A.; Singer, A.; Zhang, R.; Brown, G.; Binkowski, A.; Xu, L.; Lukin, J. A.; Murzin, A. G.; Joachimiak, A.; Arrowsmith, C. H.; Edwards, A. M.; Savchenko, A. V.; Yakunin, A. F. Biochemical and structural characterization of a novel family of

Journal of Proteome Research • Vol. 9, No. 6, 2010 3317

technical notes cystathionine beta-synthase domain proteins fused to a Zn ribbonlike domain. J. Mol. Biol. 2008, 375, 301–315. (14) Larsen, M. R.; Thingholm, T. E.; Jensen, O. N.; Roepstorff, P.; Jørgensen, T. J. Highly selective enrichment of phosphorylated peptides from peptide mixtures using titanium dioxide microcolumns. Mol. Cell. Proteomics 2005, 4, 873–886. (15) Thingholm, T. E.; Jensen, O. N.; Larsen, M. R. Enrichment and separation of mono- and multiply phosphorylated peptides using sequential elution from IMAC prior to mass spectrometric analysis. Methods Mol. Biol. 2009, 527, 67–78. (16) Ficarro, S. B.; McCleland, M. L.; Stukenberg, P. T.; Burke, D. J.; Ross, M. M.; Shabanowitz, J.; Hunt, D. F.; White, F. M. Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 2002, 20, 301–305.

3318

Journal of Proteome Research • Vol. 9, No. 6, 2010

She et al. (17) She, Y. M.; Huang, Y. W.; Zhang, L.; Trimble, W. S. Septin 2 phosphorylation: theoretical and mass spectrometric evidence for the existence of a single phosphorylation site in vivo. Rapid Commun. Mass Spectrom. 2004, 18, 1123–1130. (18) Liu, S.; Zhang, C.; Campbell, J. L.; Zhang, H.; Yeung, K. K.; Han, V. K.; Lajoie, G. A. Formation of phosphopeptide-metal ion complexes in liquid chromatography/electrospray mass spectrometry and their influence on phosphopeptide detection. Rapid Commun. Mass. Spectrom. 2005, 19, 2747–2756. (19) Ignoul, S.; Eggermont, J. CBS domains: structure, function, and pathology in human proteins. Am. J. Physiol. Cell Physiol. 2005, 289, C1369–C1378.

PR9011987