Strategy for Comprehensive Identification of Post-translational

Jan 10, 2008 - The peak list files were used to query the SwissProt version 50.8 (234 112 sequences; 85 963 701 residues) database using the MASCOT ve...
4 downloads 25 Views 5MB Size
Strategy for Comprehensive Identification of Post-translational Modifications in Cellular Proteins, Including Low Abundant Modifications: Application to Glyceraldehyde-3-phosphate Dehydrogenase Jawon Seo,† Jaeho Jeong,† Young Mee Kim,† Narae Hwang,† Eunok Paek,‡ and Kong-Joo Lee*,† Center for Cell Signaling and Drug Discovery Research, College of Pharmacy and Division of Life and Pharmaceutical Sciences, Ewha Womans University, Seoul 120-750, Korea, and Department of Mechanical and Information Engineering, University of Seoul, Seoul 130-743, Korea Received August 8, 2007; Accepted November 8, 2007

Post-translational modifications (PTMs) play key roles in the regulation of biological functions of proteins. Although some progress has been made in identifying several PTMs using existing approaches involving a combination of affinity-based enrichment and mass spectrometric analysis, comprehensive identification of PTMs remains a challenging problem in proteomics because of the dynamic complexities of PTMs in vivo and their low abundance. We describe here a strategy for rapid, efficient, and comprehensive identification of PTMs occurring in biological processes in vivo. It involves a selectively excluded mass screening analysis (SEMSA) of unmodified peptides during liquid chromatographyelectrospray ionization-quadrupole-time-of-flight tandem mass spectrometry (LC-ESI-q-TOF MS/MS) through replicated runs of a purified protein on two-dimensional gel. A precursor ion list of unmodified peptides with high mass intensities was obtained during the initial run followed by exclusion of these unmodified peptides in subsequent runs. The exclusion list can grow as long as replicate runs are iteratively performed. This enables the identifications of modified peptides with precursor ions of low intensities by MS/MS sequencing. Application of this approach in combination with the PTM search algorithm MODi to GAPDH protein in vivo modified by oxidative stress provides information on multiple protein modifications (19 types of modification on 42 sites) with >92% peptide coverage and the additional potential for finding novel modifications, such as transformation of Cys to Ser. On the basis of the information of precursor ion m/z, quantitative analysis of PTM was performed for identifying molecular changes in heterogeneous protein populations. Our results show that PTMs in mammalian systems in vivo are more complicated and heterogeneous than previously reported. We believe that this strategy has significant potential because it permits systematic characterization of multiple PTMs in functional proteomics. Keywords: proteomic analysis • GAPDH • post-translational modifications • minor PTMs • SEMSA • selectively excluded mass screening analysis • oxidative stress

Introduction Post-translational modifications (PTMs) play key roles in many important cellular functions and regulatory processes, by influencing cellular localization, protein–protein interactions, and biological activities of cellular proteins. However, accurate identification of PTMs is difficult because of their diversity, complexity, low abundance, and heterogeneity.1,2 Often a small fraction of PTMs can be successfully identified using database search tools, such as SEQUEST, Mascot, and Protein Prospector.3–5 However, comprehensive identification * To whom correspondence should be addressed: College of Pharmacy and Division of Life and Pharmaceutical Sciences, Ewha Womans University, Seoul 120-750, Korea. Telephone: 82-2-3277-3038. Fax: 82-2-3277-3760. E-mail: [email protected]. † Ewha Womans University. ‡ University of Seoul. 10.1021/pr700657y CCC: $40.75

 2008 American Chemical Society

of PTMs, especially in a high-throughput manner,6,7 remains a highly challenging task because of additional difficulties in interpreting tandem mass spectra for peptide sequencing, poor peptide fragmentation, and unexpected modifications, etc., limiting real applications to a few types of PTMs.8–10 However, recent development of convenient and unrestrictive PTM search algorithms for searching in blind mode, such as MSAlignment11 and MODi,12 paves the way for rapidly interpreting tandem mass spectra of peptides with multiple and unrestrictive PTMs. How and to what extent cellular proteins are post-translationally modified and what are the functional consequences of the PTMs are important issues in cell biology. Thus far, PTM analyses were mostly carried out by combining affinity-based enrichments, such as immobilized metal ion affinity chromatography (IMAC) or immunoaffinity and mass spectrometry Journal of Proteome Research 2008, 7, 587–602 587 Published on Web 01/10/2008

research articles

Seo et al.

Figure 1. Identification of GAPDH with MALDI-TOF MS. HEK293T cells transiently transfected with Flag-GAPDH were exposed to control (A) or 5 mM H2O2 (B) for 1 h at 37 °C. Cell lysates were subjected to immunoprecipitation by an anti-Flag antibody. The immunoprecipitated proteins were separated by 2D gel and stained with silver (A and B). The spots in rectangular boxes were subsequently identified as GAPDH by detecting the unmodified peptide using MALDI-TOF MS (C).

(MS).13–16 However, with a specific enrichment, only limited types of PTMs can be identified. Comprehensive characterization of PTMs occurring in biological processes has been difficult because of their low stoichiometry (92% peptide coverage and enabled complete characterization of PTMs, such as oxidations on C, W, and M, phosphorylation, deamidation, dimethylation, mutation, etc. Simultaneously, we performed quantitative analysis of modified peptides using precursor ion mass. Our results demonstrate the heterogeneity and complexity of PTMs naturally occurring in cellular proteins during biological processes.

Purification of GAPDH from Transiently Overexpressing Cells Using Immunoprecipitation. The 293 human embryonic kidney epithelial (HEK293T) cells were grown and maintained in high glucose Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% fetal bovine serum (FBS) at 37 °C and 5% CO2. All experiments were performed on 50% more or less confluent cell cultures. Cells were seeded in 10 cm plates a day before transfection at a density of 1.5 × 106 cells and transiently transfected with 6 µg of Flag-GAPDH expression plasmid by the calcium phosphate method. The medium was replaced with fresh medium 6 h after transfection, cultured for additional 19 h, and subsequently subjected to H2O2 treatment. For H2O2 treatment, cells transfected by Flag-GAPDH were incubated with 1 mM H2O2 in Hank’s balanced salts (HBSS) for 1 h at 37 °C. For immunoprecipitation, cells were disrupted with a lysis buffer containing protease inhibitors [50 mM Tris base, 150 mM NaCl, 2 mM ethylenediaminetetraacetic acid (EDTA), 0.5% NP40, 1 mM phenylmethylsulphonyl fluoride (PMSF), 0.5 mM dithiothreitol (DTT), 5 µg/mL aprotinin, 1 µg/mL leupeptin, and 5 mM Na3VO4 at pH 7.4] for 30 min on ice. Lysates were centrifuged at 20000g for 1 h, and the supernatant was incubated for 3 h at 4 °C with monoclonal anti-Flag M2-affinity agarose beads. Beads were washed 3 times with 1 mL of lysis buffer. Two-Dimensional Gel Electrophoresis and Immunoblot Analysis. The precipitated immune complexes were mixed with sample buffer containing protease inhibitors (9.5 M urea, 2% Triton X-100, 5% β-mercaptomethanol, 1 mM PMSF, 5 µg/mL aprotinin, 10 µg/mL pepstatin A, 10 µg/mL leupeptin, 1 mM EDTA, 10 mM Na3VO4, and 10 mM NaF), let stand for 30 min at room temperature, and electrofocused in 7 cm Immobiline DryStrips (pH 3–10) with Amersham IPGphor as described previously.10 Sample Preparation for MS Analysis. GAPDH was separated on 2D gel electrophoresis and stained with silver. The gel spots of GAPDH were excised with a scalpel, destained by 15 mM K4FeCN6/50 mM sodium thiosulfate, and washed to remove destaining reagent. The pH was adjusted to 8.0 by 200 mM NH4HCO3 to facilitate trypsin digestion. The gels were dehydrated by the addition of acetonitrile, rehydrated by adding 10–20 µL of 25 mM NH4HCO3 with 20 ng/µL sequencing-grade trypsin (Promega Co.), and incubated at 37 °C for 15–17 h. Peptides were extracted with 30 µL of solution containing 60% acetonitrile (ACN)/0.1% trifluoroacetic acid (TFA). The extracts were pooled and evaporated to dryness in Speedvac for MS analysis. Formic acid was added to the peptide solution so that the final concentration of formic acid in solvent was 0.1% to facilitate electrospray. LC and MS. Peptides were analyzed by nano-flow reversedphased high-performance liquid chromatography (HPLC)/ESI/ MS/MS with a mass spectrometer (Q-tof Ultima global, Waters Co., U.K.), comprising a three-pumping Waters nano-LC system with an autosampler, a stream selection module configured for precolumn, plus analytical capillary column, and operated under MassLynx 4.0 control (Waters Co., U.K.). Peptides were separated using a C18 reversed-phase 75 µm i.d. × 150 mm analytical column (3 µm particle size, Atlantis dC18, Waters) with an integrated ESI SilicaTip ((10 µm, New Objective, Woburn, MA). A total of 5 µL of peptide mixtures were dissolved in buffer C (95:5:0.2 water/ACN/formic acid, v/v/v), injected on a column, and eluted by a linear gradient of 5–80% buffer B (95:5:0.2 ACN/water/formic acid, v/v/v) over 120 min. Journal of Proteome Research • Vol. 7, No. 2, 2008 589

research articles

Seo et al.

Figure 3. Comparison of MS/MS spectra of low-intensity peptides under the SEMSA strategy. (A) Without an implementation of SEMSA strategy, the peptide was not analyzed by MS/MS acquisition. (B and C) With an implementation of SEMSA strategy, the same peptide was analyzed as a chosen candidate for MS/MS acquisition.

590

Journal of Proteome Research • Vol. 7, No. 2, 2008

Comprehensive PTM Analysis of GAPDH Using SEMSA

research articles

Figure 4. Summary of characterized PTMs in replicated survey scan mode with SEMSA strategy. (A) Summary of identified total and modified peptides with each run in spot 1 of H2O2-treated GAPDH. The number in parenthesis indicates the newly identified PTMs not identified in previous runs. (B) Summary of identified intensities of variously modified peptides from each run.

Samples were desalted on line prior to separation using a trap column (5 µm particle size, NanoEase dC18, Waters) cartridge. Initially, the flow rate was set to 200 nL/min by a split/splitless inlet, and the capillary voltage (3.0 keV) was applied to the

HPLC mobile phase before spray. Chromatography was performed on line to Q-tof Ultima global. MS parameters for efficient data-dependent acquisition were an intensity of >10 and 3-4 components to be switched from MS to MS/MS Journal of Proteome Research • Vol. 7, No. 2, 2008 591

research articles

Seo et al.

Figure 5

analysis. In the first run analysis, the three most abundant precursors were selected for MS/MS analysis. After positive identification, all identified peptides from the database search (Mascot) were nonredundantly excluded in the next run analysis until full sequence coverage was obtained. Data Analysis. After data acquisition, the individual MS/MS spectra acquired for each of the precursors within a single LC run were combined, smoothed, deisotoped, and centroided using the Micromass ProteinLynx Global Server (PLGS) 2.1 data processing software and output as a single MASCOT-searchable peak list (.pkl) file. The peak list files were used to query the SwissProt version 50.8 (234 112 sequences; 85 963 701 residues) 592

Journal of Proteome Research • Vol. 7, No. 2, 2008

database using the MASCOT version 2.1.03 (global search engine) and MODi (Korea, http://prix.uos.ac.kr/modi/), with the following parameters: peptide mass tolerance, 0.5 Da; MS/ MS ion mass tolerance, 0.2 Da; allow up to 2 missed trypsin cleavage sites; consideration of variable modifications considered, such as acetylation, deamidation, methylation, pyro-Glu (N-term E, Q), oxidation, dimethylation, phosphorylation, and cysteine propionamide but not fixed modifications; enzyme limited to trypsin; and toxonomy limited to Homo sapiens (14 780 sequences). Only significant hits as defined by MASCOT probability analysis were considered. In addition, a minimum total score of 50 comprising at least a peptide match of ion

Comprehensive PTM Analysis of GAPDH Using SEMSA

research articles

Figure 5

score more than 20 was arbitrarily set as the threshold for acceptance. All reported assignments were verified by automatic and manual interpretation of spectra from Mascot and MODi in a blind mode. Large number and types of potential PTMs are considered with almost full sequence coverage.

Results

understanding the biological functions of proteins with multiple modifications. However, because of the low intensity and heterogeneity of post-translationally modified proteins in real biological samples, identification of multiple PTMs is a difficult task. We tried to obtain unrestrictive PTM information via replicate nano-LC-ESI MS/MS analysis by raising the peptide coverage over 90% using well-defined biological samples.

Cellular proteins exist in multiple heterogeneous forms, appearing as multiple spots on two-dimensional gel.20,21 Comprehensive identification of PTMs is indispensable for

PTM Identification by SEMSA Employing the Peptide Library. We employed matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) MS and nano-LC-ESI-q-TOF Journal of Proteome Research • Vol. 7, No. 2, 2008 593

research articles

Seo et al.

Figure 5

tandem MS to comprehensively identify the various modifications resulting from oxidative stress of GAPDH in vivo. H2O2 treated or untreated HEK293 cells overexpressing Flag-GAPDH were lysed, and the lysates were immunoprecipitated with anti-Flag antibody. The immunoprecipitates were separated on 2D gels. Multiple spots of GAPDH appeared in both cases as shown in Figure 1. The intensities of acidic spots increased following oxidative stress (parts A and B of Figure 1). To identify the PTMs represented by various spots, peptide fingerprinting with MALDI-TOF MS and peptide sequencing 594

Journal of Proteome Research • Vol. 7, No. 2, 2008

with nano-LC-ESI-q-TOF tandem MS were employed. However, peptide peaks having high intensities shown in Figure 1C were mostly unmodified peptides but not the modified peptides with low abundance in the presence of high-intensity unmodified peptides. Therefore, to facilitate the characterization of as many PTMs as possible, we devised the strategy of selective exclusion acquisition in replicate run analysis. This approach is first used to survey MS scan mode, and then database search software tools, such as Mascot, MODi, and ProteinLynx, are used to

Comprehensive PTM Analysis of GAPDH Using SEMSA

research articles

Figure 5. ESI-MS/MS spectra of PTM peptides isolated from human GAPDH. (A) Three peptides show the diversity of PTMs: 146 IISNASCTTNCLAPLAK162, 223VPTANVSVVPLTCR248, and 310LISWYDNEFGYSNR323; +, detecting the precursor ion with MS; ++, detecting the sequence with MS/MS; superscript, SEMSA run time. (B-D) ESI-MS/MS spectra of PTM peptides containing various modifications.

validate MS/MS data sets. However, the most intense precursor ions in current data-dependent acquisition (DDA) system are redundantly acquired in the nano-LC-ESI MS/MS run. If the exclusion list is not used, identification of low-intensity ions in the presence of high-intensity ions would be far less effective in randomly repeated runs. This is the reason for selectively excluding unwanted high-intensity MS/MS data generation.

The overall scheme of SEMSA is shown in Figure 2. To efficiently characterize all of the theoretically possible types of PTMs in a blind mode, we constructed an empirical unmodified peptide library, only including the unmodified peptides, identified using an accumulated exclusion list obtained from replicate nano-LC-ESI MS/MS runs. Accumulating data sets from repeated runs can greatly increase the number of peptide Journal of Proteome Research • Vol. 7, No. 2, 2008 595

research articles

Seo et al.

Figure 6

identifications in LC-ESI MS/MS analysis. However, abundant peptides are often repeatedly analyzed in replicate runs because of their high intensity, resulting in redundant MS/MS spectra in the cumulative data set. An exclusive implementation using this unmodified peptide library resulted in efficient identification of low abundant PTMs. After nano-LC-ESI MS/MS separation in the first run, the MS/MS acquisition of this run followed the standard survey mode, as described in the Experimental Procedures. The LC-MS procedure is repeated 3 times to obtain more MS/MS data. The MS data of the first run are then processed by ProteinLynx version 2.1 for peak deconvolution and peak list generation. The resulting MS/MS spectra are then generated and submitted to Mascot and MODi database searches to obtain peptide identifications. Only unmodified peptides now serve as candidates for a precursor exclusion list, in terms of m/z and LC run times in the subsequent run. After the peak list of the second separation is generated, the peaks matched to peptides previously identified and included in the exclusion list are automatically removed from the peak list prior to MS/MS acquisition. The ranges of 596

Journal of Proteome Research • Vol. 7, No. 2, 2008

tolerance windows are typically determined by mass accuracy in the MS scan and peak widths in the chromatogram. The unmodified peptides positively identified in the second run analysis can be added to the exclusion list for the third MS/ MS run analysis as well. The same procedure can then be applied to subsequent runs. This PTM-specific exclusion strategy enables less intense PTM peptides to be analyzed and identified, thereby enhancing the confidence level of PTM identifications. For example, Figure 3 shows the effectiveness of the SEMSA strategy for enhanced characterization of lowintensity PTM peptides. In the first run without the exclusion strategy, the modified peptide m/z 898.40 (1794.78, +2), having a low intensity, was not selected for MS/MS analysis, because data-dependent acquisition choose only the three highest intensity MS peaks for switching to MS/MS analysis. However, the spectrum in Figure 3A was obtained by selecting the known precursor ion (m/z 898.40). In the second run with exclusion of high-intensity unmodified peptides, peptide m/z 898.40 (1794.78, +2) is now added to a new precursor ion for MS/MS analysis. In comparison to the MS/MS spectrum in the first

Comprehensive PTM Analysis of GAPDH Using SEMSA

research articles

Figure 6

run, the new spectrum (parts B and C of Figure 3) showed stronger signals, resulting in a new PTM identified as Wformylkynurenin of this peptide sequence with a high Mascot ion score, LISWYDNEFGYSWR. To demonstrate the advantages of this strategy, nano-LC MS/ MS analysis of tryptic digests of GAPDH expressed in human HEK293T cells separated on 2D gel was performed with and without the exclusion strategy. A mass tolerance of 0.6 Da was

used in matching peaks to the exclusion list. Peaks could be easily matched to those on the exclusion list. When the study was conducted with the exclusion list, 210 peaks were acquired in the first separation and additional 158 and 207 peaks were found in the serial runs, respectively. Figure 4A summarizes the results of PTM identification. In nano-LC MS/MS analyses without the exclusion list, the number of unique PTM peptides identified was 19 of 33 total identified peptides and the Journal of Proteome Research • Vol. 7, No. 2, 2008 597

research articles

Seo et al.

Figure 6. Quantitative analysis of PTM peptides using precursor ion intensities. (A) For the analysis of peptides having the same retention time, the relative intensity ratio of peptide 146IISNASCTTNCLAPLAK162 containing reduced (m/z, 931.4600) and oxidized (m/z, 919.8900) cysteine on 156C between spots 1 and 2. (B) For the analysis of peptides having different retention times, we employed one unmodified peptide, which is constantly measured, as an internal standard, the intensity ratio of the unmodified peptide to the modified one was compared between spots. (C) Examples of the quantitative analysis of modified peptides.

percentage of PTM peptides in the total peptides identified amounted to 57% (19/33) in a single run. On the other hand, in the accumulated data set of replicate LC MS/MS with the exclusion list, where all of the additional peptides were identified with the same confidence level as that of the first run, the numbers of unique PTM peptides newly identified were 27 and 35 and the PTM identification rate was 79% (27/34) and 89% (35/39) in the second and third runs, respectively. Indeed, the MS run analysis without the exclusion list results in repeated identifications of the same peptides (data not shown). This exclusion strategy makes it possible to increase, with a high 598

Journal of Proteome Research • Vol. 7, No. 2, 2008

confidence level, the number of unique PTM peptides extracted in repeated runs up to 81 of a total of 106 peptides. Figure 4B represents peak intensities of MS/MS spectra of various peptides in oxidized GAPDH from replicate runs. Many unmodified and modified peptides appeared in the first and second runs, and many modified peptides appeared in the second and third runs. These results indicate that SEMSA is a powerful tool for identifying the minor populations of modified peptides. Comprehensive Identification of PTMs in Cellular GAPDH. To determine whether the GAPDH spots on 2D gel have differential modifications, we intensively examined the

research articles

Comprehensive PTM Analysis of GAPDH Using SEMSA

Figure 7. Comprehensive characterization of PTMs in GAPDH. List of identified PTMs in GAPDH spot 1.

PTMs in each spot (control spots 1 and 2 and H2O2-treated spots 1 and 2) using SEMSA. As shown in Figure 5A, diverse PTM populations were identified in various peptides. The repeatedly identified peptides more than 3 times were pre-

sented as modified peptides at specific sites. It is interesting to note that peptide 146IISNASCTTNCLAPLAK162 containing active-site CxxxC showed 12 kinds of modified peptides, peptide 225VPTANVSVVPLTCR248 has 7 kinds of PTMs, and Journal of Proteome Research • Vol. 7, No. 2, 2008 599

research articles 310

Seo et al. 323

peptide LISWYDNEFGYSNR has 5 kinds. These PTMs in GAPDH from control and oxidized spots on 2D gel are presented in Supplementary Table 1 in the Supporting Information, identifying 72 modified peptides, including multiple modifications at 42 sites. We expected that each spot on the 2D gel would have distinct PTMs after oxidative stress, but the PTMs in each spot had no discernible differences. This indicates that heterogeneous populations of intracellular proteins differ mostly quantitatively in their PTMs (i.e., only in the amount of the PTMs) but not qualitatively (i.e., in the chemical nature of the PTM). As can be seen in parts B-D of Figure 5, which presents some validated MS/MS spectra of the PTMs identified, the SEMSA strategy also revealed some novel PTMs, including transformation of cysteine to serine, dehydroalanine. Quantitative Analysis of PTMs in GAPDH. We performed quantitative analysis of modified peptides present in each spot by determining the accumulated precursor ion peak intensity. For example, the peptide containing active site, 146IISNASCTTNCLAPLAK162, exists as an unmodified peptide, which can be easily labeled with propionamide in acrylamide gel at both 152C- and 156C-SH (m/z, 931.4600, +2) and as oxidized peptide containing 152C-cysteic acid and 156C-propionamide (m/z, 919.8900, +2). The precursor ion intensities of two peptides, m/z 931.46 and 919.89, were calculated from each peak intensity in Figure 6A because they have the same retention time at 62.2 min. The ratios of control (reduced two disulfide labeled with propionamide, +142.08) and oxidized peptides (152C oxidized to cysteic acid and 156C to propionamide, +119.02) in spots 1 and 2 were measured respectively as shown in Figure 6A. It is meaningful to compare two peptides at different spots because the ionization efficiency of peptides varies depending upon the sequence and the degree of modification. An absolute comparison of control and oxidized peptides is not useful for quantitation because of the changes in ionization efficiency. The ratio of control reduced peptide to oxidized peptide was 8.5 in spot 1 and 0.1 in spot 2. This indicates that both spots 1 and 2 have varying mixtures of unmodified and oxidized peptides, and the amounts of each peptide were significantly different. Acidic spot 2 is enriched by the peptide containing 152C-cysteic acid, while spot 1 is enriched by the unmodified peptide. To obtain the relative intensity of modified peptides, which have different retention times, we adopted unmodified peptide 86WGDAGAEYVVESTGVFTTMEK106 (m/z, 1139.0835, +2) as a standard and assumed that it has the same intensity in each spot. In Figure 6B, we compared the relative intensity of modified peptide, 146IISNASCTTNCLAPLAK162 (m/z, 927.9202, +2), containing three modification sites (m/z, 927.9202, +2; phosphorylation at 148S, substitution of 152C to 152S, and propionamide of 156C) and two modification sites (m/z, 887.9725, +2; substitution of 152C to 152S and propionamide of 156C) to control unmodified peptide. Transformation of 152C to Ser was dramatically increased in acidic spot 2 of the control and in the H2O2-treated spot. Using this unmodified standard, relative intensities of many modified peptides were compared between spots 1 and 2 (Figure 6C). This indicates that 152C-SH is abundant in more basic spot 1 and that 152C-oxidized modification is more abundant in acidic spot 2 in 146IISNASCTTNCLAPLAK162. On the other hand, unmodified 247C is enriched in peptide 235 VPTANVSVVDLTCR248 and contains minor populations of 247C-sulfinic acid, 247C-dehydroalanine, and 247C-cysteic acid, and the more acidic spot 2 was enriched by 247C-dehydroalanine and 247C-cysteic acid. Both 146–162 and 235–248 peptides 600

Journal of Proteome Research • Vol. 7, No. 2, 2008

contain cysteine residues, but the oxidation species of cysteine are significantly different in terms of the amount and modification type. The oxidations of 313W in spot 1 including 313W-ox and 313W-formylkynurenin were slightly higher in spot 2 than in spot 1. Many modifications of control and oxidized GAPDH in spots 1 and 2 appear in Figure 7 and Supplementary Figure 1 in the Supporting Information: these include dimethylation at K (+28.03 Da), deamidation at N (+0.98 Da), intermediate succinimide at N and D (-17.00 and -18.01 Da), phosphorylation at S and T (+79.97 Da), pyroglutamic acid at Q (-17.03 Da), and oxidation at M (+15.99 Da). This indicates that cellular modifications of amino acids are tertiary and quaternary structure-specific. Comprehensive PTMs of GAPDH, validated with repeat detections, are illustrated in Figure 7 and Supplementary Figure 1 in the Supporting Information. This is the first comprehensive report, using SEMSA strategy, on in vivo PTMs of GAPDH including some novel PTMs. Many previous reports showed that GAPDH is easily oxidized by oxidative stresses, but the results here clearly demonstrated the exact oxidation sites, oxidation species, and the quantification of oxidation states.

Discussion These studies describe and validate the usefulness of the SEMSA strategy for identifying protein PTMs in vivo. Employing SEMSA, we comprehensively identified PTMs in GAPDH, including some novel modifications in the presence and absence of oxidative stress, and the results were summarized in Figure 5A and Supplementary Table 1 in the Supporting Information. Simultaneously, we also performed quantitative analysis of modified peptides using newly identified peptide precursor ions. To define the specific role of each PTM in the biological function of a protein, we first need to separate modified and unmodified proteins. We adopted 2D gel electrophoresis to obtain well-separated samples. Second, we have to address the problem of low abundance of modified peptides relative to unmodified ones. We addressed this problem by enriching cellular GAPDH using immunoprecipitation and the SEMSA technique to exclude the abundant unmodified peptides and replicating runs for MS/MS detection of modified peptides. Third, we employed a newly developed software tool MODi, which makes it possible to identify multiple PTMs in one peptide and any unexpected and unknown modifications,12 in addition to Mascot. We comprehensively identified the PTMs of GAPDH in response to oxidative stress by combining the improved procedures including sample enrichment and separation, in nano-LC-ESI-q-TOF tandem MS, and a software tool of MODi. This strategy construct, which we called SEMSA, uses an exclusion list of redundant peptides in replicate LC-ESI MS/ MS analyses, which can substantially increase the number of PTM peptide identifications in complex proteome analysis. On the basis of the m/z value, only the unmodified portion of all of the peptides identified in the first LC-MS/MS run were automatically excluded in the next run, thus preventing redundant MS/MS acquisitions and increasing MS/MS detection of low-intensity PTM peptides. Many unexpected and novel PTMs were detected by identifying MS/MS spectra using MODi. This study demonstrates that cellular GAPDH is composed of heterogeneous populations of PTMs. Both constitutively and differentially modified peptides, in response to oxidative stress, were recognized in each spot. Novel modification sites, includ-

Comprehensive PTM Analysis of GAPDH Using SEMSA

research articles

ing phosphorylation sites (T75, S122, S148, T229, T237, and S312), dimethylation sites (K5, K66, K194, K215, K219, K227, K260, K263, and K334), deamidation (N9, N64, N70, N149, N155, N225, and N316), and methionine oxidation (M130, M133, M231, and M328), were identified. Various novel modifications were also observed. For example, one tryptophane site (W313) was observed to oxidize to various oxidative states, one oxidation (+15.99), quinone (+29.97), and formylkynurenin (+31.99) as shown in Supplementary Figure 2 in the Supporting Information. The formation of formylkynurenin in plants was identified previously.22 Deamidation at many N sites was also observed, and the succinimide intermediate of deamidation from N to D was clearly detected at various sites (N70, N225, and D326) as shown in Supplementary Table 1 in the Supporting Information. Multiple modifications of cysteine residues clearly appeared in the peptide containing active site (152CTTNC156): intradisulfide between 152C and 156C, oxidation to cysteic acid (152C), and transformation of C to S (152C). Simultaneously, 247C was shown to transform to sulfinic acid, mainly dehydroalanine and cysteic acid. This indicates that cysteine can be oxidized to various oxidation states depending upon tertiary structural environments as shown in Supplementary Figure 1 in the Supporting Information. Several oxidation pathways were proposed in previous papers.23,24 The formations of serine and dehydroalanine from C were first detected in this study, possibly because of the high sensitivity of the SEMSA strategy for the detection of low abundance PTMs. The results in this study demonstrate the comprehensive PTMs of GAPDH, which are far beyond PTM information previously reported for S-thiolation,25 nitrosylation,26 and disulfide bond.27 This information makes it possible to understand the molecular mechanism of GAPDH in biological processes.

Understanding the relationship of a PTM of a protein to its biological function is one of the challenges of cellular biology. The SEMSA strategy described here will make this possible. This strategy will also make it possible to study the extent and frequency of different types of PTMs, which is still an open problem in functional proteomic research. SEMSA is a specific system that allows for the characterization of unrestrictive PTMs. SEMSA can be used to characterize PTM levels, to improve PTM peptide assignments, and to extract quantitative information from the maximum number of combined LC MS/ MS data sets stored in the database. It leads to results filtered from single or combined data sets using a broad range of criteria, including genetic annotation (gene, isoform, and allele), biochemical pathways, subcellular localization, disease association, etc. With these platforms implemented, the peptide library model will assist in addressing the LC MS/MS data-validation/ mining bottleneck in a concise and effective manner. We believe that SEMSA strategy can be applied to study the dynamics of modifications in functional proteomics and to elucidate the relationships between PTMs and protein function.

To determine the degree of modifications, we quantified the PTM peptides based on SEMSA analysis. The precursor ions of modified peptides were quantitatively analyzed in LC/MS spectra using selective ion monitoring mode in Figure 6. This makes it possible to find significant changes of PTMs in each spot on the 2D gel. Cysteic acid, formation of serine and dehydroalanine from C was significantly increased in more acidic spot 2 of control GAPDH and the H2O2-treated spot (parts A and B of Figure 6), but oxidized forms including sulfinic acid from C and formylkynurenin from W did not show any discernible differences between spot 1 and 2 (Figure 6C). To investigate the biological functions of identified PTMs, further mutant studies and 3D structural studies are progressing. Identified PTMs are well-validated because of repeated detection in MS/MS analysis. Adopting the SEMSA strategy to avoid data redundancy in integrated data sets, we achieved intelligent data generation methods. Simultaneously, the accuracy of assigning peptides to precursor ions in the SEMSA strategy was maximized by minimizing the mass tolerance between runs. The mass ranges of tolerance windows are typically determined by mass accuracy in the MS and peak widths in the chromatogram. To match and align nano-LC-ESI MS/MS data from different runs, we used a fractionated sample and calibrated with 100 fmol of GFP (Glu-fibrino peptide) as the standard peptide prior to each sample run. Specifically, the initial LC-MS run was chosen as the standard peptide chromatogram, and the next separations were externally calibrated to this standard. Therefore, PTMs obtained from this strategy are quantitatively and qualitatively reliable and well-validated with MS/MS analysis.

Acknowledgment. This work was supported by KOSEF through the Center for Cell Signaling and Drug Discovery Research (CCS and DDR, R15-2006-002) at Ewha Womans University and KOSEF Grants FPR05A2-480 and FPR05A2-340. J. Seo, Y. M. Kim, and N. Hwang were supported by the Brain Korea 21 project. We thank Dr. Sri Ram for corrections in the manuscript and Y. H. Seo for the technical support of nano-LC-ESI-q-TOF tandem MS. Supporting Information Available: List of total PTMs identified by nano-LC-ESI MS/MS with PTM-specific SEMSA strategy (Supplementary Table 1), distribution of peptides identified by PTM-specific SEMSA strategy (Supplementary Figure 1), and suggested PTM pathways: (A) oxidation of tryptophan, (B) transition between aspartic acid and asparagine through succinimide intermediate, and (C) various oxidation states of the cysteine residue (Supplementary Figure 2). This material is available free of charge via the Internet at http:// pubs.acs.org. References (1) Shu, H.; Chen, S.; Bi, Q.; Mumby, M.; Brekken, D. L. Identification of phosphoproteins and their phosphorylation sites in the WEHI231 B lymphoma cell line. Mol. Cell. Proteomics 2004, 3, 279–286. (2) Cantin, G. T.; Yates, J. R., III. Strategies for shotgun identification of post-translational modifications by mass spectrometry. J. Chromatogr., A 2004, 1053, 7–14. (3) Yates, J. R., III; Eng, J. K.; McCormack, A. L. Mining genomes: Correlating tandem mass spectra of modified and unmodified peptides to sequences in nucleotide databases. Anal. Chem. 1995, 67, 3202–3210. (4) Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S. Probabilitybased protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 1999, 20, 3551–3567. (5) Clauser, K. R.; Baker, P.; Burlingame, A. L. Role of accurate mass measurement ((10 ppm) in protein identification strategies employing MS or MS/MS and database searching. Anal. Chem. 1999, 71, 2871–2882. (6) Mann, M.; Jensen, O. N. Proteomic analysis of post-translational modifications. Nat. Biotechnol. 2003, 21, 255–261. (7) Jensen, O. N. Modification-specific proteomics: Characterization of post-translational modifications by mass spectrometry. Curr. Opin. Chem. Biol. 2004, 8, 33–41. (8) Lipton, M. S.; Pasa-Tolic, L.; Anderson, G. A.; Anderson, D. J.; Auberry, D. L.; Battista, J. R.; Daly, M. J.; Fredrickson, J.; Hixson, K. K.; Kostandarithes, H.; Masselon, C.; Markillie, L. M.; Moore, R. J.; Romine, M. F.; Shen, Y.; Stritmatter, E.; Tolic, N.; Udseth, H. R.; Venkateswaran, A.; Wong, K. K.; Zhao, R.; Smith, R. D. Global

Journal of Proteome Research • Vol. 7, No. 2, 2008 601

research articles

(9) (10) (11) (12)

(13)

(14) (15) (16)

(17) (18)

602

analysis of the Deinococcus radiodurans proteome by using accurate mass tags. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 11049– 11054. Wu, C. C.; MacCoss, M. J.; Howell, K. E.; Yates, J. R., III. A method for the comprehensive proteomic analysis of membrane proteins. Nat. Biotechnol. 2003, 21, 532–538. Kim, H. J.; Song, E. J.; Lee, K. J. Proteomic analysis of protein phosphorylations in heat shock response and thermotolerance. J. Biol. Chem. 2002, 277, 23193–23207. Tsur, D.; Tanner, S.; Zandi, E.; Bafna, V.; Pevzner, P. A. Identification of post-translational modifications by blind search of mass spectra. Nat. Biotechnol. 2005, 23, 1562–1567. Kim, S.; Na, S.; Sim, J. W.; Park, H.; Jeong, J.; Kim, H.; Seo, Y.; Seo, J.; Lee, K. J.; Paek, E. MODi: A powerful and convenient web server for identifying multiple post-translational peptide modifications from tandem mass spectra. Nucleic Acids Res. 2006, 34, W258– W263. Steen, H.; Kuster, B.; Fernandez, M.; Pandey, A.; Mann, M. Detection of tyrosine phosphorylated peptides by precursor ion scanning quadrupole TOF mass spectrometry in positive ion mode. Anal. Chem. 2001, 73, 1440–1448. Oda, Y.; Nagasu, T.; Chait, B. T. Enrichment analysis of phosphorylated proteins as a tool for probing the phosphoproteome. Nat. Biotechnol. 2001, 19, 379–382. Moser, K.; White, F. M. Phosphoproteomic analysis of rat liver by high capacity IMAC and LC-MS/MS. J. Proteome Res. 2006, 5, 98– 104. Pinkse, M. W.; Uitto, P. M.; Hilhorst, M. J.; Ooms, B.; Heck, A. J. Selective isolation at the femtomole level of phosphopeptides from proteolytic digests using 2D-nano-LC-ESI-MS/MS and titanium oxide precolumns. Anal. Chem. 2004, 76, 3935–3943. Seo, J.; Lee, K. J. Post-translational modifications and their biological functions: Proteomic analysis and systematic approaches. J. Biochem. Mol. Biol. 2004, 37, 35–44. Wu, S. L.; Kim, J.; Hancock, W. S.; Karger, B. Extended range proteomic analysis (ERPA): A new and sensitive LC-MS platform for high sequence coverage of complex proteins with extensive post-translational modifications—Comprehensive analysis of

Journal of Proteome Research • Vol. 7, No. 2, 2008

Seo et al.

(19)

(20)

(21)

(22)

(23) (24)

(25)

(26)

(27)

β-casein and epidermal growth factor receptor (EGFR). J. Proteome Res. 2005, 4, 1155–1170. Chen, H. S.; Rejtar, T.; Andreev, V.; Moskovets, E.; Karger, B. L. Enhanced characterization of complex proteomic samples using LC-MALDI MS/MS: Exclusion of redundant peptides from MS/ MS analysis in replicate runs. Anal. Chem. 2005, 77, 7816–7825. Kim, Y. M.; Song, E. J.; Seo, J.; Kim, H. J.; Lee, K. J. Proteomic analysis of tyrosine phosphorylations in vascular endothelial growth factor- and reactive oxygen species-mediated signaling pathway. J. Proteome Res. 2007, 6, 593–601. Kim, Y. M.; Seo, J.; Kim, Y. H.; Jeong, J.; Joo, H. J.; Lee, D. H.; Lee, K. J. Systemic analysis of tyrosine phosphorylated proteins in angiopoietin-1 induced signaling pathway of endothelial cells. J. Proteome Res. 2007, 6, 3278–3290. Moller, I. M.; Kristensen, B. K. Protein oxidation in plant mitochondria detected as oxidized tryptophan. Free Radical Biol. Med. 2006, 40, 430–435. Berlett, B. S.; Stadtman, E. R. Protein oxidation in aging, disease, and oxidative stress. J. Biol. Chem. 1997, 272, 20313–20316. Nakanishi, T.; Sato, T.; Sakoda, S.; Yoshioka, M.; Shimizu, A. Modification of cysteine residue in transthyretin and a synthetic peptide: Analyses by electrospray ionization mass spectrometry. Biochim. Biophys. Acta 2004, 1698, 45–53. Schuppe-Koistinen, I.; Moldeus, P.; Bergman, T.; Cotgreave, I. A. S-Thiolation of human endothelial cell glyceraldehyde-3-phosphate dehydrogenase after hydrogen peroxide treatment. Eur. J. Biochem. 1994, 221, 1033–1037. Lopez, B. E.; Wink, D. A.; Fukuto, J. M. The inhibition of glyceraldehyde-3-phosphate dehydrogenase by nitroxyl(HNO). Arch. Biochem. Biophys. 2007, 465, 430–436. Nakajima, H.; Amano, W.; Fujita, A.; Fukuhara, A.; Azuma, Y. T.; Hata, F.; Inui, T.; Takeuchi, T. The active site cysteine of the proapoptotic protein glyceraldehyde-3-phosphate dehydrogenase is essential in oxidative stress-induced aggregation and cell death. J. Biol. Chem. 2007, 282, 26562–26574.

PR700657Y