Enhancement of the Efficiency of Phosphoproteomic Identification by Removing Phosphates after Phosphopeptide Enrichment Yasushi Ishihama,†,§,# Fan-Yan Wei,†,‡,§ Ken Aoshima,† Toshitaka Sato,† Junro Kuromitsu,† and Yoshiya Oda*,† Laboratory of Seeds Finding Technology, Eisai Company, Ltd. Tokodai 5-1-3, Tsukuba, Ibaraki 300-2635, Japan, and Department of Physiology, Okayama University, Graduate School of Medicine and Dentistry, Okayama 700-8558, Japan Received September 6, 2006
Immobilized metal affinity chromatography (IMAC) and titanium oxide (TiO2) chromatography are simple, widely used, and cost-effective methods to enrich phosphopeptides, but the sample loading buffer composition, desalting procedure, and control of loading amount are critical to avoid nonspecific interactions and to achieve efficient phosphopeptide enrichment. Although the combination of MS3 analysis and high-resolution mass spectrometry (MS) is helpful to identify phosphopeptides, the quality of many MS/MS spectra having a neutral loss peak of phosphate is still too poor to allow sequence identification, and this results in many false-negative as well as false-positive identifications. Here, we present a novel strategy, which is based on the use of alkaline phosphatase to remove phosphates and analysis of phospho/dephosphopeptide retention times to increase the reliability of identification. The use of phospho/dephosphopeptide retention time ratios allows the identification of phosphopeptides with high confidence with the aid of a focused database of dephosphopeptides. This approach was very effective to identify multiple phophorylations in tryptic peptides. A ‘true’ phosphorylation data set should contain about 90% phospho-Ser and a few percent phospho-Tyr, and this ratio can be used as a quality criterion for evaluation of data sets. By applying this efficient approach, we were able to identify more than one thousand phosphopeptides. Keywords: phosphoproteomics • phosphorylation • mass spectrometry • immobilized metal affinity chromatography (IMAC) titanium oxide (TiO2)
Introduction Protein phosphorylation is thought to play a central role in the control of cell division and signal transduction pathways in both normal and diseased cells.1 Many proteins are phosphorylated, not just at one site, but at multiple sites, and each modification seems to have a different regulatory function. Therefore, detection of protein phosphorylation and identification of phosphorylation sites are extremely important for understanding protein regulation during signal transduction. In eukaryotes, the residues that undergo protein phosphorylation are serine (Ser) and threonine (Thr), and to a lesser extent tyrosine (Tyr), but relatively little is known about the majority of protein kinases in the human proteome. Even in the cases of kinases with known consensus sequences, the in vivo protein substrates often remain unidentified.2 Traditionally, phosphorylation is detected by using phosphorylation-site-specific * To whom correspondence should be addressed. E-mail,
[email protected]; tel, +81-29-847-7098; fax, +81-29-847-7614. † Eisai Company, Ltd. ‡ Okayama University. § These authors contributed equally to this work. Their names are listed in alphabetical order. # Current address: Institute for Advanced Biosciences, Keio University, 403-1 Nipponkoku, Daihoji, Tsuruoka, Yamagata 997-0017, Japan. 10.1021/pr060452w CCC: $37.00
2007 American Chemical Society
antibodies3 and/or by incorporating radioactive [32P]orthophosphate into proteins.4 These procedures are labor-intensive, prone to sample losses, and unsuitable for comprehensive analysis, and also often fail to identify the exact site of phosphorylation. However, refinements of several affinitybased strategies, such as immunoaffinity chromatography,5 immobilized metal affinity chromatography (IMAC),6 and strong cation exchange (SCX) chromatography,7 and chemical modifications such as β-elimination,8 coupled with state-ofthe-art mass spectrometric techniques, have allowed some groups to identify several hundreds of phosphorylation sites.9 Nevertheless, it is still difficult to achieve high-throughput phosphopeptide identification on a routine basis, because efficient phosphopeptide enrichment procedures are not wellestablished, and large numbers of tandem mass spectrometry (MS/MS) spectra of phosphopeptides need to be computationally processed to identify phosphopeptides. The latter process represents a major challenge: current database search tools identify large numbers of false-positive/false-negative peptide assignments.10-12 In general, reliance on a single peptide for protein identification tends to yield false-positives, but in phosphoproteomics, identification of one phosphopeptide may be quite critical. One of the main reasons for the large Journal of Proteome Research 2007, 6, 1139-1144
1139
Published on Web 01/19/2007
research articles numbers of false-positives/false-negatives is the poor quality of the MS/MS spectra of phospho-Ser/Thr peptides in ion traptype MS, because the ambiguous identification of phosphorylated amino acid residues results in an increase in the number of random matches. Here, we present an integrated strategy for evaluating large-scale data sets involving hundreds of phosphopeptide candidates. We also describe the application of this method to characterize protein phosphorylation in neurons with a high degree of confidence.
Materials and Methods Neuro2a Cell Samples. Neuro2a cells were grown in RPMI1640 medium, supplemented with 10% fetal bovine serum plus antibiotics. For each experiment, one dish (5 × 107 cells/15cm diameter dish) of confluent neuro2a cells was used. Cells were homogenized in Tris buffer [including 100 mM Tris, pH 8, protease inhibitors (Complete, Roche, Indianapolis, IN), and phosphatase inhibitor cocktails 1 and 2 (Sigma-Aldrich Co., St. Louis, MO)], with a Teflon Potter-type homogenizer. Homogenates were cleared by centrifugation at 1500g for 10 min. Urea was dissolved in the supernatant to 8 M concentration, and n-octyl-β-D-glucopyranoside was added at 0.4% (w/v). Proteins were reduced with 20 mM dithiothreitol and then alkylated with 50 mM iodoacetamide. This sample solution was desalted on PD-10 columns (GE Biosciences, Uppsala, Sweden), which were preconditioned with 8 M urea in 0.1 M Tris buffer, pH 8, containing phosphatase inhibitor cocktails 1 and 2. Sample solutions were treated with 5% (w/w) Lys-C (Wako Pure Chemicals, Osaka, Japan) to digest proteins, diluted to 2 M urea in 0.1 M Tris buffer, pH 8 (final concentration), then further treated with modified sequence-grade trypsin (Promega, Madison, WI). Brain Samples. All mice were treated ethically according to the rules of the Eisai Co., Ltd. Animal Care and Use Committee. Mouse forebrains were prepared from adult male mice (C57BL/ 6Cr genomic background; 8 weeks of age). The procedures for whole protein extraction were the same as those described above for neuro2a cells. Protein solutions were precipitated with perchloric acid and redissolved in SDS-buffer, and 4 mg of proteins was loaded into each lane on a gel. After SDS-PAGE, each lane was cut into 20 equal pieces, and in-gel digestion was carried out.13 5-Cyclohexyl-1-pentyl-β-D-maltoside (CYMAL-5) was obtained from Anatrace. Negative gel stain MS kit was obtained from Wako. Gels with a thickness of 1.5 mm (TrisHCl, 5-20 %T) were obtained from DRC. Enrichment of Phosphopeptides. In-solution-digested tryptic peptides (neuro2a cells) were desalted using a 3M Empore HT C18 column (6 × 10 mm), and eluates (6 mg protein/2 mL of 50% acetonitrile (ACN) containing 0.3% trifluoroacetic acid (TFA)) were directly loaded onto 0.5 mL of Profinity IMAC resin (Bio-Rad, Hersules, CA) charged with iron(III), to enrich phosphopeptides. The IMAC resin was preconditioned with 50% ACN in 0.3% TFA. After sample loading, the IMAC resin was rinsed with 0.5 mL of 50% ACN in 0.3% TFA, and phosphopeptides were eluted with 1 mL of 1% ammonium buffer. The phosphopeptide fraction was evaporated by vacuum centrifugation (SpeedVac, Thermo Electron). Samples were resuspended in 0.1 M ammonium acetate solution. Polymer-based reversed-phase (PRP) chromatography was performed to fractionate phosphopeptides offline using StageTips packed with 3M Empore disks SDB-XC (3M). Samples were loaded onto the PRP-StageTip, and the flow-through fraction was collected. Peptides were eluted in a stepwise manner (flow-through, and 1140
Journal of Proteome Research • Vol. 6, No. 3, 2007
Ishihama et al.
7%, 15%, 25%, and 50% acetonitrile in 0.1 M ammonium acetate solution). The five fractions were collected and evaporated, and the residues were each resuspended in 10 µL of 5% ACN in 0.1% TFA solution and subjected to LC-MS/MS longgradient analysis. For in-gel-digested peptides (forebrains), IMAC/C18 tips purification was performed using 200 µL Eppendorf tips packed in-house with 20 µL of Profinity FeIMAC resin and a 3M Empore C18 disk.14 The IMAC/C18 tips were activated with 200 µL of 50% ACN in 0.3% TFA, and 200400 µL of in-gel-digested sample in 50% ACN in 0.3% TFA was loaded onto each IMAC/C18 tip. The IMAC/C18 tips were each rinsed with 50 µL of 5% ACN in 0.3% TFA, and phosphopeptides were moved to the C18 phase with 200 µL of 1% phosphoric acid. The column was washed with 50 µL of 5% ACN in 0.1% TFA, and phosphopeptides were eluted with 10 µL of 50% ACN in 0.1% TFA. The phosphopeptide fraction was evaporated by vacuum centrifugation. Samples were resuspended with 10 µL of 5% ACN in 0.1% TFA solution and subjected to LC-MS/MS short-gradient analysis. LC-MS/MS Analysis. Sample volumes of 3 µL were loaded onto the LC/MS system. The peptide fractions were separated on a 100 µm × 15 cm in-house-packed stone-arch C-18 column15 with a 4-32% ACN gradient for 25 min (short gradient) or for 90 min (long gradient) in 0.2% acetic acid at a flow rate of 500 nL/min, using an Ultimate 3000 nanopump from Dionex LC Packings (San Francisco, CA) and an HTCPAL autosampler (CTC Analytics AG, Zwingen, Switzerland) equipped with Valco C2 valves with 150 µm ports. The liquid chromatography (LC) eluate was coupled with an in-housebuilt nanospray source attached to a quadrupole ion trap mass spectrometer (model LTQ-FT) from Thermo Electron Corporation (San Jose, CA). Peptides were analyzed in positive ion mode, and automatic data-dependent acquisition was performed, consisting of a full scan (m/z 350-1500) and then SIM mode to improve the mass accuracy, and a subsequent MS2. When a neutral loss scan of 49 or 32.7 (H3PO4 for the +2 and +3 charged ions, respectively) in the three most abundant MS2 fragments was observed, an MS3 scan was automatically collected on the corresponding neutral loss fragments of the MS2 scan events. A dynamic exclusion window was applied which prevented the same m/z from being selected for 1 min after its acquisition. Interpretation of Tandem Mass Spectrometry Spectra. Data were analyzed with Mass Navigator software (MKI, Tokyo, Japan) to generate peak lists based on combined MS/MS spectra and corresponding MS3 spectra, and searches of the NCBI nonredundant (NCBI nr) mouse database as of March, 2006 were conducted using Mascot (version 2.1, Matrix Science). Carbamidomethylation was treated as a fixed modification of Cyc residues, whereas oxidation of Met residues was allowed as a variable modification. Peptide tolerances in MS and MS/MS modes for LTQ-FT were 20 ppm and 0.8 Da, respectively. One missed cleavage of trypsin was allowed. In the MS/MS spectral searches, phosphorylations of Ser, Thr, and Tyr were considered as variable modifications. MS/MS spectra of phospho-Ser/Thr peptides each showed a peak corresponding to the neutral loss of 98 Da from the precursor ion. Only peptides with clear mass spectra with a Mascot score of >95% reliability were considered in this study. Although phosphorylation sites determined by Mascot are not correct in some cases, and there are always ambiguous sites, Mascot results were accepted without any other evaluation.
Enhanced Efficiency of Phosphoproteomic Identification
research articles
Results and Discussion Enrichment of phosphopeptides from peptide mixtures with IMAC is widely used, because the procedures are simple, multiple parallel sample processing can be employed, and the cost is reasonable. However, many nonphosphopeptides bind to IMAC, and this hinders the detection of phosphopeptides. Ficarro et al. converted the carboxylic acid groups of all peptides to the corresponding peptide methyl esters to reduce nonspecific binding between IMAC and acidic residues of peptides, and the reacted samples were dissolved in organic solvent at high concentration.16 Kokubu et al. used a strong acid, such as TFA, to discriminate phosphates from carboxyl groups of peptides and a high concentration of ACN in both loading and washing solvents to remove hydrophobic nonphosphopeptides.14 Recently, titanium dioxide (TiO2) has been used as an alternative to IMAC for the selective enrichment of phosphopeptides.17 Larsen et al. significantly enhanced the binding selectivity of TiO2 for phosphopeptides by using 5080% ACN and 20 mg/mL 2,5-dihydroxybenzoic acid in 0.1% TFA as a loading solvent.18 Thus, we considered that the loading solvent for IMAC/TiO2 should contain TFA and a high concentration of ACN, so that acidic residues in the peptides are neutral, and hydrophobic binding between peptides and columns is minimized. In addition, crude biological samples contain large amounts of phosphorylated substances, such as nucleotides and phospholipids, which cause IMAC to be saturated. Therefore, we removed small molecules from the protein mixture with a gel filtration column before protein digestion, and then desalted the samples with a C18 column to exclude hydrophilic materials, as well as highly hydrophobic substances. The eluate from the C18 column, which was eluted with 50% ACN containing 0.3% TFA, was directly subjected to IMAC, as shown in Figure 1. In the case of in-gel-digested samples, small molecules had already been removed by SDSPAGE, so the extracted tryptic peptides (in 50% ACN containing 0.1% TFA) were directly loaded onto IMAC tips. Although estimation of the amount of phosphoproteins in a crude biological sample is not easy, we examined the optimum loading amount onto IMAC for the enrichment of phosphopeptides in neuro2a cells and found that the greater the sample loading onto IMAC, the larger the number of phosphopeptides identified. At least 10 mg of total proteins is required for 1 mL of IMAC beads (in this case, Profinity-Fe beads) to achieve highly efficient phosphopeptide enrichment. We did not observe any negative effects due to maximum or overloading samples to IMAC. Under such conditions, 581 phosphopeptides in neuro2a cells were identified by our protocol after in-solution digestion, while 1951 nonphosphopeptides were identified from the same enriched samples, as shown in Table 1 and supplementary Table 1 of Supporting Information. For in-gel-digested brain samples, 241 phosphopeptides and 386 nonphosphopeptides were identified using IMAC/C18 tips (see Table 1 and supplementary Table 2 of Supporting Information). The dominant neutral loss peaks observed in MS2 spectra derived from Phospho-Ser/Thr residues gave poor-quality spectra, which would be expected to produce many falsepositive/false-negative results, and MS3 spectral analysis was expected to be more informative for identification of the correct peptide and the phosphorylation site in a database search.19 Therefore, we combined MS2 and MS3 spectra to acquire highly accurate precursor MS information for the database search. Although MS3 data were often necessary to obtain unambiguous identification, peak intensities on MS3 are sometimes too
Figure 1. Flowchart of enrichment strategy for phosphopeptides.
weak to afford high-quality spectra. Indeed, we found that there were still many unassigned spectra having dominant neutral loss peaks, considered as a signature of phosphopeptides. Phosphopeptides are difficult to detect by MS due to low ionization efficiency and suppression effects, as well as insufficient fragmentation in the MS/MS mode, compared with their nonphosphopeptide counterparts. To address this problem, alkaline phosphatase has been used to remove covalently bound phosphate groups from peptides. Liao et al. and Zhang et al. applied the combination of phosphatase treatment and MS-based identification:20,21 their approach was to analyze phosphopeptides by MALDI/TOF MS before and after digestion with a phosphatase to discriminate phosphopeptides from other peptides. Stensballe et al. reported that phosphatase treatment after IMAC enrichment combined with MALDI-MS analysis was effective to identify phosphopeptide candidates.22 Although dephosphorylation increases the sensitivity of detection of phosphopeptides on MS, direct information, such as the location of phosphates, is lost. Torres et al. developed a hypothesis-driven MS method, in which phosphatase reaction is coupled with MALDI MS/MS and IMAC enrichment.23 This approach is powerful for phosphoprotein analysis, but is not suitable for comprehensive analysis. We first utilized a phosphatase treatment strategy to minimize ambiguity of peptide assignment. If a mass down-shift of at least 80 Da is observed upon phosphatase treatment, the peptide is most probably a phosphopeptide. The remaining samples after LTQ-FT analysis were treated with calf intestine phosphatase (CIP, New England BioLab, Inc.), then the dephosphopeptides were analyzed by LTQ-FT (Figure 2). This allowed us to identify a total of 239 paired phospho/dephosphopeptides from neuro2a cells, and 77 pairs from brain samples, as shown in Table 1. These paired Journal of Proteome Research • Vol. 6, No. 3, 2007 1141
research articles
Ishihama et al.
Table 1. Summary of Phosphopeptides by Optimized IMAC Protocola
source
fraction
neuro2a cells
soluble
mouse forebrain
soluble
separation
no. of phosphopeptides (no. of proteins)
no. of nonphosphopeptides
581d (311)
1951d
239e (177) 241d (164) 77e (62)
422f 386d 82f
_C18_FT-MSb
IMAC_SDB-XC CIP treatment SDS-PAGE_IMAC _C18_FT-MSc CIP treatment
pS %
pT %
pY %
89.3 90.1 83.1 90.9
9.0 7.3 14.3 6.4
1.7 2.6 2.6 2.7
a Abbreviations: pS, phosphoserine; pT, phosphothreonine; pY, phosphotyrosine. b Four times repeated analysis. c One analysis. d Criteria for identification were mass accuracy better than 3 ppm and Mascot score more than 30. e Number of paired peptides, phosphopeptides, and deshosphopeptides. f Number of paired peptides, nonphosphopeptides, and de()non)phosphopeptides.
Table 2. Comparison of Retention Times before and after CIP Treatmenta De-P/Non-P
De-P/One-P
sample
no.
ratio
Stdev
no.
neuro2a forebrain
422 82
1.001 1.006
0.036 0.015
137 47
neuro2a forebrain
57.0 (De-P)60.5 (Non-P) 61.6 (De-P) 69.8 (Non-P)
ratio
De-P/Two-P Stdev
no.
ratio
De-P/Three-P Stdev
0.923 0.091 101 0.869 0.078 0.944 0.094 27 0.927 0.038 Average Mascot Score 67.0 (De-P) 64.9 (One-P) 61.0 (De-P) 49.4 (Two-P) 69.6 (De-P) 58.5 (One-P) 67.6 (De-P) 45.1 (Two-P)
no.
ratio
Stdev
1 3
0.926 0.908
0.003
52.0 (De-P) 36.0 (Three-P) 66.3 (De-P) 45.0 (Three-P)
a Abbreviations: De-P, dephosphopeptide; Non-P, non-phosphopeptide; One-P, one phosphate; Two-P, two phosphates, Three-P, three or more phosphates; Stdev, standard deviation. ‘Ratio’ means the ratio of retention time of (dephosphopeptide generated by CIP)/(corresponding phosphopeptide). ‘Retention time’ means the trigger time point for data-dependent MS/MS, this was not exactly matched with real peak top of mass chromatogram. But time difference was mostly less than 10 s in this condition.
Figure 2. Overview of data analysis design.
peptides were confirmed by CIP treatment, and identified with high mass accuracy (better than 3 ppm); they showed high Mascot scores (more than 30, p < 0.05). This data set represents a very high-confidence phosphopeptide list. When the ratios of phospho-Ser/Thr/Tyr were compared in this high-confidence data set, phospho-Ser accounted for nearly 90% and phospho-Tyr for roughly a few percent, as shown in Table 1. Hunter et al. found relative abundances of 0.05%, 10%, and 90% for phospho-Tyr, phospho-Thr, and phospho-Ser, respectively, in normally growing cells,24 and then during this review 1142
Journal of Proteome Research • Vol. 6, No. 3, 2007
process, Mann et al. reported the distribution of phospho-Tyr, phospho-Thr, and phospho-Ser sites was 1.8%, 11.8%, and 86.4% based on more than 2000 phosphoproteins after stimulating HeLa cells with epidermal growth factor (EGF).25 However, rejected data sets in this study (Mascot score less than 29) showed 4.4% (phospho-Tyr), 19.7% (phospho-Thr), and 76.0% (phospho-Ser). The distribution of phosphorylation among Ser, Thr, and Tyr would be the quality criterion of phosphopeptide lists; however, these list may still contain some false-positives. Retention time on HPLC reflects the physicochemical properties of analytes,26 so the retention time ratios between dephosphopeptides and the counterpart phosphopeptides were calculated as shown in Table 2 (and supplementary Tables 3 and 4 in Supporting Information). Phosphopeptides are expected to show greater hydrophilicity than dephosphopeptides, but recently, Steen et al. reported that synthetic phosphopeptides were eluted at almost the same position, or slightly more slowly, than the corresponding nonphosphopeptides from a C18 column.27 Indeed, as seen in Table 2, phosphopeptides seemed to show slightly increased retention, so false-positives may be excluded from the list on the basis of distinctive retention times. Retention on a reversed-phase column depends on not only the hydrophobicity of analytes, but also other factors; for example, silica-gel usually contains trace levels of metals, which retain phosphopeptides to some extent. Also, phosphopeptides may differ from the corresponding dephosphopeptides in molecular shape, and they may present a larger hydrophobic surface to the C18 phase. Table 2 also shows the average Mascot scores before and after CIP treatment. Phosphate groups appear to hinder fragmentation in the MS/MS mode, because the average Mascot scores before CIP treatment were lower than those after CIP treatment. Thus, CIP treatment makes it easier to obtain high-confidence sequence information. Further, comparison of retention times before and after treatment is also helpful to remove false-positives. Next, we built a focused database using CIP-treated samples, and used it in conjunction with the MS/MS data obtained from
research articles
Enhanced Efficiency of Phosphoproteomic Identification
Figure 3. Venn diagram showing number of identified phosphopeptides from NCBInr database or CIP database.
LTQ-FT after IMAC enrichment (see Figure 1). We found that even poor MS/MS spectra could be identified; the use of this small, focused database together with high-accuracy precursor ions resulted in a decrease of the identification threshold from a Mascot score of 30 to a score of 15 at p < 0.05. To maintain high confidence of identification, we utilized the retention time information in Table 2, and candidates were accepted if retention time ratios were satisfactory (within two standard deviations) and the mass accuracy was high (better than 3 ppm). As a result, 394 MS/MS spectra were assigned to phosphopeptides (CIP results, Figure 3 and supplementary Table 5 in Supporting Information). Among them, 88 phosphopeptides were new entries. This seemed to be an effective approach to decrease false-negatives, but the positions of phosphorylated amino acid residues could not be determined in many cases because of poor-quality MS/MS spectra. Ideally, the post-CIP results should provide 100% coverage of before-CIP results, but in fact, there was only 52.7% overlap (306/581) between the two (Figure 3), presumably because of sample losses during CIP processing (including desalting) and incomplete removal of phosphate groups by the CIP reaction (at least 26% phosphopeptides were still remained after CIP reaction). The average Mascot scores of nonphosphopeptides were decreased after CIP treatment, presumably because recoveries were inadequate, as shown in Table 2. We evaluated the validity of the identification of the 88 phosphopeptides newly identified with the aid of the CIP database by repeated analysis (30 times) of neuro2a samples (Table 3). We identified 413 additional phosphopeptides from the NCBInr database, as shown in supplementary Table 6 of Supporting Information. Among them, 26 phosphopeptides were also identified by CIP database search combined with retention time information. Although only 29.5% (26/88) of the above identifications were confirmed, we believe that the CIP database search results probably represent true identification, because these candidates satisfy many criteria: highly accurate precursor ions, appropriate retention time behavior, partial MS/MS fragment ions (albeit of poor quality), and sequence confirmation of dephosphopeptides after CIP treatment. CIP treatment is time-
Figure 4. Phosphorylation motifs extracted by eMotif algorithm.
consuming, but the combination of high-resolution mass spectrometric analysis and CIP treatment coupled with highly specific enrichment of phosphopeptides effectively increased the number of phosphopeptides identified with high confidence. In regards to the number of phosphate moieties per peptide, 70% of phosphopeptides were singly phosphorylated based on searches of databases such as NCBInr, as shown in Table 3. However, the ratio of multiple phosphorylations based on the results of CIP database search was higher. The reason may be that multiple phosphorylation hinders MS detection, and it is difficult to assign fragment ions in MS/MS spectra, so it is harder to identify the sequence of a multiply phosphorylated peptide as compared with a singly phosphorylated peptide; consequently, information derived from dephosphopeptides by CIP processing can be useful to identify phosphopeptides. In this study, extracts from neuro2a cells and forebrain were independently analyzed by LC-MS after IMAC enrichment, and we attempted to discover kinase motifs with a widely used protein motif discovery program, eMOTIF, through an online server.28 As shown in Figure 4, the MAP kinase motif was found in both samples, the 14-3-3 protein motif and CaM kinase II motif were seen only in brain, and casein kinase motifs were found only in neuro2a cells. Although casein kinase activity is relatively high in all regions of the mouse brain, the reason for the discrepancy in this data was not clear. Substrates of casein kinase in our brain samples might be relatively less abundant than other phosphopeptides, or there might be some artifacts during dissecting of brains from bodies. Expression levels of one thousand abundant proteins were quite similar in mouse brain and neuro2a cells,29 but the kinase activities seem to be quite different, so phosphorylation studies are clearly important for protein functional research. The validation techniques used here should prove useful in improving the identification of phosphopeptides, possibly in combination with more restrictive scoring criteria.
Table 3. Number of Phosphates in Identified Phosphopeptides
database
no. of repeated analyses
no. of phosphopeptides (no. of proteins)
one phosphate
two phosphates
three or more phosphates
NCBI CIPa
34 (4 + 30) 4
994b (587) 394c (297)
70.4% 57.6%
28.4% 38.3%
1.2% 4.1%
a Total number of dephosphopeptides/nonphosphopeptides was 1492. b Criteria for identification were mass accuracy better than 3 ppm and Mascot score more than 30. c Criteria for identification were mass accuracy better than 3 ppm, Mascot score more than 30, and acceptable ratio of phosphopeptide/ dephosphopeptide. Before the retention-time criterion was applied, the total number identified was 435.
Journal of Proteome Research • Vol. 6, No. 3, 2007 1143
research articles Acknowledgment. The authors are grateful to Takashi Seiki for animal care and Tsuyoshi Tabata for data management in Eisai. Special thanks are also extended to Jesper Olsen at the Max Planck Institute of Biochemistry, Germany, for helping with setup of the LTQ-FT SIM-MS2 method. This work was supported by funds from the New Energy and Industrial Technology Development Organization, Japan (NEDO), and Japan Science and Technology Agency (JST). Supporting Information Available: Tables listing the phosphopeptides identified in neuro2a cells and in mouse forebrains by LTQ-FT after IMAC enrichment, retention time comparison before/after CIP treatment for neuro2a-phosphopeptides and for mouse forebrains-phosphopeptides, phosphopeptides in neuro2a cells identified using CIP database and LTQ-FT data after IMAC enrichment, and phosphopeptides in neuro2a cells identified by 30 repeated analyses. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Pawson, T.; Scott, J. D. Protein phosphorylation in signalings50 years and counting. Trends Biochem. Sci. 2005, 30, 286-290. (2) Kobe, B.; Kampmann, T.; Forwood, J. K.; Listwan, P.; Brinkworth, R. I. Substrate specificity of protein kinases and computational prediction of substrates. Biochim. Biophys. Acta 2005, 1754, 200209. (3) Yan, J. X.; Packer, N. H.; Gooley, A. A.; Williams, K. L. Protein phosphorylation: technologies for the identification of phosphoamino acids. J. Chromatogr,. A 1998, 808, 23-41. (4) Manning, D. R.; DiSalvo, J.; Stull, J. T. Protein phosphorylation: quantitative analysis in vivo and in intact cell systems. Mol. Cell. Endocrinol. 1980, 19, 1-19. (5) Pandey, A.; et al. Analysis of receptor signaling pathways by mass spectrometry: identification of vav-2 as a substrate of the epidermal and platelet-derived growth factor receptors. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 179-184. (6) Figeys, D.; et al. Electrophoresis combined with novel mass spectrometry techniques: powerful tools for the analysis of proteins and proteomes. Electrophoresis 1998, 19, 1811-1818. (7) Beausoleil, S. A.; et al. Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc. Natl. Acad. Sci. U.S.A. 2004, 101, 12130-12135. (8) Oda, Y.; Nagasu, T.; Chait, B. T. Enrichment analysis of phosphorylated proteins as a tool for probing the phosphoproteome. Nat. Biotechnol. 2001, 19, 379-382. (9) Reinders, J.; Sickmann, A. State-of-the-art in phosphoproteomics. Proteomics 2005, 5, 4052-4061. (10) Zhang, N.; et al. ProbIDtree: an automated software program capable of identifying multiple peptides from a single collisioninduced dissociation spectrum collected by a tandem mass spectrometer. Proteomics 2005, 5, 4096-4106. (11) Chen, Y.; Kwon, S. W.; Kim, S. C.; Zhao, Y. Integrated approach for manual evaluation of peptides identified by searching protein sequence databases with tandem mass spectra. J. Proteome Res. 2005, 4, 998-1005. (12) Weatherly, D. B.; et al. A Heuristic method for assigning a falsediscovery rate for protein identifications from Mascot database search results. Mol. Cell. Proteomics 2005, 4, 762-772.
1144
Journal of Proteome Research • Vol. 6, No. 3, 2007
Ishihama et al. (13) Katayama, H.; et al. Efficient in-gel digestion procedure using 5-cyclohexyl-1-pentyl-beta-D-maltoside as an additive for gelbased membrane proteomics. Rapid Commun. Mass Spectrom. 2004, 18, 2388-2394. (14) Kokubu, M.; Ishihama, Y.; Sato, T.; Nagasu, T.; Oda, Y. Specificity of immobilized metal affinity-based IMAC/C18 tip enrichment of phosphopeptides for protein phosphorylation analysis. Anal. Chem. 2005, 77, 5144-5154. (15) Ishihama, Y.; Rappsilber, J.; Andersen, J. S.; Mann, M. Microcolumns with self-assembled particle frits for proteomics. J. Chromatogr., A 2002, 979, 233-239. (16) Ficarro, S. B.; et al. Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 2002, 20, 301-305. (17) Sano, A.; Nakamura, H. Chemo-affinity of titania for the columnswitching HPLC analysis of phosphopeptides. Anal. Sci. 2004, 20, 565-566. (18) Larsen, M. R.; Thingholm, T. E.; Jensen, O. N.; Roepstorff, P.; Jorgensen, T. J. Highly selective enrichment of phosphorylated peptides from peptide mixtures using titanium dioxide microcolumns. Mol. Cell. Proteomics 2005, 4, 873-886. (19) Raska, C. S.; et al. Pseudo-MS3 in a MALDI orthogonal quadrupole-time of flight mass spectrometer. J. Am. Soc. Mass Spectrom. 2002, 13, 1034-1041. (20) Liao, P. C.; Leykam, J.; Andrews, P. C.; Gage, D. A.; Allison, J. An approach to locate phosphorylation sites in a phosphoprotein: mass mapping by combining specific enzymatic degradation with matrix-assisted laser desorption/ionization mass spectrometry. Anal. Biochem. 1994, 219, 9-20. (21) Zhang, X.; et al., Identification of phosphorylation sites in proteins separated by polyacrylamide gel electrophoresis. Anal. Chem. 1998, 70, 2050-2059. (22) Stensballe, A.; Andersen, S.; Jensen, O. N. Characterization of phosphoproteins from electrophoretic gels by nanoscale Fe(III) affinity chromatography with off-line mass spectrometry analysis. Proteomics 2001, 1, 207-222. (23) Torres, M. P.; Thapar, R.; Marzluff, W. F.; Borchers, C. H. Phosphatase-directed phosphorylation-site determination: a synthesis of methods for the detection and identification of phosphopeptides. J. Proteome Res. 2005, 4, 1628-1635. (24) Hunter, T.; Sefton, B. M. Transforming gene product of Rous sarcoma virus phosphorylates tyrosine. Proc. Natl. Acad. Sci. U.S.A. 1980, 77, 1311-1315. (25) Olsen, J. V.; et al. Global, in vivo, and site-specific phosphorylation dynamics in signaling networks. Cell 2006, 127, 635-648. (26) Shen, Y.; et al. Packed capillary reversed-phase liquid chromatography with high-performance electrospray ionization Fourier transform ion cyclotron resonance mass spectrometry for proteomics. Anal. Chem. 2001, 73, 1766-1775. (27) Steen, H.; Jebanathirajah, J. A.; Rush, J.; Morrice, N.; Kirschner, M. W. Phosphorylation analysis by mass spectrometry: Myths, facts, and the consequences for qualitative and quantitative measurements. Mol. Cell. Proteomics 2006, 5, 172-181. (28) Nevill-Manning, C. G.; Wu, T. D.; Brutlag, D. L. Highly specific protein sequence motifs for genome analysis. Proc. Natl. Acad. Sci. U.S.A. 1998, 95, 5865-58671. (29) Ishihama, Y.; et al. Quantitative mouse brain proteomics using culture-derived isotope tags as internal standards. Nat. Biotechnol. 2005, 23, 617-621.
PR060452W