Reversed-Phase - ACS Publications - American Chemical Society

Jan 8, 2010 - Netherlands Proteomics Centre, Padualaan 8, 3584 CH Utrecht, The ... Research, Unit Research and Development, Netherlands Vaccine ...
1 downloads 0 Views 3MB Size
Anal. Chem. 2010, 82, 824–832

Exploring the Human Leukocyte Phosphoproteome Using a Microfluidic Reversed-Phase-TiO2Reversed-Phase High-Performance Liquid Chromatography Phosphochip Coupled to a Quadrupole Time-of-Flight Mass Spectrometer Reinout Raijmakers,†,‡ Karsten Kraiczek,§ Ad P. de Jong,| Shabaz Mohammed,*,†,‡ and Albert J. R. Heck*,†,‡,⊥ Biomolecular Mass Spectrometry and Proteomics Group, Bijvoet Center for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands, Netherlands Proteomics Centre, Padualaan 8, 3584 CH Utrecht, The Netherlands, Agilent Technologies R&D and Marketing GmbH & Company KG, Hewlett-Packard-Strasse 8, 76337 Waldbronn, Germany, Laboratory for Vaccine Research, Unit Research and Development, Netherlands Vaccine Institute, 3720 AL Bilthoven, The Netherlands, and Centre for Biomedical Genetics, Padualaan 8, 3584 CH Utrecht, The Netherlands The study of protein phosphorylation events is one of the most important challenges in proteome analysis. Despite the importance of phosphorylation for many regulatory processes in cells and many years of phosphoprotein and phosphopeptide research, the identification and characterization of phosphorylation by mass spectrometry is still a challenging task. Recently, we introduced an approach that facilitates the analysis of phosphopeptides by performing automated, online, TiO2 enrichment of phosphopeptides prior to mass spectrometry (MS) analysis. The implementation of that method on a “plug-and-play” microfluidic high-performance liquid chromatography (HPLC) chip design will potentially open up efficient phosphopeptide enrichment methods enabling phosphoproteomics analyses by a broader research community. Following our initial proof of principle, whereby the device was coupled to an ion trap, we now show that this so-called phosphochip is capable of the enrichment of large numbers of phosphopeptides from complex cellular lysates, which can be more readily identified when coupled to a higher resolution quadrupole time-of-flight (Q-TOF) mass spectrometer. We use the phosphochipQ-TOF setup to explore the phosphoproteome of nonstimulated primary human leukocytes where we identify 1012 unique phosphopeptides corresponding to 960 different phosphorylation sites providing for the first time an overview of the phosphoproteome of these important circulating white blood cells. Phosphorylation of proteins plays a key role in the regulation of many signaling pathways, the targeting of proteins to specific * To whom correspondence should be addressed. E-mail: [email protected] (A.J.R.H.), [email protected] (S.M.). † Utrecht University. ‡ Netherlands Proteomics Centre. § Agilent Technologies R&D and Marketing GmbH & Company KG. | Netherlands Vaccine Institute. ⊥ Centre for Biomedical Genetics.

824

Analytical Chemistry, Vol. 82, No. 3, February 1, 2010

subcellullar compartments, and the control of enzymatic functions.1,2 Phosphorylation of proteins is a post-translational modification (PTM) that involves the addition of a phosphate group to the hydroxyl side chain of serine, threonine, or tyrosine residues. It is, perhaps, the most abundantly present PTM in prokaryotic and eukaryotic organisms.3,4 Despite the importance and wide abundance of phosphorylation, the analysis of protein phosphorylation in a biological context still remains a challenging task. Since phosphorylation of proteins can function as an “on/off” switch for their enzymatic function finite levels of each protein will be modified at a given time. Thus, the amount of protein that is actually phosphorylated at a certain position is often substoichiometric, which complicates detection of the phosphorylated state(s) of the protein.5 The relative instability and negative charge of the phosphorylation moiety means there is a requirement to take special care in order to prevent loss of phospho groups during sample preparation. Additional issues hampering optimal analysis are the less than optimal fragmentation of phosphopeptides in collision-induced dissociation (CID) tandem mass spectrometry (MS/MS) experiments and the concomitant presence of a large number of abundant nonphosphorylated peptides in the mixtures to be analyzed.6,7 To overcome some of these issues, immense effort has been placed on improving the detection and characterization of phosphopeptides in recent years. Arguably, the greatest advances have been achieved in the development of multiple enrichment tech(1) Jensen, O. N. Nat. Rev. Mol. Cell Biol. 2006, 7, 391–403. (2) Manning, G.; Whyte, D. B.; Martinez, R.; Hunter, T.; Sudarsanam, S. Science 2002, 298, 1912–1934. (3) Linding, R.; Jensen, L. J.; Ostheimer, G. J.; van Vugt, M. A.; Jorgensen, C.; Miron, I. M.; Diella, F.; Colwill, K.; Taylor, L.; Elder, K.; Metalnikov, P.; Nguyen, V.; Pasculescu, A.; Jin, J.; Park, J. G.; Samson, L. D.; Woodgett, J. R.; Russell, R. B.; Bork, P.; Yaffe, M. B.; Pawson, T. Cell 2007, 129, 1415–1426. (4) Mann, M.; Jensen, O. N. Nat. Biotechnol. 2003, 21, 255–261. (5) Han, G.; Ye, M.; Zou, H. Analyst 2008, 133, 1128–1138. (6) Steen, H.; Jebanathirajah, J. A.; Rush, J.; Morrice, N.; Kirschner, M. W. Mol. Cell. Proteomics 2006, 5, 172–181. (7) Boersema, P. J.; Mohammed, S.; Heck, A. J. J. Mass Spectrom. 2009, 44, 861–878. 10.1021/ac901764g  2010 American Chemical Society Published on Web 01/08/2010

Figure 1. Schematic overview of the HPLC phosphochip Q-TOF configuration. In the phosphochip the two reversed-phase C18 trapping columns are indicated in black, the TiO2 phosphopeptide trapping column is in red, and the analytical C18 column is in blue. The main components of the Q-TOF ion optics (octopoles, mirror, pulser) as well as the quadrupole mass filter, the argon (Ar) collision cell, and the detector are indicated.

niques for phosphopeptides. The most commonly used methods include immobilized metal affinity chromatography (IMAC),8-10 titanium dioxide (TiO2)-based affinity enrichment,11-13 and hydrophilic interaction chromatography (HILIC).14 Each of these methods has its strengths and weaknesses, and they enrich most likely different subsets of the phosphoproteome.15 The TiO2-based approach has proven to be one of the simplest to implement due to the robustness of the material that possesses a high affinity for phosphopeptides. Furthermore, one can also make use of the relatively mild conditions utilized in reversed-phase chromatography.16 Such a combination makes this approach particularly suitable for automated, online enrichment of phosphopeptides.11,17,18 Recently we succeeded in transferring the online TiO2 methodology to a microfluidic high-performance liquid chromatography (HPLC) chip environment demonstrating LC-chip-based enrichment and separation of phosphopeptides.19 Following that initial proof of concept, we here combine the TiO2-based chip with a high mass resolution quadrupole time-of-flight (Q-TOF) analyzer (Figure 1) to profile in vivo (8) Ficarro, S. B.; McCleland, M. L.; Stukenberg, P. T.; Burke, D. J.; Ross, M. M.; Shabanowitz, J.; Hunt, D. F.; White, F. M. Nat. Biotechnol. 2002, 20, 301–305. (9) Stensballe, A.; Andersen, S.; Jensen, O. N. Proteomics 2001, 1, 207–222. (10) Blacken, G. R.; Gelb, M. H.; Turecek, F. Anal. Chem. 2006, 78, 6065– 6073. (11) Pinkse, M. W.; Uitto, P. M.; Hilhorst, M. J.; Ooms, B.; Heck, A. J. Anal. Chem. 2004, 76, 3935–3943. (12) Larsen, M. R.; Thingholm, T. E.; Jensen, O. N.; Roepstorff, P.; Jorgensen, T. J. Mol. Cell. Proteomics 2005, 4, 873–886. (13) Thingholm, T. E.; Jorgensen, T. J.; Jensen, O. N.; Larsen, M. R. Nat. Protoc. 2006, 1, 1929–1935. (14) McNulty, D. E.; Annan, R. S. Mol. Cell. Proteomics 2008, 7, 971–980. (15) Thingholm, T. E.; Jensen, O. N.; Larsen, M. R. Proteomics 2009, 9, 1451– 1468. (16) Thingholm, T. E.; Larsen, M. R. Methods Mol. Biol. 2009, 527, 57–66. (17) Lemeer, S.; Jopling, C.; Gouw, J.; Mohammed, S.; Heck, A. J.; Slijper, M.; den Hertog, J. Mol. Cell. Proteomics 2008, 7, 2176–2187. (18) Pinkse, M. W.; Mohammed, S.; Gouw, J. W.; van Breukelen, B.; Vos, H. R.; Heck, A. J. J. Proteome Res. 2008, 7, 687–697. (19) Mohammed, S.; Kraiczek, K.; Pinkse, M. W.; Lemeer, S.; Benschop, J. J.; Heck, A. J. J. Proteome Res. 2008, 7, 1565–1571.

phosphorylation sites in human primary cells. Although a few earlier reported studies focused on the analysis of the phosphoproteome of cell lines derived from specific types of leukocytes, very little data is available on the authentic phosphorylation taking place in blood cells in vivo. Using the TiO2-based phosphochip we provide, for the first time, a global screening of the phosphoproteome of primary human leukocytes. As these cells are relatively undemanding to isolate/collect from humans, they may form an important alternative platform for biomarker research. MATERIALS AND METHODS Leukocyte Isolation. Blood (10 mL) was, with informed consent, collected from three healthy donors (two male, one female) in EDTA tubes. The blood was centrifuged at 1500g and 4 °C for 10 min. The cell pellet was resuspended in 10 mL of 0.9% NaCl, and 40 mL of cold erythrocyte lysis buffer (155 mM NH4Cl, 10 mM NaHCO3, 0.1 mM EDTA) was added. After incubation on ice for 10 min to allow lysis of the erythrocytes, all remaining cells were collected by centrifugation at 1500g and 4 °C for 5 min. After resuspension in 5 mL of erythrocyte lysis buffer and incubation on ice for 10 min to lyse remaining erythrocytes, the leukocytes were pelleted by centrifugation (1500g and 4 °C for 10 min). Subsequent to washing the leukocytes once with 0.9% NaCl, the cells were snap-frozen and stored at -80 °C. Proteolytic Digestion. To generate the reference protein mixture, three proteins, bovine serum albumin and bovine R and β casein, were each dissolved (100 µM) in 1 M urea and 50 mM ammonium bicarbonate. The proteins were reduced in 1 mM DTT and alkylated in 2 mM iodoacetamide, followed by digestion with trypsin overnight at a protein/protease ratio of 50:1. The final mixture was prepared for analysis by diluting the protein digests in 10% formic acid to a final concentration of 20 fmol/µL. For digestion of the human U2OS osteosarcoma cells and leukocyte proteins, the cells were resuspended in lysis buffer with Analytical Chemistry, Vol. 82, No. 3, February 1, 2010

825

phosphatase inhibitors (8 M urea, 50 mM NH4HCO3, 1 mM KF, 5 mM NaH2PO4, 1 mM Na2VO4), sonicated 2 times for 20 s, and centrifuged at 13 000g and 4 °C for 10 min to remove insoluble material. Lysate corresponding to 1 mg of protein (as determined by a Bradford assay) was reduced with 1,4dithiothreitol (10 mM) and alkylated with iodoacetamide reagent (20 mM). After diluting the sample to 2 M of urea (using lysis buffer lacking the urea), 10 µg of trypsin was added and the sample was incubated for 4 h at 37 °C. Next, the sample was further diluted to 1 M of urea, another 10 µg of trypsin was added, and the sample was incubated overnight at 37 °C. The digested lysate was stored at -20 °C prior to analysis. Strong Cation-Exchange Separation. Strong cation-exchange chromatography (SCX) was performed using an Agilent 1100 HPLC system (Agilent Technologies) with two C18 Opti-Lynx (Optimized Technologies, Oregon OR) guard columns and a polysulfethyl A SCX column (PolyLC, Columbia, MD; 200 mm × 2.1 mm i.d., 5 µm, 200 Å), essentially as described previously.20 The digested cell lysate was dissolved in 0.05% formic acid, and 750 µg was loaded onto the guard column at 100 µL/min and subsequently eluted onto the SCX column with 80% acetonitrile (ACN) and 0.05% formic acid (FA). SCX buffer A was made of 5 mM KH2PO4, 30% ACN, and 0.05% FA, pH 2.7; SCX buffer B consisted of 350 mM KCl, 5 mM KH2PO4, 30% ACN, and 0.05% FA, pH 2.7. The gradient was performed as follows: 0% B for 10 min, 0-85% B in 35 min, 85-100% B in 6 min, and 100% B for 4 min. A total of 32 fractions were collected and dried in a vacuum centrifuge. The individual fractions were dissolved in 10% FA and analyzed by LC-MS/MS. The eight richest fractions, corresponding to doubly charged peptides, were analyzed twice by LC-MS/MS to improve proteome coverage. HPLC Chip. All LC-MS/MS experiments were performed using an Agilent 1200 series HPLC-Chip LC system connected to an Agilent 6520 Q-TOF mass spectrometer. For regular LC-MS/ MS analyses a custom HPLC-Chip was used containing a 160 nL Aqua C18 (5 µm; Phenomenex, Torrance, CA) trapping column and a 15 cm, 75 µm Reprosil C18 analytical column. For phosphopeptide enrichment, an in-house designed chip was used,19 recently commercially introduced as the Agilent HPLC phosphochip (G4240-62020), which has a three sectioned trapping column consisting of first a 100 nL C18 trapping column (Zorbax Extend 5 µm), a 40 nL TiO2 column (10 µm, GL Sciences), and a second 100 nL Zorbax Extend C18 trapping column (see Figure 1). Trapping of peptides was performed at 3 µL/min using 0.6% acetic acid (HAc) and 2% FA in water, and analysis was performed by switching the trap to be in-line with the analytical column and nanoflow pump, followed by a linear gradient from 5% to 40% solvent B at 200 nL/min. Solvents used for analytical HPLC were 0.6% HAc/0.5% FA (solvent A) and 0.6% HAc/0.5% FA/80% ACN (solvent B). The length of the gradient was chosen based on the expected complexity of each sample, ranging from 45 min to 3 h. The exact gradients used on the system are supplied in Supporting Information Table 1. For phosphochip operation, the first analysis was followed by a washing step which consisted of 20 µL of 50% ACN containing selected additives (citric acid, DHB, DMSO). The phosphopep(20) Gauci, S.; Helbig, A. O.; Slijper, M.; Krijgsveld, J.; Heck, A. J. R.; Mohammed, S. Anal. Chem. 2009, 81, 4493–4501.

826

Analytical Chemistry, Vol. 82, No. 3, February 1, 2010

tides were subsequently eluted by injecting two times 20 µL of elution buffer (250 mM NH4HCO3 pH 9; 5 mM KF, 10 mM NaH2PO4, 1 mM Na2VO4), and an analysis of the eluted (phosphorylated) peptides was performed by switching the precolumn to be in-line with the analytical column for a second H2O/ACN gradient as described above. The upper pressure limit for the phosphochip was set at 200 bar for both loading and analysis. The phosphochip was operated in forward flush mode to allow the phosphopeptides to be eluted to the second C18 column, leading to slightly broader peaks compared to the regular HPLC chip, but this effect was minimal, particularly when performing gradients of over 90 min. Many phosphopeptide enrichment analyses (over 100 analytical runs) could be performed using a single phosphochip, illustrating its stability over multiple runs. Mass Spectrometry. The phosphochip was directly coupled to an Agilent 6520 Q-TOF mass spectrometer (Agilent, Santa Clara, CA), which was operated with the ADC at 4 GHz, “high-resolution mode” in a data-dependent manner, automatically switching between MS and MS/MS, with a quad AMU setting of 400 m/z. Acquisition times were 500 ms for MS spectra (m/z 350-2000) and 333 ms for MS/MS spectra (m/z 50-2000). The three most intense ions (minimum intensity 2500) were selected for collisioninduced fragmentation with argon as a collision gas using a collision energy corresponding to 3 V per 100 m/z units. Per selected precursor, two MS/MS spectra could be generated using the applied 30 s dynamic exclusion window. Data Processing and Analysis. All LC-MS/MS data was analyzed using both Mascot (2.2) and Spectrum Mill (A.03.03.080 SR1) against all human proteins from the Swissprot database (56.2), 20 341 entries, augmented with a concatenated decoy database to estimate false discovery rates (FDR). For Mascot analysis, deisotoped peak lists were generated using Masshunter (B.01.03) and all resulting spectra were filtered for a minimum absolute peak intensity of 100 counts and a minimum m/z of 170. For Spectrum Mill analysis, MS/MS spectra from the same precursor mass were merged within a range of 1.4 m/z units and a retention time of 15 s and peak lists were generated using the Spectrum Mill data extractor. For both Mascot and Spectrum Mill, variable modifications allowed were oxidation of methionine and phosphorylation of serine, threonine, and tyrosine. For Mascot analysis, protein N-terminal acetylation was included additionally as a variable modification. Precursor mass tolerance for database searching was set at 50 ppm, product mass tolerance was set at 50 ppm for Spectrum Mill and at 0.02 Da for Mascot. For Mascot analysis, semitrypsin was chosen as enzyme, with a maximum of three missed cleavages. For Spectrum Mill analysis a combination of trypsin and the manually defined enzyme elastase (cleavage C-terminal of A, I, or V) was used, with a total maximum of eight missed cleavages. For our final data set, we only accepted peptides with an mass between 800 and 5000 Da, a Mascot score of at least 31 or a Spectrum Mill score of at least 7.1 (with a Spectrum Mill Delta Reverse score of at least 5), and a maximum of 15 ppm deviation in precursor mass, excluding any identified peptides outside these thresholds. This resulted in an estimated FDR of 1.5% for both the Mascot and the Spectrum Mill identifications. All identifications were converted to PRIDE XML format using the PRIDE Converter program (http://code.google.com/p/pride-

converter/) and have been uploaded to the PRIDE public data repository (http://www.ebi.ac.uk/pride/) under accession numbers 9759, 9763, 9768, and 10527-10530 to allow third-party analysis and validation of the data set. Motif-X analysis was performed at p < 1 × 10-6, with a window of 21 residues and the IPI human database as a background data set.21 To obtain sufficient amino acids surrounding the identified phosphorylation site for motif analysis, additional sequence information was gathered when necessary from the database entry of the protein to which the phosphopeptide was assigned. Sequence logos were generated using Weblogo.22 RESULTS White blood cells, or leukocytes, are cells of the immune system defending the body against both disease, like those caused by infectious viruses and bacteria and foreign materials. Leukocytes are a mixture of several different cell types, each involved in different parts of the human immune system. Leukocytes consist approximately of 60% granulocytes (of which 95% neutrophils, 4% eosinophils, and 1% basophils) and 40% mononuclear cells (of which 10% B-cells, 60% T-cells, and 30% natural killer cells).23 Leukocytes can be isolated from blood in a relatively mild and efficient manner, by erylysis and differential centrifugation. With the use of this method, the cells can be kept cold throughout the procedure, which has the potential benefit that minimal loss of phosphate groups due to phosphatase activity occurs during cell disruption. Additionally, the sample preparation method involves the use of only salts, making the resulting sample highly compatible with in-solution digestion and mass spectrometric analysis. Therefore, total leukocyte preparations may present an attractive alternative platform for biomarker analysis. Recently, we described an inventory of the proteome of human leukocytes derived from different individuals in a quantitative manner, examining in particular biological variation in protein expression between leukocytes.24 Here, we extended this work and set out to map the phosphoproteome of these primary cells, as currently very little information is available on phosphorylation events in primary white blood cells. We hypothesize that in vivo phosphorylation events detected in primary cells can potentially be more relevant read-outs of biological processes than those observed in in vitro cultured cells or cancer-derived cell lines. We expected that the phosphoproteome analysis of these primary human cells would be a challenging task as the amount of observable protein phosphorylation in these cells is likely to be of lower abundance than in cells cultured in the laboratory, especially compared to when receptor kinases in the latter have been activated or when phosphatase activity has been blocked. In addition, our previous quantitative proteomics study on these leukocytes revealed significant activity of endogenous leukocyte proteases, wherein the neutrophil derived protease elastase showed most activity, significantly increasing sample complexity and thus complicating protein analysis.24 (21) Schwartz, D.; Gygi, S. P. Nat. Biotechnol. 2005, 23, 1391–1398. (22) Crooks, G. E.; Hon, G.; Chandonia, J. M.; Brenner, S. E. Genome Res. 2004, 14, 1188–1190. (23) Alberts, B.; Johnson, A.; Lewis, J.; Raff, M.; Roberts, K.; Walter, P. Molecular Biology of the Cell, 4th ed.; Garland Science: New York, 2002. (24) Raijmakers, R.; Heck, A. J.; Mohammed, S. Mol. Biosyst. 2009, 5, 9921003.

The phosphoproteome analysis of these primary human cells would also present a good real-life test case to challenge our newly designed TiO2-based phosphochip in combination with a Q-TOF mass spectrometer (see Figure 1).19 To ensure a selective and efficient enrichment of phosphopeptides on the phosphochip we first tested, as described in detail in the Supporting Information, the ability of various washing steps with organic compounds (DHB, citric acid, DMSO) dissolved in 50% ACN to reduce the binding of especially acidic, nonphosphorylated, peptides to the TiO2 enrichment column.8,25 DHB and citric acid have been shown before to assist in improving the specificity of phosphopeptide enrichment, but here we found that the use of a wash step with 1% DMSO significantly reduced the binding of acidic peptides, while the phosphopeptides were retained on the TiO2 column, providing an alternative which is directly LC-MS/ MS compatible (see the Supporting Information for details). To evaluate the performance of the phosphochip when analyzing complex biological samples we first analyzed laboratory cultured human U2OS osteosarcoma cells. Therefore, 1 mg of protein derived from U2OS osteosarcoma cells was digested by trypsin and the resulting peptides fractionated by SCX chromatography. The sample was applied to a polysulfethyl A SCX column, and peptides were separated using a gradient of KCl under acidic conditions. As shown before, this SCX-based method results in the separation of peptides broadly based on their net charge.26 As most tryptic peptides normally possess a net charge of 2+ and the phosphate group carries a negative charge at the pH used during separation, most of the phosphopeptides elute earlier alongside peptides with a 1+ net charge, whereas most nonphosphorylated peptides carry a net charge of 2+ or 3+ and elute later.20 We recently showed that by using a shallow gradient the two main pools of 1+ peptides (N-acetylated and phosphorylated peptides) can be nearly baseline-separated20,27 Under shallow gradient conditions we observed that N-acetylated peptides will elute earlier than phosphorylated peptides. Unfortunately, the 1+ fractions also contain significant amounts of other compounds originating from the lysate, which cannot be easily identified by mass spectrometric analysis, for example, tryptic peptides corresponding to the C-termini of proteins. Due to the low abundance of most phosphopeptides, it is still advantageous to perform TiO2 enrichment to improve detection of phosphopeptides and increase the frequency of relevant ions being fragmented by the mass spectrometer (Supporting Information Figure 2A). When all MS/MS spectra from both the flow-through and the elution were subjected to a database search strategy for identification (using Mascot), over 600 unique phosphopeptides could be found in the analysis of the elution, whereas only 32 phosphopeptides were identified in the flow-through, eight of which were also present in the elution fraction (Supporting Information Figure 2A). In the elution fraction, only 11 nonphosphorylated peptides were identified, showing the specificity and high level of enrichment of this approach. The MS/MS spectra obtained during the analysis of the flow-through fraction mainly represented (25) Mazanek, M.; Mituloviae, G.; Herzog, F.; Stingl, C.; Hutchins, J. R.; Peters, J. M.; Mechtler, K. Nat. Protoc. 2007, 2, 1059–1069. (26) Peng, J.; Elias, J. E.; Thoreen, C. C.; Licklider, L. J.; Gygi, S. P. J. Proteome Res. 2003, 2, 43–50. (27) Taouatas, N.; Altelaar, A. F.; Drugan, M. M.; Helbig, A. O.; Mohammed, S.; Heck, A. J. Mol. Cell. Proteomics 2009, 8, 190–200.

Analytical Chemistry, Vol. 82, No. 3, February 1, 2010

827

Figure 2. Phosphochip enrichment of highly complex phosphopeptide mixtures. (A) Total ion chromatograms of the phosphochip analysis of the main net singly charged peptide SCX fraction of 1 mg of human leukocyte lysate (flow-through in green, elution in red). The numbers of phosphopeptides identified in each experiment are indicated. (B) Overlap between the identified phosphopeptides in the LC-MS/MS runs shown in panel A. (C) Overlap between the identified phosphopeptides in the LC-MS/MS analysis of all SCX fractions of the human primary leukocytes of all three donors. (D) Overlap in identified phosphopeptides between analytical triplicates on the phosphochip of selected phosphopeptide fractions of donor 3.

peptides corresponding to protein C-termini (thus lacking a C-terminal arginine or lysine residue, causing them to have a net 1+ charge) and other peptides, which were not identified when searching the human proteins in the Swissprot database. These results clearly show that the phosphochip is capable of separating phosphopeptides from interfering nonphosphorylated peptides and other compounds in the net 1+ SCX fractions. Importantly, the TiO2 HPLC chip in conjunction with SCX enables approximately similar numbers of phosphopeptide identifications as achievable in present day alternative off-line or online approaches. Having established more optimal procedures for the analysis of complex samples using the phosphochip, we focused our attention on the phosphoproteome of leukocytes. Primary human leukocytes were isolated from the blood of three healthy volunteers (two male, one female), lysed, and material corresponding to 1 mg of protein per individual was digested with trypsin. The resulting digests were separated by SCX, and for every sample the 30 resulting fractions were analyzed using a phosphochip. When analyzing the main net 1+ fraction obtained from the leukocytes on the phosphochip connected to a Q-TOF mass spectrometer, we observed that the majority of material for the net 1+ fractions ended up in the flow-through (representing over 828

Analytical Chemistry, Vol. 82, No. 3, February 1, 2010

90% of the total MS signal) as can be seen in Figure 2A. The obtained tandem mass spectra were searched against all human proteins in the Swissprot database using both the Mascot and Spectrum Mill search engines. To accommodate for the expected endogeneous proteolytic activity described by us before,24 we additionally chose to use semitrypsin as the selected enzyme in our Mascot database searches, which allows peptides to have one nontrypsin-derived terminus. Unfortunately, the current version of Spectrum Mill (in our hands) did not allow for analysis of semitryptic phosphopeptides, and therefore in this case we did a combined trypsin/elastase search (allowing cleavages C-terminal of K, R, V, A, or I), providing us the additional advantage of being able to identify peptides of which both termini were generated by elastase (defined here as cleavage after V, A, or I which is based on the MEROPS database of proteases28). A typical phosphochip analysis of a single net 1+ SCX fraction of the leukocytes is represented in Figure 2B where 108 phosphopeptides could be identified combining the results of the Mascot and Spectrum Mill searches, of which 103 were present in the elution of the phosphochip. (28) Rawlings, N. D.; Morton, F. R.; Kok, C. Y.; Kong, J.; Barrett, A. J. Nucleic Acids Res. 2008, 36, D320–325.

Figure 3. Peptide identifications from human leukocytes of donor 1. (A) Identification of different subpopulations of all peptides by Mascot and Spectrum Mill. Bars indicate the percentage of peptides identified by Spectrum Mill alone (orange), Mascot alone (purple), or by both search engines (red). The actual numbers of peptides in each group are indicated in the bars. (B) Scatterplot of the Mascot score vs the Spectrum Mill score for all peptides identified by both search engines. Applied cutoff scores for each engine are indicated by the two red lines. The bottom two panels show the MS/MS spectra of peptides SNFDEEFTGEAPTLpSPPR (panel C) of the serine/threonine-protein kinase N1 (phosphorylated S916 of PKN1_HUMAN) and ESVPEFPLpSPPK (panel D) of the protein stathmin 1 (phosphorylated S38 of STMN1_HUMAN). The most prominent identified product ions are indicated in color (a-ions in green, b-ions in blue, y-ions in red, internal fragments in orange, and immonium ions in purple). The mass accuracy observed for each identified fragment ion is shown below the spectra.

Since this data was obtained with the same procedures as used for the single SCX fraction of the U2OS osteosarcoma cells, the analysis indicates a near 6-fold reduction of detectable phosphorylation events in the same amount of primary leukocytes sample compared to the U2OS osteosarcoma cells. Combining analyses of all SCX fractions from all leukocyte samples we were able to increase the amount of phosphosites and identified 1012 unique phosphopeptides containing 960 unique phosphorylation sites in 573 different proteins. Combining the flow-through and elution analysis of all SCX fractions, 11 539 peptides in 2767 proteins where identified (either phosphorylated or nonphosphorylated). From the total of 1012 phosphopeptides, 810 were detected exclusively in the elution fractions of the phosphochip, whereas 202 were detected either only in the flow-through or in both fractions (Figure 2C). To determine the reproducibility of the results obtained with the phosphochip when analyzing such samples, we selected three phosphopeptide-rich fractions of donor 3 and repeated their analyses on the phosphochip. The overlap between the three analyses was very comparable with previously published phosphoproteome analysis, with the overlap between any two analyses being approximately 46% and an overlap of 32% between all three analyses (Figure 2D). We decided to generate a rough comparison between the results obtained with the Mascot and Spectrum Mill database

searches (Figure 3A) using the data obtained from the leukocytes of donor 1. To compare the identifications obtained with Mascot and Spectrum Mill four different classes of peptides had to be considered due to the different enzyme specificity selections used in the searches. Two of these classes, fully tryptic peptides (both termini generated by trypsin) and certain semitryptic peptides (one terminus generated by trypsin and the other terminus generated by elastase), could be identified by both search engines. A third class, semitryptic peptides of which the nontryptic terminus was not generated by cleavage after V, A or I, could only be assigned using Mascot, and finally the fourth class of peptides stemming from two elastase cleavages could only be identified by Spectrum Mill. The results of the comparison for the number of peptides identified in these classes are summarized in Figure 3A. In total, Mascot identified 3317 unique peptides and Spectrum Mill 2883. Of these 1591 peptides were in common (35%), whereby 1292 peptides were solely identified by Spectrum Mill and 1726 by Mascot. In more detail, of all 1515 fully tryptic peptides identified, 56% (847) were identified by both search engines (at an FDR of 1.3%), whereas Mascot and Spectrum Mill had very similar numbers of nonoverlapping identifications (316 and 352, respectively). Peptides generated by a combination of trypsin and elastase cleavage (1572 in total) where slightly more often identified by Spectrum Mill than Mascot, although again apAnalytical Chemistry, Vol. 82, No. 3, February 1, 2010

829

Figure 4. Global analysis of the phosphoproteome of primary human leukocytes. (A) Bar graphs showing the occurrence of all of the identified proteins, phosphoproteins, and phosphopeptides in the leukoyctes of the three donors. Identifications in a single individual are shown in cyan, identification in two of the donors in light blue, and the dark blue bars show the number of peptides and proteins identified in all three individuals. (B) Bar graph showing a selection of GO classifications and their relative presence (% of total number of proteins) in the complete human proteome (green), in all proteins identified in the leukocytes (in brown), and in the identified phosphoproteins (in green). A list of all biological process GO classification is supplied in Supporting Information Table 3. (C) Shown are three of the phosphorylation site sequence motifs found to be most increased in the phosphoproteome of human leukocytes compared to the total human proteome, as determined using the Motif-x algorithm. Next to the motifs are the number of sites identified with this motif as well as the fold increase. The overview of all identified sequence motifs is supplied in Supporting Information Figure 3.

proximately half of all peptides (47%) were identified by both search engines. Strikingly, in contrast to the overall result for all peptides, the majority of all phosphopeptides were identified by Spectrum Mill (52% by Spectrum Mill alone, 34% by both engines, and only 14% uniquely by Mascot). Next, we assessed the confidence scores given to peptides returned both by Mascot and Spectrum Mill (1591 peptides) When comparing the scores assigned by Mascot and Spectrum Mill, we observed only a very limited correlation (Figure 3B). This lack of correlation was independent of using either class of fully tryptic or semitryptic peptides. In particular, Spectrum Mill seemed to assign relatively often high scores to spectra which were given low scores by Mascot, whereas the opposite was very rare (Figure 3B). Two illustrative examples of identified peptides, containing previously reported phosphorylation sites, with their corresponding tandem mass spectra and the mass accuracy of their product ions are shown in Figure 3. The left spectrum (Figure 3C) was returned by both Mascot and Spectrum Mill with high scores, whereas the right one (Figure 3D) was returned with an acceptable score only by Spectrum Mill. All identifications obtained from our analysis have been uploaded to the public data repository PRIDE (http://www.ebi. ac.uk/pride/), under accession numbers 9759, 9763, 9768, and 10527-10530,29 and all phosphopeptides with their corresponding identification scores are listed in Supporting Information Table 2, parts A (leukocytes) and B (U2OS cells). We analyzed which identified phosphosites in our data set had been previously annotated in the UniProt Knowledgebase (using the DAVID bioinformatics resource). This analysis showed that (29) Jones, P.; Cote, R. Methods Mol. Biol. 2008, 484, 287–303.

830

Analytical Chemistry, Vol. 82, No. 3, February 1, 2010

of the 573 identified phosphoproteins, 351 were on proteins previously described as phosphoproteins and 222 proteins were identified as a phosphoprotein in this study for the first time. Even on the already known phosphoproteins, our data set includes many new, previously unannotated, phosphorylation sites. This is most likely due to the fact that we used primary cells for the analysis, in which the phosphorylation pattern of proteins might well be very different from cultured cells. A striking example of such a difference is the novel phosphorylation site at serine 406 of the poly(U)-binding-splicing factor PUF60 which was, based on spectral count, present in a high amount in the samples of all three individuals. PUF60 is a protein involved in early mRNA splicing and as such can have significant influence on the expression of many other genes.30 We also looked at the overlap of the identified (phospho)proteins and phosphopeptides between the three individuals (Figure 4A). This revealed significant differences in the detected phosphorylation, even though there was a significant overlap in the proteins identified. Of all identified proteins, approximately 50% was identified in at least two individuals, but this number dropped to around 40% for the proteins found to be phosphorylated and at the level of phosphopeptides, only 19% was identified in more than one individual. As phosphorylation of proteins plays an important role in very specific pathways, we investigated further whether we were, using the phosphochip, enriching for peptides belonging to proteins involved in certain biological processes. We retrieved gene ontology (GO) annotations for all proteins found to be phosphorylated and compared the relative presence of the identified (30) Hastings, M. L.; Allemand, E.; Duelli, D. M.; Myers, M. P.; Krainer, A. R. PLoS One 2007, 2, e538.

biological processes to their presence in the set of all proteins we identified in these leukocytes and in complete human proteome. Although many GO categories were represented at similar levels in all these three data sets, there were also a few processes that were significantly enriched (see Figure 4B). For example, we could clearly identify enrichment of proteins that play a role in either protein amino acid phosphorylation, immune system development, and leukocyte differentiation in the leukocyte phosphoproteins, when compared to the whole human genome. In addition, we observed enrichment of certain biological processes in the total pool of leukocyte proteins as well. Enriched in both the total set of leukocyte proteins and in the phosphoproteins were, for example, proteins involved in apoptosis and vesiclemediated transport, whereas proteins related to macromolecule localization and stress response were enriched in the leukocytes, but not significantly in the set of leukocytes phosphoproteins (Figure 4B). A list of all analyzed GO terms is present in Supporting Information Table 3. Finally, using the motif-x algorithm,21 we performed an analysis on the detected phosphopeptides in order to identify enrichment of certain motifs observed in our identified phosphoprotein data set and to potentially relate the phosphorylation events observed in the primary human leukocytes with their responsible kinases. This analysis revealed enrichment of 10 sequence motifs (Supporting Information Figure 3), of which the ones that were found to be most enriched compared to the entire human proteome are shown in Figure 4C. Most were based on phosphorylated serine residues, while one motif was found for threonine phosphorylation. The SP, TP, and related motifs are phosphorylation motifs recognized by many classes of kinases, including cyclin-dependent kinase 2 (CDK2). The motifs with acidic residues can be linked to casein kinase II (CK2), whereas the arginine containing motifs are related to the protein kinases A and C (PKA/PKC), all of which have important roles in cell signaling and development. These extracted enriched motifs are as expected; however, they are not sufficiently specific to assign enriched specific kinase activity. DISCUSSION Here, we show that a codeveloped microfluidic phosphochip, recently commercially introduced by Agilent, coupled to a Q-TOF high-resolution mass spectrometer can provide a reliable and robust method for phosphopeptide enrichment and analysis. We evaluated the performance of the phosphochip analyzing the complex phosphoproteome of human U2OS osteosarcoma cells and use it to characterize, for the first time, the phosphoproteome of primary human leukocytes. The phosphochip enabled the enrichment and identification of phosphopeptides to levels approximately similar to those observed in alternative off-line and online phosphoproteomics approaches, but now implemented in an HPLC chip setup coupled to a Q-TOF mass analyzer.18,19 Benefits of Coupling to a High Mass Resolution Q-TOF Instrument. Our initial report on the phosphochip was largely based on the analysis of protein standards analyzed by an ion trap mass analyzer.19 Here, we coupled the HPLC chip technology to a Q-TOF mass analyzer. Q-TOF mass spectrometers can have several advantages over quadrupole ion traps when analyzing lowabundant phosphopeptides in complex samples. First, the high mass accuracy obtainable in the MS/MS spectra (Figure 3, parts C and D) can aid in the accurate identification of peptides and

lead to the better assignment of the exact site of phosphorylation within a peptide, particularly when relatively little, or low-intensity backbone fragment ions are observed, as is often the case for phosphorylated peptides.31 The higher mass resolution when acquiring MS/MS spectra also allows the better assignment of multiply charged product ions, and additionally noise peaks can be more easily distinguished (Figure 3D). Also, the ability of a Q-TOF to be more readily able to detect low-mass product ions, such as immonium ions, can be beneficial particularly in tandem MS (Figure 3D). In addition, tandem MS in space (such as in Q-TOF mass spectrometers) yields better fragmentation of phosphopeptides compared to tandem MS in time as performed in ion trap based instruments.7 Another important aspect of Q-TOF mass spectrometers is the improved dynamic range, when compared to the quadrupole ion trap utilized in earlier reported experiments,19 that should allow better detection of low-abundant peptides when in the presence of coeluting and much more abundant nonphosphorylated peptides.32 However, for the net 1+ SCX fraction of U2OS cell, we observed a significant saturation of the detector by other net 1+ peptides, such as those corresponding to protein C-termini (Supporting Information Figure 2B). Only by using the phosphochip, we were able to separate and identify larger numbers of phosphopeptides in this sample. When analyzing the human leukocyte sample, the number of phosphopeptides found in the main net 1+ SCX fraction was significantly lower than observed in the U2OS cells (Figure 2, parts B and D), something that likely can be attributed to a more complex sample preparation procedure and the potential high activity of endogenous phosphatases in these primary cells. To maximize the total number of identifications in the leukocyte sample, we decided to analyze the results with two different database search engines, Spectrum Mill and Mascot. We observed significant complementarity between the identifications we obtained with the two search engines (Figure 3A). It has been described before, confirming our observations here, that by using multiple search engines one can improve the total number of identification from a single data set significantly.33 The difference in identifications most likely lies in the different scoring algorithms, which place emphasis on different types of fragment ions. In addition, the Mascot score does not take into account the mass accuracy of the fragment ions, whereas the scoring algorithm of Spectrum Mill might reward the high accuracy obtained in the Q-TOF mass spectrometer. Phosphoproteome of White Blood Cells. The total number of proteins we identified in the leukocytes (2767) is very comparable to what we have obtained in similar analyses, performed on Orbitrap mass spectrometers, in the past.24 As this is, as far as we know, the first global phosphoproteome analysis of primary human leukocytes, we lack a reference to directly compare our phosphoproteome data. However, our parallel analysis on the U2OS cells indicates that the number of detectable phosphorylation events in leukocytes is approximately 6 times lower than in the U2OS cells. As these two cellular samples were (31) Mann, M.; Kelleher, N. L. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 18132– 18138. (32) Han, X.; Aslanian, A.; Yates, J. R., III. Curr. Opin. Chem. Biol. 2008, 12, 483–490. (33) Kapp, E. A.; Schutz, F.; Connolly, L. M.; Chakel, J. A.; Meza, J. E.; Miller, C. A.; Fenyo, D.; Eng, J. K.; Adkins, J. N.; Omenn, G. S.; Simpson, R. J. Proteomics 2005, 5, 3475–3490.

Analytical Chemistry, Vol. 82, No. 3, February 1, 2010

831

run under identical conditions and starting with the same amount of material the limited number of phosphoproteins detected in the leukocytes is not due to the methodology used, but related to the sample. The endogenous elastase activity (cleavage after I, A, or V) observed in the sample might affect the number of identified peptide and proteins as well, as a very significant number of the identified (phospho)peptides was not generated by trypsin alone, but by a combination of trypsin and neutrophil elastase as well as other endogenous enzymes (Figure 3A). We observed limited overlap between the phosphoproteome of the leukocytes from the three healthy donors. We have observed a very comparable proteome between the leukocytes of different persons, as determined by quantitative mass spectrometric methods.24 Many factors can contribute to this variation at the phosphopeptide level, including sample storage and the low abundance of many phosphopeptides, meaning the mass spectrometer will not always select the same precursors for fragmentation, causing significant undersampling, something that is wellknown to occur in these types of analyses. The level of overlap might indicate significant variation at the biological level, something that will be much more prominent when comparing truly biological replicates (different individuals), rather than replicates of cultured cells as has been done in most large-scale phosphoproteomic analyses. Technical replicates we performed on some of the samples indicated that the overlap between runs using the phosphochip was similar to what has been observed in the past, showing that the variation is at least partly caused by differences at the sample level. We deduced, by GO annotation analysis that the leukocyte phosphoproteome was enriched in, not surprisingly, protein amino acid phosphorylation but also in proteins involved in leukocyterelated biological processes such as immune system development and leukocyte differentiation (Figure 4A). The sequence motifs identified for the human leukocyte phosphosites and the kinases responsible for phosphorylation at those sites were found not to be very specific and similar to what has been described before for very different cells.20,21 The presence of proline residues or acidic residues close to the phosphorylation site is a well-known pattern that is recognized by a variety of kinases, such as CDK2 (34) Brill, L. M.; Salomon, A. R.; Ficarro, S. B.; Mukherji, M.; Stettler-Gill, M.; Peters, E. C. Anal. Chem. 2004, 76, 2763–2772. (35) Shu, H.; Chen, S.; Bi, Q.; Mumby, M.; Brekken, D. L. Mol. Cell. Proteomics 2004, 3, 279–286. (36) Ryu, S. I.; Kim, W. K.; Cho, H. J.; Lee, P. Y.; Jung, H.; Yoon, T. S.; Moon, J. H.; Kang, S.; Poo, H.; Bae, K. H.; Lee, S. C. J. Biochem. Mol. Biol. 2007, 40, 765–772. (37) Carrascal, M.; Ovelleiro, D.; Casas, V.; Gay, M.; Abian, J. J. Proteome Res. 2008, 7, 5167–5176.

832

Analytical Chemistry, Vol. 82, No. 3, February 1, 2010

and CK2. Although previous studies have studied the phosphorylation of proteins in cell lines derived from specific leukocyte subtypes, including the Jurkat T-cell line,34 the WEHI-231 B-cell line,35 and the AML14.3D10 eosinophil granulocyte cell line,36 hardly any information is currently available on leukocytes derived directly from human blood. One previous study addressed the phosphoproteome of primary human T-cells,34,37 but the majority of all human leukocytes are neutrophil granulocytes. That is also the reason why only limited overlap exists between the previously identified T-cell phosphosites and our data set. Only 110 phosphoproteins and only 50 phosphopeptides were identified in both studies, emphasizing the big difference in sample composition (T-cells vs total leukocytes). All in all, this study provides the first broad, proteome-wide analysis of phosphorylation in primary human leukocytes. CONCLUSIONS We have shown here that the TiO2-based HPLC chip for phosphopeptide enrichment coupled to a Q-TOF, high mass accuracy, mass spectrometer allowed us to perform automated enrichment and reliable identification of phosphopeptides from the highly complex digest of primary human leukocytes, to similar levels as other enrichment methods. The recently commercialized phosphochip provides an easy to use method for the enrichment of phosphopeptides, without requiring expert knowledge of nanoLC or enrichment methods. With this method, we were able to identify significant numbers of phosphopeptides from primary human leukocytes, providing a first insight into the human leukocyte phosphoproteome. ACKNOWLEDGMENT This work was supported by The Netherlands Proteomics Centre and with a Grant from the Agilent Technologies Foundation. We thank Dr. Vincent Halim for supplying the U2OS cells and the Mini Donor Service of the Department of Clinical Chemistry and Hematology of the University Medical Centre Utrecht in The Netherlands for help with obtaining the donor blood samples. SUPPORTING INFORMATION AVAILABLE Additional information as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org.

Received for review August 5, 2009. Accepted December 16, 2009. AC901764G