Article pubs.acs.org/jpr
Comprehensive Phosphoproteome Analysis of INS-1 Pancreatic BetaCells using Various Digestion Strategies Coupled with Liquid Chromatography−Tandem Mass Spectrometry Dohyun Han,†,‡ Sungyoon Moon,† Yikwon Kim,† Won-Kyung Ho,§ Kyunggon Kim,†,‡ Yup Kang,∥ Heesook Jun,⊥ and Youngsoo Kim*,†,‡ †
Departments of Biomedical Engineering and §Physiology and ‡Institute of Medical & Biological Engineering, Medical Research Center, College of Medicine, Yongon-Dong, Seoul 110-799 Korea ∥ Insitute for Medical Sciences, Ajou University School of Medicine, Suwon, Kyunggi-do, 442-749 Korea ⊥ Lee Gil Ya Cancer and Diabetes Institute, Gachon University of Medicine and Science, Incheon 406-840, Korea S Supporting Information *
ABSTRACT: Type 2 diabetes results from aberrant regulation of the phosphorylation cascade in beta-cells. Phosphorylation in pancreatic beta-cells has not been examined extensively, except with regard to subcellular phosphoproteomes using mitochondria. Thus, robust, comprehensive analytical strategies are needed to characterize the many phosphorylated proteins that exist, because of their low abundance, the low stoichiometry of phosphorylation, and the dynamic regulation of phosphoproteins. In this study, we attempted to generate data on a large-scale phosphoproteome from the INS-1 rat pancreatic beta-cell line using linear ion trap MS/MS. To profile the phosphoproteome in-depth, we used comprehensive phosphoproteomic strategies, including detergent-based protein extraction (SDS and SDC), differential sample preparation (in-gel, in-solution digestion, and FASP), TiO2 enrichment, and MS replicate analyses (MS2-only and multiple-stage activation). All spectra were processed and validated by stringent multiple filtering using target and decoy databases. We identified 2467 distinct phosphorylation sites on 1419 phosphoproteins using 4 mg of INS-1 cell lysate in 24 LC−MS/MS runs, of which 683 (27.7%) were considered novel phosphorylation sites that have not been characterized in human, mouse, or rat homologues. Our informatics data constitute a rich bioinformatics resource for investigating the function of reversible phosphorylation in pancreatic beta-cells. In particular, novel phosphorylation sites on proteins that mediate the pathology of type 2 diabetes, such as Pdx-1, Nkx.2, and Srebf1, will be valuable targets in ongoing phosphoproteomics studies. KEYWORDS: phosphoproteomics, phosphoproteome, INS-1 cell, beta-cell, diabetes
■
INTRODUCTION Phosphorylation is a crucial post-translational modification that regulates the function of proteins, which mediate virtually all cellular functions, including the cell cycle, apoptosis, metabolism, signal transduction, proliferation, and development.1 Aberrant regulation of phosphorylation is associated with various disease states. Thus, hundreds of protein kinases and phosphatases, as well as thousands of potential substrates, are considered drug targets for many diseases, such as cancer and diabetes.2 Type 2 diabetes (T2D) is the most common type of diabetes mellitus. Millions of people were estimated to have T2D at the beginning of the 21st century.3 T2D is characterized by insulin resistance, insufficient compensation of insulin secretion in pancreatic beta-cells, and pancreatic beta-cell dysfunction and death. Thus, pancreatic beta-cell function must be studied to © 2012 American Chemical Society
understand the mechanisms of insulin activity and the pathogenesis of T2D. Recently, several proteomics-based studies have been performed to elucidate the molecular mechanisms of glucose homeostasis and the pathogenesis of types 1 and 2 diabetes mellitus in pancreatic beta-cells,4−8 most of which have focused on proteome profiling and changes in protein expression in cell lines and animal models.4−8 In point of fact, the pathogenesis of T2D results from aberrant regulation of the phosphorylation cascade in beta-cells, which governs the activity of proteins in signal transduction process that control beta-cell function, such as insulin signaling, MAPK signaling, and insulin secretion.9,10 Thus, phosphorylation events in pancreatic beta-cells must be examined to Received: October 4, 2011 Published: January 26, 2012 2206
dx.doi.org/10.1021/pr200990b | J. Proteome Res. 2012, 11, 2206−2223
Journal of Proteome Research
Article
characterized the INS-1 phosphoproteome using several bioinformatics tools to classify functional groups and activities.
understand pancreatic beta-cell function. Yet, global phosphorylation events in pancreatic beta-cells have not been studied in detail, except with regard to subcellular phosphoproteomes using mitochondria in the INS-1 cell line.8 Phosphoproteomics is the large-scale analysis of protein phosphorylation using mass spectrometry (MS)-based strategies to examine phosphorylation-based signaling networks.11 In recent years, its success has been driven by technical advances in MS instruments that have high resolution and high mass accuracy (e.g., FT-ICR and Orbitrap) and fragmentation methods (e.g., ETD). Conversely, for sensitive and highly selective phosphorylation analysis, several proteomics techniques have been developed for specific enrichment and fractionation of phosphorylated protein and peptides, such as antibody-based methods,12 strong anion/cation exchange chromatography (SAX/SCX),13,14 immobilized metal affinity chromatography (IMAC),15 titanium dioxide,16 and hydrophilic interaction chromatography (HILIC).17 However, these methods generate disparate results, and each has advantages and drawbacks. For example, extensive protein and peptide prefractionation procedures, such as SCX, SAX, HILIC, and ERLIC, were implemented13,14,17 to overcome the low relative abundance of many phosphoproteins, low phosphorylation stoichiometry, and dynamic regulation of phosphoproteins. But, prefractionation steps are accompanied by corresponding increases in the starting material and measurement times that are required. Further, problems with sample loss and reproducibility of the protein fractionation should be taken into account. Consequently, comprehensive, robust analytical strategies are needed to characterize a meaningful number of phosphorylated proteins, including those at very low abundance. Thus, profiling of phosphoproteomes on a large scale remains a challenge. In this study, we attempted to generate data on a large-scale phosphoproteome from the INS-1 pancreatic beta-cell line using popular mass spectrometry techniques, such as linear ion trap MS/MS (LTQ velos) without peptide prefractionation, and examined its properties with regard to phosphorylation. To analyze the INS-1 phosphoproteome in detail, we used comprehensive phosphoproteomic strategies, including detergent-based protein extraction methods, differential sample preparation methods, TiO2 enrichment, and MS replicate analyses. Briefly, protein samples were prepared by sodium dodecyl sulfate (SDS)18 and sodium deoxycholate (SDC)19,20 extraction to improve coverage of the proteome. Prior to TiO2 enrichment, peptide samples were prepared by in-solution digestion, filter-aided sample preparation (FASP), or in-gel digestion (short in-gel digestion). Both digestion methods (ingel and in-solution) were used to effect efficient trypsin digestion.18,21 Further, biological and technical replicates were established to maximize coverage of the phosphoproteome; technical replicates were made by combining MS2-only and multiple-stage activation (MSA) analysis to improve the confidence of a phosphopeptide.22 All spectra were processed and validated by stringent multiple filtering using a target/ decoy database search. We identified 2467 distinct phosphorylation sites on 1419 phosphoproteins with 4 mg of INS-1 cell lysate in 24 LC−MS/ MS runs. Despite collecting samples by linear ion trap MS/MS (LTQ velos), our study provides the most comprehensive data on the pancreatic beta-cell phosphoproteome. In addition, we
■
MATERIALS AND METHODS
Reagents and Materials
HPLC-grade acetonitrile (ACN), HPLC-grade water, HPLCgrade methanol, hydrochloric acid (HCl), and sodium chloride (NaCl) were obtained from DUKSAN (Gyungkido, Korea). Acetic acid was purchased from TEDIA (Fairfield, OH). The protein assay kit (Bradford) was purchased from Bio-Rad (Hercules, CA), and Complete protease inhibitor cocktail tablets mini were from Roche (Mannheim, Germany). Dithiothreitol (DTT) and urea were purchased from AMRESCO (Solon, OH). PMSF, Sodium dodecyl sulfate (SDS), and Tris were purchased from USB (Cleveland, OH). TiO2 was collected from a disassembled column (Titansphere TiO 5 μm, GL Science, Inc. Japan), and sequencing-grade modified trypsin was purchased from Promega Corporation (Madison, WI). All other reagents including 2-mercaptoethanol, 2,5-dihydroxybenzoic acid (DHB), ammonium bicarbonate (NH4HCO3), ammonium hydroxide, Brilliant Blue R, EDTA, formic acid, glycolic acid, iodoacetamide, phosphatase inhibitor cocktail 2 + 3, phosphoric acid, sodium deoxycholate (SDC), sodium fluoride (NaF), sodium orthovanadate (Na3VO4), sodium pyrophosphate (Na2H2P2O7), and trifluoroacetic acid (TFA) were purchased from Sigma-Aldrich (St. Louis, MO). Cell Culture
Rat insulinoma cells (the INS-1 cell line) were maintained in RPMI 1640, containing 11 mM glucose, supplemented with 10% heat-inactivated FBS, 10 mM HEPES, 50 μM 2mercaptoethanol, 2 mM glutamine, 1 mM sodium pyruvate, 100 U/mL penicillin, and 100 mg/mL streptomycin, at 37 °C in a humidified atmosphere and 5% CO2. Treatment with Phosphatase Inhibitors
Confluent INS-1 cells were washed with PBS, and 10 mL of media that contained 10 mM sodium pervanadate and phosphatase inhibitor cocktail 2 + 3 (diluted 100-fold) (Sigma-Aldrich, St. Louis, MO) was added. Stock 50 mM sodium pervanadate solution was made by diluting 100 mM activated sodium orthovanadate in 0.18% H2O2. Phosphatase inhibitor cocktail 2 (Sigma) contains several tyrosine phosphatase inhibitors (sodium vanadate, sodium molybdate, sodium tartrate, and imidazole), whereas phosphatase inhibitor cocktail 3 contains serine and threonine phosphatase inhibitors (cantharidin, p-bromolevamisole oxalate, and calyculin A). The cells were incubated at 37 °C for 30 min and detached with a cell scraper. After being washed with cold PBS twice, the cells were pelleted by centrifugation at 3000 rpm for 10 min. Preparation of Protein Samples
Proteins were extracted with various extraction buffers. For ingel digestion, cell lysates were prepared using 3 extraction buffers. The cells were lysed with 200 μL of SDS-based extraction buffer (1% SDS, 50 mM Tris-HCl pH 7.4, 150 mM NaCl, 1 mM Na3VO4, 10 mM Na2H2P2O7, 1 mM NaF, 1 mM EDTA, 0.1 mM PMSF, and 1× protease inhibitor cocktail), SDC-based extraction buffer (1% SDC, 50 mM Tris-HCl pH 7.4, 150 mM NaCl, 1 mM Na3VO4, 10 mM Na2H2P2O7, 1 mM NaF, 1 mM EDTA, 0.1 mM PMSF, and 1× protease inhibitor cocktail), or a combination extraction buffer (1% SDS, 1% SDC, 50 mM Tris-HCl pH 7.4, 150 mM NaCl, 1 mM Na3VO4, 2207
dx.doi.org/10.1021/pr200990b | J. Proteome Res. 2012, 11, 2206−2223
Journal of Proteome Research
Article
Table 1. Overview of INS-1 Phosphoproteome Coverage Obtained by Different Experimental Schemes ASCORE ≥ 19 level
FDR < 1% level experimental schemes Experimental scheme 1
Experimental scheme 2
Experimental scheme 3
sample preparation method 1. In-solution digestion (FASP)
2. In-gel digestion (Short in-gel digestion)
3. Comparison set (Short and Traditional in-gel digestion)
subjects (amount of proteins)
no. of unique phospho-sites
no. of unique phospho-peptides
no. of unique phospho-sites
no. of unique phospho-proteins
FASP_TiO2 biological replicate 1 (500 μg) FASP_TiO2 biological replicate 2 (500 μg) Combined SDC_TiO2 biological replicate 1 (500 μg) SDC_TiO2 biological replicate 2 (500 μg) Combined SDS_TiO2 biological replicate 1 (500 μg) SDS_TiO2 biological replicate 2 (500 μg) Combined Combined(S)_TiO2 (500 μg) Combined(T)_TiO2 (500 μg) Combined
3133
1477
1444
890
2068
1035
937
597
3683 1487
1626 864
1616 667
966 494
1856
1028
728
539
2495 1351
1255 817
1012 540
704 423
1454
857
577
452
2140 1708
1133 987
829 687
617 529
1838
1031
675
508
2714 6277
1358 2334
1030 2467
742 1419
Total
containing 0.1% SDC with sequencing-grade modified trypsin (Promega, Madison, WI), at an enzyme-to-protein ratio of 1:100 w/w. We added SDC to the digestion buffer, because SDC increases the solubility of proteins, enhances the activity of trypsin, and improves the accessibility of trypsin to the proteins that are denatured during extraction process.19 After an overnight incubation at 37 °C, the peptides were extracted from the gel pieces sequentially using 150 μL 40% acetonitrile/ 50 mM NH4HCO3 and 150 μL 80% acetonitrile/0.1% TFA with sonication for 30 min at each stage. All supernatants were combined. To remove SDC, all supernatants were acidified with 0.1% TFA (final concentration) and centrifuged at 15000× g for 15 min. The resulting supernatants were dried in a vacuum centrifuge and stored at −80 °C until analysis.
10 mM Na2H2P2O7, 1 mM NaF, 1 mM EDTA, 0.1 mM PMSF, and 1× protease inhibitor cocktail). Samples were lysed in each extraction buffer for 15 min on ice, followed by sonication. Each cell lysate were centrifuged at 15000 rpm for 30 min at 4 °C to remove cellular debris. The supernatant was collected, and total protein concentration was measured by Bradford assay (Bio-Rad, Hercules, CA). For the in-solution digestion procedure (FASP), cells were lysed in 200 μL of strong SDS-based extraction buffer (4% SDS, 50 mM Tris-HCl pH 7.4, 150 mM NaCl, 100 mM DTT, 1 mM Na3VO4, 10 mM Na2H2P2O7, 1 mM NaF, 1 mM EDTA, 0.1 mM PMSF, and 1X protease inhibitor cocktail) for 5 min at 95 °C, followed by sonication. Each cell lysate was centrifuged at 15,000 rpm for 15 min at 4 °C to remove cellular debris. The supernatant was collected, and total protein concentration was measured by Bradford assay (Bio-Rad, Hercules, CA).
In-solution Digestion (FASP)
INS-1 cell lysates in strong SDS extraction buffer were processed by FASP18 using 30k Microcon filtration devices (Millipore), with some modifications. Proteins (500 μg) were mixed with 0.2 mL 8 M urea in 0.1 M Tris/HCl, pH 8.5 (UA solution), loaded into two filtration unit (250 μg per filteration unit), and centrifuged at 14000× g for 20 min. The concentrates were diluted in the devices with 0.2 mL UA solution and centrifuged again. After centrifugation, the concentrates were mixed with 0.2 mL IAA solution (50 mM iodoacetamide in UA solution) and incubated in darkness at room temperature (RT) for 30 min without being mixed, followed by centrifugation for 15 min. Then, the concentrate was diluted with 0.2 mL 8 M urea in 0.1 M Tris/HCl, pH 8.5 (UB solution) and concentrated again. Washing the concentrate with UB solution was repeated 3 times. After the flow-through was discarded, 0.2 mL 50 mM ABC was added to the filter and centrifuged at 14000× g for 15 min. This step was repeated 3 times. One hundred microliters 50 mM ABC with trypsin (enzyme:protein ratio 1:100) was added to the resultant concentrate. After an overnight incubation at 37 °C, the filtration unit was transferred to new collection tubes, and peptides were collected by centrifugation for 20 min. Finally, the peptides that were retained by the MWCO
1D SDS-PAGE and In-gel Digestion
The protein samples were digested by short in-gel digestion21 with some modifications. Briefly, 500 μg of proteins that were extracted from each buffer was mixed with 5× SDS loading buffer and heated at 95 °C for 15 min prior to loading on the SDS-PAGE gel. Two hundred fifty micrograms of proteins was loaded into a single lane and separated by 12% SDS-PAGE (∼2 cm). The gels were stained with coomassie blue R250 and destained in 50% methanol/10% acetic acid solution. Two gel slices (250 μg of proteins per slice) were excised, transferred to microcentrifuge tubes, washed with HPLC-grade water twice, and destained further in 200 μL 200 mM NH4HCO3, 50% ACN overnight at 4 °C. After the gel pieces were completely destained, dehydration and rehydration were performed sequentially using 100% ACN and 0.1 M NH4HCO3, respectively. After being dried in a vacuum centrifuge, the gel pieces were subjected to in-gel digestion, as described23 with minor modifications. Proteins were reduced with 200 μL 10 mM DTT at 65 °C for 1 h and alkylated with 55 mM iodoacetamide (IAA) at room temperature in the dark for 25 min. The gel pieces were redissolved in 200 μL 50 mM NH4HCO3, 2208
dx.doi.org/10.1021/pr200990b | J. Proteome Res. 2012, 11, 2206−2223
Journal of Proteome Research
Article
Figure 1. Flowchart of identification of phosphorylation sites in the INS-1 pancreatic beta-cell line. Experiments were performed using 3 comparison sets. INS-1 cell lysates, obtained using differential protein extraction methods (SDS, SDC, and a combination of SDS and SDC), were digested by insolution (FASP) or in-gel trypsin digestion (short in-gel digestion and traditional in-gel digestion) and enriched for phosphopeptides. Phosphopeptides were analyzed by reverse-phase LC−MS/MS using MS2-only or multiple stage activation (MSA) methodologies on LTQ velos ion trap mass spectrometer.
membrane in the filtration units were eluted with 50 μL 0.5 M NaCl to enhance the yield of the digested protein. The resultant supernatants were acidified with 1% TFA, dried in a vacuum centrifuge, and stored at −80 °C until further analysis.
sample was mixed with glycolic acid loading buffer at a sampleto-loading buffer ratio of 1:4 v/v, loaded onto the C8−TiO2 microcolumns, and washed successively with glycolic acid loading buffer and wash buffer (80% acetonitrile and 2% TFA). Finally, the phosphopeptides were eluted sequentially from the resin with 200 μL 0.5% ammonium hydroxide solution (pH 10.5) and 100 μL 30% acetonitrile. The eluates were pooled, acidified immediately with 1 μL 100% formic acid per 10 μL eluate, and desalted with C18-R3-StageTips. The resulting phosphopeptide samples were dried in a vacuum centrifuge and stored at −80 °C for LC−MS/MS analysis.
Desalting
Prior to TiO2 enrichment, all dried peptide mixtures were dissolved in 0.1% TFA and desalted using homemade StageTips, as described.24 Self-packed C18 microcolumns were prepared by reversed-phase packing POROS 20 R2 material (Applied Biosystems, Foster City, CA) into 200-μL yellow pipet tips on top of C18 Empore disk membranes. The microcolumns were washed 3 times with 100 μL 100% acetonitrile (ACN) and equilibrated 3 times with 100 μL 0.1% TFA by applying air pressure from a syringe. After the samples were loaded, the microcolumns were washed 3 times with 100 μL 0.1% TFA, and peptides were eluted with 100 μL of a series of elution buffers, containing 0.1% TFA and 40, 60, and 80% ACN. All eluates were combined and dried in a vacuum centrifuge.
Comparison of Two Gel-based Phosphopeptide Enrichment Approaches
To compare gel-based phosphopeptide enrichment between short separation (short in-gel digestion; Combined(S)_TiO2) and conventional separation (traditional in-gel digestion; Combined(T)_TiO2) (Table 1 and Figure 1), 1 mg of INS-1 cell lysate that was extracted using a combination extraction buffer (SDC + SDS), was divided into 2 portionsone each for the short and traditional in-gel digestion. The experimental schemes are described in Supplementary Figure 1 (Supporting Information). For the short in-gel digestion approach, each step, including the gel separation, in-gel digestion, and TiO2 enrichment of 500 μg of INS-1 cell lysate was performed as described above. For the traditional in-gel digestion approach, 500 μg of INS-1 cell lysate was separated conventionally by 12% SDS-PAGE, and proteins were stained with Coomassie brilliant blue R250. Forty-eight gel bands were excised from the entire lane, each of
TiO2 Phosphopeptide Enrichment by Microcolumn
Phosphopeptides were enriched using microcolumns as described,16 with some modification. Microcolumns, packed with Titansphere TiO2 beads (1 mg beads/200-μL yellow pipet tip) on top of C8 Empore disk membranes, were washed with 100% acetonitrile and equilibrated with glycolic acid loading buffer (1 M glycolic acid, 80% acetonitrile, and 2%TFA) by applying air pressure from a syringe. The desalted peptides were reconstituted in 20 μL 0.1% TFA solution. The peptide 2209
dx.doi.org/10.1021/pr200990b | J. Proteome Res. 2012, 11, 2206−2223
Journal of Proteome Research
Article
database (v3.72, 42153 entries), and its reverse sequences were generated using Scaffold 3 (Proteome Software Inc., Portland, OR). A postsearch analysis was performed using the TransProteome Pipeline (TPP) with the PeptideProphet and ProteinProphet algorithms.25 The database search parameters were: full enzyme digest using trypsin (After KR/−) with up to 2 missed cleavages; a precursor ion mass tolerance of 2.0 Da (average mass); a fragment ion mass tolerance of 0.5 Da (monoisotopic mass); a static modification of 57.02 Da on Cys residues for carboxyamidomethylation; and a dynamic modification of 79.96 Da on Ser, Thr, and Tyr residues for phosphorylation and 15.99 Da on Met residues for oxidation. Four PTM sites were allowed per peptide. Advanced search options that were enabled were XCorr score cutoff of 1, isotope check using a mass shift of 1.003355 amu, keep the top 2000 preliminary results for final scoring, display up to 10 peptide results in the result file, display up to 10 full protein descriptions in the result file, and display up to 10 duplicate protein references in the result file.
which was cut into small pieces and placed in a microcentrifuge tube. Destaining and in-gel digestion were performed as described above. A pooled peptide mixture of 4 gel slices was desalted using a homemade StageTip, and enriched using a TiO2 microcolumn as described above. Twelve phosphopeptide enrichments were performed. Finally, the 12 elutes were combined and desalted with a C18-R3-StageTip. The resulting phosphopeptide sample was dried in a vacuum centrifuge and stored at −80 °C for LC−MS/MS analysis. nanoLC−ESI−MS/MS Analysis
Phosphopeptide mixtures were analyzed by LC−MS/MS on an EasyLC (Proxeon, Odense, Denmark) that was interfaced to a high-throughput tandem mass spectrometer (LTQ velos, Thermo, Waltham, MA), equipped with a nanoelectrospray device and fitted with a 10-μm fused silica emitter tip (New Objective, Woburn, MA). The nanoliter flow LC was operated in the 2-column setup with a trap column (100 μm × 3 cm) and an analytical column (75 μm × 15 cm) that was packed inhouse with C18 resin (Magic C18-AQ 200 Å, 5-μm particles). Solvent A was 0.1% formic acid and 2% acetonitrile in ddH2O, and solvent B was 98% acetonitrile with 0.1% formic acid. For each analysis, 10 μL of the sample dissolved in 50 μL solvent A, was loaded onto the trap column at 5 μL/min. Peptides were separated with a gradient of 0 to 40% solvent B over 120 min, followed by a gradient of 40 to 60% for 15 min and 60 to 100% over 5 min at 300 nL/min. The spray voltage was 1.8 kV in the positive ion mode, and the temperature of heated capillary was 200 °C. Two mass spectrometry methodologies were considered in this study: standard MS2 (hereafter called “MS2-only”) and multistage activation (MSA). For MS2-only, TOP10, TOP8, and TOP5 were implemented. A cycle of 1 full-scan MS survey spectra (m/z 300−2000) was acquired in the profile mode. Fragmentation of the precursor and detection of the product ions occurred in the linear trap in a data-dependent manner for the top 10 (TOP10) and top 5 ions (TOP5)particularly the top 8 ions (TOP8) in the FASP sample. Only MS precursor that exceeded a threshold of 500 ion counts was allowed to trigger MS/MS fragmentation. For MSA, data-dependent MS/MS scans were performed following full-scan MS survey spectra (m/z 300−2000) for the 8 most intense peptides that were fragmented by collisioninduced dissociation with multistage activation ions, resulting in the neutral loss of phosphoric acid from the parent ion (neutral loss masses = 98, 49, 32.33, and 24.5 m/z for z = 1, 2, 3, and 4, respectively), and its spectra were scanned using LTQ velos. All MS/MS spectra were acquired using the following parameters: normalized collision energy, 35%; ion selection threshold, 500 counts; activation Q, 0.25; and activation time, 10 ms. Dynamic exclusion was used with a repeat count of 1, 30-s repeat duration, an exclusion list size of 100, an exclusion duration of 45 s, and ±1.5 m/z exclusion mass width. Instruments were controlled through Tune 2.6.0 and Xcalibur 2.1.
Data Processing
The search results were processed as follows to identify phosphopeptides with high confidence. 1. The database search results were validated using PeptideProphet.25 To remove low-quality MS/MS spectra, the resulting MS/MS spectra were filtered with PeptideProphet using a probability of ≥0.05 and minimum peptide length >6. Also, only spectra that identified at least one phosphorylated amino acid were selected to remove nonphosphorylated peptides. 2. Data sets were constrained by a minimum delta CN of 0.1 and a maximum RSp of 2. 3. RT_Score value, calculated in PeptideProphet, was used to obtain a better correlation model between predicted and observed peptide retention times (Pearson correlation ≥ 0.9). 4. We adjusted the XCorr cutoff to achieve an FDR of the final data set that was less than 1.0%. XCorr was recalculated using following formula: XCorr = ln(Xcorr)/ln(L), where L is peptide length.25 The threshold of Xcorr was applied, based on charge state (+2 and +3).25,26 FDR was calculated as R/F, where R is the number of reversed hits (decoy database; false positive) and F is the number of forward hits (target database; true positive).27,28 5. The probability of identifying a correct phosphorylation site was determined for every site in each peptide using Ascore (http://ascore.med.harvard.edu/ascore.php), 29 a probabilistic algorithm that predicts the likelihood of matching site-determining ions to specific phosphorylation sites. Sites with an Ascore value ≥19 were considered to be assigned with high confidence. To determine the number of unique phosphorylation sites, only sites with an Ascore value ≥19 were counted, but phosphorylation sites that were identified from different charge states (+2 and +3) and methionine oxidation variants were not distinguished.
LC−MS Data Analysis
All SEQUEST searches were performed on the SEQUEST Sorcerer 2 platform (Sage-N Research, Milpitas, CA). The raw data from LTQ velos was converted into an mzXML file using ReAdW.exe (version 4.0, http://sourceforge.net/projects/ sashimi/files/). MS/MS data were searched using a targetdecoy database search strategy against a composite database that contained the International Protein Index (IPI) Rat
Bioinformatics Analysis
The biological and functional analysis of our phosphoproteome was performed using only identified phosphorylation sites with high confidence (FDR < 1% and Ascore value ≥19). 2210
dx.doi.org/10.1021/pr200990b | J. Proteome Res. 2012, 11, 2206−2223
Journal of Proteome Research
Article
Extraction of kinase-specific motifs and prediction of presumable kinase families were performed using Motif-X (http://motif-x.med.harvard.edu)30 and NetworKIN, version 2 beta (http://networkin.info/version_2_0/newPrediction. php).31 Gene ontology analysis was performed using Cytoscape32 and Plugin BiNGO 2.4,33 and pathway analysis was performed using the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways database (http://www.genome.jp/kegg) and the DAVID bioinformatics tool.34 The details of each bioinformatics tool are described in Supplementary Methods (Supporting Information).
■
RESULTS AND DISCUSSION
Phosphoproteome Profiling using Differential Protein Extraction and Peptide Preparation
To improve coverage of the pancreatic beta-cell phosphoproteome, we implemented a workflow in which various protein extraction and digestion strategies were implemented. In all experiments, INS-1 cell lysates from the differential protein extraction with SDS or SDC were digested by in-solution or ingel trypsin digestion, and phosphopeptides were enriched. The enriched phosphopeptide mixtures were desalted using a C18R3-stage tip24 and analyzed by LTQ velos linear ion trap LC− MS/MS using MS2-only or MSA. Figure 1 and Table 1 show an overview of the 3 experimental schemes that we used to analyze the INS-1 phosphoproteome, the details of which are shown in Supplementary Figure 1 (Supporting Information). Briefly, in experimental scheme 1, 500 μg of INS-1 cell lysate that was solubilized using strong SDS extraction buffer, was processed by FASP, yielding tryptic peptides. In experimental scheme 2, we combined short SDS-PAGE and in-gel digestion to maximize peptide recovery, obtain good separation, and avoid diffusion across the gel. Each 500-μg sample of protein lysates, extracted using SDC or SDS lysis buffer, was separated by short SDS-PAGE (run approximately 2 cm), followed by ingel digestion (termed short in-gel digestion”). In experimental scheme 3, we compared short in-gel digestion to traditional ingel digestion to evaluate the advantage of the former. The detailed comparison of the 2 gel-based approaches is described in the Supplementary Text and Supplementary Figure 1 (Supporting Information). Further, to maximize coverage of the INS-1 phosphoproteome, we analyzed biological replicates in experimental schemes 1 and 2. Using all stringently filtered peptides from the 3 schemes, 6277 unique phosphopeptides from 2334 phosphoproteins were identified from 24 LC−MS/MS runs, with less than a 1% false positive rate at the peptide level (Table 1). All identified phosphopeptides are listed in Supplementary Table 1 (Supporting Information). Experimental schemes 1, 2, and 3 resulted in the identification of 3683, 3316, and 2714 unique phosphopeptides from 1626, 1539, and 1358 phosphoproteins, respectively (Figure 2A and Table 1). Altogether, 834 phosphoproteins overlapped in the 3 sets, and unique proteins constituted 20.4% (477), 13.7% (318), and 7.9% (184), respectively, of the sets. The 3 methods were complementary30.4% (1911), 19.8% (1243), and 11.9% (744) of the unique phosphopeptides were detected exclusively in schemes 1, 2, and 3, respectively (Figure 2A, Supplementary Figure 1, Supporting Information). Notably, in the FASP_TiO2 approach of scheme 1, more phosphopeptides and phosphoproteins were identified with less starting material (1 mg) and a lower mass spectrometry analysis
Figure 2. Area-proportional Venn diagram for identified nonredundant phosphopeptides and phosphoproteins with FDR < 1%. (A) Overlap of unique phosphopeptides and unique phosphoproteins identified from experimental schemes 1, 2, and 3 at FDR < 1% (Supporting Information). (B) Identification of phosphopeptides and phosphoproteins in 8 replicates of the 3 experimental schemes in Figure 1. Blue and red bars indicate the numbers of phosphopeptides and phosphoproteins, respectively, identified in 1 replicate run (combining 2 MS2-only and 1 MSA LC−MS/MS runs) in the 3 experimental schemes (Supporting Information).
time (6 LC−MS/MS runs) compared with the other gel-based TiO2 approaches (Figure 2B), suggesting that phosphopeptide enrichment methods that are based on in-solution digestion (FASP) are preferable for samples that are not prefractionated. Estimation of False Discovery Rate
Identifying phosphopeptides correctly and localizing phosphorylation sites accurately are often challenging tasks, due to the significant neutral loss of phosphates and poor CID fragmentation of phosphopeptides. In particular, low-accuracy and low-resolution phosphopeptide MS data by LTQ velos have some ambiguity that is associated with the identification of each peptide for many reasons, such as the low mass accuracy of precursor ions, poor fragmentation in the MS/MS spectra, interference of neutral loss peaks, and ambiguous charge states of parent ions.35,36 Consequently, estimating the relative amount of false positives is critical in determining the quality and reliability of entire data sets. Recently, many phosphoproteome studies have implemented a false discovery rate (FDR) estimation strategy, combined with a target-decoy database search and multiple filter, for large-scale phosphopeptide identification pipelines.13,27,36,37 In this study, after MS/MS spectra were searched using the SEQUEST-Sorcerer algorithm (version 27, rev 12, Sage-N2211
dx.doi.org/10.1021/pr200990b | J. Proteome Res. 2012, 11, 2206−2223
Journal of Proteome Research
Article
Figure 3. Overall properties of INS-1 phosphoproteome. (A) Distribution of singly, doubly, triply, and quadruply phosphorylated peptides identified at FDR < 1%. (B) Ascore distribution of phosphorylation sites. About 60% of phosphorylation sites were localized with near or high certainty (P ≤ 0.05, Ascore value ≥13). (C) Frequency of phosphoserine (pSer), phosphothreonine (pThr), and phosphotyrosine (pTyr) residues with Ascore value ≥19. (D) Identification of distinct phosphorylation sites and phosphoproteins with Ascore value ≥19 in each replicate run. The bars indicate the number of phosphosites and phosphoproteins identified in each replicate run.
Research) against a concatenated target-decoy database that contained the rat IPI protein database (version 3.72, 42153 entries) and its reverse complements, we filtered the search results sequentially using the following orthogonal filtering criteria to remove nonphosphorylated peptide and establish a phosphopeptide data set at FDR < 1.0% (Supplementary Figure 2A, Supporting Information). (1) To begin estimating the FDR of our MS data, all MS/MS spectra that were assigned peptide sequences were processed using PeptideProphet. 25 Based on the distribution of scores over the entire data set, PeptideProphet calculates the probability of the assignment for each peptide being correct using database search scores, delta mass, the number of termini that are consistent with the type of enzymatic cleavage, the number of missed cleavage sites, differences between the observed and predicted retention times (RT_Score), and other factors.25 Consequently, a minimum PeptideProphet probability score of ≥0.05 and a minimum peptide length >6 were used initially to remove low-probability peptides and many short false-positive peptides that match. Also, spectra that contained at least one phosphorylated amino acid were selected to remove nonphosphorylated peptides.38,39 (2) A minimum Delta CN of 0.1 and a maximum RSp of 2 were set as secondary filters. Delta CN, the difference between the normalized cross-correlation parameters of the first- and second-ranked peptides, and RSp, the ranking of the preliminary raw score, are the most
predictive values that separate positive from negative peptides.40 Whereas peptides that are identified with high ΔCn scores are more likely to be true assignments,40 filtering results that have maximum RSp = 2 excludes possible “true-positive” identifications, because they are not in the top 2 ranks of preliminary scores.27 However, this is a patent trade-off between the coverage and quality of a data set. Nevertheless, after filtering with these parameters, the FDR of the data set fell from 30.69 to 2.4% (Supplementary Figure 2A, Supporting Information). (3) Because LC retention times of peptides are valuable in their identification and characterization,41,42 we used RT_Score, calculated from PeptideProphet, as the third filter. RT_Score represents the difference between the actual scan number and that of the calculated retention time for the peptide. To evaluate the performance of filtering using RT_Score, the example data set (the TOP10 data set of FASP_TiO2_Biological replicate 1) was used preferentially (Supplementary Figure 2B, Supporting Information). The example data set resulted in a Pearson correlation coefficient (r) of 0.83 after the second filter was applied. Subsequently, peptides with a defined range of RT_Score (−12 < RT_Score < 12) were selected, resulting in a data set with a Pearson correlation coefficient (r) that exceeded 0.9. After the RT_Score was adjusted, the resulting peptides showed a robust correlation between the observed and predicted retention 2212
dx.doi.org/10.1021/pr200990b | J. Proteome Res. 2012, 11, 2206−2223
Journal of Proteome Research
Article
near certainty (P ≤ 0.01).13,29 An Ascore value was calculated for each phosphorylation site in the 6277 phosphopeptides in Supplementary Table 1 (Supporting Information). The distribution of scores for all phosphopeptides in the individual experiments is shown in Supplementary Table 3 (Supporting Information). Without removing redundancy, based on the Ascore distribution for 41619 phosphorylation sites, 46.51% of the data set (19358 of 41619 sites) achieved near certainty (>99%) with regard to localization, and 4312 sites (10.36%) had Ascore values between 13 and 19 (P < 0.05), also indicating high certainty. An additional 6.31% (2628 of 41619 sites) could be localized with ∼90% confidence (0.05 < P ≤ 0.1, 13 ≥ Ascore value >10). We detected ambiguous (Ascore value