ARTICLE pubs.acs.org/ac
Improvement of the Quantification Accuracy and Throughput for Phosphoproteome Analysis by a Pseudo Triplex Stable Isotope Dimethyl Labeling Approach Chunxia Song,† Fangjun Wang,†,§ Mingliang Ye,† Kai Cheng,† Rui Chen,† Jun Zhu,† Yexiong Tan,‡ Hongyang Wang,‡ Daniel Figeys,*,§ and Hanfa Zou*,† †
CAS Key Laboratory of Separation Sciences for Analytical Chemistry, National Chromatographic R&A Center, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, China ‡ The International Cooperation Laboratory on Signal Transduction of Eastern Hepatobiliary Surgery Institute, Second Military Medical University, Shanghai, 200438, China § Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa, Ontario, Canada K1H 8M5
bS Supporting Information ABSTRACT: Accurately quantifying the changes of phosphorylation level on specific sites is crucial to understand the role of protein phosphorylation in physiological and pathological processes. Here, a pseudo triplex stable isotope dimethyl labeling approach was developed to improve the accuracy and the throughput of comprehensive quantitative phosphoproteome analyses. In this strategy, two identical samples are labeled with light and heavy isotopes, respectively, while another comparative sample is labeled with an intermediate isotope. Two replicated quantification results were achieved in just one experiment, and the relative standard deviation (RSD) criterion was used to control the quantification accuracy. Compared with the conventional duplex labeling approach, the number of quantified phosphopeptides increased nearly 50% and the experimental time was reduced by 50% under the same quantification accuracy. Combined with the automated online reversed phase-strong cation exchange-reversed phase (RP-SCX-RP) multidimensional separation system, a comparative phosphoproteome analysis of hepatocellular carcinoma (HCC) and normal human liver tissues was performed. Over 1800 phosphopeptides corresponding to ∼2000 phosphorylation sites were quantified reliably in a 42 h multidimensional analysis. The pro-directed motifs, which were mainly associated with the extracellular signal-regulated kinases (ERKs), were observed as being overrepresented in the regulated phosphorylation sites, and some quantification results of phosphorylation sites were validated by the other studies. Therefore, this pseudo triplex labeling approach was demonstrated as a promising alternative for the comprehensive quantitative phosphoproteome analysis.
Q
uantification of specific phosphopeptides or phosphorylation sites is critical to better understand the role of reversible phosphorylation in cellular processes. Different approaches for phosphopeptide quantification have been proposed, including stable isotope labeling of amino acids in cell culture (SILAC),1,2 isobaric tag for relative and absolute quantitation (iTRAQ),3 and stable isotope dimethyl labeling.4 Although in vivo labeling in animal models has been reported for SILAC,5 its main use has been in cell culture experiments. In contrast, chemical labeling approaches such as iTRAQ can be readily used for tissues with the commercial isotopic labeling reagent kits. Alternatively, stable isotope dimethyl labeling, which leads to fast and complete reductive amination of peptides,6 is a relatively simple and economic strategy for large scale quantitative analyses. To date, this approach has been used for comprehensive quantitative proteome analyses.79 The sophisticated quantitative results are indispensable to correctly understand the change of phosphorylation level on r 2011 American Chemical Society
different physiological conditions. However, how to objectively evaluate the accuracy of the mass spectrometry (MS)-based quantification results is still in its infancy although some studies have paid attention to this question.1012 In quantitative proteomics, the changes of protein expression level are usually determined by averaging the ratios of corresponding isotope labeled peptide pairs derived from the same protein excluding spurious peptide ratios to eliminate their influences on protein quantification. As each protein can generate many peptides after digestion, this filtering strategy is effective in controlling the protein quantification accuracy. However, this assumption is not suitable for phosphoproteomics.11 The ratio for phosphopeptides obtained from the same protein will depend on the levels of the different protein isoforms and the activities of the different Received: May 23, 2011 Accepted: September 8, 2011 Published: September 08, 2011 7755
dx.doi.org/10.1021/ac201299j | Anal. Chem. 2011, 83, 7755–7762
Analytical Chemistry
ARTICLE
Figure 1. Experimental scheme of the pseudo triplex dimethyl labeling approach for comprehensive quantitative phosphoproteome analysis of human liver tissues.
kinases/phosphatases associated with the different phosphorylation sites. Therefore, we cannot assume that phosphopeptides derived from the same protein would have similar ratios. A better strategy is required to obtain high accurate quantification results for each individual phosphopeptide in quantitative phosphoproteomics. Here, we present a pseudo triplex stable isotope dimethyl labeling approach to improve the accuracy of large scale quantitative phosphoproteome analyses (Figure 1). In this strategy, two identical samples are labeled with light and heavy isotopes, respectively, while another comparative sample is labeled with an intermediate isotope. Therefore, two replicated quantifications for each phosphopeptide are obtained in just one MS analysis. Moreover, the identical samples with different isotopic labels can also act as a reference system to clearly measure whether the quantification results of the comparative sample were reliable. After being filtered by the relative standard deviation (RSD) criterion in two replicated analyses in one experiment, the accuracy and throughput of phosphoproteome quantification were improved significantly in half the time required by the conventional duplex labeling approach. Finally, this triplex labeling system was applied to quantify the difference in the phosphoproteomes of hepatocellular carcinoma (HCC) and normal human liver tissues. Over 1800 phosphopeptides corresponding to ∼1000 phosphoproteins were reliably quantified in only 42 h using an automated online multidimensional phosphorylation analysis system.
’ EXPERIMENTAL SECTION Chemicals and Materials. Formic acid (FA) and sodium cyanoborohydride (NaBH3CN) were provided by Fluka (Buchs,
Germany). Acetonitrile (ACN, HPLC grade) was purchased from Merck (Darmstadt, Germany). All the other chemicals and reagents were purchased from Sigma (St. Louis, MO). Sep-Pak C18 cartridges were provided by Waters (Milford, MA). Fused silica capillaries with 75 μm i.d. were obtained from Polymicro Technologies (Phoenix, AZ) and with 200 μm i.d. were obtained from Yongnian Optical Fiber Factory (Hebei, China). All the water used in this experiment was prepared using a Mill-Q system (Millipore, Bedford, MA). Sample Preparation and Digestion. Adult female C57 mice were purchased from Dalian Medical University (Dalian, China). The hepatocellular carcinoma (HCC) and normal human liver tissues were provided by Eastern Hepatobiliary Surgery Hospital (Shanghai, China), and the study was approved by the Institutional Review Board of this hospital. Informed consent was obtained from patients enrolled in this study. The normal human liver tissues were the nontumorous liver tissues g2 cm outside the hepatic hemangiomas removed by surgical operation. The liver tissues have been checked by histopathological examination to exclude the presence of invading or microscopic metastatic tumor cells. The HCC tissues were obtained from the advanced stage of the HCC patients removed by the surgical operation. As described in our previous work,9,13 the liver tissues were lysed in a homogenization buffer, consisting of 8 M urea, 1% Triton X-100 v/v, 65 mM DTT, 1 mM EDTA, 0.5 mM EGTA, 1 mM PMSF, 100 μL of protease inhibitor cocktail for 10 mL of homogenized buffer, phosphatase inhibitors (1 mM NaF, 1 mM Na3VO4, 1 mM C3H7Na2O6P, 10 mM Na4O7P2), and 40 mM TrisHCl at pH 7.4. After being resuspended in the denaturing buffer containing 8 M urea and 100 mM TEAB (triethyl ammonium 7756
dx.doi.org/10.1021/ac201299j |Anal. Chem. 2011, 83, 7755–7762
Analytical Chemistry bicarbonate, pH 8.0), the protein concentration was determined by Bradford assay. The proteins were reduced by DTT at 37 °C for 2 h and alkylated by iodoacetamide in the dark at room temperature for 40 min. Then, the solutions were diluted to 1 M urea with 100 mM TEAB, and trypsin was added, with the weight ratio of trypsin to protein at 1/25, and incubated at 37 °C overnight. All of the resulting peptide solution was stored under 80 °C. Dimethyl Labeling Reaction. Tryptic digestions (1 mg in 1 mL of 100 mM TEAB solution) were diluted with 2 mL of 50 mM sodium phosphate buffer at pH 6.8. For the light, intermediate, and heavy labeling, 500 μL of CH2O (4%, v/v), CD2O (4%, v/v), and 13CD2O (4%, v/v) were added into the sample solution, respectively, and then, 500 μL of freshly prepared NaBH3CN (0.6 M) or NaBD3CN (0.6 M) were added subsequently. The resultant mixture was incubated for 1 h at room temperature. Then, 20 μL of ammonia (25%) and 50 μL of FA were added to consume the excess labeling reagents and acidify for the subsequent solid phase extraction (SPE). After mixed in a ratio of 1:1:1 on the basis of the total peptide amount, the labeled peptide mixture was desalted by the SPE column and redissolved in 0.1% TFA/80% ACN. Enrichment of Phosphopeptides and Online Reversed Phase-Strong Cation Exchange-Reversed Phase (RP-SCXRP) Multidimensional Separation. The phosphopeptides were enriched by Ti4+-IMAC microspheres following the protocol described by Yu et al.14 and then, were resuspended in 50 μL of 0.1% FA. The automated sample injection and multidimensional separation using the RP-SCX-RP system were constructed as previously described,9 and the RP segment of the RP-SCX biphasic column was used as the sample loading column to reduce the sample loss. The resuspended phosphopeptides were loaded onto the biphasic column, and then, a 145 min RP gradient nanoflow LC-MS/MS (0 mM) was applied at first to transfer the peptide retained on the RP segment to the SCX monolithic of the biphasic column. Then, a series stepwise elution with salt concentrations of 24, 40, 56, 72, 100, 160, and 500 mM NH4AC (pH 2.7) was used to elute peptides from SCX monolithic column to the second dimensional C18 separation column. Each salt step lasted 10 min followed by a 15 min equilibrium with 0.1% FA/H2O. Finally, a 145 min binary RP gradient elution was performed as follows. For the global comparative phosphorylation analysis of human liver, a longer salt gradient was performed to increase the phosphoproteome coverage. A series of salt concentrations of 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 100, 120, 160, and 500 mM NH4AC (pH 2.7) were applied to the multidimensional separation of phosphopeptides. The other procedures were the same as described above. For technical replicates, the procedures of dimethyl labeling and on line 2D LC-MS/MS analysis were repeated for the same triplex samples (two tryptic digests of normal human liver samples and one tryptic digest of HCC sample). For biological replicates, different triplex normal and HCC human liver tissues were lysed and digested, respectively. Then, the other procedures were the same as described above. Nanoflow RPLC Separation and Mass Spectrometry Analysis. The HPLC system consisted of a degasser and a quaternary surveyor MS pump (Thermo, San Jose, CA). The separation capillary column with a 75 μm i.d. was in-house packed with C18 particles (3 μm, 120 Å) to 14 cm length. For the RPLC separation, 0.1% FA in water and in acetonitrile were used as
ARTICLE
mobile phases A and B, respectively, and the flow rate was adjusted to ∼200 nL/min after splitting. The gradient elution was performed with a gradient of 03% B in 2 min, 325% B in 90 min, 2580% B in 8 min, 80% B in 10 min, and 80100% B in 5 min, and 100% B lasted 30 min. The LTQ-Orbitrap mass spectrometer (Thermo, San Jose, CA) was operated in data-dependent MS/MS acquisition mode. Full mass scan performed in the Orbitrap analyzer was acquired from m/z 400 to 2000 (R = 60000 at m/z 400). The 10 most intense ions from the full scan were selected to fragmentation via collision induced dissociation (CID) in the LTQ. The dynamic exclusion function was set as follows: repeat count 2, repeat duration 30 s, and an exclusion duration of 60 s. Protein Identification and Quantification. All MS/MS spectra were searched using the Mascot server (version 2.1) against a composite international protein index (IPI) database appended with its reversed complement for evaluation of the false discovery rate (FDR) (IPI mouse 3.26 and IPI human 3.52 were used for mouse liver and human liver, respectively). Carbamidomethylation on cysteine was set as a fixed modification whereas oxidation on methionine and phosphorylation on serine, threonine, and tyrosine as well as light, intermediate, and heavy dimethylation on lysine and peptide amino termini were set as the variable modifications. Trypsin was set as the specific proteolytic enzyme with up to two missed cleavages allowed. The mass tolerance for the precursor ion was set to 10 ppm and 0.8 Da for the fragment ion. Phosphopeptides with Mascot score g25 (rank 1, bold red, p < 0.05) were accepted for protein quantification. Quantification of peptide pairs as well as triplets was performed by MSQuant with a dimethyl-adapted version.15 The peptide ratio was calculated by the median intensity of the isotope clusters over multiple full scans, and the average of all the quantified peptides was used for the protein quantification. The assignment of phosphorylation sites of quantified phosphopeptides was performed by the PTM scoring algorithm implemented in MSQuant, and only the top scoring site localization was retained. Then, all the quantification results of the MSQuant from a single multidimensional separation were combined and normalized against the log2 of the median ratio of all the quantified peptides by StatQuant (version 1.2.2).16
’ RESULTS AND DISCUSSION Improvement of the Performance of Quantitative Phosphoproteome Analysis by Multiple Technical Replicates. In
quantitative proteomic experiments, multiple technical and biological replicate analyses are often performed to improve the quantification coverage and accuracy.10,17 The individual protein quantification accuracy was shown to improve with an increase in the number of corresponding quantified peptides.10 In contrast, shotgun phosphoproteomic analysis, in general, relies on a single phosphopeptide being quantified per site of phosphorylation. Although technical replicates can also be performed in phosphoproteomic analyses and lead to an increase in the number of phosphopeptides quantified, it does not necessarily lead to a systematic remeasurement of individual phosphopeptides. This is further compounded when looking at technical replicates and biological replicates. Previous studies have defined “precision” as the measurement of the reproducibility of the quantitative results in technical replicates, whereas “accuracy” was defined as the closeness between the experimental ratio and the actual values.11,18 The 7757
dx.doi.org/10.1021/ac201299j |Anal. Chem. 2011, 83, 7755–7762
Analytical Chemistry
ARTICLE
false discovery rate (FDR) is a statistical measurement that is extensively used in proteomics to control the quality of protein identifications.19 Here, we introduce the notion of false quantification rate (FQR) to objectively evaluate the accuracy of the quantification results. The false quantification results are defined as the experimental quantification ratios are larger than 2-fold of their actual values. Thus, the calculation of the FQR is shown in eq 1. FQ R ¼
2Nf q 100% N
ð1Þ
where Nfq is the number of false quantified peptides and N is the total number of quantified peptides. Similar with FDR, the quantitative data set with FQRe 1% is considered as an acceptable accurate quantification in this study. To investigate the relationship between the precision and accuracy in quantitative phosphoproteomics, three replicated analyses by online multidimensional separation of a test mouse liver sample were performed. Three aliquots (1 mg) of the same sample (mouse liver tryptic digests) were labeled with light and intermediate isotope dimethyl groups, respectively, and were pooled together prior to Ti4+-IMAC enrichment. Then, the enriched phosphopeptides were loaded onto an online RPSCX-RP system to conduct a 23 h multidimensional chromatography separation with 8 salt gradients. An average of 1600 phosphopeptides corresponding to over 800 phosphoproteins was quantified per multidimensional separation. The reliability of the identification results was demonstrated with a FDR of 0.2%. Although most of the quantitative results for the test samples were close to their theoretical values (ratio 1), some undesirable scattered ratios were still observed in the phosphopeptides distribution (the red dots in the Figure S1A in Supporting Information). Because two identical samples were labeled and quantified, Nfq in eq 1 was the number of peptides with ratio >2 or 1 or