Stable-Isotope Dimethyl Labeling for Quantitative Proteomics

Nov 8, 2003 - ITMSQ: A software tool for N- and C-terminal fragment ion pairs based isobaric tandem mass spectrometry quantification. Li-Qi Xie , Lei ...
0 downloads 12 Views 201KB Size
Anal. Chem. 2003, 75, 6843-6852

Stable-Isotope Dimethyl Labeling for Quantitative Proteomics Jue-Liang Hsu,† Sheng-Yu Huang,† Nan-Haw Chow,‡ and Shu-Hui Chen*,†

Department of Chemistry, National Cheng Kung University, and Department of Pathology, College of Medicine, National Cheng Kung University, No.1 Ta-Hsueh Road, Tainan, 701, Taiwan

In this paper, we report a novel, stable-isotope labeling strategy for quantitative proteomics that uses a simple reagent, formaldehyde, to globally label the N-terminus and E-amino group of Lys through reductive amination. This labeling strategy produces peaks differing by 28 mass units for each derivatized site relative to its nonderivatized counterpart and 4 mass units for each derivatized isotopic pair. This labeling reaction is fast (less than 5 min) and complete without any detectable byproducts based on the analysis of MALDI and LC/ESI-MS/MS spectra of both derivatized and nonderivatized peptide standards and tryptic peptides of hemoglobin molecules. The intensity of the a1 and yn-1 ions produced, which were not detectable from most of the nonderivatized fragments, was substantially enhanced upon labeling. We further tested the method based on the analysis of an isotopic pair of peptide standards and a pair of defined protein mixtures with known H/D ratios. Using LC/MS for quantification and LC/MS/MS for peptide sequencing, the results show a negligible isotopic effect, a good mass resolution between the isotopic pair, and a good correlation between the experimental and theoretical data (errors 0-4%). The relative standard deviation of H/D values calculated from peptides deduced from the same protein are less than 13%. The applicability of the method for quantitative protein profiling was also explored by analyzing changes in nuclear protein abundance in an immortalized E7 cell with and without arsenic treatment. Quantitative analysis of the relative abundance of expressed proteins is an essential issue in comprehensive proteomics. Recently, mass spectrometry has become a very powerful tool in proteome analysis, especially in quantitative analysis of differential expression. Some stable-isotope-based methods have been developed and applied extensively in this comparative analysis because they are well suited to mass spectrometric analysis.1-3 Using stable-isotope labeling, coupled with mass spectrometry-based methods, to quantify changes in protein abundance is a simple and practical way for profiling biological differential regulation. * Corresponding author. E-mail: [email protected]. † Department of Chemistry. ‡ Department of Pathology. (1) Goshe, M. B.; Smith, R. D. Curr. Opin. Biotechnol. 2003, 14, 101-109. (2) Aebersold, R.; Tao, W. A. Curr. Opin. Biotechnol. 2003, 14, 110-118. (3) Sechi, S.; Oda, Y. Curr. Opin. Chem. Biol. 2003, 7, 70-77. 10.1021/ac0348625 CCC: $25.00 Published on Web 11/08/2003

© 2003 American Chemical Society

Among these methods, isotope-coded affinity tagging (ICAT) is one of the most versatile quantification technologies for differential display of cysteine-containing proteins.4 The chemical tag comprises a cysteine-reacting group, a labeled linker, and an affinity group for separation. This method has several drawbacks, however. First, the bulky affinity group, biotin, increases the complexity in the interpretation of MS/MS spectra. Second, the eight deuterium atoms associated with this mass tag can lead to partial resolution of isotopic peptide pairs by HPLC, which complicates the MS analysis.1,5,6 Third, ICAT is limited to Cyscontaining proteins that do not, as a set, cover the whole proteome. The first problem can be addressed by a solid-phase capture-andrelease system bearing a photocleavable linker that reduces the tag size before MS analysis7 or by using an acid-labile linker.8 The second problem may be resolved by using a 13C labeling strategy instead of deuterium labeling.1,6 Another class of labeling reagents has been designed for the quantification of global protein expression, such as acylation of the N-terminus and -amino units of Lys residues by acetic anhydride (-H6 and -D6) or Nacetoxysuccinimide (-H3 and -D3).9-11 This approach has been applied successfully to the quantification of relatively simple protein mixtures and has been improved further by an enrichment protocol for cysteine- and histidine-containing peptides.12 Acylation of basic amino groups, however, changes the ionic states of peptides and may reduce the ionization efficiency of tryptic digests containing C-terminal lysine residues. Another labeling method is the so-called mass-coded abundance tagging, in which the -amino group of Lys residues is labeled by reagents such as O-methylisourea13,14 and the relative quantity of peptides is determined by comparison between the labeled and unlabeled (4) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Nat. Biotechnol. 1999, 17, 994-999. (5) Han, D. K.; Eng, J.; Zhou, H.; Aebersold, R. Nat. Biotechnol. 2001, 19, 946951. (6) Zhang, R.; Sioma, C. S.; Wang, S.; Regnier, F. E. Anal. Chem. 2001, 73, 5142-5149. (7) Zhou, H.; Ranish, J. A.; Watters, J. D.; Aebersold, R. Nat. Biotechnol. 2002, 20, 512-515. (8) Qui, Y.; Sousa, E. A.; Hewick, R. M.; Wang, J. H. Anal. Chem. 2002, 74, 4969-4979. (9) Che, F. Y.; Fricker, L. D. Anal. Chem. 2002, 74, 3190-3198. (10) Ji, J.; Chakraborty, A.; Geng, M.; Zhang, X.; Amini, A.; Bina, M.; Regnier, F. E. J. Chromatogr., B: Biomed. Sci. Appl. 2000, 745, 197-210. (11) Geng, M.; Ji, J.; Regnier, F. E. J. Chromatogr., A 2000, 870, 295-313. (12) Wang, S.; Zhang, X.; Regnier, F. E. J. Chromatogr., A 2002, 949, 153-162. (13) Brancia, F. L.; Oliver, S. G.; Gaskell, S. Rapid Commun. Mass Spectrom. 2000, 14, 2070-2073. (14) Beardsley, R. L.; Karty, J. A.; Reilly, J. P. Rapid Commun. Mass Spectrom. 2000, 14, 2147-2153.

Analytical Chemistry, Vol. 75, No. 24, December 15, 2003 6843

species. Although this procedure is simple and leads to higher ionization efficiency,15 several issues related to the difference in the physicochemical characteristics between the labeled and unlabeled peptides markedly reduce the accuracy of the quantification. 2-Methoxy-4,5-dihydro-1H-imidazole has also been used to label lysine residues at the C-terminus of peptides,16 and like O-methylisourea, the labeled digests exhibit a greater number of more intense features than their unlabeled counterparts, which, thus, increases the sequence coverage. Moreover, this labeling reagent enables differential quantification studies since a stable isotopic form, containing four deuterium atoms, can be produced readily. This method, however, is of no use for analyzing peptides that do not contain lysine residues; these peptides are commonly found in many arginine-terminated tryptic digests. Herein, we explore an alternative labeling reagent, formaldehyde, to label the N-terminus and -amino group of Lys residues via reductive amination. This novel strategy is applicable for global quantification, and since the ionic state of the modified peptides is not changed, the physicochemical properties of ions can be conserved. In this study, a standard protein mixture was used to demonstrate the capability of the proposed method for quantitative proteomics and then the method was applied for the analysis of E7 cell lysate with and without arsenic treatment. We believe that reductive amination labeling is more universal and practical than many other methods for protein quantification. EXPERIMENTAL SECTION Materials. Acetonitrile and formaldehyde (37% solution in H2O) were purchased from J. T. Baker (Phillipsburg, NJ). Trifluoroacetic acid (TFA), and sodium acetate were obtained from Riedel-deHae¨n (Seelze, Germany). D,L-Dithiothreitol (DTT), sodium cyanoborohydride, hydroxylamine, ovalbumin, myoglobin, hemoglobin, and bovine serum albumin (BSA) were provided by Sigma (St. Louis, MO). Iodoacetamide and cysteine were purchased from Fluka (Buchs, Switzerland). R-Cyano-4-hydroxycinnamic acid and formaldehyde-D2 (20% solution in D2O) were purchased from Aldrich (Milwaukee, WI). Immobilized TPCKtrypsin was obtained from Pierce (Rockford, IL). The peptide standard (AEEEIpYGVLFAKKKK) was purchased from AnaSpec (San Jose, CA). The water used in these experiments was obtained from an E-pure water purification system (Barnstead Thermolyne Co., Dubuque, IA). Formaldehyde is known to the state of California to cause cancer; special caution was taken including the use of surgical gloves and fume hood when formaldehyde was handled. Tryptic Digestion. The protein standards were prepared by dissolving hemoglobin, ovalbumin, BSA, and myoglobin separately in sodium bicarbonate buffer (100 mM, pH 8.1) containing 8 M urea at a final concentration of 10-4 M. Two standard mixtures consisting of the same three proteins, ovalbumin, BSA, and myoglobin, at different concentrations were prepared from the protein standards. The final concentrations of ovalbumin, BSA, and myoglobin were 2.5 × 10-5, 2.5 × 10-5, and 5.0 × 10-5 M, respectively, in one mixture and 6.25 × 10-5, 2.5 × 10-5, and 1.25 × 10-5 M, respectively, in the other. The tryptic digest of hemoglobin was prepared by loading 20 µL of the standard (15) Beardsley, R. L.; Reilly, J. P. Anal. Chem. 2002, 74, 1884-1890. (16) Peters, E. C.; Horn, D. M.; Tully, D. C.; Brock, A. Rapid Commun. Mass Spectrom. 2001, 15, 2387-2392.

6844

Analytical Chemistry, Vol. 75, No. 24, December 15, 2003

solution into a cartridge (25 µL, 37 °C) packed with TPCK-treated trypsin beads (Pierce) at a flow rate of 1 µL/min. The resulting tryptic digest was eluted with sodium bicarbonate (100 mM, pH 8.1, 180 µL). The tryptic digest that eluted from the cartridge had a total volume of 200 µL, which corresponds to a concentration of 10 pmol/µL. For protein mixtures, the disulfide bond was reduced with DTT (10 mM, 2 µL) at 37 °C for 1 h and the resulting cysteine residues were alkylated with iodoacetamide (10 mM, 4 µL) at 4 °C in the dark for 2 h. The excess iodoacetamide was quenched with cysteine (10 mM, 4 µL). The treated protein mixtures (16 µL) were then loaded into the trypsin cartridge, and the digestion and elution were processed as previously described. Reductive Amination. The peptide standard dissolved in sodium acetate buffer (100 mM, pH 5-6) was mixed with formaldehyde (4% in water, 1 µL), vortexed, and then mixed immediately with freshly prepared sodium cyanoborohydride (260 mM, 1 µL). The mixture was vortexed again and then allowed to react for 5 min. If necessary, ammonium hydroxide (4% in water, 1 µL) or hydroxylamine (1 M in water, 1 µL) was added to consume the excess aldehyde. Deuterium labeling was performed in a similar manner, but by using formaldehyde-D2 (4% in water, 1 µL). The hemoglobin digest (10 pmol/µL in 100 mM NaHCO3, 10 µL) was diluted with sodium acetate buffer (0.1 M, pH 5.0, 90 µL) and then labeled as described above. The tryptic digests of the two protein mixtures were separately labeled with formaldehyde-H2 and formaldehyde-D2, which resulted in protein abundance ratios (H4/D4) of 2.5, 1, and 0.25 for ovalbumin, BSA, and myoglobin, respectively. All the labeled tryptic digests were analyzed by MALDI- or µLC/ESI-MS. Immortalized E7 Cells. The E7 cell line was established from human uroepithelial cells after being immortalized by human papilloma virus E7 protein (a kind gift from Dr. C. A. Reznikoff at University of Wisconsin). The immortalized E7 cell line was grown separately in the presence and absence of arsenic (As2O3). The E7 cells were first treated with As2O3 (0.05 ppm) at 70% confluence for two weeks. Then concentrations increased 2-fold every week with final concentrations set at 0.4 ppm As2O3, the maximal tolerable dose for E7 cells. The cells were lysed with 120 mM NaCl, 10 mM Tris-HCl, 1% NP40, 0.1% SDS, and 1% deoxycholate after treatment with As2O3 (0.4 ppm) for an additional two weeks. The cell lysate, containing 70 µg of total protein, was treated by dialysis through a membrane, digested, labeled with formaldehyde-H2 (arsenic untreated) or formaldehyde-D2 (arsenic treated), and then fractionated by a strong cation-exchange cartridge (SCX, HiTrapSP HP, Amersham Biosciences, Uppsala, Sweden). A total of six fractions that were eluted with 100, 200, 300, 400, and 500 mM and 1 M NaCl, respectively, were collected. For this study, a single SCX fraction (100 mM) was selected, desalted by OligoR3 (Applied Biosystems, Foster City, CA), dried using a SpeedVac (VR Mini/Maxi, HETO lab equipment, Denmark), redissolved in 5% ACN containing 0.1% formic acid, and then analyzed by MALDI and LC/ESI-MS. Mass Spectrometry. The MS data were obtained using a MALDI-TOF spectrometer equipped with a 337-nm N2 laser (MALDI, Micromass, Manchester, U.K.). The MALDI matrix was prepared by dissolving 4-cyanohydroxysuccinic acid (10 mg) in EtOH/MeCN (1:1, 1 mL) containing 0.1% TFA. A 0.5 M solution of HCl was mixed with both the matrix and the sample at a ratio

Figure 1. MALDI spectra of the (A) native, (B) H-labeled, and (C) D-labeled peptide standard AEEEIpYGVLFAKKKK.

of matrix/HCl/sample of 2:1:1 (v/v/v). The mixture was deposited onto the target and dried before detection. The dimethylated tryptic peptides were also analyzed by a Q-TOF microspectrometer (Micromass) equipped with a nanoflow HPLC system (LC Packings, Amsterdam, Netherlands). Briefly, a tryptic digest solution (1 µL) was injected onto a column (NAN75-15-03-C18-PM; 75 µm × 15 cm) packed with C18 beads (3 µm, 100-Å pore size, PepMap). Mobile-phase buffer A consisted of 0.1% formic acid in water; mobile-phase buffer B was 95% acetonitrile in 0.1% formic acid. The peptides were separated using a linear gradient of 0-70% solvent B over 40 min at a flow rate of 200 nL/min. For quantification, only MS data were acquired throughout the chromatographic procedure to ensure the complete mass peak representation of all the ionized peptides at any given time point. All the spectra containing both mass peaks of D4- and H4-labeled peptides were combined to produce a composite MS spectrum. Typically, 20-60 spectra were combined since an average peak duration for a peptide was ∼20-60 s and each individual spectrum was acquired within 1 s with an interscan time of 0.1 s. The ratios of the D4- and H4-labeled peptides in the composite MS spectra were calculated from their relative peak heights. The relative quantification of a protein from two different samples was determined by averaging the ratios of the D4- and H4-labeled peptides that were derived from the same protein. For sequencing, the MS/MS spectra were obtained by a survey scan and automated data-dependent MS analysis was carried out using the dynamic exclusion feature built into the MS acquisition software. Each MS scan was followed by four MS/MS scans of the first four most intense peptide mass peaks to obtain as many CID spectra as possible; the peptide sequences were identified using Mascot Search (www.matrixscience.com). The search results that were within the list of significant hit were regarded as identified proteins, and all results were further verified by manual interpretation. RESULT AND DISCUSSION General Reductive Amination of Tryptic Digest. Reductive amination is a well-known organic reaction that is used extensively

in the modification of proteins and peptides.17,18 As is shown in eq 1, formaldehyde reacts with the N-terminus, or an -amino

group of a Lys residue, of a peptide to form a Schiff base that is reduced by sodium cyanoborohydride to form a secondary amine, which is relatively more reactive than the primary amine. Subsequently, the more-reactive species reacts with another formaldehyde unit and is then reduced to form a dimethylamino group. This labeling strategy was first explored using a peptide standard as the model, and the labeled molecules were analyzed by MALDI. The peptide standard (AEEEIpYGVLFAKKKK) investigated here was specifically designed to have four lysine residues, including the N-terminal R- and -amino and C-terminal Lys -amino groups, to examine the reactivity of formaldehyde on molecules with multiple reaction sites. This model peptide is also a phosphopeptide, with one phosphate group on the tyrosine residue, which allowed us to examine whether any side reactions occur during the labeling. The spectra obtained from the H4- and D4-labeled, as well as unlabeled, peptide standard are shown in Figure 1. Panels A and B of Figure 1 indicate that the values of m/z of the H4-labeled ions are 140 mass units higher relative to those of the unlabeled ion, which is consistent with the prediction made by counting a 28-mass-unit difference for each labeled site and indicates that this labeling reaction is effective for multiple Lys-containing peptides. Moreover, Figure 1C shows that the D4labeled ions are 160 and 20 mass units higher relative to the unlabeled and H4-labeled ions, respectively, which is also consis(17) Lundblad, R. L.; Noyes, C. M. Chemical Reagents for Protein Modification; CRC Press: Boca Raton, FL, 1984; Vol. 1, Chapter 10. (18) Hermanson, G. T. Bioconjugate Techniques; Academic Press: San Diego, CA, 1996.

Analytical Chemistry, Vol. 75, No. 24, December 15, 2003

6845

Figure 2. MALDI spectra of the (A) native, (B) H-labeled, and (C) D-labeled hemoglobin digest. Table 1. Theoretical and Experimental MALDI-MS Data for Hemoglobin Tryptic Digesta mass of unlabeled peptides

experimental mass of labeled peptidesb

no.

peptide sequence

theor

exptl

light

heavy

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

FFESFGDLSTPDAVMGNPK TYFPHFDLSHGSAQVK LLGNVLVCVLAHHFGK VLGAFSDGLAHLDNLK VGAHAGEYGAEALER GTFATLSELHCDK EFTPPVQAAYQK VNVDEVGGEALGR LLVVYPWTQR FLASVSTVLTSK VVAGVANALAHK LHVDPENFR MFLSFPTTK VHLTPEEK SAVTALWGK VDPVNFK

2058.9 1833.8 1719.9 1669.8 1529.7 1421.6 1378.7 1314.6 1274.7 1252.7 1149.6 1126.5 1071.5 952.5 932.5 818.2

2058.3 1833.6 -c 1669.8 1529.3 1378.8 1314.8 1274.8 1126.6 1071.6 932.5 -

2114.1 (2) 1889.5 (2) 1725.7 (2) 1557.4 (1) 1434.8 (2) 1342.8 (1) 1302.8 (1) 1308.8 (2)d 1205.8 (2)d 1154.6 (1) 1127.6 (2) 1008.5 (2) 988.5 (2) 874.2 (2)

2122.1 (2) 1897.6 (2) 1734.8 (2) 1561.4 (1) 1442.8 (2) 1346.8 (1) 1306.9 (1) 1316.9 (2)d 1213.8 (2)d 1158.7 (1) 1135.7 (2) 1016.6 (2) 996.6 (2) 882.3 (2)

a The data are a summary of Figure 2, and all the ion masses listed are based on their protonated forms. b Value in parentheses is the number of labeled amino groups. c -, not found. d Observed from an expanded intensity scale of Figure 2.

tent with the prediction. It is also worth noting that the phosphate group on the tyrosine residue was not modified by dimethyl labeling, indicating that this labeling strategy would not change the state of posttranslational phosphorylation of the protein. Therefore, this labeling strategy can be coupled with many analytical skills for the analysis of protein phosphorylation. Some labeling methods may cause changes in a protein’s state,19 which limit their utility for the analysis of protein phosphorylation. The spectra obtained from the H4- and D4-labeled, as well as unlabeled, hemoglobin digests are shown in Figure 2 and the assigned amino acid sequences, based on a database search, are tabulated in Table 1. Sixteen peptides were found in the mass range m/z 800-2500 in the MALDI spectrum, including 5 peptides belonging to the (19) Herschlag, D.; Jencks, W. P. J. Am. Chem. Soc. 1986, 108, 7938-7946.

6846 Analytical Chemistry, Vol. 75, No. 24, December 15, 2003

R-chain and 11 peptides belonging to the β-chain, with no missed cleavages. Compared to the theoretical sequences of hemoglobin, peptide sequence coverage found from these identified dimethyllabeled peptides was up to 88%; only two tryptic peptides were not observed. Compared to the coverage found from the unlabeled tryptic peptides, the dimethyl labeling increased the peptide sequence coverage by 13%; two additional tryptic peptides were found from the labeled digest. We believe that these results support the concept of global labeling that we propose for this dimethyl labeling strategy. The values of m/z of the labeled ions were 28 mass units higher for each H4-labeled site and 32 mass units higher for D4-labeled site on the ion fragment relative to those obtained for the unlabeled ions. We also found that the three active sites of the peptide ion at m/z 1797 (KVLGAFSDGLAHL-

Figure 3. Selected CID spectrum of the (A) H-labeled and (B) native tryptic petide EFTPPVQAAYQK derived from hemoglobin. Data were generated using a Q-TOF microspectrometer during nanoLC elution.

DNLK) were all labeled by dimethyl groups, which is also consistent with the results obtained from the model peptide. In both Figures 1 and 2, it is particularly noticeable that no unlabeled starting peptides or incompletely modified products, such as monomethylated peptides, are observed in the MALDI spectrum, which indicates that the yields of both the H4- and D4labeling reactions are quantitative. The signal intensities of most of the modified small peptides are comparable, or slightly

enhanced, relative to those of the unlabeled peptides, but the intensities of most of the modified large peptides are comparable, or slightly reduced, relative to those of the unlabeled peptides. Generally speaking, the reaction is simple, fast (less than 5 min), and quantitative, and the signal quality of dimethyl-labeled ions is comparable to that of the unlabeled ions. CID of the Dimethyl-Labeled Peptides. The dimethyllabeled hemoglobin digest was further investigated by LC/ESIAnalytical Chemistry, Vol. 75, No. 24, December 15, 2003

6847

tions. Second, the signal intensity of both the a1 and y11 ions was increased substantially upon labeling. Without labeling, these ions were hardly detectable from the unlabeled digests of many proteins. This phenomenon is not well understood, but may be explained by considering that the resulting b1 ion tends to form a stable alkylated immonium ion a1 by losing a molecule of carbon monoxide. It is also noticeable that, for some fragments, the number of b ions detected was significantly increased upon labeling. Based on the analysis of all of the sequenced fragments and the automatic database search, two more tryptic peptides of hemoglobin from the labeled tryptic digest were identified using ESI-MS/MS than from the unlabeled tryptic digest. The data obtained prove the applicability of dimethyl labeling for the analysis of peptide sequences using LC/tandem-MS techniques and suggest that this method is likely to have a greater degree of success because of the larger range of fragment coverage relative to methods using unlabeled counterparts. Quantification of the Peptide Standard Using MALDI-MS. We investigated the usefulness of the dimethyl labeling method for quantitative analysis using MALDI-MS, based on the presence of stable isotopes, with the peptide standard as the model. Two identical solutions of the peptide standard that were investigated previously were labeled using formaldehyde-H2 and formaldehydeD2, respectively, and then they were mixed in combinations within a linear range from 1:0.5 to 1:10 (H pool/D pool). As for this standard peptide with four lysine residues, the values of m/z for this isotopic pair differ by 20 mass units. Figure 4 shows that the intensity ratio of this isotopic pair is consistent with the mixing ratio of the two labeling pools, yielding a linear dynamic range with a slope close to 1 and a value of R2 of up to 0.99, which indicates that this quantitation method is accurate. Moreover, the standard deviation obtained from three repeated measurements was much smaller for smaller concentration ratios (D/H ratio

Table 2. CID Spectra and the Deduced Sequence of a Dimethyl-Labeled and Unlabeled Peptide (Peptide Sequence: EFTPPVQAAYQK) from Hemoglobin Tryptic Digest unlabeled y ion seq

calcd

E F T P P V Q A A Y Q K

1249.66 1102.59 1001.54 904.49 807.44 708.37 580.31 509.27 438.24 275.17 147.11

labeled y ion

found

1102.47 1001.43 904.36 807.39 708.29 580.26 509.21 438.24

calcd

found

1277.69 1130.62 1029.57 932.52 835.47 736.40 608.34 537.30 466.27 303.20 175.14

1277.55 1130.55 1029.45 932.39 835.39 736.32 608.28 537.26 466.22

no. 12 11 10 9 8 7 6 5 4 3 2 1

MS/MS for its applicability for protein identification by sequence analysis. Figure 3 depicts the fragment ion spectrum, obtained from the hemoglobin digest, of labeled and unlabeled peptides (EFTPPVQAAYQK), which have two labeled sites at both the Nand C-termini of the peptide. The CID fragments and their identified sequences were compared with those of the unlabeled counterparts, and the results are listed in Table 2. We note two interesting features in Figure 3 and Table 2. First, a mass shift of 28 units was detected between the labeled and unlabeled b ions as well as the y ions. This evidence suggests strongly that two methyl groups are attached, as expected, to both the N- and C-termini of the peptide. For some fragments with only one labeled site at the N-terminus of the peptide, the 28 mass unit shift was only observed for b ions but not for y ions (spectra shown in the Supporting Information), which is also consistent with expecta-

Table 3. Identification and Quantification of the Protein Mixturea protein name ovalbumin

bovine serum albumin

myoglobin

peptide sequence identified

no. of labeling

obsd

exptd

mean ( SD

% error

3 3 2 3 2 2 2 2 2 3 2 2 3 3 2 2 2 2 2 3 2 2 3 2 3

2 2 1 2 2 2 1 1 1 2 2 2 2 3 2 2 2 2 1 2 2 1 1 2 2

2.20 2.24 2.63 2.84 2.54 2.01 2.70 2.47 2.84 0.93 0.91 0.91 0.94 0.95 0.90 0.90 1.12 1.00 1.08 1.03 0.21 0.25 0.27 0.19 0.28

2.50 2.50 2.50 2.50 2.50 2.50 2.50 2.50 2.50 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 0.25 0.25 0.25 0.25 0.25

2.50 ( 0.29

0

0.97 ( 0.07

3

0.24 ( 0.03

4

ADHPFLFCIK AFKDEDTQAMPFR HIATNAVLFFGR DILNQITKPNDVYSFSLASR YPILPEYLQCVK LTEWTSSNVMEER GGLEPINFQTAADQAR ELINSWVESQTNGIIR SLHTLFGDELCK QTALVELLK KVPQVSTPTLVEVSR LKPOPNTLCDEFK LVNELTEFAK HLVDEPQNLIK TVMENFVAFVDK EYEATLEECCAK DAFLGSFLYEYSR LFTFHADICTLPDTEK HGTVVLTALGGILK VEADIAGHGQEVLIR GLSDGEWQQVLNVWGK YLEFISDAIIHVLHSK

a

DLTDYLM*K (β-actin). GLGTD I. H/D ) 1/1. H/D ) 1/1.4.

6848

ratio

charge state

Analytical Chemistry, Vol. 75, No. 24, December 15, 2003

Figure 4. (A) MALDI spectrum for a linearity assay of H- and D-labeled phosphpeptide standard AEEEIpYGVLFAKKKK under various ratios. (B) The linearity plot of the H- and D-labeled phosphopeptide standard AEEEIpYGVLFAKKKK. The error bars denote the minimum and maximum range of the data acquired from three individual experiments.

approaching 1), implying that this method is particularly sensitive and accurate for small variations in protein abundance. For large variations (D/H ratio >5), the method is less precise relative to that for small variations, and more fragments derived from the same protein, or more repetitive measurements, are required to ensure its accuracy. We believe that the major cause of the decreased precision associated with substantial variations in peptide/protein abundance could be due to the selective signal capturing for the stronger isotopic ion, which is particularly severe when ESI mode is used for MS analysis. Nevertheless, all the measured relative standard deviations are within 15%, which is

normally treated as a reproducibility or precision threshold for bioassays. Quantification and Sequencing of the Protein Mixture Using ESI-µLC Coupled with MS. The feasibility of the dimethyl labeling for quantitative analysis and sequencing was further investigated using ESI-based MS, with two prepared protein mixtures, in which the protein abundance ratios (D/H) between the two mixtures were 2.5, 1, and 0.25 for ovalbumin, BSA, and myoglobin, respectively. As described previously, the quantification was based on LC/MS and the sequencing was based on LC/ MS/MS. Table 3 summarizes the result of the quantification and Analytical Chemistry, Vol. 75, No. 24, December 15, 2003

6849

Figure 5. (A) Total ion chromatogram of the protein mixture. The enlargement shows the averaged LC/MS spectrum of the isotopic peptide pair *QTALVELL*K derived from BSA. The data were taken over a 60-s period of peak width. (B) Selective ion chromatogram for the same isotopic peptide pair (*QTALVELL*K): m/z 535.77 for the H-labeled peptide and m/z 539.80 for the D-labeled peptide.

sequencing of the standard protein mixtures. Basically, this analysis shows good correlation of the experimental data with the theoretical ratios with an error less than 4% and the relative standard deviation less than 12.5%. Without prefractionation, the number of peptides that were sequenced and derived from the same protein ranges from 5 to 11, including doubly and triply charged ions. We believe that the peptide coverage will increase if the mixture is further fractionated. It is also noticeable that the relative standard deviation is relatively larger for larger abundance differences (D/H ratios of 2.5 and 0.25) than for small differences (D/H ratio of 1), which is consistent with the observations from the peptide standard. On the basis of the analyzed results published previously,4,8 however, we believe that this range of dynamic linearity is sufficient for many proteomic applications in which only semiquantification is required. Isotopic Effect. It has been reported that partial resolution of light and heavy tags during LC separation is likely to occur for 6850 Analytical Chemistry, Vol. 75, No. 24, December 15, 2003

hydrogen-based isotopes, which may reduce the accuracy of quantification.1 We have investigated this effect and found that the inaccuracy in quantification caused by such an “isotope effect”1,20 is negligible for the samples that we have analyzed. As an example, Figure 5 indicates that both the H4- and D4-labeled peptide ions almost coelute at a migration time of ∼35.3 min and that the averaged MS spectrum taken over a 60-s period of peak width shows that the relative signal intensity of the two isotopic pairs is about equal, which is consistent with expectations. The isotopic effect is also negligible for doubly labeled peptides. This may be due to the fact that the deuterium labeling is associated with charged amino residues and this might reduce the differential interactions with the RP-HPLC stationary phase that leads to chromatographic separation.6 This peptide (QTALVELLK) was derived from BSA and is doubly charged with one labeled site. (20) Zhang, R.; Regnier, F. E. J. Proteome Res. 2002, 1, 139-147.

Figure 6. Total ion chromatogram of the protein mixture. The enlargement shows the averaged LC/MS spectrum of the triply charged isotopic pair with one labeled site (*VEADIAGHGQEVLIR). This peptide was derived from myoglobin.

One challenge associated with this labeling reagent is that the mass difference between the light- and heavy-labeled isotopic pair might be too small to be resolved by MS since each labeled pair differs by only 4 mass units. We have investigated this possibility based on an extreme case: the triply charged isotopic pair with one labeled site. As is shown in Figure 6, this isotopic pair has a difference in m/z of less than 1.4 units and an intensity ratio of ∼0.25, which makes the mass resolution even more difficult when compared with the isotopic pair of equal intensity. This isotopic pair was still well resolved by LC/MS, however, and their sequence was successfully determined to be VEADIAGHGQEVLIR derived from ovalbumin. Application to Protein Expression Profiling. To explore the applicability of the method to quantitative protein profiling, we analyzed changes in nuclear protein abundance caused by arsenic treatment in immortalized E7 cells. The rationale to carry out this investigation is the well-known association between ingested arsenic and the occurrence of bladder cancer in an arseniasis endemic area in southwestern Taiwan.21 Several potential targets of arsenic have been reported using immortalized cell models, such as Erk in the arsenite-induced cell transformation.22 Since there is no comprehensive information concerning arsenic-related carcinogenesis, the aim of this study is to determine the molecular signature for arsenic-related carcinogenesis. We first analyzed the alterations of nuclear proteins of immortalized E7 cells with and without arsenic treatment. A total of 23 proteins were identified from the selected SCX fraction (100 mM NaCl), and all of them have at least one identified isotopic peptide pair with an average

H/D ratio ranging from 0.8 to 2.7. The summarized results obtained from quantification analysis by LC/MS and sequencing analysis by LC/MS/MS are tabulated in the Supporting Information. In general, these detected H/D ratios, which imply the expression variation between two cell states, are relatively small but their deviation determined based on the identified peptides derived from the same protein is also small. Among the quantified proteins, β-actin, a standard normal control for human cells, has an average H/D ratio very close to 1 (1.03). Figure 7 depicts the chromatogram along with two MS spectra of β-actin and annexin I, respectively. It is particularly interesting to note that the protein with the greatest H/D ratio of 2.7 was identified to be tyrosine 3-monooxygenase/tryptophan 5-monooxygenase (14-3-3 ). In human cells, seven different 14-3-3 proteins regulate diverse cellular processes by binding to proteins with numerous functions.23 For example, 14-3-3 molecules act as scaffolding proteins to enhance the activity of proteins such as p53 or inactivating proteins such as BAD and Cdc25 by sequestration in the cytosol, partitioning these proteins into the cytoplasm to modulate nuclear import/export.24 Arsenic is a well-documented human carcinogen associated with cancers of the skin, lung, liver, and bladder. The finding that expression of tyrosine 3-monooxygenase/tryptophan 5-monooxygenase (14-3-3 ) was induced by chronic arsenic treatment appears to corroborate an earlier report showing that arsenic induces p53 expression.25 Therefore, these observed results deserve further investigations to find out their biological significance in arsenic-related carcinogenesis.

(21) Chiou, H. Y.; Hsueh, Y. M.; Liaw, K. F.; Horng, S. F.; Chiang, M. H.; Pu, Y. S.; Lin, J. S. N.; Huang, C. H.; Chen, C. J. Cancer Res. 1995, 55, 12961300. (22) Huang, C.; Ma, W.-Y.; Li, J.; Goranson, A.; Dong, Z. J. Biol. Chem. 1999, 274, 14595-14601.

(23) Yaffe, M. B. FEBS Lett. 2002, 513, 53-7. (24) Waterman, M. J.; Stavridi, E. S.; Waterman, J. L.; Halazonetis, T. D. Nat. Genet. 1998, 19, 175-178. (25) Salazar, A. M.; Ostrosky-Wegman, P.; Menendez, D.; Miranda, E.; GarciaCarranca, A.; Rojas, E. Mutat. Res. 1997, 381, 259-65.

Analytical Chemistry, Vol. 75, No. 24, December 15, 2003

6851

Figure 7. Total ion chromatogram of the combined mixture of immortalized E7 cell lysates with and without arsenic treatment. The enlargements show the averaged LC/MS spectra of the peptide pairs derived from β-actin and annexin I, respectively.

CONCLUSION Generally speaking, the dimethyl labeling method exhibits several advantages and disadvantages when compared to the other methods. First, the isotopic formaldehyde used as the dimethyl labeling reagent is inexpensive and commercially available. Some of the current labeling reagents are relatively more expensive. Second, the derivatization procedure for dimethyl labeling is relatively fast and simpler when compared to some of the other methods. Third, the ionic state is not changed significantly by dimethyl modification, and so the ionization efficiency of the fragment is more likely to be conserved. Finally, the dimethyl modification is a global labeling that labels not only lysine residues but also the N-terminus of the peptide, without significant isotopic effects. In comparison, guanidination labeling does not label peptides lacking lysine residues and ICAT does not label peptides lacking cysteine residues. The global labeling characteristic, however, can also be a disadvantage, since it produces a large number of peaks, which requires a relatively pure sample or a greater separation power to resolve these peaks. We are currently

6852

Analytical Chemistry, Vol. 75, No. 24, December 15, 2003

developing many selective separation methods to be coupled with dimethyl labeling to increase the applicability of this method for quantitative proteomics. ACKNOWLEDGMENT This work was supported by National Science Council in Taiwan under the National Program for Genomics Medicine (Grant NSC 91-3112-P-006-008-Y). We also thank Prof. F. E. Regnier at Purdue University for helpful discussions and Micromass in Taiwan for excellent technical support. SUPPORTING INFORMATION AVAILABLE Additional information as noted in the text. This material is available free of charge via the Internet at http://pubs.acs.org.

Received for review July 28, 2003. Accepted October 7, 2003. AC0348625