Quantitative Profiling of Proteins in Complex Mixtures Using Liquid

J. Proteome Res. ...... Goodlett, D. R.; Keller, A.; Watts, J. D.; Newitt, R.; Yi, E. C.; Purvine, S.; Eng, J. K.; von Haller, P.; Aebersold, R.; Kolk...
0 downloads 0 Views 121KB Size
Quantitative Profiling of Proteins in Complex Mixtures Using Liquid Chromatography and Mass Spectrometry Dirk Chelius* and Pavel V. Bondarenko Thermo Finnigan, 355 River Oaks Parkway, San Jose, California 95134 Received March 18, 2002

The objective of this study was to determine if liquid chromatography mass spectrometry (LC/MS) data of tryptic digests of proteins can be used for quantitation. In theory, the peak area of peptides should correlate to their concentration; hence, the peak areas of peptides from one protein should correlate to the concentration of that particular protein. To evaluate this hypothesis, different amounts of tryptic digests of myoglobin were analyzed by LC/MS in a wide range between 10 fmol and 100 pmol. The results show that the peak areas from liquid chromatography mass spectrometry correlate linearly to the concentration of the protein (r2 ) 0.991). The method was further evaluated by adding two different concentrations of horse myoglobin to human serum. The results confirm that the quantitation method can also be used for quantitative profiling of proteins in complex mixtures such as human sera. Expected and calculated protein ratios differ by no more than 16%. We describe a new method combining protein identification with accurate profiling of individual proteins. This approach should provide a widely applicable means to compare global protein expression in biological samples. Keywords: mass spectrometry • protein quantification • protein identification

Introduction For a number of years, two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) in combination with mass spectrometry (MS) or tandem mass spectrometry (MS/MS) identification of stained spots has been the standard method for quantitative analysis of protein mixtures.1-5 Quantitation of peptide and protein mixtures by mass spectrometry has been a challenging analytical problem, largely because of ionization suppression among coeluting species.6 It was realized early that stable isotope incorporation can be employed as a method to generate internal standards for mass spectrometry, because the chemical and physical properties of the labeled standards are similar to properties of their natural isotopic counterparts, but their mass is different.7,8 These internal standards were produced, for example, by incorporation of oxygen-18 into the peptide molecules during the chemical or enzymatic hydrolysis in 18O-water.8,9 A recent report described methylation of a sample and control peptide mixtures with methanol containing three deuterium atoms and no deuterium atoms.10 A wholecell stable isotope labeling was used by Chait and co-workers, who grew one-half of the yeast cells on medium enriched in 15N (15N > 96%) and another half on medium containing the natural abundance of the isotopes of nitrogen (15N ) 0.4%).11 Smith and co-workers used normal and rear isotope media to grow and study stress response in Escherichia coli and also the 15N isotope labeling.12,13 Although widely used, the 2D-PAGE/ MS technique has limitations when dealing with very large or * To whom correspondence should be addressed. Phone: (408) 965-6326. Fax: (408) 9656138. E-mail: [email protected]. 10.1021/pr025517j CCC: $22.00

 2002 American Chemical Society

small proteins, proteins at the extremes of the pI scale, membranes, and low-abundant proteins. These limitations of the 2D-PAGE/MS approaches have been recently challenged by a new chromatography-based method. The method includes digestion of complex protein mixtures without preliminary separation and data-dependent LC/MS/MS analysis of the resulting digest products, followed by protein identification through database searching.1,14-16 This method was complemented by quantitative profiling of the complex protein digests based on isotope-coded affinity tags (ICAT).17-19 The ICAT reagent has two forms, heavy (containing eight deuterium atoms) and light (containing no deuterium atoms) to tag the control and experimental samples.17 Application of the ICAT method to a standard protein mixture indicated that accuracy and relative standard deviation of peak area ratios was better than 12%.17 Another report suggested that the error could be much larger due to the chromatographic resolution of the isotopic forms of ICAT-treated peptides, causing the peak elution time to vary by as much as 30 s.20 If one of the isotopic forms (but not the other) coelutes with another species, its ionization can be greatly suppressed causing substantial quantitation error. Different isotope-coded affinity tags were proposed containing three or four deuterium atoms, which may have a lower degree of separation.20 Although the ICAT approach is sophisticated and promising, it is associated with several practical problems, such as timely chemical derivatization steps and high cost of the reagents. This report describes a simpler method for identification and quantitation of complex protein mixtures using LC/MS/MS analysis of protein digests. Journal of Proteome Research 2002, 1, 317-323

317

Published on Web 06/06/2002

research articles

Chelius and Bondarenko

Figure 1. Base peak ion chromatogram of a tryptic digest of myoglobin. Horse myoglobin (1 mg; 60 nmol) was digested with trypsin, and a 250 fmol sample was analyzed as described in the Experimental Section. Peptides are numbered on the basis of the sequence in Table 1. Table 1. Predicted Peptides from Trypsin Digestion of Horse Myoglobin on the Basis of Numbering of the Wild-Type Proteina

a 250 fmol of myoglobin tryptic digest was analyzed as described in the Experimental Section. Identified peptides are shown in bold. Peptide numbering assumes that Lys-Pro (97/98) bonds are not cleaved by trypsin. Sequential peptide numbers 1-18 identify the peptides in the text. Residue numbers are given only for those peptides that contain more than one amino acid residue.

Experimental Section Reduction, Alkylation, and Digestion. Lypholized protein samples (1 mg of human serum and 1 mg of horse myoglobin, Sigma-Aldrich, St. Louis, MO) were reconstituted in 1 mL of ammonium bicarbonate buffer (100 mM, pH 8.5) and 3 µL of DTT (1 M, Sigma-Aldrich, St. Louis, MO). The mixture was incubated for 30 min at 37 °C. To alkylate the protein, 7 µL of iodoacetic acid (1 M in 1 M KOH, Sigma-Aldrich, St. Louis, MO) was added, and the mixture was incubated for an additional 30 min at room temperature in the dark. Thirteen microliters of DTT (1 M) was added to quench the iodoacetic acid. The reduced and alkylated proteins were digested by adding 20 µL of trypsin (0.5 mg/mL, Promega, Madison, WI). The mixture 318

Journal of Proteome Research • Vol. 1, No. 4, 2002

was incubated for 6 h at 37 °C, an additional 20 µL of trypsin (0.5 mg/mL) was added, and incubation was continued for 16 h at 37 °C. LC/MS/MS. Aliquots (as indicated in the text) of the sample digests were placed in wells of a 96-well plate. The plate was sealed with plastic film to minimize evaporation and positioned in the Surveyor auto-sampler, where it was maintained at 4 °C while waiting for analysis. The Surveyor auto-sampler was equipped with no-waste injection capability, which enables injection volumes as low as 1 µL. The injected peptides were first loaded on a small reversed-phase peptide trap poly(styrene-divinylbenzene) (Michrom Bioresources) with a relatively high flow rate of 10 µL/min for 3 min. Then peptides were eluted from the trap and subsequently separated on a reversed-phase capillary column (PicoFrit; 5 µm BioBasic C18, 300 A pore size; 75 µm × 10 cm; tip 15 µm, New Objective) with a 30-min linear gradient of 0-60% acetonitrile in 0.1% aqueous formic acid at a flow rate of 0.1 µL/min after split. The Surveyor HPLC system was directly coupled to a Thermo Finnigan LCQ Deca XP ion-trap mass spectrometer equipped with a nano-LC electrospray ionization source. The spray voltage was 2.0 kV, the capillary temperature was 150 °C, and ion-trap collision fragmentation spectra were obtained by collision energies of 35 units. Each full mass spectrum was followed by three MS/MS spectra of the three most intense peaks. The Dynamic Exclusion was enabled. After each sample, an injection of 10 µL of 0.1% aqueous formic acid was analyzed to ensure proper equilibration of the system. Data Analysis. Peptides and proteins were identified automatically by the computer program Sequest, which correlates the experimental tandem mass spectra against theoretical tandem mass spectra from amino acid sequences obtained from the National Center for Biotechnology Information (NCBI) sequence database.21 Peptide identification was further evalu-

research articles

Quantitative Profiling of Proteins

Table 2. ESI-MS Analysis of Myoglobin Proteolytic Fragments from Tryptic Digestion of 250 fmol of Horse Myoglobina no.

14 3 2 15 9 18 10 a

peptide

HPGDFGADAQGAMTK LFTGHPETLEK VEADIAGHGQEVLIR ALELFR HGTVVLTALGGILK ELGFQG GHHEAELKPLAQSHATK

m/z

752.62+,

502.23+

637.32+, 425.03+ 804.62+, 537.53+ 748.81+ 690.52+, 460.93+ 650.71+ 464.94+, 619.33+

obsd mass

calcd mass

1503.4 ( 0.2 1272.3 ( 0.3 1608.3 ( 1.3 747.8 1379.3 ( 0.3 649.7 1854.9 ( 0.6

1502.6 1271.4 1606.8 747.9 1378.6 649.7 1854.0

Peptides are numbered on the basis of the sequence in Table 1.

Figure 2. Changes of chromatographic peak area of peptide 15 (ALELFR) with different amounts of myoglobin injected. Horse myoglobin (1 mg; 60 nmol) was digested with trypsin, and 10, 100, 250, 500, 1000, 5000, 25 000, 50 000, and 100 000 fmol samples were analyzed as described in the Experimental Section. The peak area decrease at 100 000 fmol could have been caused by overloading the column and displacing this peptide with other peptides with stronger retention.

ated using a unified score combining all three correlation coefficients generated by Sequest. The score was calculated according to the following formula: score ) (10000DelCn2 + Sp)Xcorr.21 The equation was developed empirically in house and tested on several thousands of analysis. For proteins, the score of each peptide was added and the normalized score was calculated to be the total score divided by the numbers of peptides. Only peptides with a score of more than 2000 were accepted. The Genesis algorithm in the Xcalibur software was used for peak detection and calculation of the peak area.

Results The predicted tryptic peptides of horse myoglobin are illustrated in Table 1, the numbering of which is based on the assumption that the arginine-proline and lysine-proline bonds would not be digested. The peptides were separated on a reversed-phase column and directly electrosprayed into a LCQ DECA XP ion-trap mass spectrometer, which was equipped with a nano-electrospray ion source. A base peak ion chro-

matogram of myoglobin digest is shown in Figure 1. In general, peptides with four or less amino acids cannot be detected with this method. Small and hydrophobic peptides do not bind to the reversed-phase column, and the fragmentation information obtained from these small peptides is not sufficient for positive identification. However, more than 51% of the myoglobin amino acid sequence could be identified from a 250 fmol sample using fragmentation information from MS/MS spectra and the automated search program Sequest. A summary of all identified peptides with their calculated and experimentally obtained masses is shown in Table 2. Eleven aliquots containing different amounts of myoglobin digests in the range from 10 fmol to 100 pmol were analyzed by LC/MS/MS, and the peak area of five selected peptides were calculated. The experiment was repeated three times to ensure repeatability. Figure 2 shows the change of the peak area of one selected peptide (peptide 15 ALELFR) over the whole range. As expected, the peak area increases with increased concentration of injected peptides. In our experiments, the lower limit Journal of Proteome Research • Vol. 1, No. 4, 2002 319

research articles

Chelius and Bondarenko Table 3. ESI-MS Analysis of Myoglobin Proteolytic Fragments from Tryptic Digestion of Horse Myoglobina concn (fmol)

peak area I

peak area II

peak area III

avg

SD

% error

100 000 273 819 105 719 199 122 192 886 84 223 44 50 000 170 712 144 559 194 372 169 881 24 917 15 25 000 67 095 70 790 81 044 72 976 7227 9.9 5000 12 820 13 879 19 128 15 275 3378 22 1000 3492 3224 2768 3161 366 12 500 1289 1651 1764 1568 248 16 250 714 643 588 648 63 9.7 100 212 219 231 221 9.6 4.4 50 130 97 61 90 36 40 25 38 74 55 56 18 32 10 19 0 6 8.3 9.7 117 0 0 0 0 0 0 0 Figure 3. Changes of combined chromatographic peak area of peptides 2, 3, 9, 14, and 15 for different amounts of myoglobin injected. Peptides are numbered on the basis of the sequence in Table 1. Horse myoglobin (1 mg; 60 nmol) was digested with trypsin, and 10, 25, 50, 100, 250, 500, 1000, 5000, 25 000, 50 000, and 100 000 fmol samples were analyzed as described in the Experimental Section. Data for each concentration are the average of the 3 runs ( SD as shown in Table 3. The peak areas with a value of 0 (see Table 3) could not be shown at the logarithmic scale but are included in the linear regression. The linear regression was performed using Sigma Plot (SPSS Inc, Chicago, IL) resulting in the following parameters for y ) ax + b: a ) 0.159; b ) 1.072; and r2 ) 0.991.

for peak detection was 10 fmol. The upper limit was 100 pmol. At this concentration, the peaks start to broaden and the loading capacity of the peptide trap is reached. The peak areas of all five myoglobin peptides were combined and plotted against the amount of myoglobin. The peak area correlates

a Peak areas of three analyses (I-III) are calculated using Xcalibur software as described in the Experimental Section. For simplification, the actual peak area is recorded as 1/100000 of the original unit (total ion counts × time).

linear to the concentration of myoglobin (r2 ) 0.991) from 10 fmol to 100 pmol, and the results are repeatable. A summary of the results is shown in Table 3 and Figure 3. To further evaluate our quantitation method for protein profiling of complex mixtures, human serum (approximately 1 µg total protein) was mixed with different amounts of horse myoglobin (250 fmol and 500 fmol), and the two mixtures were analyzed as above. Tryptic peptides were separated on a C-18 column with a gradient of 0-60% acetonitrile in 30 min. The chromatograms are shown in Figure 4. As expected, no obvious difference can be detected in both chromatograms, since both samples differ only in the concentration of myoglobin. Fragmentation information from MS/MS spectra and the automated search program Sequest was used for peptide and protein identification. A summary of all identified proteins is shown

Figure 4. Quantitative analysis of human serum spiked with two different concentrations of horse myoglobin. The protein mixture was digested and analyzed as described in the Experimental Section. (A) Base peak ion chromatogram of human serum containing 250 fmol of horse myoglobin. (B) Base peak ion chromatogram of human serum containing 500 fmol of horse myoglobin. (C) Reconstructed ion chromatogram of horse myoglobin peptides with m/z ) 637.2, 804.5, and 690.5 from human serum containing 250 fmol of horse myoglobin. (D) Reconstructed ion chromatogram of horse myoglobin peptides m/z ) 637.2, 804.5, and 690.5 from human serum containing 500 fmol of horse myoglobin. 320

Journal of Proteome Research • Vol. 1, No. 4, 2002

research articles

Quantitative Profiling of Proteins Table 4. Protein Identified in Human Serum Sample Spiked with Horse Myoglobina

for this normalization is described later in the Discussion. The concentration of the human proteins was constant, and therefore, the peak areas should have a ratio of 1. Serum albumin was calculated to have a ratio of 0.91, serotransferrin was calculated to be 1.05, antitrypsin was calculated to be 0.84, Ig γ-4 chain C region was calculated to be 0.95, and apolipoprotein A-1 was calculated to be 1.10. The concentration of myoglobin in the second sample was double the concentration of myoglobin in the first sample, and therefore, the ratio of the peak areas should be 2. Indeed, the peak area for horse myoglobin was calculated to be 1.91. The calculated ratio of the peak areas and the expected ratio of the peak areas are within 16% for the calculated proteins. The results confirm our hypothesis that peak area from peptides can be used for quantitative profiling of proteins in complex mixtures. This method can be used to detect small changes in protein concentrations from one sample to the other and gives information about the ratio at which the changes occur.

protein

peptides

scans

score

norm score

serum albumin serotransferrin myoglobin (horse) R-1-antitrypsin Ig γ-4 chain C region Ig λ chain C region Ig γ-1 chain C region apolipoprotein A-1 fibrinogen β chain transthyretin haptoglobulin-2 Ig R-1 chain C region fibrinogen gamma chain R-1 acid glycoprotein 2 Ran binding protein 2 eukariotic translation initiation factor 3 subunit 2 haptoglobulin-related protein transcription factor RELB serine/threonine protein phosphatase 2B catalytic subunit, β isoform S100 calcium-binding protein A14

22 8 4 3 3 1 1 2 1 1 1 1 1 1 1 1

34 12 6 4 4 2 2 4 1 2 1 2 2 1 1 1

270 459 98 574 69 433 26 549 227 511 21 148 15 492 13 075 12 118 10 070 9725 8588 6595 5821 3751 3071

7955 8214 11 572 6637 5688 10 574 7746 3269 12 118 3035 9725 4294 3297 5821 3751 3071

1

1

2848

2848

Discussion

1 1

1 1

2782 2500

2782 2500

1

1

2376

2376

Typically, large-scale quantitative protein analysis is achieved by combining protein separation techniques such as 2D-PAGE with mass spectrometry based or tandem mass spectrometry based sequence identification. This approach is labor intensive, difficult to automate, and biased against certain proteins. Very large and small proteins, membrane proteins, and proteins with extremely high or low pI values are frequently missed. In this study, a different approach including tryptic digestion, peptide separation, and mass spectrometry analysis is described. The results presented demonstrate that the peak areas from liquid chromatography/mass spectrometry of tryptic peptides can be used for quantitative protein analysis. The method described is accurate, repeatable, and can be used to measure small changes in relative protein concentration. For our system, the dynamic range for protein quantification of a purified protein was between 10 fmol and 100 pmol. The limitation factors on the lower end are obvious due to detection limits of the LC/MS analysis. Different proteins do have different detection limits, and the data obtained in this study certainly do not allow a general rule for this limitation. However, it can be used as a general guideline, since most proteins tested have detection limits around 1-10 fmol. The capacity of the liquid chromatography system used in this study, including the peptide trap and the 75 µm i.d. column, is approximately 2 µg or 100 pmol peptides. Overloading the peptide trap or the column does obviously not allow accurate quantitative measurements. However, the dynamic range of our system of 104 is compatible to the dynamic range that can be achieved using other techniques. The accuracy of this approach for protein quantification is best from 100 fmol to 50 pmol where the percentage error of the myoglobin digest is between 4.4% and 22% (Table 3). This means that for real biological samples such as serum only the most abundant proteins (e0.2% abundance) can be analyzed using this method. Analysis of low-abundant proteins will require sample preparation steps prior to the analysis to enrich these low-abundant proteins. Quantitation of peptide and protein mixtures by mass spectrometry has been a challenging analytical problem, largely because of ionization suppression among coeluting species.6 The ICAT approach solves this problem by isotope labeling of peptides containing cysteine residues and is currently used for quantitative profiling of proteins in complex mixtures.17 Our

a Peptides were identified using Sequest software and MS/MS data as described in the Experimental Section. All proteins were identified in both samples with only minor differences in peptide coverage.

in Table 4. A total of 56 peptides corresponding to 20 different proteins could be identified in both samples. The same proteins were identified in both samples with only minor differences in peptide coverage (data not shown). The very low number of peptides and therefore proteins identified in this study is not surprising considering the amount of protein injected and the gradient used for peptide separation. The focus of this study was not to identify the maximum number of peptides in the sample but rather to ensure elution of all peptides in a small period of time. In similar experiments using longer gradients of up to 8 h and using more material over 300 proteins could be identified. For quantitative analysis, a total of 16 peptides was chosen from six different proteins including five proteins from human serum (serum albumin, serotransferrin, R-1antitrypsin, Ig γ-4 chain C region, and apolipoprotein A-1) and horse myoglobin. All proteins with more than one peptide identified were included in the quantitative analysis. The peak areas of these peptides were calculated as described in the Experimental Section, and the two samples were compared. The only difference in the two samples was the concentration of the horse myoglobin. In theory, the peak area of the human proteins should be constant and only the peak area of the horse myoglobin should change. The result of this experiment is summarized in Table 5. Comparison of sample 1 (250 fmol of myoglobin) and sample 2 (500 fmol of myoglobin) shows that the peak areas of the human peptides of sample 2 are all approximately the same or smaller (ratio from 1.04 to 0.69), whereas the myoglobin peptides are all higher (ratio from 1.27 to 2.29). The ratios of the peak areas were normalized against an experiment-dependent correction factor. This correction factor was calculated by excluding all ratios not within the median (0.92) ( the standard deviation (0.42). The average of the remaining ratios was calculated to be 0.87, and all peak area ratios were normalized against this factor. The rational

Journal of Proteome Research • Vol. 1, No. 4, 2002 321

research articles

Chelius and Bondarenko

Table 5. Sequence Identification and Quantification of the Components of Human Serum Spiked with Two Different Concentrations of Horse Myoglobin (250 fmol and 500 fmol)a protein

albumin

transferrin antitrypsin myoglobin IgG-4 Apo-A1

peptides identified

obsd ratio

mean ( SD

NLb ratio

expected ratio

% error

LCTVATLR YICENQDSISSK CCAAADPHECYAK KVPQVSTPTLVEVST DGAGDVAFVK SVIPSDGPSVACVK SVLGQLGITK LSITGTYDLK HGTVVLTALGGILK VEADIAGHGQEVLIR LFTGHPETLEK GPSVFPLAPCSR NQVSLTCLVK THLAPYSDELR ATEHLSTLSEK

0.87 0.69 0.93 0.72 0.85 0.98 0.76 0.70 1.27 2.29 1.42 0.62 1.04 0.92 1.00

0.79 ( 0.18

0.91

1

9

0.91 ( 0.11

1.05

1

5

0.73 ( 0.03

0.84

1

16

1.66 ( 0.55

1.91

2

5

0.83 ( 0.11

0.95

1

5

0.96 ( 0.04

1.10

1

10

a The ratio of the peak areas was calculated by dividing the peak area of sample 2 (500 fmol of myoglobin) by the peak area of sample 1 (250 fmol of myoglobin) for each peptide. b The ratios of the peak areas were normalized (NL) against an experiment-dependent correction factor. This correction factor was calculated by excluding all ratios not within the median ratio (0.92) ( the standard deviation (0.42). The average of the remaining ratios was calculated to be 0.87, and all peak area ratios were divided by this factor giving the normalized (NL) ratios.

proposed method was evaluated for quantitative profiling of proteins using human serum spiked with two different amounts of horse myoglobin. The gradient used for peptide separation of 0-60% acetonitrile in 30 min ensures that a lot of peptides will coelute and the effect on our method can be analyzed. In contrast to our expectation, the peak area of coeluting peptides can still be used for protein quantitation. Expected and calculated protein ratios differ in no more than 16%. The results could not be improved significantly by using longer gradients of 60 and 120 min (data not shown), indicating that coelution of peptides might not affect our quantitation method. Gygi et al. reported the quantitative profiling of 26 proteins from yeast growing on galactose or ethanol as a carbon source with the ICAT method.17 However, the results were not confirmed by other analytical techniques such as ELISA or radioimmuno assays, and no information about the actual protein ratio and the measured protein ratio was reported. The ICAT method was evaluated using a mixture of 5 proteins at different concentration.17 Expected and calculated ratios differed by up to 12%, which is comparable with the results achieved by our method. The results from our quantitative profiling analysis could be improved by normalizing the calculated peak area with a correction factor as shown in Table 5. The rational for such normalization is based on observations that even the same sample results in differences in the peak areas of the peptides from one run to the other. These differences can be caused by several experiment dependent parameters such as differences in sample preparation (pipetting errors, incomplete digestion) or inaccurate sample injection. Our data suggests that these experiment dependent parameters, also unknown, affect all proteins from a single run in the same way. The peak area of the human proteins that should theoretically be the same for the two analysis are all lower in the sample containing 500 fmol of myoglobin. Several methods to correct this error have been considered, and the pros and cons of each method will be explained. One obvious approach would be to repeat each experiment several times to get a better statistic about each sample. However, this approach is very time-consuming and is certainly less desirable. Another approach would be to normalize all peak areas to the peak area of a known protein. In our example, we could have normalized all ratios to the ratio of albumin. The 322

Journal of Proteome Research • Vol. 1, No. 4, 2002

correction factor would have been 0.79, and the results would have been similar to the results presented in Table 5. This approach would need the addition of standard proteins or peptides to the samples, since a general method should enable researchers to analyze data without guessing which proteins are constant. To compare similar samples that differ only in the concentrations of a few proteins, like cell cultures that are treated with different drugs, a different method to calculate the correction factor can be used. The peak areas or the ratios can be normalized against an obvious trend. In our experiment, the differences in the expected and the calculated peak areas for the human proteins are most likely due to differences in sample preparation and affect all proteins from a single run in the same way. The correction factor is the average peak area ratio of all proteins that are constant in the two runs (all human proteins). For unknown samples where some protein change the calculation is somewhat tricky. Outliers that are the proteins that are really changing (in our case myoglobin) need to be detected and excluded from the calculation. This is a mathematical problem that cannot be easily solved. We applied a general method to calculate the correction factor by excluding all ratios not within the median ( SD and recalculating the median (or average) of the remaining ratios. The median was chosen rather than the average for our calculations since this method is less susceptible to exceptions to the trend and should theoretically be the best approach for a wide area of applications. Clearly, additional experiments need to be performed to demonstrate that our method for calculating the correction factor can be used for general applications, and the limitations of this method for real biological samples need to be determined. One other important issue in order to automate our proposed method for quantitation is the peptide identification. The peptide identification as described in the Experimental Section is an essential part of our protein quantitation method. Only proteins with more than two peptides identified were included in our quantitation studies. Having more than one peptide identified in a particular protein will not only improve the identification but also optimize the quantitative analysis. However, quantitative analysis of proteins with only one peptide identified was possible in our study with similar percentage error (data not shown). The results from such quantitation experiments based on a single peptide are cer-

research articles

Quantitative Profiling of Proteins

tainly less valuable and can probably not be applied as a general method. To conclude, in this study we describe an approach for quantitative profiling of proteins in complex mixtures using peak area from liquid chromatography/mass spectrometry analysis. The method described has several advantages compared to the commonly used ICAT method for protein profiling in complex mixtures. No labor-intensive and time-consuming labeling of the samples is needed prior to the analysis. Our method is not limited to proteins that contain cysteine residues and more than two samples can be compared. Quantitative profiling of proteins using peak areas from LC/MS/MS experiments has the potential to be fully automated and should have an impact on large-scale protein identification and quantification.

Acknowledgment. We thank Tom Shaler, Andrew Guzzetta, and Antonio Piccolboni for fruitful discussions. We also thank Tom McCall for developing and installing the “nowaste” injection option on the auto-sampler and other improvements in the HPLC automation. References (1) Link, A. J.; Eng, J.; Schieltz, D. M.; Carmack, E.; Mize, G. J.; Morris, D. R.; Garvik, B. M.; Yates, J. R., III. Nat. Biotechnol. 1999, 17, 676-82. (2) Shevchenko, A.; Jensen, O. N.; Podtelejnikov, A. V.; Sagliocco, F.; Wilm, M.; Vorm, O.; Mortensen, P.; Shevchenko, A.; Boucherie, H.; Mann, M. Proc. Natl. Acad. Sci. U.S.A 1996, 93, 14440-45. (3) Gygi, S. P.; Rochon, Y.; Franza, B. R.; Aebersold, R. Mol. Cell Biol. 1999, 19, 1720-30. (4) Garrels, J. I.; McLaughlin, C. S.; Warner, J. R.; Futcher, B.; Latter, G. I.; Kobayashi, R.; Schwender, B.; Volpe, T.; Anderson, D. S.; Mesquita-Fuentes, R.; Payne, W. E. Electrophoresis 1997, 18, 1347-60. (5) Boucherie, H.; Sagliocco, F.; Joubert, R.; Maillet, I.; Labarre, J.; Perrot, M. Electrophoresis 1996, 17, 1683-99.

(6) King, R.; Bonfiglio, R.; Fernandez-Metzler, C.; Miller-Stein, C.; Olah, T. J. Am. Soc. Mass Spectrom. 2000, 11, 942-50. (7) Rafter, J. J.; Ingelman-Sundberg, M.; Gustafsson, J. A. Acta Biol. Med. Ger 1979, 38, 321-31. (8) Desiderio, D. M.; Kai, M. Biomed. Mass Spectrom. 1983, 10, 47179. (9) Mirgorodskaya, O. A.; Kozmin, Y. P.; Titov, M. I.; Korner, R.; Sonksen, C. P.; Roepstorff, P. Rapid Commun. Mass Spectrom. 2000, 14, 1226-32. (10) Goodlett, D. R.; Keller, A.; Watts, J. D.; Newitt, R.; Yi, E. C.; Purvine, S.; Eng, J. K.; von Haller, P.; Aebersold, R.; Kolker, E. Rapid Commun. Mass Spectrom. 2001, 15, 1214-21. (11) Oda, Y.; Huang, K.; Cross, F. R.; Cowburn, D.; Chait, B. T. Proc. Natl. Acad. Sci. U.S.A 1999, 96, 6591-96. (12) Jensen, P. K.; Pasa-Tolic, L.; Peden, K. K.; Martinovic, S.; Lipton, M. S.; Anderson, G. A.; Tolic, N.; Wong, K. K.; Smith, R. D. Electrophoresis 2000, 21, 1372-80. (13) Conrads, T. P.; Alving, K.; Veenstra, T. D.; Belov, M. E.; Anderson, G. A.; Anderson, D. J.; Lipton, M. S.; Pasa-Tolic, L.; Udseth, H. R.; Chrisler, W. B.; Thrall, B. D.; Smith, R. D. Anal. Chem. 2001, 73, 2132-39. (14) Washburn, M. P.; Wolters, D.; Yates, J. R., III. Nat. Biotechnol. 2001, 19, 242-47. (15) Spahr, C. S.; Davis, M. T.; McGinley, M. D.; Robinson, J. H.; Bures, E. J.; Beierle, J.; Mort, J.; Courchesne, P. L.; Chen, K.; Wahl, R. C.; Yu, W.; Luethy, R.; Patterson, S. D. Proteomics 2001, 1, 93-107. (16) Davis, M. T.; Spahr, C. S.; McGinley, M. D.; Robinson, J. H.; Bures, E. J.; Beierle, J.; Mort, J.; Yu, W.; Luethy, R.; Patterson, S. D. Proteomics 2001, 1, 108-17. (17) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Nat. Biotechnol. 1999, 17, 994-99. (18) Smolka, M. B.; Zhou, H.; Purkayastha, S.; Aebersold, R. Anal. Biochem. 2001, 297, 25-31. (19) Griffin, T. J.; Gygi, S. P.; Rist, B.; Aebersold, R.; Loboda, A.; Jilkine, A.; Ens, W.; Standing, K. G. Anal. Chem. 2001, 73, 978-86. (20) Zhang, R.; Sioma, C. S.; Wang, S.; Regnier, F. E. Anal. Chem. 2001, 73, 5142-49. (21) Eng J. K.; McCormack A. L.; Yates J. R. I J. Am. Soc. Mass Spectrom. 1994, 5, 976-89.

PR025517J

Journal of Proteome Research • Vol. 1, No. 4, 2002 323