Proteome Analysis of Low-Abundance Proteins Using

Using methanol: formic acid: water (30:0.3:70 v/v/v) or methanol: TFA: water (30:0.4:70 v/v/v) ...... Protein Analysis: A Laboratory Manual; Simpson, ...
3 downloads 0 Views 126KB Size
Proteome Analysis of Low-Abundance Proteins Using Multidimensional Chromatography and Isotope-Coded Affinity Tags Steven P. Gygi,*,†,‡ Beate Rist,‡,§ Timothy J. Griffin,| Jimmy Eng,| and Ruedi Aebersold| Department of Cell Biology, Harvard Medical School, 240 Longwood Avenue, Boston, Massachusetts 02115, BioVisioN GmbH & Co. KG, Feodor-Lynen-Str. 5, D-30625 Hannover, Germany, and Institute for Systems Biology, 4225 Roosevelt Way NE, Suite 200, Seattle, Washington 98195 Received November 12, 2001

The effectiveness of proteome-wide protein identification and quantitative expression profiling is dependent on the ability of the analytical methodologies employed to routinely obtain information on low-abundance proteins, as these are frequently of great biological importance. Two-dimensional gel electrophoresis, the traditional method for proteome analysis, has proven to be biased toward highly expressed proteins. Recently, two-dimensional chromatography of the complex peptide mixtures generated by the digestion of unseparated protein samples has been introduced for the identification of their components, and isotope-coded affinity tags (ICAT) have been introduced to allow for accurate quantification of the components of protein mixtures by mass spectrometry. Here, we demonstrate that the combination of isotope coded affinity protein tags and multidimensional chromatography/ mass spectrometry of tryptic peptide mixtures is capable of detecting and quantifying proteins of low abundance in complex samples. Keywords: gene expression • functional genomics • proteomics • protein profiling • mass spectrometry • isotopecoded affinity tags

Introduction Essential to proteome-wide protein identification and quantitative expression profiling (referred to in this paper as proteome analysis) is the ability to analyze the entire spectrum of proteins that are expressed by an organism, tissue, or cell type. Therefore, the sensitive and routine characterization of proteins present at low abundance is required for any analytical methodology that will be employed for such analyses. Unlike nucleic acids, there is no technique for the amplification of proteins. Thus, the analysis of low copy number proteins has proven to be a difficult analytical problem. The standard approach to proteome analysis involves protein separation and visualization by two-dimensional gel electrophoresis (2DE) and protein staining followed by the identification of separated protein spots by mass spectrometry (MS) or tandem mass spectrometry (MS/MS). In typical studies of this type, whole lysates or specific fractions of cells representing two different states are examined individually by 2DE, and the resulting protein patterns are compared by spot pattern matching. Proteins for which the level of expression has been found to change between the two states are then identified by one of a variety of MS or MS/MS methods (reviewed in refs 1 and 2). Using the codon-bias (CB) distribution in the yeast Saccharo* To whom correspondence should be addressed. Phone: 617-432-3155. Fax: 617-432-1144. E-mail: steven•[email protected]. † Harvard Medical School. ‡ These two authors contributed equally to this work. § BioVisioN GmbH & Co. KG. | Institute for Systems Biology. 10.1021/pr015509n CCC: $22.00

 2002 American Chemical Society

myces cerevisiae as an approximate indicator of the abundance level of the yeast proteins detectable by silver staining of 2D gels in which total cell lysates had been separated,3-5 we exclusively found proteins of medium to high expression levels and no proteins of low abundance.6 The results from this study were consistent with results obtained from other studies in which large-scale identification of 2D gel separated yeast proteins had been attempted,4,7-9 indicating that 2DE of total cell lysates (or other very complex samples) lacks the sensitivity for the detection of low-abundance proteins. Two-dimensional chromatography-electrospray ionization (ESI) MS/MS of complex peptide mixtures generated by the digestion of unseparated protein samples has recently been described as a method to identify the components of protein samples, obviating the need for protein gel electrophoresis,10,11 and we have introduced isotope-coded affinity tags (ICAT) as reagents for accurate, quantitative profiling of proteins in complex mixtures by MS/MS.12 While the 2D chromatography MS/MS method has been shown tobe capable of identifying large numbers of proteins, including proteins of low abundance,6,10 it does not, by itself, indicate the accurate quantity of the proteins identified. Conversely, while the ICAT method as initially described has been shown capable of accurate quantification of proteins in complex mixtures, the onedimensional capillary chromatography separation used did not have sufficient peak or sample capacity to allow for the analysis of low-abundance proteins. In this study, we describe the combined use of ICAT reagent labeling of proteins in complex mixtures and three-dimensional Journal of Proteome Research 2002, 1, 47-54

47

Published on Web 01/18/2002

research articles

Gygi et al.

(cation exchange, biotin affinity, reverse-phase) chromatography of the peptides generated by enzymatic digestion of the tagged proteins. We demonstrate that ICAT-labeled peptides can be reproducibly isolated at high yields and that the method is capable of identifying, and by implication quantifying, proteins of low abundance. We therefore suggest that this approach will be useful for the proteome-wide identification and quantitative profiling of proteins and expect it to find widespread application in the field of quantitative proteomics.

Results Optimization of Three-Dimensional Chromatography for the Separation of Complex ICAT-Labeled Peptide Mixtures. The ICAT procedure has been outlined12 and reviewed13,14 in detail previously. The ICAT method and MS/MS permit the identification and quantitative comparison of the components of two complex protein mixtures by the selective alkylation, isolation, quantification, and sequence identification of the cysteine-containing peptides from each protein. The relative quantification is achieved by stable isotopes (deuterium atoms) included in one form (termed d8 or heavy form) but not the other form (termed d0 or light form) of the reagent, and the recognition that two chemically identical compounds of different mass serve as perfect mutual internal standards for quantitative mass spectrometry.15 Protein identification is realized by MS/MS sequence analysis and sequence database searching using uninterpreted tandem mass spectra.16 The result is protein profiling directly from complex mixtures without the use of gel electrophoresis. The digestion of complex, labeled protein mixtures generates peptide samples potentially containing hundreds of thousands of peptides differing in their abundance by several orders in magnitude. The analysis of such samples therefore represents a considerable analytical challenge. We have implemented an optimized three-dimensional chromatographic procedure for the separation and isolation ICAT labeled peptides. The procedure schematically represented in Figure 1 was designed for sufficient sample capacity to make even peptides derived from low abundance proteins detectable by MS and for sufficient peak capacity for the analysis of very complex peptide mixtures. Whole cell protein extracts or fractions thereof can be labeled according to the ICAT strategy at cysteinyl residues and proteolyzed with trypsin. The resulting highly complex peptide mixture is then separated by strong cation-exchange (SCX) chromatography followed by biotin-affinity chromatography to selectively isolate the ICAT-labeled peptides, and these are further separated by capillary reversed-phase (RP) liquid chromatography prior to analysis by ESI-MS/MS. Critical parameters of the procedure related to the sample recovery and reproducibility of the procedure were systematically tested. Reproducibility of SCX Chromatography. To test the reproducibility of peptide separation by SCX chromatography in an automated mode, we repeatedly injected aliquots of the same sample into the chromatography system (n ) 10). The sample consisted of 100 pmol of a three-peptide standard (neurotensin, bradykinin, and angiotensin). Peak retention times for the three peptides varied by less than 15 s from the first to last injection. Area-under-the-curve measurements for the three peptides varied by 5.5% (n ) 10) (data not shown). A high degree of chromatographic reproducibility is important for the procedure in cases in which several samples of the same type need to be processed sequentially (e.g., time course of stimulation studies, clinical samples, etc.). 48

Journal of Proteome Research • Vol. 1, No. 1, 2002

Figure 1. Proteome analysis using a combination of the isotopecoded affinity tag (ICAT) strategy and multidimensional chromatography. Whole cell protein is first denatured, reduced, and then labeled at cysteines with the ICAT strategy followed by proteolysis with trypsin. The resulting highly complex peptide mixture is subjected to three different chromatographic separations. Strong cation-exchange (SCX) chromatography separates the peptide mixture based on ionic charge, and fractions are collected. Each fraction can then be passed over an avidin column to selectively isolate only ICAT-labeled (cysteine-containing) peptides. The resulting mixture of ICAT-labeled peptides is then separated by reversed-phase (RP) nanoscale capillary liquid chromatography (LC) with online analysis by tandem mass spectrometry (MS/MS). The reduction in the complexity of the original mixture by the 3D separation permits the analysis of lower abundance proteins. The reduction in the complexity of a theoretical example of tryptic proteolysis of the entire yeast proteome is shown on the right.

Peptide Yield of Avidin Affinity Chromatography. A key to the effectiveness of any multidimensional chromatographic strategy is a high recovery of the applied peptides. It is well established that peptide recoveries from SCX and RP columns are generally high.17 The affinity purification of the biotinylated cysteine-containing peptides using immobilized avidin was therefore considered the recovery-limiting step in the 3D peptide separation procedure. We therefore systematically investigated and optimized the conditions for the purification of the ICAT-labeled peptides by avidin affinity chromatography. Peptide recovery was quantified by using differential isotopic labeling, and quantitative mass spectrometry of peptide mixtures passed over the column or control samples that were not subjected to avidin chromatography. Bovine R-lactalbumin protein was labeled with the d0 form of the ICAT reagent, trypsinized and subjected to avidin affinity chromatography under different conditions. An amount of a tryptic digest of a control sample labeled with the d8 form of the reagent corresponding to the amount of d0 labeled sample applied to the column was added to the column eluate and the ratios of the signals for the heavy to light forms of each peptide were determined by MALDI-TOF-MS and used to quantify peptide

research articles

Proteome Analysis of Low-Abundance Proteins

Table 1. Examples of Reproducibility of ICAT Expression Ratios from Four Separate Experiments gene namea

Figure 2. Effect of different elution conditions on peptide recovery using biotin affinity chromatography during the ICAT strategy. Quantitative recoveries for seven different peptides from bovine R-lactalbumin were determined by a modified ICAT approach as described in the text with varying elution conditions. Peptides a-g are shown in their elution order by RP-HPLC (hydrophobicity), and sequences are detailed in the text. Only the third elution condition gave effective recovery of both hydrophobic and hydrophilic peptides. Elution conditions from the affinity columns were the following: condition 1, 30% methanol, 0.3% formic acid; condition 2, 30% methanol, 0.4% TFA; condition 3, 30% acetonitrile, 0.4% TFA.

recovery. Figure 2 shows the recovery of seven different ICATlabeled peptides from bovine alpha-lactalbumin at three different elution conditions. Using methanol: formic acid: water (30:0.3:70 v/v/v) or methanol: TFA: water (30:0.4:70 v/v/v) as eluting solvents we observed a large discrepancy between the recoveries of hydrophilic and hydrophobic peptides with more hydrophobic peptides being recovered poorly. In contrast, the peptides uniformly eluted at high yield in acetonitrile: TFA: water (30:0.4:70 v/v/v). From these experiments we concluded that complex mixtures containing ICAT labeled peptides could be separated reproducibly and at that peptides were generally were recovered at high yields. Reproducibility of ICAT Procedure. To examine the overall reproducibility of the ICAT method we analyzed protein samples from yeast cells grown on ethanol or galactose as a carbon source. This experiment has been described and discussed in detail previously.12 For the purpose of investigating the reproducibility of the ICAT method, the entire process consisting of yeast cell growth, ICAT labeling, chromatographic separation of peptides and mass spectrometric analysis was repeated in triplicate. These results, together with the results obtained months earlier from the same experiment for the initial description of the ICAT method are shown in Table 1. As we frequently detected multiple peptides from the same protein, many proteins were quantified by more than four independent measurements. Ability of the ICAT Procedure To Identify Low Abundance Proteins. The ability to identify proteins of low abundance is a critical element of proteome analysis. Peptide samples generated by the tryptic digestion of an unseparated proteome are expected to span approximately 6 orders of magnitude in abundance for yeast4 and an even wider range for mammalian cells. Furthermore, such peptide samples are extraordinarily complex, containing several hundred thousand peptides at least. A suitable proteome analysis technology must therefore have sufficient dynamic range to match or exceed the range of protein expression of the sample analyzed, sufficient peak capacity to fractionate the peptide mixture to a degree that allows the mass spectrometer to detect and resolve most or all

ADH1 BMH1 CDC19 GPD1 LPD1 PEP4 PSA1 PGM2 PCK1 QCR6 SAH1 SOD1 TEF1

obsd ratiob (Eth/Gal)c

nd

0.55 ( 0.10 0.92 ( 0.15 0.59 ( 0.07 0.58 ( 0.13 1.41 ( 0.22 2.30 ( 0.39 0.66 ( 0.09 0.58 ( 0.09 1.53 ( 0.11 1.35 ( 0.17 0.71 ( 0.08 0.52 ( 0.10 0.77 ( 0.11

7 4 10 4 4 4 8 7 18 3 8 4 12

a Gene names are according to the yeast proteome database (YPD). Protein expression ratios were calculated as described in text. c Carbon source for yeast growth was 2% ethanol (Eth) or 2% galactose (Gal). d Total number of unique peptide identifications and quantifications from four different labeling experiments. b

of the peptides in the sample, and sufficient sample capacity to elevate the signal of the lowest abundance peptides above the limit of detection of the mass spectrometer used. To test the ability of the procedure described in this paper to identify low abundance proteins, tryptic digests of soluble yeast proteins harvested from cells grown on either ethanol or galactose and labeled with ICAT reagents were subjected to increasing chromatographic separation and identified by different MS/MS sequencing protocols. The number of proteins identified and their codon bias distribution were used to assess the ability of these methods to identify low abundance proteins. Figure 3 shows the effect on the number of proteins identified if the peptide mixture (proteolyzed yeast proteome) was separated by one-dimensional (RP-HPLC), two-dimensional (SCX and RP-HPLC), or three-dimensional (SCX, biotin-affinity, RP-HPLC) chromatography and the peptides were analyzed by performing alternating MS (surveying eluting peptide masses) and MS/MS (sequencing peptide ions) scans in the ion trap mass spectrometer. The data indicate that the number of identified proteins increased by approximately 10-fold by moving from 1D separation to 3D peptide separation. In fact, from four SCX fractions analyzed (out of 30 collected) 986 unique proteins were identified which is a substantial fraction of the entire yeast proteome. To increase the number of proteins identified we employed two different MS/MS strategies. In the first, a MS scan was followed by a single MS/MS sequencing scan of the most abundant ion detected in the MS scan. In the second strategy, a single MS scan was followed by MS/MS scans of the five most intense peptide ions detected in the MS scan and the chromatographic separation time was extended from 1 to 2 hours.18 The results in Figure 3B show that by the second strategy the number of proteins uniquely identified from the yeast lysates doubled, independent of the chromatographic separation scheme utilized. A comparison of Figures 4A and 4B also showed that by going from the 2D to the 3D chromatographic separation the number of identified proteins increased by 2-fold while the number of uniquely identified peptides decreased by 2-fold. This was due to the sometimes-reduced ability of the database-searching algorithm to unambiguously identify peptides with the large ICAT modification on the side chain of cysteine residues. It is anticipated that the gradient length could be further extended to 3 or 4 h. This would undoubtedly result in even more peptide identifications. Journal of Proteome Research • Vol. 1, No. 1, 2002 49

research articles

Figure 3. Effect of decreasing the mixture complexity and increasing the separation power on protein identification in yeast. A single sample representing all tryptic peptides from whole soluble yeast protein labeled at cysteines with the ICAT reagents was analyzed. Aliquots of this sample were subjected to either one-dimensional (1D), 2D, or 3D chromatographic separation prior to peptide sequence analysis by tandem mass spectrometry (MS/MS) and database matching. In addition, two different MS/ MS strategies were evaluated with the second strategy being maximized for peptide sequencing. Results show the number of (A) unique peptides and (B) unique proteins identified in each category and separation strategy. Separation strategies were the following: 1D, RP only; 2D, SCX and RP; 3D, SCX followed by biotin-affinity chromatography followed by RP. 2D and 3D separations represent combined data from only four SCX chromatography fractions (60 were collected). MS strategies were the following: MS strategy 1 was one MS scan followed by one MS/ MS (sequencing) scan repeated during a 1-h analysis, MS strategy 2 was one MS scan followed by five MS/MS scans on the top five peptide ions repeated during a 2-h analysis. The lowercase letters in panel B are to be used as reference for Figure 4.

Figure 4 presents the codon bias distributions (CBD) for the proteins identified from each of the conditions tested in Figure 3B. The intervals with the lighter shading represent identified genes that were classified as low abundance proteins based on codon bias values. Clearly, only the distribution in Figure 4F closely approximates the distribution of the entire yeast proteome. The conditions used to generate the data in Figure 4F were 3D chromatographic peptide separation and the extensive MS/MS sequencing strategy. Other conditions, especially those utilizing only 1D separation by capillary chromatography only, failed to identify proteins of lower abundance due to the limited sample and peak capacity.

Discussion Proteome analysis employing automated tandem mass spectrometry has proven effective for identifying proteins extracted from gels,1 and also directly from complex mixtures.19 Hundreds of proteins can be identified independently when complex peptide mixtures are separated by RP-LC prior to online analysis in the MS.2 Coelution of multiple peptide ions usually presents no problem because each peptide is isolated and fragmented, in turn, by the tandem mass spectrometer. Furthermore, the entire process of ion selection, CID and acquisition and analysis of the fragment ion spectrum is completely automated with the mass spectrometer cycling back 50

Journal of Proteome Research • Vol. 1, No. 1, 2002

Gygi et al. and forth between measuring peptide masses (MS mode) and collecting peptide sequence information (MS/MS mode). Because the time required to sequence each peptide (collect one MS/MS scan) is approximately 1.3 s, impressive numbers of peptides can be sequenced during a typical 1-hr gradient (3600 s).18 As demonstrated by Link and co-workers11 and others,6,10,20 the number of peptides analyzed can be further increased if the peak capacity and also the analysis time of the peptide separation is increased by the implementation of 2D (SCX, RPLC) procedures. However, these methods fail to provide accurate quantitative information on the peptides analyzed. The ICAT strategy, by using post isolation labeling of proteins with stable isotope tags provides in a single analysis the ability to identify and quantify protein expression levels directly from complex mixtures.13 In this study we have systematically examined and optimized the conditions for 3D (SCX, avidin-affinity, RP) chromatographic separation of ICAT labeled peptides and the utility of this method for the identification of proteins of low abundance. Specifically we have assessed the reproducibility of the SCX chromatography method used, the recovery of ICAT-labeled peptides from biotin affinity chromatography and the quantitative reproducibility of the method. We show that ICAT-labeled peptides generally are recovered in high yield (Figure 2), that the method is reproducible with respect to chromatographic fidelity and quantitative accuracy, and that the 3D peptide separation has sufficient sample and peak capacity to detect proteins of low abundance. The reproducibility of quantification of the method was investigated by measuring the effects on protein expression from a metabolic shift from galactose to ethanol in yeast cells. The experiment was performed in quadruplicate with one measurement performed months earlier than the other three. The results were highly reproducible. Generally, changes in abundance exceeding 25% could be confidently detected. Quantitative protein profiling by the ICAT method is therefore considerably more accurate and more precise than mRNA profiling by expression array methods.21 Proteolytic cleavage of all soluble yeast proteins with trypsin presents a highly complex peptide mixture of at least 300,000 peptides. Reducing the complexity of this mixture while maintaining the integrity of the original sample is paramount to detecting classes of proteins with lower expression levels. Selective labeling of peptides containing the relatively rare amino acid cysteine is a suitable strategy to reduce the complexity of peptide mixtures by approximately a factor of 10.22 However, this reduction in complexity is accompanied with some protein loss because approximately 8% of proteins in yeast contain no cysteines. Notwithstanding, the average number of cysteines per protein in yeast is approximately six.23 There are 61 possible codons that code for 20 amino acids. Codon bias is a measure of the propensity of an organism to selectively utilize certain codons that result in the incorporation of the same amino acid residue in a growing polypeptide chain. The larger the codon bias value, the fewer the number of codons that are used to encode the protein.24 It is thought that codon bias is a measure of protein abundance because highly expressed proteins generally have large codon bias values.3 We have previously shown that codon bias appears to be an excellent indicator of the boundaries of current 2D gel proteome analysis technology.6 In yeast genes with codon bias values less than 0.1 are generally considered to be expressed at low abundance.4,8 There are thousands of yeast genes with

Proteome Analysis of Low-Abundance Proteins

research articles

Figure 4. Evaluation of low abundance proteins identified in Figure 3. Panels a-f correspond to the genes of the identified proteins from Figure 3. The codon bias value distributions for identified proteins from each mixture analysis is shown. Genes with codon bias values 20% acetonitrile (ACN) in the buffers to linearize peptides,28 yet only 5-10% ACN can be tolerated with the online approach. iii) Peptide separation is superior using true chromatography instead of salt steps. iv) UV absorbance provides both quality control and a quantitative indication of peptide amounts in each fraction. v) Detergents and solubilizing agents (i.e., SDS, urea, etc.) are completely removed. vi) User discretion as to which fractions are to be analyzed is available in the offline approach, and interesting fractions can be re-visited (reanalyzed). We therefore conclude that the automation of sequential off-line chromatographic steps has the highest potential to reproducibly detect and identify the greatest number of (low abundance) proteins. These experiments aimed at quantifying peptide recovery from the biotin-affinity column also demonstrated the potential of the ICAT strategy as a tool for method development. The quantitative peptide recovery of a particular protein isolation or extraction method can be easily determined by subjecting only the d0-labeled protein to the method and by immediately prior to mass spectrometric analysis spiking into the sample an amount of heavy (d8-labeled) peptides that is equal to the amount of starting d0-labeled protein. Such method optimizations can be performed with most types of mass spectrometers, including the MALDI-TOF-MS instrument used here. Protein identification was performed by searching MS/MS spectra against the yeast protein database with the algorithm Sequest.16 Differential modifications to methionines (oxidation) and cysteines (both d0- and d8-labeled ICAT modifications) 52

Journal of Proteome Research • Vol. 1, No. 1, 2002

Gygi et al. were considered. No enzyme specificity was utilized in the search so that any peptide with a correct mass might be returned. Filtering of the data after the search was accomplished by applying the following constraints:10 (i) a minimum Sequest Xcorr of 1.9, 2.2, and 3.75 for peptides with a charge state of 1+, 2+, and 3+, respectively; (ii) a requirement that returned peptides have fully tryptic ends; and (iii) enforcement of a minimum ×a6Cn score of 0.1. For proteins identified by three or more qualifying peptides, no manual interpretation of the data was performed. For protein identification from only one or two peptides, the MS/MS spectra were manually verified using existing criteria.11 In addition, ICAT-labeled peptides were required to contain a cysteine with the correct ICAT label (heavy or light). Therefore, to the best of our knowledge, no protein or peptide was included in the datasets that was not represented by a high-quality tandem mass spectrum. In conclusion, we have determined an optimal 3D purification strategy for ICAT-labeled peptides, established the high reproducibility of protein quantification by using the ICAT technology and mass spectrometric analysis, and shown that the combination of 3D chromatography and the ICAT approach enables the analysis of low abundance yeast proteins. The application of this approach, in conjunction with quantitative mRNA expression profiles, to the analysis of externally or internally perturbed biological systems is expected to yield new insights into the control of gene expression. Future studies are being performed that will compare the protein profiles for the proteins identified here with mRNA expression profiles from matched-sample cDNA microarray experiments.

Experimental Section Synthesis of ICAT Reagents. The ICAT reagents were either synthesized in house12 or obtained from Applied BioSciences (Framingham, MA). Yeast Strain and Growth Conditions. The source of protein for all experiments was the yeast strain YPH499 (MATa ura352 lys2-801 ade2-101 leu2-1 his3-200 trp1-63). Cells were grown to log phase (2 × 107 cells/mL) in YP-rich medium with either 2% galactose or 2% ethanol at 30 °C. Protein was harvested as described.12 A 2.2 mg portion of harvested protein from each state was reduced (5 mM tributyl phosphine), denatured (6 M guanidine HCl, 50 mM Tris buffer, pH 8.5), and alkylated with the heavy and light ICAT reagents, respectively.12 The denatured, ICAT-labeled protein mixtures were combined and then diluted 8-fold in 10 mM Tris buffer and digested with 40 µg of sequencing-grade modified trypsin (Promega, Madison, WI) overnight at 37 °C. Peptide Recovery during ICAT Analysis. Bovine R-lactalbumin (LCA) protein (50 µg) was denatured, reduced, and alkylated at cysteinyl residues using a 5-fold excess of the d0ICAT reagent followed by gel filtration and trypsin digestion as described (Gygi, 1999 #195). An aliquot corresponding to 10 pmol of peptide mixture was then passed over the avidin column. Elution conditions were the following: (1) 30% methanol and 0.4% formic acid, (2) 30% methanol and 0.4% TFA, (3) 30% acetonitrile and 0.4% TFA. Recovery was assessed by spiking into the eluent 10 pmol of trypsinized d8-ICATlabeled LCA protein followed by rapid MS analysis of peptide ion ratios by MALDI-TOF-MS (DE-STR, PE Biosystems, Framingham, MA). A control with a 1:1 ratio of d0/d8 label peptide was obtained when the d8-labeled peptide solution was spiked prior to biotin affinity purification. Recoveries were determined independently for seven cysteine-containing peptides with the

research articles

Proteome Analysis of Low-Abundance Proteins following sequences (shown as peptide letter, amino acid position including signal peptide): a, 128-133; b, 25-29; c, 7881; d, 82-98; e, 134-141; f, 99-112; g, 36-77. Only the third elution condition produced excellent recoveries of all seven peptides. Reproducibility of Protein Profiling of Carbon Source Change with the ICAT Strategy. This experiment involved the exact replication in triplicate (including growth and harvesting of yeast cells) of a previously published experiment.12 Protein expression profiles for yeast growing on galactose were compared with yeast growing on ethanol. The four complete datasets of identified and quantified protein levels were used to assess the reproducibility of the ICAT strategy. One-Dimensional Reversed-Phase Chromatography with On-Line Mass Spectrometry. Fused silica microcapillary columns (100 µM i.d. × 12 cm) were in-house packed with Magic C18 (5 µM, 200 Å) spherical silica (Michrom BioResources, Auburn, CA). A flame-pulled tip (5 µM diameter) at one end of the fused silica served a dual purpose of retaining beads and as a needle for electrospray ionization. The voltage (1.8 kV) was applied behind the column through a gold wire into one arm of a microcross (Upchurch Scientific, Oak Harbor, WA). The other three arms were used as a receiver for the HPLC flow (75 µL/min), a fused-silica flow restrictor (50 µM i.d. × 50 cm) which passed 74.5 µL/min to waste, and the packed capillary column (500 nl/min flow), which was connected to the inlet for the MS. The capillary column was loaded with the yeast complex peptide mixture offline (20 µg peptides) via a pressure cell and then reconnected to the system. After washing for 5 min with solvent A [5% acetonitrile, 0.4% acetic acid and 0.005% heptafluorobutyric acid (HFBA)], a binary gradient with 5-80% solvent B (95% acetonitrile, 0.4% acetic acid and 0.005% HFBA) was developed with an HP1100 solvent delivery system (HewlettPackard, Palo Alto, CA). Functional chromatography was achieved, and eluting peptides were analyzed by an LCQ classic ion-trap mass spectrometer (ThermoFinnigan, San Jose, CA). Two different MS strategies were utilized with the second strategy being designed to maximize the number of peptides sequenced during the run. The first strategy maintained the ion trap switching back and forth between the MS (detecting peptide ion mass-to-charge ratios) and the MS/MS (sequencing) modes. Each scan consumed between 1 and 2 s with an average of 1.3 s/scan. This occurred throughout a 1-h analysis (collecting approximately 1300 sequencing attempts) using the LC gradient profile described above. The second strategy employed one MS scan followed directly by five MS/MS scans on the five most intense ions from the MS spectrum. This strategy occurred throughout a 2-h analysis (collecting approximately 4400 sequencing attempts) also using the same gradient profile described above. In addition, during both strategies, peptide ions for which sequencing information had been collected were dynamically excluded from reanalysis for 1 min. The acquired MS/MS spectra were automatically searched against a database of known and predicted yeast proteins with the Sequest algorithm.16 No enzyme parameter was chosen and oxidized methionines and ICAT-labeled cysteines (both heavy and light versions) were examined as static modifications. No difference was found between the relative numbers of oxidized methionine residues detected during the 1D, 2D, and 3D experiments. A peptide was considered to be a match by utilizing criteria described in the discussion and elsewhere.10 Often multiple peptides from the same protein were detected

during the analysis which permitted completely unambiguous protein identification with no manual examination of spectra. SCX and two-dimensional chromatographic separation of peptides. The same peptide mixture as that described for 1D separation was utilized. 2.2 mg of ICAT-labeled peptides from each state (4.4 mg total) were subjected to strong cation exchange chromatography as described.27 The separation took place on an Integral chromatography system (Applied BioSciences, Framingham, MA) during a 1-h gradient on a polysulfoethyl aspartamide column (2.1 × 150 mm, PolyLC, Columbia, MD) at a flow rate of 200 µl/min with fraction collection every minute: solvent A, 25% ACN, 5 mM phosphate buffer pH 2.7; solvent B, 25% ACN, 5 mM phosphate buffer pH 2.7 with 350 mM KCl. Four individual fractions (numbers 10, 13,16, and 27) were then concentrated to 50 µL. A 25-µL portion was reserved for the 3D chromatographic separation. Of the remaining 25 µL, the maximum peptide loading onto the RP capillary column was determined based on the UV trace for that fraction (2, 2, 5, and 20 µL for fractions 10, 13, 16, and 27, respectively). The second-dimensional separation after peptide loading was exactly the same as that described for the RP-LC-MS/MS technique described above. Peptide and protein identifications from the analyses of the four SCX fractions were combined. Three-Dimensional Peptide Separation. The first-dimensional separation was SCX chromatography as described above. Affinity chromatography using monomeric avidin agarose columns was performed on 50% of the solution from the SCX fractions. Elution of biotinylated (ICAT-labeled) peptides was accomplished with 30% acetonitrile, 0.4% TFA in 0.5 mL. The volume was reduced to 25 µL, and 25 µL was analyzed by RPLC-MS/MS as described above.

Acknowledgment. This work was supported by NIH Grant No. HG00041 to S.G., a grant from the GiovanniArmenise-Harvard Foundation to S.G., a grant to R.A. from Merck Genome Research Institute, a fellowship to B.R. from the Swiss National Science Foundation, and an NIH Postdoctoral Genome Training Grant fellowship to T.G., with additional support from the National (USA) Cancer Institute Grant No. 1R33CA84698. References (1) Aebersold, R.; Goodlett, D. R. Chem. Rev. 2001, 101, 269-295. (2) Peng, J.; Gygi, S. P. J. Mass Spectrom. 2001, 36, 1083-1091. (3) Garrels, J. I.; McLaughlin, C. S.; Warner, J. R.; Futcher, B.; Latter, G. I.; Kobayashi, R.; Schwender, B.; Volpe, T.; Anderson, D. S.; Mesquita, F. R.; Payne, W. E. Electrophoresis 1997, 18, 1347-60. (4) Gygi, S. P.; Rochon, Y.; Franza, B. R.; Aebersold, R. Mol. Cell. Biol. 1999, 19, 1720-1730. (5) Bennetzen, J. L.; Hall, B. D. J. Biol. Chem. 1982, 257, 3026-31. (6) Gygi, S. P.; Corthals, G. L.; Zhang, Y.; Rochon, Y.; Aebersold, R. Proc. Natl. Acad. Sci. U.S.A. 2000, 97, 9390-5. (7) Shevchenko, A.; Jensen, O. N.; Podtelejnikov, A. V.; Sagliocco, F.; Wilm, M.; Vorm, O.; Mortensen, P.; Shevchenko, A.; Boucherie, H.; Mann, M. Proc. Natl. Acad. Sci. U.S.A. 1996, 93, 14440-5. (8) Futcher, B.; Latter, G. I.; Monardo, P.; McLaughlin, C. S.; Garrels, J. I. Mol. Cell Biol. 1999, 19, 7357-68. (9) Perrot, M.; Sagliocco, F.; Mini, T.; Monribot, C.; Schneider, U.; Shevchenko, A.; Mann, M.; Jeno, P.; Boucherie, H. Electrophoresis 1999, 20, 2280-98. (10) Washburn, M. P.; Wolters, D.; Yates, J. R. Nat. Biotechnol. 2001, 19, 242-247. (11) Link, A. J.; Eng, J.; Schieltz, D. M.; Carmack, E.; Mize, G. J.; Morris, D. R.; Garvik, B. M.; Yates, J. R. Nat. Biotechnol. 1999, 17, 676682. (12) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Nat. Biotechnol. 1999, 17, 994-999.

Journal of Proteome Research • Vol. 1, No. 1, 2002 53

research articles (13) Gygi, S. P.; Aebersold, R. Proteomics: A Trends Guide 2000, 3135. (14) Gygi, S. P.; Rist, B.; Aebersold, R. Cur. Opin. Biotechnol. 2000, 11, 396-401. (15) De Leenheer, A. P.; Thienpont, L. M. Mass Spectrom. Rev. 1992, 11, 249-307. (16) Eng, J.; McCormack, A. L.; Yates, J. R. J. Am. Soc. Mass Spectrom. 1994, 5, 976-989. (17) Hennion, M. C. J. Chromatogr. A 1999, 856, 3-54. (18) Shabanowitz, J.; Settlage, R. E.; Marto, J. A.; Christian, R. E.; White, F. M.; Russo, P. S.; Martin, S. E.; Hunt, D. F. In Mass Spectrometry in Biology and Medicine; Burlingame, A. L., Ed.; Human Press: Totowa, 2000; pp 163-177. (19) Washburn, M. P.; Yates, J. R. Proteomics: A Trends Guide 2000, 28-32. (20) Davis, M. T.; Beierle, J.; Bures, E. T.; McGinley, M. D.; Mort, J.; Robinson, J. H.; Spahr, C. S.; Yu, W.; Luethy, R.; Patterson, S. D. J. Chromatogr. B Biomed. Sci. Appl. 2001, 752, 281-91. (21) Lipshutz, R. J.; Fodor, S. P.; Gingeras, T. R.; Lockhart, D. J. Nat. Genet. 1999, 21, 20-4.

54

Journal of Proteome Research • Vol. 1, No. 1, 2002

Gygi et al. (22) Spahr, C. S.; Susin, S. A.; Bures, E. J.; Robinson, J. H.; Davis, M. T.; McGinley, M. D.; Kroemer, G.; Patterson, S. D. Electrophoresis 2000, 21, 1635-50. (23) Goodlett, D. R.; Bruce, J. E.; Anderson, G. A.; Rist, B.; Pasa-Tolic, L.; Fiehn, O.; Smith, R. D.; Aebersold, R. Anal. Chem. 2000, 72, 1112-8. (24) Kurland, C. G. FEBS Lett. 1991, 285, 165-9. (25) Velculescu, V. E.; Zhang, L.; Zhou, W.; Vogelstein, J.; Basrai, M. A.; Bassett, D. E., Jr.; Hieter, P.; Vogelstein, B.; Kinzler, K. W. Cell 1997, 88, 243-51. (26) Griffin, T. J.; Gygi, S. P.; Rist, B.; Aebersold, R.; Loboda, A.; Jilkine, A.; Ens, W.; Standing, K. G. Anal. Chem. 2001, 73, 978-86. (27) Gygi, M. P.; Licklider, L. J.; Peng, J.; Gygi, S. P. In Protein Analysis: A Laboratory Manual; Simpson, R., Ed.; Cold Spring Harbor: New York City, 2001; in press. (28) Burke, T. W.; Mant, C. T.; Black, J. A.; Hodges, R. S. J. Chromatogr. 1989, 476, 377-89.

PR015509N