Selective Enrichment of Tryptophan-Containing Peptides from Protein

Jul 27, 2007 - ... E. J. Smeenk , Leo J. de Koning , Jan H. van Maarseveen , Luitzen de Jong , Carolyn R. Bertozzi , Henk Hiemstra and Chris G. de Kos...
0 downloads 0 Views 166KB Size
Selective Enrichment of Tryptophan-Containing Peptides from Protein Digests Employing a Reversible Derivatization with Malondialdehyde and Solid-Phase Capture on Hydrazide Beads Alexandra Foettinger,†,‡ Alexander Leitner,*,† and Wolfgang Lindner† Department of Analytical Chemistry and Food Chemistry, University of Vienna, Waehringer Strasse 38, 1090 Vienna, Austria, and Baxter AG, Biomedical Research Center, Uferstrasse 15, 2304 Orth/Donau, Austria Received May 11, 2007

A method for the selective enrichment of tryptophan-containing peptides from complex peptide mixtures such as protein digests is presented. It is based on the reversible reaction of tryptophan with malondialdehyde and trapping of the derivatized Trp-peptides on hydrazide beads via the free aldehyde group of the modified peptides. The peptides are subsequently recovered in their native form by specific cleavage reactions for further (mass spectrometric) analysis. The method was optimized and evaluated using a tryptic digest of a mixture of 10 model proteins, demonstrating a significant reduction in sample complexity while still allowing the identification of all proteins. The applicability of the tryptophanspecific enrichment procedure to complex biological samples is demonstrated for a total yeast cell lysate. Analysis of the processed fraction by 1D-LC-MS/MS confirms the specificity of the enrichment procedure, as more than 85% of the peptides recovered from the enrichment step contained tryptophan. The reduction in sample complexity also resulted in the identification of additional proteins in comparison to the untreated lysate. Keywords: mass spectrometry • chemical tagging • malondialdehyde • tryptophan-containing peptides • enrichment • hydrazide beads

Introduction Because of the enormous complexity of the samples usually examined in proteomic studies, separation and simplification of these mixtures has become a major task. In most cases, onedimensional separations prior to MS analysis are not sufficient for the analysis of such complex mixtures even with high-end mass spectrometers. Traditionally, two-dimensional gel electrophoresis is used for protein separation, with the benefit of its high resolving power but also with several drawbacks like limitations in pI and molecular weight of the proteins that can be separated under typical conditions. Various methods to overcome these problems have been developed over the last years based on chromatographic and electrophoretic separation techniques on the protein as well as on the peptide level.1-3 Two-dimensional (2D) HPLC methods have been applied in different variations: This includes combination of orthogonal chromatographic techniques4 (e.g., cation exchange and reversed phase chromatography (RPC) or RPC at different pH values) used at the peptide level, which can be performed in an on-line or off-line fashion. Alternatively, a separation step at the protein level followed by reversed phase separation of * To whom correspondence should be addressed. Dr. Alexander Leitner, Department of Analytical Chemistry and Food Chemistry, University of Vienna, Waehringer Strasse 38, 1090 Vienna, Austria. Phone, +43 1 4277 52322; fax, +43 1 4277 9523; E-mail, [email protected]. † University of Vienna. ‡ Biomedical Research Center. 10.1021/pr0702767 CCC: $37.00

 2007 American Chemical Society

the peptides after enzymatic digestion of the fractions may be performed.5 Recently, even three chromatographic dimensions have been combined to deal with highly complex mixtures.6 Often, the resolving power of these techniques is still considered insufficient; besides, multidimensional fractionation results in a high number of sample fractions to be analyzed individually, increasing the workload and the demand for instrument time. Therefore, various chemoselective affinity enrichment methods directed toward certain amino acid residues have been developed.7-9 These methods are either based on the recognition of chemically introduced targets or of functional groups already present in the analytes of interest. The isotope coded affinity tag (ICAT) was the first widely used method that combines affinity enrichment and stable isotope labeling for relative quantification.10-11 Other methods based on chemical reactions focus on cysteine,12 arginine,13 tryptophan,14 methionine,15 and toward peptides carrying posttranslational modifications such as phosphorylation16-19 and glycosylation.20-21 Many of these methods are directed toward low abundant amino acids to achieve a high degree of simplification while maintaining enough information for reliable protein identification. Tryptophan is also one of these low abundance amino acids with an occurrence of ∼1%,22 making it the rarest of the 20 proteinogenic amino acids. Nevertheless, approximately 90% (depending on the species) of proteins contain at least one tryptophan residue in their sequence, which is comparable to Journal of Proteome Research 2007, 6, 3827-3834

3827

Published on Web 07/27/2007

research articles

Foettinger et al.

Figure 1. Reaction scheme for the enrichment of tryptophan-containing peptides. Derivatization with MDA, capture on hydrazide beads, cleavage of covalently bound peptides with hydrazine or pyrrolidine.

the occurrence of cysteine residues.23 Conceptually, it is therefore of interest to target Trp-containing peptides for the development of selective tagging and trapping strategies. Kuyama et al. used 2-nitrobenzenesulfenyl chloride to label tryptophan residues in peptides for relative quantification.14 They prepared an isotopically labeled version (containing six 13 C atoms) of this well-known reagent that has often been applied for the determination of the tryptophan content in proteins or for investigating active sites of enzymes.24 To isolate the tagged peptides, the increased hydrophobicity upon modification of tryptophan residues was utilized. Recently, an improved enrichment strategy using a phenyl column and the application to biological samples have been described.25-26 2-(Trifluoromethyl)benzenesulfenyl chloride as an alternative tagging reagent has been introduced by Amster and coworkers.27 Several other (more or less) selective tryptophan modification agents are known, like N-bromosuccinimide,28 2-hydroxy-5-nitrobenzyl bromide,29 or rhodium carbenoids,30 but they have not been applied to proteomic samples. A kit for the enrichment of tryptophan containing peptides has been made commercially available,31 but to our knowledge, no application has been described in the literature. The so-called “Pi3” enrichment kit is also based on the sulfenyl chloride chemistry, with the reactive group being immobilized on a solid support. For cleavage of the trapped Trp-peptides under reducing conditions, a linker containing a disulfide bond is incorporated, and the Trp-peptides are recovered carrying a SH-substituent in the indole ring. We have recently investigated the reaction of the indole group of tryptophan with malondialdehyde (MDA) and applied it to the derivatization of tryptophan residues in peptides.32 This reaction has first been described in the 1960s by Teuber et al.33,34 One of the aldehyde groups of MDA reacts with the indole nitrogen under elimination of water and a reactive acrolein-like, R,β-unsaturated aldehyde is formed from the second aldehyde group. We found that this group can further react with hydrazide-containing compounds but is cleaved by incubation with hydrazine or amines such as pyrrolidine32 (see Figure 1). Solid supports with immobilized hydrazide groups have been used for the immobilization of proteins (e.g., antibodies) via their oxidized carbohydrate moieties for the preparation of affinity chromatography sorbents.35 More recently, a selective enrichment method for glycoproteins for proteomics has been developed using this concept by Zhang et al.20 Hydrazide chemistry has also been used by Regnier’s group for the 3828

Journal of Proteome Research • Vol. 6, No. 9, 2007

enrichment of proteins carrying carbonyl groups as a result of oxidative stress events in vivo or in vitro.36 We have utilized the reaction of tryptophan with MDA under acidic conditions, outlined in Figure 1, for the development of a solid-phase capture and release method based on such hydrazide affinity materials. The method was first optimized with tryptophan containing standard peptides and tryptic digests of model proteins. The Trp-selective enrichment method was then successfully applied to a digest of soluble yeast proteins as an example for a highly complex mixture of biological origin.

Experimental Section Materials. All peptides were purchased from Bachem (Weil am Rhein, Germany), proteins were either from Sigma (Steinheim, Germany) or from Fluka (Buchs, Switzerland). The derivatizing reagent 1,1,3,3-tetramethoxypropane (TMP) was bought from Aldrich (Steinheim, Germany). The solid support hydrazide materials, CarboLink Coupling Gel (agarose) and UltraLink Hydrazide Gel (acryl), were supplied by Pierce (Rockford). CelLytic-Y yeast cell lysis/extraction reagent was purchased from Sigma. All common chemicals and solvents were obtained from either Fluka, Riedel-de Haen (Seelze, Germany), Sigma, or VWR (Vienna, Austria). Trypsin Digestion of Model Proteins. For each protein, ∼300 µg were dissolved in 100 µL of 8 M urea. Reduction and alkylation of disulfide bonds was performed with DTT and iodoacetamide according to standard protocols. The samples were then diluted to 1 mL with 50 mM aqueous ammonium bicarbonate solution, trypsin (proteomics grade, Sigma) was added in a ratio of 1:50 (w/w), and digestion was performed at 37 °C overnight. The digest solutions were purified by solidphase extraction using 100 mg C-18 cartridges (Phenomenex, Torrance, CA). Briefly, the diluted reaction mixture was applied to the SPE cartridges followed by a washing step with 2 mL of water. Peptides were eluted with 500 µL of water/acetonitrile 50/50 (v/v) and 500 µL water/acetonitrile 25/75 (v/v), both containing 0.1% TFA. The combined eluents were then dried in a stream of nitrogen and peptides were then derivatized with MDA and applied to the hydrazide beads as described below. Preparation of Soluble Yeast Proteins. Three grams (wet weight) of baker’s yeast (bought at a local supermarket) were lyzed in 7.5 mL CelLytic-Y for 30 min at room temperature. Insoluble material was pelleted by centrifugation (8000g, 10 min, 4 °C) and proteins were precipitated overnight at -20 °C after addition of 24 mL of ice-cold acetone. The precipitate was

research articles

Enrichment of Tryptophan-Containing Peptides

centrifuged as above and the pellet was air-dried. Reduction, alkylation and trypsin digestion of yeast proteins was performed as described above. Labeling of peptides with MDA. For model peptides, 50 µg of peptides were dissolved in 50 µL of 80% trifluoroacetic acid (TFA) in water and 0.9 µL of TMP (MDA dimethylacetal) was added; for standard proteins and yeast proteins, 100 µL of 80% TFA and 1.8 µL of TMP were used per mg of crude protein. In all cases, the mixture was diluted approximately 1:20 with water after a reaction time of 1 h at room temperature, and the reagent excess was removed by solid-phase extraction using 100 mg C-18 cartridges as described above. Optimized Enrichment Procedure for TryptophanContaining Peptides. Typically, 200 µL of hydrazide gel (400 µL of suspension) were washed twice with 100 mM aqueous ammonium acetate solution adjusted to pH 5 with acetic acid. The MDA-treated peptide mixtures (concentration up to 2.5 mg mL-1) were dissolved in 200 µL of 100 mM ammonium acetate pH 5 and the mixture was vortexed for at least 4 h or overnight. After centrifugation, to settle the beads, the supernatant was collected and the beads were washed twice with 400 µL of methanol, twice with 400 µL of 8 M urea, and once with 400 µL of 100 mM aqueous ammonium acetate pH 5 for 10 to 15 min, respectively. For hydrazine-induced cleavage, the beads were shaken at 50 °C on a thermomixer (Eppendorf, Hamburg, Germany), twice with 400 µL 20 mM hydrazine dihydrochloride in 100 mM aqueous ammonium acetate pH 3 (adjusted with acetic acid) for 1 h, and twice with 200 µL of this solution containing 50% (v/v) of methanol for 30 min. Alternatively, cleavage was induced by pyrrolidine as follows: The beads were shaken at room temperature, first twice with 400 µL 500 mM pyrrolidine in water (for 1 h), then twice with 200 µL 500 mM pyrrolidine in water/methanol, 50:50 (v/v), for 30 min. In all cases, the corresponding fractions were pooled and the solvent was evaporated under a stream of nitrogen for concentration. NOTE: Due to the toxicity of hydrazine, the elution and concentration steps should be performed in a ventilated fume hood. Liquid Chromatography-Tandem Mass Spectrometry. All digest samples were analyzed by nanoLC-MS/MS on an Agilent 1100 series nanoLC system (Agilent Technologies, Waldbronn, Germany) and an Agilent MSD Trap SL quadrupole ion trap mass spectrometer equipped with an orthogonal nanoelectrospray source. For the untreated samples, an equivalent of approximately 1 µg of total protein was loaded onto the column. For samples obtained from the Trp-enrichment procedure, 10% of the recovered amount was injected, corresponding to a maximum of 50 µg of total protein as starting material. Samples were injected onto an Agilent Zorbax 300 SB-C18 trapping column (5 mm × 300 µm ID) which was washed for 5 min with water/acetonitrile 98/2 (v/v) containing 0.1% formic acid at 30 µL min-1. Then the sample was eluted from the trapping column by back-flushing onto an Agilent Zorbax 300 SB-C18 nanoLC column (15 cm × 75 µm, 3 µm particle size) with a flow rate of 250 nL min-1. Gradient elution was performed using the solvents A ) water with 0.1% (v/v) formic acid and B ) acetonitrile with 0.1% (v/v) formic acid and the following gradient table: 0-5 min 5% B, 5-90 min from 5 to 40% B, 90-100 min from 40 to 80% B, 100-105 min 80% B, 105-110 min from 80 to 5% B, and 110-115 min 5% B. MS/ MS spectra were acquired in the data-dependent mode select-

ing the two most abundant ions from each full scan for fragmentation in the m/z range of 400-1500 with an exclusion time of 2 min after two spectra. MS/MS spectra were acquired in the range from m/z 200 to 2000. Database Search and Data Validation. MS/MS spectra from all analyses were extracted by the Data Analysis software (version 2.2, Bruker Daltonics, Bremen, Germany) provided with the instrument and peak lists were exported in the MGF (Mascot generic file) format. Database search was performed using an in-house license of Mascot (version 2.0.5, http:// www.matrixscience.com). MS/MS spectra from model protein digests were searched against the UniProt/Swiss-Prot database (version 49.1, dated 21/02/2006, 208005 entries). Search settings were as follows: Taxonomy ) chordata; enzyme ) trypsin, allowing up to one missed cleavage; carbamidomethylation of Cys as permanent modification and methionine oxidation as variable modification; mass tolerances of ( 1.5 Da (precursor ions) and ( 0.8 Da (product ions), respectively; ESI-trap as instrument and charge states 1+/2+/3+. The protein hits were validated manually: peptides with an individual ion score g20, minimum 3 consecutive b- or y-ions and an expectation value of e1 were considered as positive identifications. A list of identified Trpcontaining peptides from the model proteins is provided in Supporting Table 1 (see Supporting Information). For yeast data, a custom FASTA database was created from the above UniProt/Swiss-Prot database by the program DBToolkit (version 3.1.1, obtained from http:// genesis.ugent.be/dbtoolkit/,37). All entries were filtered with the header filter “YEAST”, resulting in 5278 entries, and the sequence of porcine trypsin was appended to the database to identify potential autolysis peptides (only one was found during the whole study). A reversed database was created using the “decoy.pl” Perl script obtained from http://www. matrixscience.com, and a concatenated forward/reverse database was produced. All Mascot search parameters were as listed above, with the exception of charge state, which was set to 2+/ 3+ only. The Mascot output was then filtered as follows: Significance threshold ) p > 0.05, scoring scheme ) standard, ions score cutoff (for individual peptides) ) 20. In addition, only top-ranked peptides were retained. All peptide and protein assignments from the yeast sample are listed in Supporting Tables 2 (untreated lysate) and 3 (Trp-enriched lysate) (see Supporting Information). A comparison of all protein identifications, including hits from the reversed database, is provided as an Excel file (Supporting Table 4, see Supporting Information).

Results and Discussion Principle of the Enrichment Concept. Based on a reaction concept that we have previously outlined,32 a Trp-specific enrichment concept was developed that includes the following steps (see Figures 1 and 2): (1) Labeling of a peptide mixture with MDA under conditions that prevent the formation of either Schiff bases with amino groups or derivatization of the guanidino group of arginine residues. This is possible by using 80% TFA as a reaction medium for labeling where imines are unstable and formation of a pyrimidine ring at Arg residues38 does not occur. (2) Removal of excess reagent and immobilization of MDA-derivatized Trp-peptides onto hydrazide beads. Following a solid-phase cleanup step, the derivatized sample is applied to a solid support carrying immobilized hydrazide groups, as previously used for the capture of other peptides Journal of Proteome Research • Vol. 6, No. 9, 2007 3829

research articles

Foettinger et al. Table 1. Comparison of Peptide Identifications Obtained for the 10 Protein Digest Mixture Before and After Performing the Trp-Peptide Enrichment Methoda

protein

R1-Acid glycoprotein 1, human Carbonic anhydrase 2, bovine κ-Casein, bovine

before enrichment

after enrichment

Trp peptides

Trp peptides

Trp-free peptides

2

0

2

0

1

0

6 1 3 4 1 2 1 23

0 1 0 0 0 0 0 1

not detected 1

not detected R-Lactalbumin, bovine 1 β-Lactoglobulin, bovine 0 Lactotransferrin, bovine 1 Lysozyme C, chicken 2 Myoglobin, equine 0 Ovalbumin, chicken 1 Transferrin, human 0 total number of peptides 6 identified total number of proteins 8 identified

Trp-free peptides

3

1 5 9 2 6 6 2 34

10

a All corresponding identifications of Trp-peptides are listed in Supporting Table 1.

Figure 2. Overview of the experimental workflow used in the enrichment procedure.

carrying carbonyl groups (see Introduction). (3) Stringent washing to remove any nonspecifically bound peptides. Because a covalent hydrazone bond is formed between derivatized Trp-peptides and the hydrazide beads, washing with high concentrations of organic solvents and/or chaotropic salts is possible without risking the premature elution of the captured peptides. (4) Specific cleavage of Trp-peptides by either hydrazine or pyrrolidine. These two cleavage reagents were previously found to result in the cleavage the N-C bond at the nitrogen of the indole system via two different mechanisms.32 Any other peptides carrying carbonyl groups (such as those resulting from oxidation processes in vivo) that might be bound as well are not affected by the chemoselective cleavage step and do not interfere with further analysis of the Trp-enriched fraction. (5) Analysis of the released peptide fraction by LC-MS and/or -MS/MS. Optimization of the Experimental Protocol. As a first step, the method was adjusted and optimized on the peptide level using eight standard peptides, four Trp-containing peptides (WAGGDASGE, HPKRPWIL, KPQLWP, and LWMR), and four Trp-free peptides (PHPFHFFVYK, PyrLYENK, RPPGFSPFR, and DRVYIHPFHL) as negative controls. Two different hydrazide materials were used, based on agarose and acrylamide supports, respectively. We optimized the protocol concerning reaction times, binding buffer and pH, washing solution, and elution (cleavage) conditions (pH, buffer, cleaving reagent). The enrichment procedure is performed in a batch format by shaking the beads because of the relatively long reaction times required. The volume of the binding solution was kept as low as possible because we found that this is a crucial point for achieving better binding. An ammonium acetate buffer pH 5 3830

Journal of Proteome Research • Vol. 6, No. 9, 2007

and an approximate 100-fold molar excess of hydrazide groups was found to give the best results. A minimal reaction time of 4 h was necessary to achieve sufficient binding but carrying out the capture step overnight is also feasible. Performing the immobilization reaction at higher temperatures did not show any improvement for the binding step. Washing was performed with the binding buffer containing methanol as organic modifier (∼20%) because nonspecific binding was not so critical at this stage. Peptides were subsequently cleaved by a hydrazine hydrochloride solution at pH 3. Using elevated temperatures was found to accelerate cleavage; thus, the sample was incubated at 50 °C for this step. With this protocol, more than 90% of the recovered Trpcontaining peptides were found in the elution fraction and a maximum of 6% of Trp-free peptides were nonspecifically bound and detected in the elution fraction. Absolute recovery values for tryptophan peptides varied from peptide to peptide but were found to be in the range of 50-80% which is comparable to the (few) data reported for comparable enrichment schemes.39-41 Reproducibility of the procedure was also quite good with a relative standard deviation of around 10%. Among the two hydrazide supports (acryl and agarose) that were compared, the acryl-based material was found more suitable as reduced nonspecific binding was observed and recovery rates were at least as good as for the agarose material. It was therefore used exclusively for further experiments. Performance Evaluation Using Model Protein Digests. As a next step, the developed method was applied to a tryptic digest of 10 model proteins (see Table 1). As could be expected, we observed a higher degree of nonspecific binding when using the same protocol as for the simple peptide mixture, due to Trp-free peptides being present in large excess, roughly a factor of 10. Various washing conditions were tested and finally, the combination of 100% methanol followed by a 8 M urea solution gave the best results (for details see the Experimental Section). Enrichment experiments were then performed in duplicate using the improved washing step, and each sample was run at least twice. In addition, the untreated digest mixture was

Enrichment of Tryptophan-Containing Peptides

research articles

Figure 3. Nanoflow LC-MS/MS analysis of a tryptic digest of 10 model proteins. (a-b) MS2 total ion chromatograms of the digest mixture before (a) and after (b) the tryptophan selection step; (c-d) Mass spectrum at elution time ∼44 min with for the doubly charged peptide GYSLGNWVCAAK from lysozyme (m/z ) 663.3) before (c) and after (d) Trp-selection; (e) MS/MS spectrum of GYSLGNWVCAAK from the analysis of the digest after affinity enrichment.

analyzed for comparison purposes. All data sets were analyzed by Mascot, and peptide identifications were manually validated. The results for the 10 protein digest mixture obtained with the acrylic hydrazide material are summarized in Table 1 and Supporting Table 1. All of the 10 proteins could be identified after performing the tryptophan enrichment protocol, which was not the case when measuring the whole peptide mixture due to the overwhelming sample complexity. Six of the proteins were identified with more than one peptide hit and also myoglobin and κ-casein with only one suitable tryptophan peptide in the mass range of 800-3500 Da were identified. Only one Trp-free peptide (from β-lactoglobulin) was retained and detected in the elution fraction. Thus, nonspecific interactions were successfully minimized by the extensive washing procedure. For comparison, in the untreated 10-protein digest mixture, approximately 40 peptides were identified, but these only corresponded to 8 proteins. Two proteins, κ-casein and R1-acid-glycoprotein, were never identified in the complete mixture.

Usually, the problem is that in the time scale of an MS scan followed by data-dependent MS/MS experiments only a small part of the analytes that are present can actually be detected, and therefore, a (sometimes significant) part of the information is lost. So it happens that from one protein several peptides are identified while a protein, maybe present in lower amounts, is not detected at all. After performing the enrichment procedure all of the 10 proteins were identified with only 23 Trppeptide hits. In Figure 3, an example of the 10 protein mixture is given demonstrating the enormous simplification achieved by the proposed Trp-enrichment procedure. By removing a large amount of peptides from the mixture and the selective enrichment of tryptophan peptides, more protein identifications are achieved. Analysis of Soluble Yeast Proteins Using the Trp-Peptide Enrichment Strategy. To thoroughly evaluate our novel Trpenrichment procedure, we applied it to the analysis of a crude fraction of soluble yeast proteins. This sample represents a mixture of more than 100 000 peptides (assuming complete Journal of Proteome Research • Vol. 6, No. 9, 2007 3831

research articles

Foettinger et al.

Table 2. Proteins Identified from the Untreated Yeast Lysate (in Alphabetical Order)a accession no.

entry name

protein name

P22943 P31787 P00330 P07265 P00924 P00925 P23301

HSP12_YEAST ACBP_YEAST ADH1_YEAST MAL62_YEAST ENO1_YEAST ENO2_YEAST IF5A2_YEAST

P38248

ECM33_YEAST

P14540 P00360

ALF_YEAST G3P1_YEAST

P00358

G3P2_YEAST

P00359

G3P3_YEAST

P10591 P00560 P00950 P06169

HSP71_YEAST PGK_YEAST PMG1_YEAST PDC1_YEAST

P00549 P22217 P00942

KPYK1_YEAST TRX1_YEAST TPIS_YEAST

12 kDa heat shock protein Acyl-CoA-binding protein Alcohol dehydrogenase 1 Alpha-glucosidase MAL62 * Enolase 1 Enolase 2 Eukaryotic translation initiation factor Extracellular marix protein 33 precursor Fructose-bisphosphate aldolase Glyceraldehyde-3-phosphate dehydrogenase 1 Glyceraldehyde-3-phosphate dehydrogenase 2 Glyceraldehyde-3-phosphate dehydrogenase 3 Heat shock protein SSA1 * Phosphoglycerate kinase Phosphoglycerate mutase 1 Pyruvate decarboxylase isozyme 1 Pyruvate kinase 1 Thioredoxin I Triosephosphate isomerase

a Only proteins that were identified in at least two of the three replicate analyses are shown. All protein and corresponding peptide identifications are listed in Supporting Table 2. Proteins that could only be identified as member of a protein family are marked with an asterisk, and the protein selected by Mascot is shown.

digestion, expression of 80% of the proteins covered in the UniProt/SwissProt KB,42 and a suitable mass range of 800-3500 Da for identification by MS/MS). Such a complexity is already quite challenging for the instrumental setup used in this study, one-dimensional LC separation followed by data-dependent MS/MS on an ion trap instrument. In contrast to previous experiments that only used hydrazine as a cleavage reagent, samples were also treated with pyrrolidine as an alternative. Otherwise, the same protocol as for the model proteins was used, although the data analysis step was adapted to reflect a more typical large scale “shotgun” proteomics workflow. Thus, MS/MS spectra obtained from 1DLC-MS/MS runs were searched against concatenated yeast protein databases that contained reversed sequences from all entries in addition to the “real” sequences. This approach was found most useful to determine the false positive rate for large data sets,43-44 although Hutlin et al. recently noted that the calculation of false positive rates is less straightforward for smaller data sets.45 As we restricted ourselves to a singledimensional LC separation, the setup reflects a typical case where the capacity of the mass spectrometer to select precursor ions for data-dependent MS/MS is not sufficient to deal with the highly complex mixture of the total lysate digest. In the untreated sample, an average of 25 proteins were identified per run, of which 1-3 were false positive identifications, using a cutoff score of 25. Table 2 lists the 19 proteins that were identified in at least two of the LC-MS/MS runs; detailed information about all identified proteins/peptides is provided as Supporting Information (Supporting Tables 2 and 4). Lower scores resulted in a significant increase in the number of false positives and were therefore not considered. As is frequently observed in shotgun experiments, the overlap between replicate analysis was not 100%, as just around 40% 3832

Journal of Proteome Research • Vol. 6, No. 9, 2007

Figure 4. Nanoflow LC-MS/MS analysis of total soluble yeast cell lysate: (a) MS2 total ion chromatogram for the untreated sample; (b) MS2 total ion chromatogram for a Trp-enriched sample using hydrazine as cleavage reagent; (c) MS2 total ion chromatogram for a Trp-enriched sample using pyrrolidine as cleavage reagent.

of the proteins were observed in all analyses. The highest scoring proteins were enzymes involved in glycolysis, such as enolase, phosphoglycerate kinase, and glyceraldehyde-3phosphate dehydrogenase. As shown in Figure 4, some reduction in sample complexity by the Trp-enrichment protocol is evidently visible from the comparison of chromatograms of the untreated and treated samples. All Trp-enriched samples resulted in a similar number of proteins being identified regardless of the cleavage agent; the average was 19 (see Table 3 and Supporting Tables 3 and 4). Therefore, neither hydrazine nor pyrrolidine seems to be superior in performance over the other. As could be expected, the number of peptides per protein decreased strongly for the most abundant proteins, depending on their tryptophan content. Only 1-3 Trp-containing enolase peptides remained after the enrichment step, compared to more than 10 peptides for the unprocessed sample. As a result, the reduced sample complexity enabled the identification of additional proteins. For example, the mitochondrial ketol-acid reductoisomerase was only identified in one run (with a single, low scoring peptide) in the whole lysate. Following the enrichment for Trppeptides, this protein was consistently identified with at least three tryptophan-containing peptides in all samples. Similar results were obtained for inorganic pyrophosphatase and peroxiredoxin type II. In total, six proteins were identified in at least three of the six runs for treated samples, which were only identified in one of three runs or not at all in the untreated sample. A number of additional proteins were identified in only

research articles

Enrichment of Tryptophan-Containing Peptides Table 3. Proteins Identified in Yeast Lysate after Enrichment for Trp-Peptides (in Alphabetical Order)a accession no.

entry name

protein name

P07265 P00890

MAL62_YEAST CISY1_YEAST

P00924 P00925 P00360

ENO1_YEAST ENO2_YEAST G3P1_YEAST

P00359

G3P3_YEAST

P53912 P00817 P06168

YNN4_YEAST IPYR_YEAST ILV5_YEAST

P07262

DHE4_YEAST

P38013 P00560 P00950 P06169

AHP1_YEAST PGK_YEAST PMG1_YEAST PDC1_YEAST

P00549 P00942

KPYK1_YEAST TPIS_YEAST

Alpha-glucosidase MAL62 * Citrate synthase, mitochondrial precursor Enolase 1 Enolase 2 Glyceraldehyde-3-phosphate dehydrogenase 1 Glyceraldehyde-3-phosphate dehydrogenase 3 Hypothetical 41.2 kDa protein Inorganic pyrophosphatase Ketol-acid reductoisomerase, mitochondrial NADP-specific glutamate dehydrogenase Peroxiredoxin type II Phosphoglycerate kinase Phosphoglycerate mutase 1 Pyruvate decarboxylase isozyme 1 * Pyruvate kinase 1 Triosephosphate isomerase

a Only proteins that were identified in at least three of the six replicate analyses are shown. All protein and corresponding peptide identifications are listed in Supporting Table 3. Proteins that could only be identified as member of a protein family are marked with an asterisk, and the protein selected by Mascot is shown. Proteins that were not identified in the untreated sample (see Table 2) are highlighted in bold.

a few analyses even for the processed samples, suggesting that the sample complexity was still exceeding the capacity of the MS instrument after the enrichment step. As is the case for all techniques that target rare amino acids such as tryptophan or cysteine for reducing sample complexity, some proteins are inevitably not recovered. This occurs either because they do not carry any Trp residues in the sequence or the residue is part of a sequence that cannot be effectively identified by MS. Examples are fructose-bisphosphate aldolase (3 Trp residues, 1 fully tryptic peptide from 800 to 3500 Da, which is the optimum range for the instrument used) and alcohol dehydrogenase 1 (5 Trp residues, 2 suitable peptides). Differentiation of members of protein families that share a significant sequence homology is also more challenging, because the peptides discriminating between variants need to carry a Trp residue. These effects are well-known and therefore do not reflect a limitation of our particular method, but rather of this type of enrichment concept per se. Overall, there was only a small number of Trp-free peptides identified by the Mascot searches: In the six individual analyses, around 86% (169 of 197) of the peptides identified according to our criteria contained tryptophan, therefore demonstrating the specificity of our protocol for tryptophancontaining peptides. Furthermore, analysis of the few hits from the reversed database that were obtained from the processed samples revealed that four of the five sequences did not contain Trp. As a consequence, it might be possible to exclude Trpfree sequences from the results altogether without compromising the number of identified proteins significantly. (However, a concomitant reduction in the false positive rate cannot be accurately determined unless the decoy database reflects the enrichment in Trp-peptides.) In the case of the yeast data set, a total of six protein IDs in six runs corresponded to assignments that were exclusively based on Trp-free peptides. None of the individual peptide scores exceeded 28, and in just one

case (Filament protein FIN1 in the pyrrolidine-cleaved sample, run #1), the identification was based on two peptides, although individual peptide scores were 21 and 24, respectively. Alternatively, the cutoff score for the identification of individual peptides could be lowered, possibly increasing the number of true positive protein IDs in the process. For example, if the cutoff level for protein ID is set to 22, together with the exclusion of Trp-free peptides, a total of 14 additional protein identifications would be added for the six analyses of the present data set, with four additional hits from the reversed database. More extended data sets will be necessary to further look into this possibility. Clearly, the full potential of the enrichment method for highly complex samples such as cell lysates can only be explored in combination with multidimensional chromatography to reduce the number of coeluting peptides, or by using mass spectrometers with increased scan speeds. But even at this stage, and with a limited amount of data available, our method seems to compare favorably with other enrichment techniques that were recently compared in a study on serum fractionation strategies.46 In this work, the enrichment of cysteine-containing peptides from a total digest of human serum proteins based on disulfide-exchange chromatography47 resulted in 86.6% of the enriched peptides carrying Cys residues, which is very similar to our number obtained for the yeast lysate (85.8%). A further advantage of our enrichment strategy compared to others such as ICAT is that the peptides are recovered in their native form and do not carry a tag that potentially interferes with peptide fragmentation. In the present case, a substituent on the indole nitrogen such as the MDA tag was found to give a prominent signal in peptide MS/MS spectra due to a preferential cleavage as noted previously.32

Conclusions We have developed a novel method for the chemoselective enrichment of tryptophan-containing peptides from tryptic digests of protein mixtures. We have optimized the method, on the basis of the reversible derivatization of the indole side chain of tryptophan with malondialdehyde and capture of the modified peptides on hydrazide beads, using standard peptides and tryptic digests of standard proteins. It could be shown that Trp-peptides are recovered in good yields and that nonspecific binding can be effectively reduced by stringent washing steps. The application of the protocol to a total yeast cell lysate successfully demonstrated its applicability to complex samples of biological origin. The high specificity of the enrichment procedure, along with the simple implementation of the concept using readily available materials, makes the concept an attractive alternative to cysteine-specific approaches. Future work will focus on the combination of the newly developed method to other samples of biological origin. It could be a particularly attractive strategy to analyze plasma or serum samples, as the dominant serum albumin contains only one tryptophan residue in its sequence. Therefore, our strategy could help overcome the dynamic range issue in plasma proteomics as has, for example, been demonstrated by Liu et al. for Cys-peptide enrichment via disulfide exchange chromatography.48

Acknowledgment. We thank the Austrian Science Fund (FWF, project number P15482) for financial support. Journal of Proteome Research • Vol. 6, No. 9, 2007 3833

research articles Supporting Information Available: Overview of all validated peptide identifications from the 10-protein digest mix after enrichment for Trp-peptides (Supporting Table 1); overview of all validated peptide identifications from the yeast lysate without enrichment for Trp-peptides (Supporting Table 2); overview of all validated peptide identifications from the yeast lysate after enrichment for Trp-peptides (Supporting Table 3); Microsoft Excel file listing all cumulative protein scores from the yeast sample for comparison, including hits from the reversed database (Supporting Table 4). This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Paoletti, A. C.; Zybailov, B.; Washburn, M. P. Exp. Rev. Proteomics 2004, 1, 275-282. (2) Issaq, H. J.; Chan, K. C.; Janini, G. M.; Conrads, T. P.; Veenstra, T. D. J. Chromatogr. B 2005, 817, 35-47. (3) Swanson, S. K., Washburn, M. P. Drug Discovery Today-TARGETS 2005, 10, 719-725. (4) Gilar, M.; Olivova, P.; Daly, A. E.; Gebler, J. C. Anal. Chem. 2005, 77, 6426-6434. (5) Li, X.; Gong, Y.; Wang, Y.; Wu, S.; Cai, Y.; He, P.; Lu, Z.; Ying, W.; Zhang, Y.; Jiao, L.; He, H.; Zhang, Z.; He, F.; Zhao, X.; Qian, X. Proteomics 2005, 5, 3423-3441. (6) Wei, J; Sun, J; Yu, W; Jones A.; Oeller, P.; Keller, M.; Woodnutt, G.; Short, J. M. J. Proteome Res. 2005, 4, 801-808. (7) Leitner, A.; Lindner, W. J. Chromatogr. B 2004, 813, 1-26. (8) Mirzaei, H.; Regnier, F. J. Chromatogr. B 2005, 817, 23-34. (9) Leitner, A.; Lindner, W. Proteomics 2006, 6, 5418-5434. (10) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Nat. Biotechnol. 1999, 17, 994-999. (11) Hansen, K. C.; Schmitt-Ulms, G.; Chalkley, R. J.; Hirsch J.; Baldwin, M. A., Burlingame, A. L. Mol. Cell. Proteomics 2003, 2, 299-314. (12) Wang, S.; Regnier, F. E. J. Chromatogr. A 2001, 924, 345-357. (13) Foettinger, A.; Leitner, A.; Lindner, W. J. Chromatogr. A 2005, 1079, 187-196. (14) Kuyama, H.; Watanabe, M.; Toda, C.; Ando, E.; Tanaka, K.; Nishimura, O. Rapid Commun. Mass Spectrom. 2003, 17, 16421650. (15) Gevaert, K.; van Damme, J.; Goethals, M.; Thomas, G. R.; Hoorelbeke, B.; Demol, H.; Martens, L.; Puype, M.; Staes, A.; Vandekerckhove, J. Mol. Cell. Proteomics 2002, 1, 896-903. (16) Oda, Y.; Nagasu, T.; Chait, B. T. Nat. Biotechnol. 2001, 19, 379382. (17) Zhou, H.; Watts, J. D.; Aebersold, R. Nat. Biotechnol. 2001, 19, 375-378. (18) Qian, W.-J.; Goshe, M. B.; Camp, D. G., II; Yu, L.-R.; Tang, K.; Smith, R. D. Anal. Chem. 2003, 75, 5441-5450. (19) Tao, W. A.; Wollscheid, B.; O’Brien, R.; Eng, J. K., Li, X.-j.; Bodenmiller, B., Watts, J. D., Hood, L.; Aebersold, R. Nat. Methods 2005, 2, 591-598. (20) Zhang, H., Li, X.-j.; Martin, D. B.; Aebersold, R. Nat. Biotechnol. 2003, 21, 660-666. (21) Wells, L.; Vosseller, K.; Cole, R. N.; Cronshaw, J. M.; Matunis, M. J.; Hart, G. W. Mol. Cell. Proteomics 2002, 1, 791-804. (22) Gilis, D.; Massar, S.; Cerf, N. J.; Rooman, M. Genome Biol. 2001, 2, research0049.1 - 0049.12. (23) Gevaert, K.; van Damme, P.; Martens, L.; Vandekerckhove, J. Anal. Biochem. 2005, 345, 18-29.

3834

Journal of Proteome Research • Vol. 6, No. 9, 2007

Foettinger et al. (24) Scoffone, E.; Fontana, A.; Rocchi, R. Biochemistry 1968, 7, 971979. (25) Matsuo, E.-i.; Toda, C.; Watanabe, M.; Iida, T.; Masuda, T.; Minohata, T.; Ando, E.; Tsunasawa, S.; Nishimura, O. Rapid Commun. Mass Spectrom. 2006, 20, 31-38. (26) Ou, K.; Kesuma, D.; Ganesan, K.; Yu K.; Soon, S. Y.; Lee, S. Y.; Goh, X. P.; Hooi, M.; Chen, W.; Jikuya, H.; Ichikawa, T.; Kuyama, H.; Matsuo, E.-i.; Nishimura, O.; Tan, P. J. Proteome Res. 2006, 5, 2194-2206. (27) Li, C.; Gawandi, V.; Protos, A.; Phillips, R. S.; Amster, I. J. Eur. J. Mass Spectrom. 2006, 12, 213-221. (28) Spande, T. F.; Green, N. M.; Witkop, B. Biochemistry 1966, 5, 1926-1933. (29) Horton, H. R.; Koshland, D. E, Jr. J. Am. Chem. Soc. 1965, 87, 1126-1132. (30) Antos, J. M.; Francis, M. B. J. Am. Chem. Soc. 2004, 126, 1025610257. (31) http://www.nestgrp.com/pdf/bmt/Pi3MET_TRP.pdf (accessed May 2007). (32) Foettinger, A.; Leitner, A.; Lindner, W. Bioconjugate Chem. 2007, accepted. (33) Teuber, H. J.; Cornelius, D.; Pfaff, H. Chem. Ber. 1963, 96, 26172631. (34) Teuber, H. J.; Glosauer, O.; Hochmuth, U. Chem. Ber. 1964, 97, 557-562. (35) O’Shannessy, D. J. J. Chromatogr. 1990, 510, 13-21. (36) Mirzaei, H.; Regnier, F. Anal. Chem. 2005, 77, 2386-2392. (37) Martens, L.; Vandekerckhove, J.; Gevaert, K. Bioinformatics 2005, 21, 3584-3585. (38) Foettinger, A.; Leitner, A.; Lindner, W. J. Mass Spectrom. 2006, 41, 623-632. (39) Betancourt, L.; Gil, J.; Besada, V.; Gonza´lez, L. J.; Ferna´ndez-deCossio, J.; Garcı´a, L.; Pajo´n, R.; Sanchez, A.; Alvarez, F.; Padro´n, G. J. Proteome Res. 2005, 4, 491-496. (40) Sa´nchez, A.; Gonza´lez, L. J.; Ramos, Y.; Betancourt, L.; Gil, J.; Besada, V.; Ferna´ndez-de-Cossio, J.; Alvarez, F.; Padro´n, G. J. Proteome Res. 2006, 5, 1204-1213. (41) Sa´nchez, A.; Gonza´lez, L. J.; Betancourt, L.; Gil, J.; Besada, V.; Ferna´ndez-de-Cossio, J.; Rodrı´guez-Ulloa, A.; Marrero, K.; Alvarez, F.; Fando, R.; Padro´n, G. Proteomics 2006, 6, 4444-4455. (42) Ghaemmaghami, S.; Huh, W.; Bower, K.; Howson, R. W.; Belle, A.; Dephoure, N.; O’Shea, E. K.; Weissman, J. S. Nature 2003, 425, 737-741. (43) Higdon, R.; Hogan, J. M.; van Belle, G.; Kolker, E. OMICS 2005, 9 364-379. (44) Elias, J. E.; Gygi, S. P. Nat. Methods 2007, 4, 207-214. (45) Huttlin, E. L.; Hegeman, A. D.; Harms, A. C.; Sussman, M. R. J. Proteome Res. 2007, 6, 392-398. (46) Whiteaker, J. R.; Zhang, H.; Eng, J. K.; Fang, R.; Piening B. D.; Feng, L.-C.; Lorentzen, T. D.; Schoenherr, R. M.; Keane, J. F.; Holzman, T.; Fitzgibbon, M.; Lin, C.; Zhang, H.; Cooke, K.; Liu, T.; Camp, D. G., II; Anderson, L.; Watts, J.; Smith, R. D.; McIntosh, M. W.; Paulovich, A. G. J. Proteome Res. 2007, 6, 828-836. (47) Liu, T.; Qian, W. J.; Strittmatter, E. F.; Camp, D. G., II; Anderson, G. A.; Thrall, B. D.; Smith, R. D. Anal. Chem. 2004, 76, 53455353. (48) Liu, T., Qian, W.-J., Gritsenko, M. A., Xiao, W.; Moldawer, L. L.; Kaushal, A.; Monroe, M. E.; Varnum, S. M.; Moore, R. J.; Purvine, S. O.; Maier, R. V.; Davis, R. W.; Tompkins, R. G.; Camp, D. G., II; Smith, R. D. Mol. Cell. Proteomics 2006, 5, 1899-1913.

PR0702767