Anal. Chem. 2008, 80, 3334-3341
Enrichment by Organomercurial Agarose and Identification of Cys-Containing Peptides from Yeast Cell Lysates Mark J. Raftery*
Bioanalytical Mass Spectrometry Facility, University of New South Wales, Sydney, New South Wales 2052, Australia
Dynamic range and the presence of highly abundant proteins limit the number of proteins that may be identified within a complex mixture. Cysteine (Cys) has unique chemical reactivity that may be exploited for chemical tagging/capture with biotin/avidin reagents or affinity chromatography allowing specific isolation and subsequent identification of peptide sequences by mass spectrometry. Organomercurial agarose (Hg-beads) specifically captures Cys-containing peptides and proteins from cell lysates. Tryptic peptides from yeast lysates containing Cys were captured and eluted from Hg-beads after incubation with TCEP and trypsin. From two 1 h nano 1-D LC DDA/MS of the eluate >700 proteins were identified with an estimated false positive rate of ∼1%. Few peptides were identified with high confidence without Cys within their sequence after capture, and extensive washing, indicating little nonspecific binding. The number of fragmentation spectra was increased using automated 2-D nano-LC/MS and allowed identification of 1496 proteins with an estimated false positive rate of 1.1%. Approximately 4% of the proteins identified were from peptides that did not contain Cys, and these were biased toward higher abundance proteins. Comparison of the 1496 proteins to those reported previously showed that >25% were from yeast proteins not previously observed. Most proteins were identified from a single peptide, and sequence coverage was sacrificed by focusing only on identifying Cys-containing peptides, but large numbers of proteins were rapidly identified by eliminating many of the peptides from the higher abundance proteins. A number of Saccharomyces cerevisiae (yeast) proteomics studies are published using methods focusing on 2-D gels with mass spectrometry or multidimensional liquid chromatography and tandem mass spectrometry.1 The largest numbers of proteins reported integrated multidimensional chromatography (strong cation exchange (SCX) with nano C18 RP LC) with data dependent tandem mass spectrometry and database searches.1-4 The numbers of proteins reported ranged from ∼1500 to >2000 with * To whom correspondence should be addressed. Phone: 61-2-9385-1892. Fax: 61-2-9385-3950. E-mail:
[email protected]. (1) de Godoy, L. M.; Olsen, J. V.; de Souza, G. A.; Li, G.; Mortensen, P.; Mann, M. Genome Biol. 2006, 7, R50. (2) Washburn, M. P.; Wolters, D.; Yates, J. R., III Nat. Biotechnol. 2001, 19, 242-247.
3334 Analytical Chemistry, Vol. 80, No. 9, May 1, 2008
estimated false positive rates of 1%.1,4 Greater numbers of proteins were identified when 3-dimensional-liquid or CIEF separations and multiple-lysis conditions were employed.5,6 A recent quantitative study identified a total of 2754 proteins at various time points from budding yeast after labeling with ICAT reagents, with an estimated false positive rate of 5%, and representing ∼48% of the predicted yeast ORF.7 Cysteine (Cys) contains a free thiol (-SH) with unique chemical reactivity and has been chemically modified allowing selective detection, isolation, and characterization of peptides.8,9 Labeling of Cys with affinity tags containing biotin or direct capture of thiols using thiopropyl sepharose (TPS) are two common approaches applied in proteomics.10-14 Reversible binding of thiols to solid phase supports like TSP allows extensive washing to remove nonspecially bound peptides, and elution is normally with excess small MW thiols like 1,4-dithiothreitol (DTT) which does not interfere with subsequent nano-LC and mass spectrometry.15-17 Chemical modification by β-elimination and addition of DTT, followed by TPS enrichment has also been reported for capturing O-Linked β-N-acetylglucosamine modified- and phospho-pep(3) Washburn, M. P.; Ulaszek, R. R.; Yates, J. R., III Anal. Chem. 2003, 75, 5054-5061. (4) Peng, J.; Elias, J. E.; Thoreen, C. C.; Licklider, L. J.; Gygi, S. P. J. Proteome Res. 2003, 2, 43-50. (5) Wei, J.; Sun, J.; Yu, W.; Jones, A.; Oeller, P.; Keller, M.; Woodnutt, G.; Short, J. M. J. Proteome Res. 2005, 4, 801-808. (6) Wang, W.; Guo, T.; Song, T.; Lee, C. S.; Balgley, B. M. Proteomics 2007, 7, 1178-1187. (7) Flory, M. R.; Lee, H.; Bonneau, R.; Mallick, P.; Serikawa, K.; Morris, D. R.; Aebersold, R. Proteomics 2006, 6, 6146-6157. (8) Chang, J. Y.; Knecht, R.; Braun, D. G. Biochem. J. 1983, 211, 163-171. (9) Sliwkowski, M. X.; Levine, R. L. Anal. Biochem. 1985, 147, 369-373. (10) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Nat. Biotechnol. 1999, 17, 994-999. (11) Spahr, C. S.; Susin, S. A.; Bures, E. J.; Robinson, J. H.; Davis, M. T.; McGinley, M. D.; Kroemer, G.; Patterson, S. D. Electrophoresis 2000, 21, 1635-1650. (12) Gygi, S. P.; Rist, B.; Griffin, T. J.; Eng, J.; Aebersold, R. J. Proteome Res. 2002, 1, 47-54. (13) Liu, T.; Qian, W. J.; Strittmatter, E. F.; Camp, D. G., II; Anderson, G. A.; Thrall, B. D.; Smith, R. D. Anal. Chem. 2004, 76, 5345-5353. (14) Bernhard, O. K.; Kapp, E. A.; Simpson, R. J. J. Proteome Res. 2007, 6, 987995. (15) McLachlin, D. T.; Chait, B. T. Anal. Chem. 2003, 75, 6826-6836. (16) Liu, T.; Qian, W. J.; Chen, W. N.; Jacobs, J. M.; Moore, R. J.; Anderson, D. J.; Gritsenko, M. A.; Monroe, M. E.; Thrall, B. D.; Camp, D. G., II; Smith, R. D. Proteomics 2005, 5, 1263-1273. (17) Wang, H.; Qian, W. J.; Chin, M. H.; Petyuk, V. A.; Barry, R. C.; Liu, T.; Gritsenko, M. A.; Mottaz, H. M.; Moore, R. J.; Camp Ii, D. G.; Khan, A. H.; Smith, D. J.; Smith, R. D. J. Proteome Res. 2006, 5, 361-369. 10.1021/ac702539q CCC: $40.75
© 2008 American Chemical Society Published on Web 03/20/2008
tides.15,18 Other chemical modifications allowing enrichment of Cys-containing peptides include addition of quaternary amine tags and performic acid oxidation.19,20 Selective enrichment for thiolcontaining peptides reduces the complexity of the mixture and allows enhanced proteome coverage.19,20 Increased numbers (∼30%) of proteins were identified within mammalian cell lysates after enrichment and analysis.16,17 Another affinity matrix suitable for enriching for Cys-containing proteins and peptides exploits the reversible interaction of organomercury compounds cross-linked to agarose with thiols.21,22 There are few recent reports using this affinity matrix and no reports using the matrix in proteomics studies. All proteomics studies have focused on TSP as the affinity matrix, probably because of its commercial availability. Potential difficulties with TPS include the obligatory reduction and removal of the reductant before capture of the peptides that may potentially lead to loss of some peptides. In this report, yeast cell lysates simply treated with tris(2carboxyethyl)phosphine hydrochloride (TCEP) and digested with trypsin were incubated with organomercurial agarose (Hg-beads) and these readily captured and provided a highly enriched pool of Cys-containing peptides that were suitable for identifying large numbers of yeast proteins with high confidence by 1- and 2-D nano-LC data dependent tandem MS. MATERIALS AND METHODS General. Reagents and chemicals were analytical grade, and solvents were analytical or HPLC grade (Sigma-Aldrich, St. Louis, MO). Phosphate buffered saline (PBS) contained sodium phosphate (25 mM) NaCl (250 mM), and the pH was adjusted to 7.5 with HCl. Heptafluorobutyric acid (HFBA) and tris(2-carboxyethyl)phosphine hydrochloride (TCEP) were from Pierce (Rockford, IL). Ellman’s reagent (DTNB) was from Sigma.23 All liquids containing mercury were disposed of appropriately. Preparation of Hg-Agarose. Preparation of the beads was based on a procedure supplied by BioRad (Hercules, CA). Affigel 10 (50 mL, BioRad) was transferred to a sintered Bu¨chner funnel and washed with isopropyl alcohol (2 × 150 mL) and collected by vacuum filtration. The partially dried beads were suspended in DMF (50 mL), and 4-aminophenylmercuric acetate (0.75 g, 2.1 mmol) in DMF (15 mL) was added. The mixture was agitated for 4 h, then ethanolamine (0.5 mL, 8.3 mmol) was added, and the mixture was agitated for a further 60 min. The beads were collected, washed with DMF (125 mL) and isopropyl alcohol (3 × 125 mL), then suspended in isopropyl alcohol (150 mL), and stored at 4 °C until needed. Binding Capacity of Hg-Agarose. The binding capacity of the beads was determined by measuring the quantity of Cys captured by a small amount of beads. Briefly, a standard curve was constructed by measuring A405 (BioTrack II plate reader, Amersham, Rydalmere NSW, Aust) of DTNB (10 µL, 1 mg/mL) (18) Wells, L.; Vosseller, K.; Cole, R. N.; Cronshaw, J. M.; Matunis, M. J.; Hart, G. W. Mol. Cell. Proteomics 2002, 1, 791-804. (19) Ren, D.; Julka, S.; Inerowicz, H. D.; Regnier, F. E. Anal. Chem. 2004, 76, 4522-4530. (20) Dai, J.; Wang, J.; Zhang, Y.; Lu, Z.; Yang, B.; Li, X.; Cai, Y.; Qian, X. Anal. Chem. 2005, 77, 7594-7604. (21) Sluyterman, L. A.; Wijdenes, J. Methods Enzymol. 1974, 34, 544-547. (22) Bornstein, D. L.; Walsh, E. C. J. Immunol. Methods 1979, 29, 343-352. (23) Ellman, G. L. Arch. Biochem. Biophys. 1959, 82, 70-77.
containing Cys (0, 0.5, 1, 2, 4, and 8 µg) in phosphate (pH 8, 200 µL). A control solution containing Cys (20 µg) in phosphate (450 µL) and a beads solution containing Cys (20 µg) and beads (50 µL) in binding buffer (400 µL) were prepared. After mixing by rotation for 30 and 60 min and then centrifuging (10 000 rpm 30 s) an aliquot (10 µL) of each supernatant was mixed with DTNB (10 µL, 1 mg/mL) and phosphate (180 µL) after, and the A405 was determined to allow the quantity of Cys remaining in the control and beads sample to be determined. Comparison of the two values allowed the amount of Cys captured by the beads and binding capacity to be calculated. Beads (50 µL) were regenerated by incubating with HgCl2 (2 mM) in NaOAc (950 µL, 50 mM, pH 5.0) for 60 min as described21,24 and washed (water 3 × 500 µL and loading buffer 2 × 500 µL) for 5 min. The binding capacity of the used beads was determined as described above. Yeast Lysis and Affinity Chromatography. Dried yeast (25 mg, Sigma) was suspended in PBS (4 mL, 8 M urea) and vortexed (30 s). Zirconia/silica beads (∼2 mg, BioSpec Products, Bartlesville, OK) were added, and the cells were lysed by bead beating (Mini-BeadBeater 5 × 30 s, BioSpec Products). The suspension was centrifuged twice (13 000 × rpm) for 5 min, and the supernatant was removed and then stored at -80 °C until needed. The lysate (150 µL) was diluted with NH4HCO3 (30 mM, 1000 µL) and TCEP (100 µL, 100 mM) and then digested with trypsin (20 µg, Promega, Madison, WI) for 14 h. Affinity chromatography was performed in “batch mode” using 1.5 mL microcentrifuge tubes (Treff, Degersheim, Switzerland). Hg-beads (100 µL) were washed with water (2 × 500 µL) and buffer A (PBS:CH3CN 1:1, 2 × 500 µL), and the solution containing the digested peptides (200 µL) was diluted with buffer A (800 µL). The beads and peptides were mixed by rotation for 30 min and then centrifuged (1000 rpm, 20 s), and the supernatant was removed. The beads were washed with buffer A for 10 min and centrifuged, and the supernatant was removed. The washing was repeated twice. The beads were incubated with DTT (200 µL, 25 mM) for 10 min, and the released peptides were collected in the supernatant after centrifugation (1000 rpm, 20 s). For analysis samples were concentrated by drying in a SpeedVac (Savant, Farmingdale, NY) or desalted and concentrated using peptide Micro- or Macro-Traps (Michrom Bioresources, Auburn, CA) and resuspended in H2O (0.05% HFBA, 1% formic acid). Mass Spectrometry. QStar Pulsar i. Digest peptides were separated by nano-LC using an Ultimate HPLC and Famos autosampler system (LC-Packings, Amsterdam, Netherlands). Samples (5 µL) were concentrated and desalted onto a CapTrap peptide precolumn (500 µm × 2 mm, Michrom Bioresources) with H2O:CH3CN (98:2, 0.05% HFBA) at 20 µL/min. After a 4 min wash the precolumn was switched (Switchos, LC Packings) into line with a fritless nanocolumn (75 µm (id) × ∼10 cm) containing C18 RP packing material (5µ Magic (Michrom Bioresources)) manufactured according to Gatlin.25,26 Peptides were eluted using (24) Krieger, D. E.; Erickson, B. W.; Merrifield, R. B. Proc. Natl. Acad. Sci. U.S.A. 1976, 73, 3160-3164. (25) Gatlin, C. L.; Kleemann, G. R.; Hays, L. G.; Link, A. J.; Yates, J. R., III Anal. Biochem. 1998, 263, 93-101. (26) Goodchild, A.; Raftery, M.; Saunders, N. F.; Guilhaus, M.; Cavicchioli, R. J. Proteome Res. 2004, 3, 1164-1176.
Analytical Chemistry, Vol. 80, No. 9, May 1, 2008
3335
a linear gradient of H2O:CH3CN (98:2, 0.1% formic acid) to H2O: CH3CN (55:45, 0.1% formic acid) at ∼300 nL/min over 30 or 90 min. High voltage (2300 V) was applied to low volume tee (Upchurch Scientific) and a column tip positioned ∼1 cm from the orifice of an API QStar Pulsar i hybrid tandem mass spectrometer (Applied Biosystems, Foster City, CA). Positive ions were generated by electrospray, and the QStar was operated in an information dependent acquisition mode. A Tof MS survey scan was acquired (m/z 350-1700, 1.2 s). The three largest multiply charged ions (counts >15) were sequentially selected by Q1 for MS/MS analysis. Nitrogen was used as collision gas, and an optimum collision energy was chosen for each eluting peptide by Analyst (Ver 1.0 SP8) based on charge state and mass using the default values in the IDA collision energy parameters script supplied by the manufacture (ABI). Tandem mass spectra were accumulated for up to 3.5 s (m/z 65-2000) with 1 repeat analysis. Peak lists were generated using Mascot Distiller (Matrix Science, London, England) using the default parameters. LTQ-FT Ultra. Peptides were separated by nano-LC using an Ultimate 3000 HPLC and autosampler system (Dionex, Amsterdam, Netherlands). For online 2-D separations, desalted peptides were dissolved in H2O (0.05%, HFBA, 25 µL) and loaded (10 µL) onto a SCX microcolumn (0.75 × ∼25 mm, Poros S10, Applied Biosystems) manufactured “in house”. Peptides were eluted sequentially using 5, 10, 15, 20, 25, 30, 40, 50, 75, 150, 300, and 1000 mM ammonium acetate (20 µL).2,4 The unbound load fraction and each salt step were concentrated and desalted onto a CapTrap peptide precolumn (500 µm × 2 mm, Michrom Bioresources) with H2O:CH3CN (98:2, 0.1% formic acid, buffer A) at 20 µL/min.26 After a 10 min wash the precolumn was switched (Valco 10 port valve, Dionex) into line with a fritless nanocolumn (75 µm (id) × ∼10 cm) containing C18 RP packing material (5µ Magic (Michrom Bioresources)) manufactured according to Gatlin.25 Peptides were eluted using a linear gradient of H2O:CH3CN (98: 2, 0.1% formic acid) to H2O:CH3CN (55:45, 0.1% formic acid) at 350 nL/min over 60 min. High voltage (1800 V) was applied to low volume tee (Upchurch Scientific), and the column tip was positioned ∼0.5 cm from the heated capillary (T ) 200 °C) of a LTQ FT Ultra (Thermo Electron, Bremen, Germany) mass spectrometer. Positive ions were generated by electrospray, and the LTQ FT Ultra operated in data dependent acquisition mode (DDA/MS). Identical parameters were used for the 1-D LC/MS analysis of the elute sample. A survey scan m/z 350-1750 was acquired in the FT ICR cell (resolution ) 100 000 at m/z 400, with an initial accumulation target value of 1 000 000 ions in the linear ion trap). Up to the 9 most abundant ions (>2000 counts) with charge states of +2 or +3 were sequentially isolated and fragmented within the linear ion trap using collisionally induced dissociation with an activation q ) 0.25 and activation time of 30 ms at a target value of 30 000 ions. m/z ratios selected for MS/MS were dynamically excluded for 60 s after two repeat product ion scans. Peak lists were generated using Mascot Daemon/extract_msn (Matrix Science, London, England, Thermo) using the default parameters. Database Searches. Peak lists were submitted to the database search program Mascot (version 2.1 or 2.2, Matrix Science). Search parameters were as follows: precursor and product ion tolerances were (0.25 and 0.2 Da (QStar) or 10 ppm and 0.6 Da 3336
Analytical Chemistry, Vol. 80, No. 9, May 1, 2008
(LTQ-FT Ultra), respectively; Met(O) was specified as a variable modification, enzyme specificity was trypsin, 1 and 3 (QStar) or 1 (LTQ-FT Ultra) missed cleavages were possible, and the Yeast SwissProt (ftp://au.expasy.org/databases/complete_proteomes/ fasta/eukaryota/ down loaded Nov 2006) or Yeast ORF (ftp:// genome-ftp.stanford.edu/pub/yeast/data_download/sequence/ genomic_sequence/orf_protein/ down loaded Jan 2007) databases were searched. Peak lists were researched using shuffled databases to estimate the false positive rate.4,27 Protein identifications were compared using Protein Results Parser (version 2.1)28 or after importing results into Microsoft Access and applying suitable filters. Protein Digestion Simulator Basic was used to produce the in silico digestions of yeast proteins (downloaded from http:// ncrr.pnl.gov/software/ProteinDigestionSimulatorBasic.stm). All search results are shown as data together with annotated MS/ MS spectra of protonated peptides of single spectra identifications (Supporting Information Tables 1-10). RESULTS AND DISCUSSION Several previous reports have shown some advantages after enriching Cys-containing peptides for proteomics studies and determining post-translational modifications.19,20 These studies show excellent enrichment of Cys-containing peptides using TPS, and increased numbers of proteins were identified after enrichment and DDA/MS from complex protein mixtures digested with trypsin.13,17,29 All previous affinity chromatography used TPS and required prior reduction of disulfides and removal of any residual reducing agent before formation of disulfide bonds between the beads and thiols.17 The capture of Cys-containing peptides with Hg-beads may be done in the presence of a reducing agent like TCEP because it does not interfere with the binding Cys to the Hg-beads or with subsequent nano-LC or DDA/MS. Hg-beads may be regenerated after elution and may be reused if necessary. Preparation of Organomercurial Agarose and Enrichment of Cys-Containing Peptides. Organomercury affinity chromatography and preparation of beads was described,21,30 and these were generally used for isolating proteins containing free thiols.22,31 Similar beads were commercially available from BioRad (Affi-gel 501) but are no longer available. No other commercial source of Hg-beads was found, but BioRad provided a simple protocol for coupling 4-phenylmercuric acetate to Affi-gel 10 (Figure 1a). The capacity and coupling efficiency was determined using Ellman’s reagent. Supporting Information Figure 1 shows a standard curve determined for a range of Cys concentrations (0-200 ng/µL). Working within this range the binding capacity of the beads for Cys after 30 or 60 min was determined to be 2.2 µmol/mL gel (see Supporting Information Figure 1) and is similar to the binding capacity of Affi-gel 501 once available from BioRad (∼30 mg/mL myoglobin). Deactivation of beads has previously been reported to enhance release of captured proteins by eliminating high affinity (27) Ambatipudi, K.; Old, J.; Guilhaus, M.; Raftery, M.; Hinds, L.; Deane, E. Comp. Biochem. Physiol., Part D: Genomics Proteomics 2006, 1, 283-291. (28) Annaiah, K.; Arnold, R. J.; Novotny, M. V. Nashville TN, May 23-27, 2002; TPA014. (29) Conrads, T. P.; Alving, K.; Veenstra, T. D.; Belov, M. E.; Anderson, G. A.; Anderson, D. J.; Lipton, M. S.; Pasa-Tolic, L.; Udseth, H. R.; Chrisler, W. B.; Thrall, B. D.; Smith, R. D. Anal. Chem. 2001, 73, 2132-2139. (30) Cuatrecasas, P.; Wilchek, M.; Anfinsen, C. B. Proc. Natl. Acad. Sci. U.S.A. 1968, 61, 636-643. (31) Kaplan, R. S.; Mayor, J. A.; Johnston, N.; Oliveira, D. L. J. Biol. Chem. 1990, 265, 13379-13385.
Figure 1. Preparation of affinity beads and capture of peptides were simple procedures. A. Schematic for preparing beads from Affi-gel 10 and 4-aminphenylmercuric acetate. B. Schematic for capturing Cys-containing peptides from cell lysates.
binding sites.32 This procedure was followed and did not appear to affect the binding or elution of the Cys-containing peptides (not shown). The binding capacity of the beads was determined after binding Cys and reactivation by incubation with HgCl2.21,24 The capacity of the beads was reduced by ∼70% after 1 usage. A simple procedure was followed to isolate Cys-containing peptides (Figure 1b). After digestion and reduction of disulfide bonds with TCEP, the digested solution was diluted and added directly to the agarose. TCEP did not inhibit binding of peptides to the beads at the concentrations needed for complete reduction of disulfide bonds (not shown). Peptide solutions were incubated with the beads for 30 min with end of end mixing, and the supernatant was removed. The beads were washed with PBS:CH3CN (1:1) for 10 min 3 times, and this effectively removed nonspecifically bound peptides. Cys-containing peptides were readily liberated by incubation with DTT and the lysate-, nonbinding-, wash samples-, and eluted-peptides were analyzed first by 1-D nano-LC DD/MS. One-Dimensional nano-LC/MS (QStar). Initial analysis was performed using 1-D nano-LC data dependent tandem mass spectrometry to evaluate the effectiveness of the Hg-beads for enrichment of Cys-containing peptides. An initial run of each sample was used to estimate empirically the quantity of sample to load, and amounts were chosen in order to maximize the number of cycles of data dependent mass spectra/run (not shown). In this way an approximately equivalent quantity of peptides was loaded and analyzed for each sample. As expected, a similar number of queries was obtained for all samples except wash 3 where the total quantity of peptides in the sample was low (Figure 2). Once the load amount was established, each sample was run 3 times using a 90 min LC gradient (Supporting Information Figure 1) in order to maximize the number of proteins identified.33 The derived peaks lists from each sample were combined before Mascot searches using a yeast database and a shuffled yeast database with 1 or 3 missed cleavages. A peak list file combining all samples except the elute sample was also (32) Sluyterman, L. A.; Wijdenes, J. Biochim. Biophys. Acta 1970, 200, 593595. (33) Liu, H.; Sadygov, R. G.; Yates, J. R., III Anal. Chem. 2004, 76, 4193-4201.
Figure 2. Analysis of 1-D LC QStar data to determine the optimum Mowse score and false positive rate. A. Yeast proteins identified before, in washes, and after elution of Hg-beads with Mowse scores of a single peptide >25 (black) >33 (dark gray) >40 (light gray) or >50 (white) after searching the yeast database. The number if queries for each sample were 2511 (load), 2755 (flow through), 2825 (wash 1), 2158 (wash 2), 940 (wash 3), 11 229 (all loaded), and 2775 (elute), respectively. B. Estimated false positive rates for each sample and Mowse scores (>25 (black), >33 (dark gray) >40 (light gray) or >50 (white)). C. Venn diagram showing proteins identified from the all load sample (Mowse >40, false positives ∼1%) and compared to the proteins identified from the Elute sample (Mowse >33, false positives ∼1%). Unique and common proteins were observed in each sample.
searched. Figure 2A shows the number of proteins identified from each fraction using different Mowse cutoff scores. The estimated false positive rate4 was calculated at different Mowse scores after searching a shuffled database by means of the formulas % false positive ) 2[nshuff/(nshuff + nreal)] × 100 (Figure 2B).4 Few peptides with more than 1 missed cleavage were detected (see Supporting Information Tables 1.1, 1.2, 3.1, and 3.2 for a comparison). Higher Mowse scores reduced the number of identified proteins but allowed the estimated false positive rate to be reduced to approximately 1 in 100. Interestingly, the Mowse score determined for an estimated false positive rate of ∼1% was >40 for the yeast lysate samples (non-Cys-containing peptides) and >33 for the elute sample (Cys-containing peptides). Comparison of both sets of identifications shows common protein identifications and unique identifications for both samples (Figure 2C). With an estimated false positive rate of ∼1% 119 unique proteins (Mowse >40) were identified from the search of the load and wash samples, 228 unique proteins (Mowse >33) were identified from the search of the elute sample, and 94 proteins were common to both data sets (Figure 2C). Figure 3 shows a comparison of the unique peptides identified from Mascot searches from all samples without Cys in their sequence or with Cys in their sequence. A comparison of searches Analytical Chemistry, Vol. 80, No. 9, May 1, 2008
3337
Figure 3. Comparison of yeast peptides identified with Mowse scores >20 or >36, before, in washes, and after Hg-beads. Most peptides that did not contain Cys (light gray) in their sequence were identified before enrichment compared to almost excusive identification of Cys-containing peptides (black) after enrichment (acquired using 1-D LC and QStar).
Figure 4. Comparison of the proportion of unique yeast peptides (A) and proteins (B) identified from Mascot searches before, in flow though, wash 1, all load sample, and after elution of Hg-beads, with Mowse scores of a single peptide >25 (black) >33 (dark gray) >40 (light gray) or >50 (white) after searching the yeast database. The number of queries for each sample were 2511 (load), 2755 (flow through), 2825 (wash 1), 11 229 (all loaded), and 2775 (elute), respectively (* shows data with a false positive rate of ∼1%).
with a low Mowse scores >20 which allowed more peptides to be identified and an intermediate Mowse score >36 were used for the searches. Few peptides containing Cys were identified (∼1%) in the load or flow through samples with either Mowse cut-offs. Increasing numbers of Cys-containing peptides were found in the wash samples, probably because of some nonspecific elution of these from the Hg-beads. Mascot searches of the peak lists from the elute sample showed almost exclusive Cys-containing peptides (Mowse >20) and only Cys-containing peptides (Mowse >36). This demonstrates the high specificity of the Hgbeads for capturing and retaining Cys-containing peptides from digested yeast lysates. Figure 4 shows a comparison of the percent success rate for identifying yeast peptides or proteins at different Mowse scores, for the load, flow through, wash 1, all load sample, and the elute samples. Values were calculated by dividing the number of unique peptides or proteins identified from Mascot searches by the number of queries (×100). With an estimated false positive rate of ∼1% approx (Mowse >40) ∼20-25% of the queries led to successful identifications using the load, flow through, or wash 1 sample. This decreased to a 33, estimated false positive rate of ∼1%). Interestingly, the estimated FP rate (∼1%) occurred with a lower Mowse score from the spectra obtained from the Cys-containing peptide sample, suggesting that the interpretability using Mascot of low-energy CID spectra of Cys-containing peptides may be better. The precise reason for this in not known, but it may be simply that fragmentation spectra from a more diverse range of peptides was obtained from the elute sample vs more spectra/protein for the load samples. If successful protein identifications were similarly analyzed, then only a 5-6% identification success rate was observed for the load, flow through, or wash 1 sample (Figure 4B). The protein identification success rate was twice that (∼12%) from the elute sample (Cys-containing peptide fraction) with a Mowse score >33 or 40. There were two possible reasons why fewer proteins/query were identified from the total lysate samples compared to the elute sample. The calculated number of Cys-containing tryptic peptides (m/z 350-3500) was 25 838, whereas the total number of tryptic peptides calculated was 230 972 from in silico digestion of all yeast proteins and 76.5% of proteins have 25. Only a small number of proteins were identified when searching the shuffled database if the Mowse score was set to >25, which corresponded to an estimated false positive rate of ∼1% (Figure 5B,C). The separation efficiency of the automated online SCX/nano RP LC separation was evaluated by comparing the number of unique proteins identified in each fraction and if proteins were identified in any other 2-D nano-LC separation (Figure 5D). The load and all salt steps contained unique protein identifications, but few unique proteins were identified in the load and the last 5 salt steps. However, all salt steps contained a large number of identified proteins, but many were redundant identifications and found in more than one salt step, showing the relatively poor separation efficiency of the small SCX cartridge used here. Time and equipment did not allow comparison with an offline SCX separation, but more proteins would likely be identified if an offline SCX protocol followed by MS was possible.7 Nevertheless, the number of yeast proteins identified with high confidence from only analyzing the Hg-bead elute fraction was 1496, and this was similar to the numbers of proteins identified in previous studies.1,2,4 Approximately 96% of the proteins were identified from peptides that contained Cys in their sequence, and ∼14% of the proteins identified had 1 or more peptides that did not contain Cys (not shown). This suggests that the effectiveness of the enrichment procedure was good and comparable with other Cys enrichment procedures.34 The number of identified high scoring peptides that contained Cys in their sequence was 88% for either a single or the combined 2-D LC/MS data (Figure 6A). If peptides from the top 100 identified proteins were analyzed, then only 73% of the peptides contained Cys in their sequence, and the number increased to 92% if all the other proteins were (34) Whiteaker, J. R.; Zhang, H.; Eng, J. K.; Fang, R.; Piening, B. D.; Feng, L. C.; Lorentzen, T. D.; Schoenherr, R. M.; Keane, J. F.; Holzman, T.; Fitzgibbon, M.; Lin, C.; Cooke, K.; Liu, T.; Camp, D. G., II; Anderson, L.; Watts, J.; Smith, R. D.; McIntosh, M. W.; Paulovich, A. G. J. Proteome Res. 2007, 6, 828-836.
Figure 5. Unique yeast proteins identified after elution of peptides from Hg-beads combining queries from two separate experiments using automated online 2-D LC and LTQ-FT Ultra (10 ppm and 0.6 Da). A. Numbers of proteins identified after searching the yeast database at different Mowse scores. B. Numbers of proteins identified after searching a shuffled yeast database at different Mowse scores. C. Estimated false positive rates with increasing Mowse scores. More than 1600 proteins were identified with a Mowse score >20 with an estimated false positive rate ∼3.5%. A Mowse score of >25 allowed 1496 proteins to be identified and reduced the estimated false positive rate to 1.1%. D. Numbers of unique yeast proteins identified after elution of peptides from Hg beads from one 2-D LC separation and LTQ-FT Ultra (10 ppm and 0.6 Da) from ∼49 000 queries. The number of unique proteins identified exclusively in each SCX sample (-(-), the number of yeast proteins identified in each SCX fraction and any other SCX fraction (-2-), and the number of yeast proteins identified in any other second dimension salt step (-9-) are shown. High numbers of yeast proteins were identified yeast proteins in all fractions. Unique proteins were also identified in all fractions, but most proteins were identified within the first 7 salt steps. Almost all proteins were identified in more than 1 fraction. This indicates that some carryover from one fraction to another occurred, and this may have diminished the total number of possible identifications.
analyzed. It would be expected that the higher scoring proteins from Mascot searches would be from the more abundant proteins. The increased proportion of peptides without Cys in their sequence found in the higher abundance proteins suggests that the identified peptides (without Cys) were present in lower abundance compared to many Cys-containing peptides. Further evidence to support the identification of low abundance peptides without Cys from high abundance proteins was obtained by analysis of the number of repeat analysis of the same peptide.33 The peptides from the Mascot searches for the top 100 identified proteins (from a single 2-D separation) show that there were very Analytical Chemistry, Vol. 80, No. 9, May 1, 2008
3339
Figure 6. Analysis of 2-D LC DDA/MS data showing that some higher abundance proteins were identified from peptides that did not contain Cys. A. Comparison of identified peptides to determine if Cys (black) was or was not (light gray) in unique peptide sequences derived from Mascot searches (Mowse >25). 88% of peptides had Cys in the sequence in a single 2-D nano-LC separation and the search data combining two separations (LTQ-FT). This decreased to 73% if peptides from the top 100 identified proteins were analyzed and increased to 92% if the remaining proteins were analyzed. This shows that more peptides without Cys in their sequence were identified from the abundant proteins (see text for details). B. Redundant peptide identifications after Mascot searches (Mowse >25) of spectra from a single 2-D nano-LC and LTQ-FT experiment (data from the top 100 identified proteins). All peptides (-(-), peptides with Cys in their sequence (-9-), and peptides without Cys in their sequence (-2-) were analyzed. High abundance peptides were more frequently analyzed and identified and these normally contained Cys in their sequence, whereas the lower abundance peptides (often without Cys) are analyzed less often.
different numbers of repeat analysis for peptides with and without Cys in their sequence (Figure 6B). Most identified peptides without Cys in their sequence were analyzed and identified by Mascot searches once or twice or in a few instances up to 10 times. Most peptides with Cys in their sequence were analyzed and identified twice or more and some up to 100 times. The number of repeat fragmentation spectra has been shown to be predictive for peptide abundance in complex protein digests.33 Thus, the reason for the higher number of proteins identified without Cys in their sequence from the 2-D LC and LTQ-FT experiments was probably the greatly improved sensitivity that allowed detection of many more of the lower abundance peptides present in the sample using this combination. Comparison with Previous Studies. The studies so far reported for the analysis of yeast have shown that large numbers of proteins may be identified with high confidence, and quantitative changes in protein expression may be measured.1,3-7 The proteins 3340 Analytical Chemistry, Vol. 80, No. 9, May 1, 2008
Figure 7. Venn diagrams showing comparisons of yeast proteins identified in this study with yeast proteins identified previously (A,2 B,4 and C1). In two cases (A and B) similar total numbers of proteins were identified in the studies, and in the recent Mann study ∼3/4 of the total numbers of proteins were identified (see Supporting Information Table 10 for more information). However, a large number of proteins that were not detected previously were identified after elution of Hg-beads and mass spectrometry. This indicates that more unique proteins may be identified with high confidence if this type of fractionation strategy is used.
identified after elution of Cys-containing peptides and analyzed by 2-D LC DDA/MS and LTQ-FT followed by Mascot searches were compared to proteins identified in three previous studies where the proteins identified were readily available as Supporting Information (Figure 7 and Supporting Information Table 10). In these comparisons it was clear that many common proteins (between 722 and 990) were identified, and many proteins (506774) were only identified after enrichment of Cys-containing peptides. A similar number of proteins (688-1011) were only identified in previous reports. Even in comparison to the most comprehensive study to date where the proteins identified are reported,1 an additional 506 unique yeast proteins were identified in this study, with estimated false positive rates of ∼1%. These improvements in the number of protein identifications were similar to those shown for mammalian cells where an ∼30% improvement in identified proteins may be expected by enriching and analyzing the Cys-containing peptides.16,17 Supporting Information Table 10 shows that the unique proteins identified in this study were from diverse categories, and there were no obvious properties that would account for their identification here. The absolute amounts of many proteins present in yeast have been reported,35 and these data allow a direct comparison of the abundance of yeast proteins identified in proteomics experiments.1,33 An additional 167 proteins identified here were not detected using Western analysis of Tap-tagged proteins or GFPexpression tags (Supporting Information Table 10d). A graph (Supporting Information Figure 4a) showing proteins identified from the 2-D separations using all peptides and proteins identified from Ghaemmaghami et al.35 vs concentration shows that both (35) Ghaemmaghami, S.; Huh, W. K.; Bower, K.; Howson, R. W.; Belle, A.; Dephoure, N.; O’Shea, E. K.; Weissman, J. S. Nature 2003, 425, 737-741.
high and low abundance proteins were detected and is similar to previous comparisons.1 Supporting Information Figure 4b contains analysis of the percentage of proteins found at all concentration bins for yeast using proteins from Ghaemmaghami et al.,35 protein identifications from all peptides, protein identifications from peptides with Cys in the sequence, and protein identifications from peptides without Cys in the sequence. Analysis of protein identifications using all peptides and peptides with Cys in their sequence showed a similar distribution with identification of both high and low abundance proteins identified. If only proteins identified from peptides without Cys in their sequence were analyzed a distinct bias toward high abundance proteins was clearly evident and further supports the notion that these peptides were present in low abundance after enrichment using Hg-beads. CONCLUSIONS Hg-beads were readily prepared and effectively captured Cyscontaining peptides from a tryptic digest of yeast proteins. The peptides were released from the beads by DDT and were suitable for analysis by 1- and 2-D nano-LC DDA/MS. An advantage of these beads over TSP methods was the ability to digest and capture peptides in a single step. The beads may also be reused several times, although the binding capacity was substantially reduced after use. By reducing the number of peptides from each protein to ∼1 large numbers of yeast proteins were rapidly identified with high confidence after enrichment of Cys-containing peptides using Hg-beads and nano-LC, tandem MS, and database searches. 1-D nano-LC and QStar identified only Cys-containing
proteins, but some peptides without Cys in their sequence were identified using better separation and more sensitive detection technologies (LTQ-FT), and these peptides were generally from more abundant proteins. In general, the proteins identified after enrichment were from both low and high abundance proteins, and the numbers of proteins identified were similar to analysis of unenriched samples. Up to 50% of the identified proteins were not detected in previous studies of yeast proteins, where protein identifications are reported, and the enrichment procedure provides a relatively simple and rapid way of obtaining large numbers of unique protein identifications or increasing the number of proteins identified from complex samples. ACKNOWLEDGMENT This work was supported in part by the Faculty of Medicine (UNSW) and grants from The Australian Research Council (Systemic Infrastructure Initiative, Major National Research Funds and LIEF funds). A copy of Protein Results Parser was kindly supplied by Dr. R. Arnold (Indiana University). SUPPORTING INFORMATION AVAILABLE Additional information as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org.
Received for review December 14, 2007. Accepted February 14, 2008. AC702539Q
Analytical Chemistry, Vol. 80, No. 9, May 1, 2008
3341