The Challenges Ahead - Journal of Proteome Research (ACS

Lewis Y. Geer, Jonathan Epstein, Xiongfong Chen, Sanford P. Markey, and Jeffrey A. ... Joshua J. Coon, Heather A. Steele, Philip J. Laipis, and W...
1 downloads 0 Views 71KB Size
Personal Commentary on Proteomics Donald F. Hunt Department of Chemistry, University of Virginia, Charlottesville, Virginia 22904-4319 Received January 18, 2002

Bill Hancock, the editor of this journal, asked me to provide some personal comments on the current state of proteomics, how we’ve gotten to this point, and where the problems lie for which research in this field will provide answers. It is a challenge to which I am happy to accede. Bill and I agreed that the most straightforward way to approach a perspective such as this would be a question-andanswer format. We arrived at a series of six questions that we believe will provide writers and readers alike with the means to look briefly at the history of proteomics and the current state of affairs and attempt to perceive the future of both research and the resulting applications. These questions were answered verbally, and this paper was adapted from the written transcription of that conversation.

How Would You Define Proteomics? Proteome has been defined as the PROTEin complement expressed by a genOME or tissue.1 Proteomics, then, is the field that involves the identification, characterization, and quantitation of proteins in tissues or whole cells. The challenge is to analyze the 10000-20000 proteins expressed in a mammalian cell at a given time point and characterize them all simultaneously. In an ideal situation, protein characterization would likely include sequence analysis and cellular localization, plus identification of posttranslation modifications, splice variants, and binding partners.

What Were Your Early Efforts in this Field? Several events in the late 1970s triggered my entry into the field of proteomics, or protein chemistry as it was known at that time. In 1976, George Stafford, Jr., a graduate student in my group, developed pulsed positive-ion/negative-ion mass spectrometry, a technique that allowed simultaneous detection of positive and negative ions on a quadrupole mass spectrometer (Figure 1).2 A little bit later, Frank Crow, another graduate student, showed that electron-capture, negative-ion, chemical-ionization mass spectrometry could be up to 500 times more sensitive than positive-ion mass spectrometry (Figure 2).3 About the same time, my long-time colleague, Jeffrey Shabanowitz, who was then a graduate student, constructed one of the first triple-quadrupole instruments and began to demonstrate the power of this instrument to characterize the structure of molecules present in complex mixtures.4 In 1978, a colleague and friend, Howard Morris, in the Biochemistry Department of Imperial College, London, asked if he could spend a semester sabbatical with me to learn how to use electron-capture, negative-ion, mass spectrometry to quantitate 10.1021/pr020300a CCC: $22.00

 2002 American Chemical Society

enkephalins, neuropeptide pain killers found at trace levels in the brain. Since he was an expert in the technique of permethylation to make peptides more volatile, I requested, in turn, that he teach my group how to work with this methodology. He agreed, and shortly thereafter, we published the first paper detailing the use of tandem mass spectrometry to characterize the sequence of permethylated peptides directly from mixtures without prior purification.5 Next, we searched for a sample of a relatively simple protein of unknown sequence that we could characterize by a combination of enzymatic cleavage, off-line chromatography, and tandem mass spectrometry. Unfortunately, the Biochemistry Department at UVA gave me a generous supply of impure apolipoprotein B and indicated that its molecular weight was probably about 12 kDa. Since the actual molecular weight turned out to be in excess of 500000 kDa, we actually had, in moles, 40 times less than we thought each time we weighed it out. Sample size for successful analysis of a permethylated peptide by tandem mass spectrometry at that time was in the micromole range. Apolipoprotein B, because of its size and complexity, was certainly not the ideal model for testing new methodology. Needless to say, we made little or no progress for several years. Late in 1980, M. Barber announced the development of fast atom bombardment as an ionization technique for nonvolatile peptides.6 Early presentations described the method as involving bombardment of a sample matrix with kilovolt energy argon atoms. The nature of the matrix was kept secret, but several groups including our own recognized the presence of signals due to glycerol in the mass spectra recorded on peptides. Accordingly, Shabanowitz quickly implemented the technology on our triple-quadrupole instrument and demonstrated the utility of combining FAB with tandem mass spectrometry to sequence proteins.7 The paper concluded with the following statement: “We envision that this (strategy) will entail degradation of the protein to peptides, the optimal length of which will be determined by the mass range of the mass spectrometer (1,000 to 3,000 Da) rather than by the volatility of the mixture components. The mixture of peptides will then be fractionated by high-performance liquid chromatography and each fraction will be analyzed directly, without further purification, by the combination of secondary ion/collision activated dissociation mass spectrometry on a multi-analyzer instrument. Generation and analysis of the peptides could be completed in a matter of hours.” Sample size required for analysis by FAB was in the picomole range. A full paper describing the above approach, derivatization schemes, and instrumental methods plus sequence information Journal of Proteome Research 2002, 1, 15-19

15

Published on Web 02/15/2002

perspectives

Hunt

Figure 1. Fragmentation observed in the positive and negative ion spectrum recorded on the N-acetyl permethylated peptide, MetGly-Met-Met. Reprinted with permission from ref 2. Copyright 1976 American Chemical Society.

Figure 2. Bronsted acid-electron capture mass spectrum of the pentafluorobenzoyl derivative of amphetamine that shows a 500fold increase in the ion current observed in the negative ion spectrum. Reprinted with permission from ref 2. Copyright 1976 American Chemical Society.

derived from apoliprotein B appeared in the PNAS in 1986.8 In the next two years, we published nine papers describing the sequence analysis of proteins and characterization of posttranslational modifications by mass spectrometry. Still, biochemists remained skeptical that mass spectrometry would ever displace Edman degradation as the method of choice for protein sequence analysis. Use of low energy collisions on a triple-quadrupole mass spectrometer for sequence analysis of peptides and proteins finally gained acceptance by biochemists as a result of our participation in the third annual peptide sequence analysis competition held at the 1988 meeting of the Protein Society. Jean Rivier, of the Salk Institute, synthesized and distributed 3 nmol of a standard test peptide, STP-3, prior to the meeting in an effort to challenge the capabilities of the best microsquencing laboratories in the country. Two of my graduate students, John Yates, III, and Patrick Griffin plus Shabanowitz and I worked on the sample. We subjected it to a variety of derivatization procedures, enzymatic digestion, cyanogen bromide cleavage, subtractive Edman degradation, MS/MS on a triplequadrupole mass spectrometer, and laser photodissociation on a home-assembled tandem quadrupole-Fourier transform instrument.9 Shown in Figure 3 is the laser dissociation spectrum recorded for intact STP-3. Observed fragmentation was sufficient to define the linkage of the two peptide chains to the C-terminal Lys-CONH2 residue. We finally came up with what we thought was a definitive structure, 1, on the plane ride to attend the meeting in San Diego.10 Although we did not know it at the time, Rivier organized presentations by the contestants at the meeting so that those who got part of the structure correct went first and those who deduced the complete sequence went last. Three groups were scheduled to make presentations after ours. 16

Journal of Proteome Research • Vol. 1, No. 1, 2002

Following my address to the audience of more than 500 biochemists, Rivier instructed me to leave our final structure up on the screen and then asked if it was possible that I had made a mistake. After some thought, I assured him that the mass spectrometer did not tell fibs and that our structure was correct. Needless to say, I was quite shocked when the next speaker, Ken Williamson, from Yale University, described data obtained by a combination of molecular mass measurements by mass spectrometry and microsequence analysis by Edman degradation and came up with structure 2 containing a different disulfide linkage.11 Rivier then asked me to explain the discrepancy. All I could do under the circumstances was to say that I was confident that we had the correct structure. Fortunately, the next two groups, John Shively et al. from the Beckman Research Institute12 and a team from Applied Biosystems, Inc.,13 both arrived at the same answer as we did. All three teams deduced the correct structure, but only our group relied on tandem mass spectrometry to generate the necessary data. The following year, my group also employed tandem mass spectrometry to deduce the structure of the next synthetic test peptide, STP-4 (3).14 After these two accomplishments, many research groups approached us to collaborate on protein structural problems. To further promote acceptance of the triple-quadrupole technology for protein sequence analysis, we also offered a number of short courses at the University of Virginia. Over several years, we taught more than 400 scientists from around the world to handle and digest proteins, derivatize peptides, operate the triple-quadrupole instrument, and interpret the resulting MS/MS spectra.

With Which Particular Aspects of Proteomics Are You involved? For the past several years, the major focus of our research effort has been to develop methodology (a) for identification of proteins in complex mixtures by the combination of nanoflow-HPLC interfaced to electrospray ionization on an ion trap or Fourier transform mass spectrometer, (b) for selective analysis of phosphoproteins, (c) for the display and sequence analysis of proteins differentially expressed by healthy and diseased cells or by cells treated with and without a therapeutic agent or agonist, and (d) for characterization of protein-protein interactions in the cell. Automated high-throughput analysis of protein samples is accomplished by digesting the sample with trypsin and then analyzing the resulting mixture of tryptic petpides (40 peptides/ protein) by nanoflow-HPLC (5-200 nL/min) interfaced directly to electrospray ionization on a Thermo Finnigan LCQ instrument.15 Proprietary peak parking technology16 under control

Personal Commentary on Proteomics

perspectives

Figure 3. Laser photodissociation Fourier transform mass spectrum recorded at the 50 pmol level on the (M + 1)+ ion, m/z 2778.3, of STP-3. Reprinted with permission from ref 10. Copyright 1989 Academic Press.

of the mass spectrometer data system is employed to sequence up to 100 different peptides that happen to elute in the same

10 s window from the chromatography column. At the end of the chromatographic run, proteins in the original mixture are Journal of Proteome Research • Vol. 1, No. 1, 2002 17

perspectives

Hunt

Figure 4. Mass spectra recorded at the elution time of the phosphopeptide, MEpSTEVFTK, turing LC/MS analysis of a complex mixture of tryptic peptides before and after immobilized metal affinity chromatography.

identified by processing the complete set of MS/MS spectra against the protein and nucleic acid databases using the SEQUEST software program.17 Up to 6000 sequences can be obtained in a single 4 h chromatographic run with the above technology. Spectra with no match are interpreted manually. To characterize most, if not all, phosphoproteins from a whole cell lysate in a single experiment, cellular proteins are digested with trypsin, and the resulting peptides are enriched for phosphopeptides by immobilized metal affinity chromatography and analyzed by nanoflow HPLC/electrospray ionization mass spectrometry. As shown in Figure 4, phosphopeptides, present in a complex mixture of tryptic peptides, can only be detected following selective enrichment by immobilized metal affinity chromatography. More than 1000 phosphopeptides were detected when the methodology was applied to the analysis of a whole cell lysate from S. cerevisiae. Sequences for 216 peptides have been confirmed to date.18 Differential display of proteins expressed in two different cell populations is performed on a home-assembled Fourier transform mass spectrometer.19 This instrument operates with a detection limit in the low attomole range, mass resolving power of approximately 10000, mass accuracy in the millimass range, and dynamic range greater than 1000. Mass spectra of tryptic peptides from two different samples are analyzed sequentially by highly reproducible nanoflow-HPLC interfaced to electrospray ionization. Spectra acquired on the two samples are then subtracted from each other with an in-house software package to determine which peptides are more abundant in one sample than the other. With a sample size of 106 cells, peptides are detected at the level of 5-10 copies/cell. Sequence analysis of these peptides is performed in a second HPLC run on the Thermo Finnigan LCQ-Deca instrument operating in the targeted MS/MS mode. To date, we have employed the above technology to detect proteins up-regulated during sporulation of the bacterium Bacillus subtilis and proteins expressed in the periplasmic space of E. coli growing in media containing a limited source of nitrogen. Work to monitor phosphoproteins up-regulated in signal transduction pathways following treatment of cells with several agonists is presently in progess. Jarrod Marto and John Syka have constructed a new linear trapFourier transform instrument in the laboratory that operates with a 10-fold increase in sensitivity and dynamic range and also facilitates sequence analysis of peptides at the low attomole level.20 To probe protein-protein interactions, a bait protein is labeled with an affinity tag,21-22 expressed in cell culture, and then isolated from lysed cells along with its associated partners by affinity chromatography. Proteins are released with acid or salt, digested with trypsin, and the resulting peptides are then analyzed as described above. Recent studies in our laboratory have identified a number of proteins that interact with the 18

Journal of Proteome Research • Vol. 1, No. 1, 2002

androgen receptor and 28 components of the U3 small nucleolar ribonuclear protein complex that is involved in pre-rRNA processing.

What Do You See as the Future of Proteomics? Proteomics and mass spectrometry, in particular, will almost certainly revolutionize the way we diagnose, treat, and prevent disease in the 21st century. It will facilitate direct analysis of proteins in biological fluids, including serum, cerebral spinal fluid, urine, and exhaled breath. This, in turn, will lead to development of disease diagnostics and also identify surrogate markers that can be employed to monitor drug effectiveness in clinical trials. Proteomics, through protein-protein association studies, will eventually provide a detailed map of all protein interactions in healthy and diseased cells and thus facilitate development of drugs that selectively target disease-associated pathways while minimizing unwanted side effects. Understanding how proteins function by interacting with relevant cellular partners will also make it possible to evaluate the consequences of gene mutations on the operation of the cell. This, in turn, should accelerate the advance of gene therapy and individualized medicine in general. Differential display of proteins expressed on cellular membranes in healthy and diseased cells should provide a large number of disease-associated targets for the development of antibody-based therapeutics. Proteomics is key to understanding signal transduction pathways and cellular communication in general. Differential display of phosphorylated proteins induced by a particular drug or agonist is now feasible and should accelerate our understanding of cellular communication pathways and how to use them in the treatment of disease. Proteomics research on bacterial and viral genomes should lead to more powerful and safer vaccines in a shorter time frame than has been possible to date. Information gathered from research in proteomics of plants will undoubtedly lead to better and safer genetically engineered crops that are both pest resistant and nonhazardous to the environment. Proteomics is also a key to the development of edible plants as a delivery system for pharmaceuticals, vitamins, and vaccines.

Acknowledgment. This work was supported by U.S. Public Service Grants GM 37537 and AI 33993. References (1) Wilkins, M. R.; Sanchez, J. C.; Golley, A. A.; Appel, R. D. Humphery-Smith, I.; Hochstrasser, D. F.; Williams, K. L. Genet. Eng. Rev. 1996, 13, 19-50. (2) Hunt, D. F.; Stafford, G. C., Jr.; Crow, F. W.; Russell, J. R. Anal. Chem. 1976, 48, 1160-1163. (3) Hunt, D. F.; Crow, F. W. Anal. Chem. 1978, 50, 1781-1784. (4) Hunt, D. F.; Shabanowitz, J.; Giordani, A. B. Anal. Chem. 1980, 52, 386-390.

perspectives

Personal Commentary on Proteomics (5) Hunt, D. F.; Buko, A. M.; Ballard, J. M.; Shabanowitz, J.; Giordani, A. B. Biomed. Mass Spectrom. 1981, 8, 387-408. (6) Barber, M.; Bordoli, R. S.; Sedgwick, R. D.; Tyler, A. N. J. Chem. Soc., Chem. Commun. 1981, 325. (7) Hunt, D. F.; Bone, W. M.; Shabanowitz, J.; Rhodes, G.; Ballard, J. M. Anal. Chem. 1981, 53, 1704-1706. (8) Hunt, D. F.; Yates, J. R., III; Shabanowitz, J.; Winston, S.; Hauer; C. R. Proc. Natl. Acad. Sci., U.S.A. 1986, 83, 6233-6237. (9) Hunt, D. F.; Shabanowitz, J.; Yates, J. R., III. J. Chem. Soc., Chem. Commun. 1987, 548-550. (10) Hunt, D. F.; Griffin, P. R.; Yates, J. R., III; Shabanowitz, J.; Fox, J. W.; Beverly, L. K. In Techniques in Protein Chemistry; Hugli, T., Ed.; Academic Press: New York, 1989; pp 580-588. (11) Elliott, J.; Stone, K.; Roberts, W.; LoPresti, M.; De Angelis, R.; Crawford, M.; Kapouch, J.; Jacobsen, E.; Williams, K.; McMurray, W. L.; Meng, C.-K.; Mann, M.; Fenn, J. In Techniques in Protein Chemistry; Hugli, T., Ed.; Academic Press: New York, 1989; pp 569-79. (12) Heftz, S. A.; Besman, J.; Lee, T. D.; Shively, J. E.; Paxton, R. J. In Techniques in Protein Chemistry; Hugli, T., Ed.; Academic Press: New York, 1989; pp 560-568. (13) Yuen, S. W.; Otteson, K. M.; Colburn, J. C.; Moore, W. T.; Schlabach, T. D.; Dupont, D. F.; Mattaliano, R. J. In Techniques in Protein Chemistry; Hugli, T., Ed.; Academic Press: New York, 1989; pp 589-597.

(14) Hunt, D. F.; Alexander, J. E.; McCormack, A. L.; Martino, P. A.; Michel, H.; Shabanowitz, J. In Techniques in Protein Chemistry II; Villafranca, J. J., Ed.; Academic Press: New York, 1991; pp 455465. (15) Shabanowitz, J.; Settlage, R. E.; Marto, J. A.; Christian, R. E.; White, F. M.; Russo, P. S.; Martin S. E.; Hunt, D. F. In Mass Spectrometry in Biology and Medicine; Burlingame, A. L., Carr, S. A., Eds.; Humana Press: Totowa, NJ, 2000; pp 163-177. (16) Settlage, R. E.; Hunt, D. F.; Christian, R. U.S. Patent 6,139,734. (17) Eng, J.; McCormack, A. L.; Yates, J. R., III. J. Am. Soc. Mass Spectrom. 1994, 5, 976-989. (18) Ficarro, S. B.; McCleland, M. L.; Stukenberg, P. T.; Burke, D. J.; Ross, M. M.; Shabanowitz, J.; Hunt, D. F.; White, F. M. Nature Biotech. 2002, in press. (19) Martin, S. E.; Shabanowitz, J.; Hunt, D. F.; Marto, J. A. Anal. Chem. 2000, 72, 4266-4274. (20) Syka, J. E. P.; Bai, D. L.; Stafford, G. C., Jr.; Hornung, S.; Shabanowitz, J.; Hunt, D. F.; Marto, J. A. Proceedings of the 49th ASMS Conference on Mass Spectrom; May 2001, Allied Topics: Chicago, IL, 2001. (21) Ho, Y.; et al. Nature 2002, 415, 180-183 (22) Gavin, A. C.; et al. Nature 2002, 415, 141-147.

PR020300A

Journal of Proteome Research • Vol. 1, No. 1, 2002 19