Membrane Protein Identification: N-Terminal Labeling of Nontryptic Membrane Protein Peptides Facilitates Database Searching Maria Jansson, Kristofer Wårell, Fredrik Levander, and Peter James* Department of Protein Technology, BMC D13, Lund University, Lund SE-221 84, Sweden Received August 20, 2007; Accepted November 14, 2007
Membrane proteins are fairly refractory to digestion especially by trypsin, and less specific proteases, such as elastase and pepsin, are much more effective. However, database searching using nontryptic peptides is much less effective because of the lack of charge localization at the N and C termini and the absence of sequence specificity. We describe a method for N-terminal-specific labeling of peptides from nontryptic digestions of membrane proteins, which facilitates Mascot database searching and can be used for relative quantitation. The conditions for digestion have been optimized to obtain peptides of a suitable length for mass spectrometry (MS) fragmentation. We show the effectiveness of the method using a plasma membrane preparation from a leukemia cell line and demonstrate a large increase in the number of membrane proteins, with small extra-membranar domains being identified in comparison to previous published methods. Keywords: Membrane proteins • pepsin • proteinase K • N-terminal labeling • database search
Introduction Approximately 25% of genes described in all three kingdoms of life (archael, prokaryotic, and eukaryotic) code for membrane proteins (defined as possessing two or more transmembrane domains). The predicted number of genes could increase dramatically, up to 40%, if one includes membrane-anchored proteins, those which have either a single transmembrane (TM) helix or are anchored through covalent attachment to a lipid. The preponderance of this superfamily of proteins throughout evolution is not surprising when one considers that the major evolutionary jumps were the establishment of the cell as a single entity defined by an enclosing membrane and then the establishment of organelles within the cell. Compartmentalization allows for a greater degree of control over metabolism by segregating anabolic and catabolic pathways and substrate accessibility via transport across the membrane boundaries. All of these estimates are based on the prediction of R-helical membrane domains and do not include β-barrel structures that have been found in prokaryotes and eukaryotic organelles. In eucaryotes, the proteins are predicted to account for ca. 2% of the genes in these genome. Despite the advances in protein technology, membrane proteins still represent one of the most difficult classes of proteins to analyze. This is mainly due to the lack of tryptic digestion sites close to the membrane domains as well as to the limited solubility and the general refractory nature of these proteins toward digestion. The first membrane structure to be solved was that of bacteriorhodpsin,1,2 which has remained the most-studied model membrane protein ever since. The strategy employed was to digest the protein with cyanogen bromide * To whom correspondence should be addressed: Protein Technology, BMC D13, Lund University, Lund SE-221 84, Sweden. Fax: +46-46-222-4200. E-mail:
[email protected]. 10.1021/pr070545t CCC: $40.75
2008 American Chemical Society
and separate the large fragments by chromatography in strong organic solvents. The peptides were sequenced by Edman degradation, and overlaps were obtained by gas chromatography-mass spectrometry (GC-MS) analysis of partial acid hydrolysis of the protein. Most of the problems encountered when analyzing membrane proteins were established in these studies: the need for separation systems for water-insoluble polypeptides, the use of chemical modification to increase solubility (and to stop washout during Edman degradation), and the use of nonspecific digestion to increase sequence coverage. Historically, Edman degradation became the method of choice with the development of automated sequencers. Even with the advent of new ionization techniques that made peptide sequencing by MS/MS possibly, mass spectrometric analysis of membrane proteins has been inhibited by the reliance on the use of trypsin as the preferred proteolytic enzyme and the difficulty in digesting these recalcitrant proteins. Low-specificity proteases, such as elastase, digest membrane proteins very well3 as does pepsin, which has the added advantage of being active in 30% formic acid, which is an excellent denaturing solvent for most membrane proteins. However, a large number of peptides derived from these digest fragments in ways that are hard to predict because of the random location of the positive charges, which direct fragmentation. Charge-directed fragmentation can be manipulated by reagents that are specifically attached to the N-terminal or certain amino acid side chains4,5 to increase the number of b- or y-ion series fragments over internal and neutral loss fragments. We present an approach that uses membrane isolation, protein separation by 1D polyacrylamide gel electrophoresis (PAGE), digestion by Journal of Proteome Research 2008, 7, 659–665 659 Published on Web 12/28/2007
research articles low-specificity proteases, and modification by charge-directing fragmentation reagents to allow for high-confidence database searching.
Materials and Methods Materials and Reagents. Nicotinic acid, standard protein alcohol dehydrogenase, pepsin, 3-(N-morpholino)propanesulfonic acid (MOPS), acetonitrile, acrylamide, urea, Tris, NaOH, iodoacetamide, R-cyano-4-hydroxycinnamic acid, Nmethyl-piperidine, protein assay kit, and solvents for synthesis and high-performance liquid chromatography (HPLC) were purchased from Sigma-Aldrich (Stockholm, Sweden). Dithiothreitol (DTT) was purchased from Pierce (SDS, Falkenberg, Sweden). Sequencing-grade-modified trypsin and proteinase K were purchased from Promega (SDS, Falkenberg, Sweden). Synthesis 1-(Nicotinoyloxy)succinimide Ester. Nicotinic acid (1.23 g, 10 mmol, 1 equiv) and N-hydroxysuccinimide (1.15 g, 10 mmol, 1 equiv) were dissolved in N,N-dimethylformamide (DMF) (20 mL). 1-Ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride (EDC) (1.92 g, 10 mmol, 1 equiv) was added to the reaction mixture, and stirring was continued at room temperature overnight. The solvent was then evaporated on a rotary evaporator, and the obtained crude product was dissolved in dichloromethane. The organic phase was extracted 3 times with a saturated solution of NaCl and dried over MgSO4, and the solvent was evaporated. The crude product was recrystallized from a mixture of diethyl ether/methanol. The white precipitate was filtered off and dried under vacuum. Yield: 1.65 g, 75%. The purity and structure was confirmed by 1 H nuclear magnetic resonance (NMR) (CDCl3, 300 MHz) δ: 2.863 (s, 2 CH2), 7.40–7.50 (m, H5 (ArH)), 8.30–8.40 (dt, H4 (ArH), J ) 1.86, 2.18, 8.1 Hz), 8.80–8.90 (dd, H6 (ArH), J ) 1.87, 4.93 Hz), 9.27 (d, H2 (ArH), J ) 2.18 Hz) ppm. Enzymatic Digestion of Standard Protein. The protein standard used was yeast alcohol dehydrogenase. It was dissolved in 25 mM Tris-HCl buffer at pH 8.5, and proteinase K in 50 mM Tris-HCl (pH 8) and 10 mM CaCl2 was added to the sample (1:100 protein/enzyme), and the digestion was carried out at 37 °C for 3 h before another aliquot of proteinase K was added. The digestion was continued for another 1.5 h before it was quenched by adding 35% HCl to a final pH of 2.0. For pepsin digestion, the ADH solution was adjusted to pH 2.0 by adding 35% HCl and pepsin (1:10 protein/enzyme) in 0.04 M HCl. The digestion was carried out at 37 °C for 3 h before a second aliquot was added and incubated for another 1.5 h. The digestion was then quenched by adding 6 M NaOH to pH 7. The digests were then split into two aliquots: one half was stored at -20 °C, and the other half of the digest was then partially dried in a Speedvac before adding 60 µL of 100 mM MOPS buffer at pH 7. N-Terminal Modification of Standard Protein Digests. A total of 10 µL of prechilled 200 mM nicotinic acid N-hydroxysuccinimide ester (NicNHS) dissolved in acetonitrile was added to each digest sample, and the reaction was carried out for 1 h on ice. The pH was adjusted to pH 7 before incubation. The side-reaction products with Ser-, Thr-, and Tyr-forming esters were eliminated by treating the samples with 1 µL of 600 µM hydroxylamine in 25 mM Tris buffer at pH 8.5 for 20 min at room temperature. Then, the pH was increased to 11–12 by adding 6 M NaOH, and the incubation was carried out for 20 min at room temperature. Membrane Isolation. The leukemia cells U937 MOC (the cell line and their culture is described in Håkansson et al.6) were 660
Journal of Proteome Research • Vol. 7, No. 2, 2008
Jansson et al. left to thaw on ice before they were lysed with 250 µL of icecold H2O, added to each tube containing 50 million cells, and vortexed 4 times for 60 s. Between each vortexing, the cells were chilled on ice for 1 min. The sample was centrifuged at 5000g for 30 min at 4 °C to remove unbroken cells. Then, the supernatant was centrifuged at 48000g for 40 min to pellet the membranes. The membrane pellet was resuspended and washed 2 times with ice-cold 10 mM sodium carbonate buffer at pH 11 and, then finally, 2 times with 10 mM Tris buffer at pH 7. Between each wash, the membrane was pelleted by centrifugation at 48000g for 40 min at 4 °C. After the final centrifugation, the pellet was suspended in 10 mM Tris buffer at pH 7 and the protein concentration was determined using the protein assay kit (Sigma Diagnostics). One-Dimensional Gel Electrophoresis and In-Gel Modification of the Membrane Proteins. Five aliquots, each containing 50 µg of protein, were prepared and mixed 1:1 with sample buffer and heated at 98 °C for 3 min. The samples were separated on a 12.5% sodium dodecyl sulfate (SDS)-PAGE gel with a 5% stacking gel at 25 °C with 25 A/gel until the bromophenol blue dye front had run off the base of the gel. The gel was stained using GelCode Blue Stain Reagent (Pierce). Each of the lanes was cut into 15 slices. The slices were destained in 50% acetonitrile and 25 mM NH4HCO3 before reduction with 10 mM DTT in 100 mM NH4HCO3 at 55 °C for 1 h. Alkylation was performed by adding 55 mM iodoacetamide in 100 mM NH4HCO3 to each sample and incubating for 45 min at room temperature in the dark. After washing the slices with 100 mM NH4HCO3 and acetonitrile, the lysine residues were succinylated by adding 100 mM succinic anhydride in 2 M urea and 200 mM sodium phosphate buffer at pH 8.5. The sample pH was adjusted with 6 M NaOH to 8.5 and incubated for 2 h at room temperature. A second aliquot of succinic anhydride was added and incubated for an additional 2 h. After the gel slices were washed with H2O and acetonitrile, the proteins were digested with either pepsin, proteinase K, or trypsin. Enzymatic Digestion of Gel Slices. A total of 25 µL of enzyme (1.17 µg/mL pepsin in 0.04 M HCl; 0.64 µg/mL proteinase K in 10 mM sodium carbonate buffer at pH 11) was added to the dehydrated gel slice after the acetonitrile wash, and the digestion was carried out at 37 °C for 3 h before a second aliquot of enzyme was added and continued for another 1.5 h. The proteinase K digest must be kept in a sealed tube to prevent carbon dioxide from altering the pH of 11 and increasing the activity of the enzyme. The trypsin digestion was carried out overnight at 37 °C after adding 20 µL of 12.5 ng/µL Promega trypsin in 50 mM ammonium bicarbonate buffer to the gel slice. The peptides were extracted twice from the gel by adding 0.1 M HCl in 75% acetonitrile to the slices and incubating at room temperature for 30 min. The N-terminal modification with NicNHS was carried out as described for ADH, except that only 1 µL of NicNHS was added to each sample. MS/MS Analysis. The standard protein samples were diluted 1:100 in 0.1% formic acid, and the pH of the gel slice digestion was decreased to 200), but the number of 7 TM proteins found has increased considerably. Tryptic digestion on the other hand only gave a positive identification for less than 150 proteins (data not shown). The problem of digesting membrane proteins is illustrated in Figure 4, which depicts a typical 7 TM protein, which has very little extra-membranar sequence that is accessible to proteases, especially sequence-specific ones. The analysis of the proteins being expressed in the U937 showed that representatives of virtually all classes of membrane proteins were present, including most classes of receptors, ion 664
Journal of Proteome Research • Vol. 7, No. 2, 2008
channels, and pumps. Many of the proteins identified have been shown to be present at the mRNA level (data not shown) in several of the public databases. Quite a few of these have been suggested as cancer biomarkers (e.g., breast cancer type-2 susceptibility protein, ovarian cancer marker CA125, hypoxiainduced protein HIG-1, etc.), although it appears that many occur in different cancer types and may be due to the creation of the cell line and the subsequent selective stress that is placed on the cells when growing in culture. Also surprising was the presence of taste and especially olfactory receptors, although these have also shown to be present in monocytic cell lines.
Conclusions The greatest gap in proteome coverage that currently exists is that of the membrane proteins. These are currently estimated to consist of between 25 and 35% of the human genome. Because virtually all of the current pharmaceutical drugs on the market (estimated at around 90% if one includes disease and vaccine targets) act on membrane proteins, this is clearly a huge gap in our analytical capabilities. We have systematically investigated the various steps in membrane protein analysis and established a reproducible and highly effective protocol that allows for a high coverage of membrane proteins, especially those with small extra-membranar domains. To obtain enough peptides for an unequivocal identification, it is necessary to use nonspecific enzymes because there is a paucacity of charged residues close to the membrane surface, which are useful for specific enzymes, such as trypsin or AspN/GluC. The lack of a terminal charge localization makes the use of chargedirecting reagents highly desirable, and we show that this increases the confidence of the score reported. The approach is reproducible and gives a wide coverage of the predicted membrane proteins expressed in a model cell line.
research articles
Membrane Protein Analysis Abbreviations: Nic, nicotinic acid; NHS, N-hydroxysuccinimide ester.
Acknowledgment. This work was supported by grants from the Gothenburg Research School in Functional Genomics (M.J.), a Swegene postdoctoral program grant (F.L.), the Knut and Alice Wallenberg Foundation (P.J.), and the Swedish Strategic Research Council to CREATE Health (P.J.). Supporting Information Available: Summary of the data analysis and protein identification comments of the proteins obtained from the U937 leukaemia plasma membrane using proteinase K (Supplementary Table 1) and tryptic (Supplementary Table 2) digestions. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Khorana, H. G.; Gerber, G. E.; Herlihy, W. C.; Gray, C. P.; Anderegg, R. J.; Nihei, K; Biemann, K. Amino acid sequence of bacteriorhodopsin. Proc. Natl. Acad. Sci. U.S.A. 1979, 76 (10), 5046–5050. (2) Ovchinnikov, Y. A.; Abdulaev, N. G.; Feigina, M. Y.; Kiselev, A. V.; Lobanov, N. A. The structural basis of the functioning of bacteriorhodopsin: An overview. FEBS Lett. 1979, 100 (2), 219–224. (3) Wu, C. C.; MacCoss, M. J.; Howell, K. E.; Yates, J. R., III. A method for the comprehensive proteomic analysis of membrane proteins. Nat. Biotechnol. 2003, 21 (5), 532–538. (4) Spengler, B. Peptide sequencing of charged derivatives by postsource decay MALDI mass spectrometry. Int. J. Mass Spectrom. 1997, 169/170, 127–140. (5) Munchbach, M. Quantitation and facilitated de novo sequencing of proteins by isotopic N-terminal labeling of peptides with a fragmentation-directing moiety. Anal. Chem. 2000, 72 (17), 4047– 4057. (6) Ha˚kansson, P.; Lassen, C.; Olofsson, T.; Baldetorp, B.; Karlsson, A.; Gullberg, U.; Fioretos, T. Establishment and phenotypic characterization of human U937 cells with inducible P210 BCR/ ABL expression reveals upregulation of CEACAM1 (CD66a). Leukemia 2004, 18, 538–547. (7) Kyte, J.; Doolittle, R. F. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 1982, 157 (1), 105– 132. (8) Pisareva, T. Proteomics of Synechocystis sp. PCC 6803. Identification of novel integral membrane proteins. FEBS J. 2007, 274, 791– 804. (9) Fischer, F.; Poetsch, A. Protein cleavage strategies for an improved analysis of the membrane proteome. Proteome Sci. 2006, 4, 2. (10) Lu, X.; Zhu, H. Tube-gel digestion: A novel proteomic approach for high throughput analysis of membrane proteins. Mol. Cell. Proteomics 2005, 4 (12), 1948–1958.
(11) Zhong, H.; Marcus, S. L.; Li, L. Microwave-assisted acid hydrolysis of proteins combined with liquid chromatography MALDI MS/ MS for protein identification. J. Am. Soc. Mass Spectrom. 2005, 16 (4), 471–481. (12) Quach, T. T.; Li, N.; Richards, D. P.; Zheng, J.; Keller, B. O.; Li, L. Development and applications of in-gel CNBr/tryptic digestion combined with mass spectrometry for the analysis of membrane proteins. J. Proteome Res. 2003, 2 (5), 543–552. (13) Hixson, K. K.; Rodriguez, N.; Camp, D. G., II; Strittmatter, E. F.; Lipton, M. S.; Smith, R. D. Evaluation of enzymatic digestion and liquid chromatography-mass spectrometry peptide mapping of the integral membrane protein bacteriorhodopsin. Electrophoresis 2002, 23 (18), 322–432. (14) Yu, Y. Q.; Gilar, M.; Gebler, J. C. A complete peptide mapping of membrane proteins: A novel surfactant aiding the enzymatic digestion of bacteriorhodopsin. Rapid Commun. Mass Spectrom. 2004, 18 (6), 711–715. (15) Zischka, H.; Gloeckner, C. J.; Klein, C.; Willmann, S.; Swiatek-de Lange, M.; Ueffing, M. Improved mass spectrometric identification of gel-separated hydrophobic membrane proteins after sodium dodecyl sulfate removal by ion-pair extraction. Proteomics 2004, 4 (12), 3776–3782. (16) Kamo, M.; Tsugita, A. Specific cleavage of amino side chains of serine and threonine in peptides and proteins with S-ethyltrifluorothioacetate vapor. Eur. J. Biochem. 1998, 255 (1), 162–171. (17) Tsugita, A.; Kamo, M.; Miyazaki, K.; Takayama, M.; Kawakami, T.; Shen, R.; Nozawa, T. Additional possible tools for identification of proteins on one- or two-dimensional electrophoresis. Electrophoresis 1998, 19 (6), 928–938. (18) Rodriguez-Ortega, M. J. Characterization and identifiecation of vaccine candidate proteins through analysis of the groupe A Streprococcus surface proteome. Nat. Biotechnol. 2006, 24 (2), 191– 197. (19) Schmidt, A.; Kellermann, J.; Lottspeich, F. A novel strategy for quantitative proteomics using isotope-coded protein labels. Proteomics 2005, 5 (1), 4–15. (20) Ross, P. L.; Huang, Y. N. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Proteomics 2004, 3 (12), 1154–1169. (21) Martin, D. B.; Eng, J. K.; Nesvizhskii, A. I.; Gemmill, A.; Aebersold, R. Investigation of neutral loss during collision-induced dissociation of peptide ions. Anal. Chem. 2005, 77 (15), 4870–4882. (22) Ji, C.; Lo, A.; Marcus, S.; Li, L. Effect of 2MEGA labelling on membrane proteome analysis using LC-ESI QTOF MS. J. Proteome Res. 2006, 5, 2567–2576. (23) Wahlander, A.; Arrigoni, G.; Warell, K.; Levander, F.; Palmgren, R.; Maloisel, J. L.; Busson, P.; James, P. Development of reagents for differential protein quantitation by subtractive parent (precursor) ion scanning. J. Proteome Res. 2007, 6 (3), 1101–1113. (24) Marsden, R. L.; Lee, D.; Maibaum, M.; Yeats, C.; Orengo, C. A. Comprehensive genome analysis of 203 genomes provides structural genomics with new insights into protein family space. Nucleic Acids Res. 2006, 34 (3), 1066–1080.
PR070545T
Journal of Proteome Research • Vol. 7, No. 2, 2008 665