Experimental Evaluation of Protein Identification by an LC/MALDI/ On-Target Digestion Approach Melkamu Getie-Kebtie, Peter Franke, Robert Aksamit, and Michail A. Alterman* Tumor Vaccines and Biotechnology Branch, Division of Cellular and Gene Therapies, Center for Biologics Evaluation and Research, Food and Drug Administration, Building 29A, Room 2D12, 8800 Rockville Pike, Bethesda, Maryland 20892 Received December 11, 2007
Tryptic digestion of proteins continues to be a workhorse of proteomics. Traditional tryptic digestion requires several hours to generate an adequate protein digest. A number of enhanced accelerated digestion protocols have been developed in recent years. Nonetheless, a need still exists for new digestion strategies that meet the demands of proteomics for high-throughput and rapid detection and identification of proteins. We performed an evaluation of direct tryptic digestion of proteins on a MALDI target plate and the potential for integrating RP HPLC separation of protein with on-target tryptic digestion in order to achieve a rapid and effective identification of proteins in complex biological samples. To this end, we used a Tempo HPLC/MALDI target plate deposition hybrid instrument (ABI). The technique was evaluated using a number of soluble and membrane proteins and an MRC5 cell lysate. We demonstrated that direct deposition of proteins on a MALDI target plate after reverse-phase HPLC separation and subsequent tryptic digestion of the proteins on the target followed by MALDI TOF/TOF analysis provided substantial data (intact protein mass, peptide mass and peptide fragment mass) that allowed a rapid and unambiguous identification of proteins. The rapid protein separation and direct deposition of fractions on a MALDI target plate provided by the RP HPLC combined with off-line interfacing with the MALDI MS is a unique platform for rapid protein identification with improved sequence coverage. This simple and robust approach significantly reduces the sample handling and potential loss in large-scale proteomics experiments. This approach allows combination of peptide mass fingerprinting (PMF), MS/MS peptide fragment fingerprinting (PPF) and whole protein MS for both protein identification and structural analysis of proteins. Keywords: protein identification • proteomics • tryptic digest • MALDI • mass spectrometry • on-target • membrane proteins
Introduction The rapid and comprehensive separation, identification, and characterization of proteins from complex biological samples is a formidable challenge that the growing field of proteomics faces.1 The strategies that protein chemists employ to approach this challenge can be broadly divided into two broad categories “top-down” and “bottom-up”.2 The most used bottom-up approach, multidimensional protein identification technology (MudPIT), relies heavily on liquid chromatography (LC) to reduce sample peptide mixture complexity prior to MS analysis.3,4 Typically, this technique involves digestion of the whole biological sample followed by separation of the peptide mixture by a combination of ionexchange and reverse phase chromatography prior to MS analysis. Despite its usefulness for a wide-ranging analysis on a global scale, this strategy (also termed the ‘shotgun proteomics’ approach) suffers from an inherent limitation that lies in the fact that an increase in peak resolution leads to a decrease * To whom correspondence
[email protected]. 10.1021/pr800258k CCC: $40.75
should
be
addressed.
2008 American Chemical Society
E-mail:
in time available for collection of MS/MS data. This in turn causes incomplete analysis of all peak components or compromised analysis of closely eluting peaks. Such results can be misleading where the complexity of samples results in false positive identifications because only a small number of peptides are matched. Another limitation is an apparent “disconnect” between intact protein and tryptic peptides resulting in a loss of information concerning intact protein, for example, molecular mass, pI, and so forth. Top-down methods rely on intact protein analysis which starts with separation of proteins and is followed by mass spectrometric analysis performed either on the level of whole proteins or tryptic peptides.5,6 The former technique holds great promise but is still under development and limited only to Fourier transform ion cyclotron resonance mass spectrometry (FTICR) instruments. Initial proteomics efforts relied on protein separation by two-dimensional gel electrophoresis (2DE), where in-gel digestion of each protein spot is analyzed by subsequent matrix-assisted laser desorption/ionization (MALDI) or electrospray ionization (ESI)-based mass spectrometry (MS) analysis.7–9 Although 2DE offers high-resolution separations, Journal of Proteome Research 2008, 7, 3697–3707 3697 Published on Web 07/03/2008
research articles
Getie-Kebtie et al.
Figure 1. Scheme for protein RP HPLC separation/on-target tryptic digestion.
the technique remains labor-intensive and requires significant operator skill. Moreover, the depth of coverage is limited mainly to moderately and highly abundant soluble proteins and it suffers from lower throughput or difficulty of automation.10 In recent years, protein separation by two-dimensional liquid chromatography followed by in-solution digest of the collected protein fractions has gained considerable popularity.11–15 LCbased separation of intact proteins offers many advantages over the MudPIT approach and the use of gel-based methods. Proteins are kept intact throughout separation and collection, so that several MS-based methods of analysis can be applied. These include digestion of fractions for protein identification by peptide mass fingerprinting (PMF) and/or MS/MS by peptide fragment fingerprinting (PFF), as well as the direct analysis of intact proteins by MS. The combined result provides more accurate identification of proteins than the results of either strategy individually. However, this approach requires subsequent steps of sample manipulation, such as fraction collection, reinjection, concentration, reduction/alkylation, and 3698
Journal of Proteome Research • Vol. 7, No. 9, 2008
sometimes, buffer exchange before obtaining a sample ready for protease digestion. Even after digestion, the resulting peptide mixture may require several steps of sample preparation, such as desalting and enrichment prior to mass spectrometric analysis. These steps, in addition to being timeconsuming, maximize the risk of sample contamination and loss. Off-line coupling of LC with MALDI instruments (LC/ MALDI/MS/MS) via direct on-target tryptic peptide fraction collection (MALDI adaptation of MudPIT) was used successfully for improving proteome coverage obtained by LC/ESI/ MS/MS.16 A logical next step in this progression of proteomic workflow improvement is to perform direct on-target protein collection with subsequent tryptic digest. The first attempts in this direction were published earlier describing a limited on-probe tryptic digest of bacterial proteins.17,18 Another attempt involves direct tryptic digestion of proteins on ProteinChip surface followed by surface-enhanced laser desorption/ionization-time-of-flight (SELDI-TOF) mass spec-
research articles
Protein Identification by LC/MALDI/On-Target Digestion Approach
Table 1. Number of Peptides Matched to BSA from On-Target Tryptic Digestion of Different BSA Samples Sigma sample type
Control (- trypsin) Experiment (+ trypsin)
method
PMF PFF PMF PFF
a
lot 1
c
Pierce lot 2
10 8 23 10
0 0 12 7
d
lot 1
lot 2
Equitech
Equitech (PFa,b)
0 0 18 9
0 0 12 9
0 0 14 6
0 0 16 7
a Peptide matching was made by peptide mass fingerprinting (PMF) and peptide fragment fingerprinting (PFF). 109H1062. d Lot 2: Lot No. 126K7405.
b
PF: protease free.
c
Lot 1: Lot No.
Figure 2. MALDI TOF spectra of BSA digested with trypsin on a MALDI target plate. (A) Spectrum of the non reduced/alkylated sample and (B) spectrum of the reduced/alkylated sample (black squares, peptides with no cysteine, red circles, modified cysteine containing peptides.)
trometry.19 This method required 2-4 h of incubation in a humid chamber for the digestion to take place. More recent reports described an in-house designed LC/MS system that
integrates monolithic capillary HPLC protein separation and on-plate digestion for subsequent MALDI-MS analysis.20,21 However, the method described in the latter publications is Journal of Proteome Research • Vol. 7, No. 9, 2008 3699
research articles
Getie-Kebtie et al.
Table 2. Number of Peptides Obtained from On-Target Digestion of Membrane Proteins no. of peptides matched
a
protein
PMF
PFF
% SCa
CYP2B1 CYP1A2 CYP2E1
24 15 24
9 9 19
46 38 53
SC (sequence coverage) calculated based on PMF data.
difficult to repeat and adapt in other laboratories since it is based on in-house developed software and a combination of various instruments and an in-house prepared monolithic column. Considering that such integrated LC/MALDI/on-target digest workflow has the potential to become a significant step forward in proteomic technology, we performed a detailed characterization of this approach using commercially available instrumentation and columns. We have analyzed a mixture of standard proteins, compared in-solution and on-target digestion of a hydrophobic membrane protein, and finally, applied this approach to the identification of proteins in the MRC-5 cell line, where we were able to identify 49 proteins in one HPLC run. This type of intact protein separation is an improvement over the peptide separation-based approach in that each peptide does not lack the association to the protein from which it originated. This association makes the identification of proteins more reliable and the assignment of multiple locations of posttranslational modification or sequence variation to a single protein species less problematic.
Experimental Section Materials. The following types of bovine serum albumin (BSA) were purchased from the respective companies: (a) BSA Fraction V powder, Lot No. 109H1062 (SigmaAldrich, St. Louis, MO) (b) BSA Lyophilized Powder, Lot No. 126K7405 (SigmaAldrich, St. Louis, MO) (c) BSA ampules, 2 mg/mL, Lot No. IG115177 (Pierce Biotechnology, Rockford, IL) (d) BSA Standard grade Powder, Lot No. BAH63-717 (Equitech-Bio, Kerrville, TX), and (e) BSA Protease Free Powder, Lot No. BAH65-667 (Equitech-Bio, Kerrville, TX) Human cytochrome P450 1A2 and human cytochrome P450 2E1 were purchased from Invitrogen Corporation (Carlsbad, CA). Purification of rat cytochrome P450 2B1 was described elsewhere.22 Ferritin from horse spleen was purchased from Pierce Biotechnology (Rockford, IL). All of the following items were purchased from Sigma-Aldrich: Cytochrome c from bovine heart, insulin from bovine pancreas, bovine β-lactoglobulin (βLG) from bovine milk, catalase from bovine liver, rabbit glyceraldehyde-3-phosphate dehydrogenase (GAPDH) from rabbit muscle, equine myoglobin from equine skeletal muscle, glucose oxidase from Aspergillus niger, R-cyano-4-hydroxycinnamic acid (CHCA), acetonitrile (ACN), methanol, ammonium bicarbonate (ABC), Protein Extraction Reagent 4, Protease Inhibitor Cocktail, and tributylphosphine. Trifluoroacetic acid (TFA) was from Fluka (Milwaukee, WI). Sequence grade modified porcine trypsin was from Promega (Madison, WI). 3700
Journal of Proteome Research • Vol. 7, No. 9, 2008
Methods 1. Preparation of MRC5 Extract. MRC5 cells (ATCC, Cat. No. CCL-171) in a 75 cm2 flask were grown to confluence in Eagles’s Minimal Essential medium with Earle’s BSS, 2 mM L-glutamine, 1.0 mM sodium pyruvate, 0.1 mM nonessential amino acids, 50 IU/mL penicillin, 50 IU/mL streptomycin and 10% heat-inactivated (56 °C for 30 min.) bovine fetal serum. Medium was removed from the flask and the monolayer was washed three times with 10 mL of Dulbecco’s PBS without calcium and magnesium used for each wash. Two milliliters of Protein Extraction Reagent 4 containing 1% Protease Inhibitor Cocktail and 5 mM tributylphosphine were applied to the monolayer and rocked for approximately 1 h. The lysate was then subjected to 20 pressure cycles, with each cycle consisting of 20 s at 35 000 psi followed by 10 s at atmospheric pressure (Barocycler Pressure Cycling Technology system, Model NEP3229, Pressure BioSciences, Inc., West Bridgewater, MA) at room temperature. The lysate was then centrifuged at 10 000g for 10 min. The supernatant was separated and alkylated with 2.8% dimethyl acrylamide (DMA) for 30 min at room temperature on a rotary shaker. Residual DMA was quenched with 32.6 µL of 2 M DTT. 2. On-Target Tryptic Digestion of Selected Proteins. Individual solutions of the following proteins were prepared at a concentration of 1 mg/mL in 25 mM ammonium bicarbonate buffer (ABC): BSA from different sources, horse cytochrome C, human cytochrome P450s (recombinant CYP1A2 and CYP2E1), and rat cytochrome P450s. A 0.5 µL aliquot of each protein solution was manually spotted on a 384-well MALDI plate. To investigate the effect of pre- and postspotting of trypsin on its digestion efficiency, half of the wells contained a prespotted and immobilized trypsin, whereas to the remaining half, trypsin was spotted after the proteins were immobilized. Furthermore, the effect of adjusting pH was evaluated by depositing 0.5 µL aliquot of 25 mM ABC before and after the two conditions of trypsin spotting, that is, post- and prespotting of trypsin, respectively. The effect of reduction/alkylation on the effectiveness of on-target digestion was evaluated by using BSA (Equitech-Bio protease free) as a model protein. A 2 mg/mL solution of BSA in 25 mM ABC was denatured with an equal volume of ACN. The BSA was reduced by incubation with 100 mM DTT for 1 h at room temperature and then alkylated with 500 mM iodoacetamide at room temperature for 1 h. The alkylated BSA was diluted with 25 mM ABC to a final concentration of 0.25 µg/mL. Aliquots of 0.5 µL were spotted on a 384well MALDI plate and digested with trypsin as described above. A nonreduced/alkylated BSA solution of the same concentration was used as a control. In all cases, a 0.5 µL aliquot of CHCA matrix solution (5 mg/mL) in 0.1% TFA/70% ACN was deposited prior to MS analysis. 3. Intact Protein MW Determination Following RP HPLC Separation. Individual solutions (1 mg/mL) in ACN (30%), 2-propanol (5%), and TFA (0.1%) were prepared for the following 8 proteins: bovine cytochrome c, bovine insulin, bovine β-LG, bovine catalase, rabbit GAPDH, equine myoglobin, A. niger glucose oxidase, and equine ferritin. Equal volumes of each protein solution were mixed and the volume was adjusted with the same solvent so that the final concentration of each protein was 100 µg/mL. Chromatographic separation of the proteins was achieved using a Tempo LC MALDI instrument from Applied Biosystems (Foster City, CA). The protein mixture (2 µL) was loaded into a C4 polymeric reversed-phase (RP) column (5 µm,
Protein Identification by LC/MALDI/On-Target Digestion Approach
research articles
Figure 3. Comparison of MALDI TOF spectra of the on-target tryptic digestion with traditional overnight digestions of CYP 2E1. CYP2 E1 in-solution digest was performed as described elsewhere;36 9, peaks assigned to CYP2E1 peptides.
300 Å, 150 µm × 50 mm) purchased from VYDAC (Alltech associates, Inc., Deerfield, IL) and the proteins were eluted using a nonlinear gradient of two mobile phases: 0.1% TFA in 2% ACN (A) and 0.1% TFA in 98% ACN (B) at a flow rate of 3 µL/min. The gradient profile used for solvent B was as follows: 5% for 2 min; 5-95% in 12 min; 95% for 5 min; 95-50% in 2 min; 50% for 2 min; 50-5% in 2 min, and 5% for 9 min, with a total run time of 35 min. The fractions (0.6 µL each) were deposited on a 384-well MALDI target plate premixed with equal volumes of a 5 mg/mL solution of CHCA in 0.1% TFA/70% ACN. The droplets were then air-dried. The masses of the proteins contained in each spot were determined using 4800 MALDI TO/TOF MS from Applied Biosystems (Foster City, CA) in linear detector mode. 4. On-Target Tryptic Digestion of Proteins Following RP HPLC Separation. This experiment was also performed using the Tempo LC MALDI instrument. Figure 1 illustrates the
experimental design. Briefly, 2 µL of the mixture of 8 proteins or 2 µL of MRC5 cell extract prepared as described above was injected into a C4 polymeric RP column and the proteins were separated using a nonlinear gradient of two mobile phases as described above. The fractions (0.6 µL each) were deposited on a MALDI target plate without matrix. When the spots were dry, 0.6 µL of a trypsin solution in 25 mM ABC (50 ng/µL, pH ∼ 8.0) was deposited manually on each spot. Digestion took place at ambient temperature while the droplets dried. On average, the spots dried in about 10 min. An aliquot of 0.6 µL of CHCA in 70% ACN containing 0.1% TFA (5 mg/mL) was then deposited over each spot using the Tempo LC MALDI instrument and air-dried. The spots were then analyzed using the 4800 MALDI TOF/TOF instrument. 5. MALDI MS Analysis. MALDI MS analysis was used to determine intact protein masses and identify the proteins Journal of Proteome Research • Vol. 7, No. 9, 2008 3701
research articles
Getie-Kebtie et al.
Figure 4. MALDI TOF mass spectrum of CYP 2E1 in linear detector mode. The peak at 54.497 kDA corresponds to the m/z of CYP 2E1. The theoretical mass of the protein according to the manufacturer’s specifications is 54.586 kDa. Table 3. Summary of Proteins and Their Respective Number of Peptides Identified from RP HPLC Separation/On-Target Tryptic Digestion of the Mixture of 8 Proteins protein MW (Da) label
A B C D E F G H a
protein name
Insulin Cyt C β-LG GADPH Catalase Myoglobin Glucose Oxidase Ferritin LC
protein ID (Swis-Prot)
P01317 P62894 P02754 P46406 P00432 P68082 P13006 P02791
observed
Theoretical
5644 12142 18230 35207 57289 16859 Not detected 19890
5733 12327 18367 35666 57550 16941 65597 19818
% seq. coverage
PFF
PMF
PFF
PMF
8 3 5 4 7 4 4
10 9 9 19 12 11 6
54 33 21 11 63 13 21
63 56 36 49 82 36 34
Theoretical MW in accordance with the manufacturer’s specifications.
employing PMF and peptide fragment fingerprinting (PFF) approaches. The analysis was performed using a 4800 MALDI TOF/TOF Analyzer (Applied Biosystems, Foster City, CA), where reflector and linear laser shots were set at 1000 and 2000, respectively. Intact protein mass spectra were acquired from m/z 5000 to 80 000. Peptide masses were acquired with a range of m/z 900 to 4000. A signal-to-noise threshold of 100 was used for MS peak selection and that of 50 was used for MS/MS peak selection. The MS/MS peak filtering was set between 50 Da (lowest m/z) and precursor mass minus 50 Da (highest m/z). A fragmentation voltage of 2 kV was used throughout the MS/ MS experiments. For on-target digested samples, tryptic autolysis peptides were used as internal standards. 6. Data Analysis. The resulting data were processed and interpreted using GPS Explorer version 3.6 (Applied Biosystems, Foster City, CA), which uses the Mascot database search engine for PMF and PFF identification for the MS and MS/MS data, respectively. Searches were performed against the Swiss-Prot database considering up to one missed tryptic cleavage, monoisotopic peptide mass tolerance of 50 ppm, and fragment ion mass tolerance of 0.3 Da. Propionamide or carbamidomethyl modifications of cysteine were considered as appropriate. 3702
no. of peptides matched a
Journal of Proteome Research • Vol. 7, No. 9, 2008
Results and Discussion Tryptic digestion of proteins continues to be a workhorse of proteomics. Traditional tryptic digestion requires several hours to generate adequate protein digest. Decreased digestion time without loss of downstream protein identification information for the generation of same-day proteomic data is highly desirable. Several factors influence the results of proteolysis, including time, temperature, denaturant, protease concentration, and buffer.23–26 The duration of the digestion can be reduced by increasing the substrate or trypsin concentration. However, the amount of substrate is usually limited and increasing the trypsin concentration leads to enhanced autolysis products. A number of accelerated digestion protocols have been developed in recent years.27,28 Nevertheless, there is a need for new digestion strategies that meet the demands of proteomics for high-throughput and rapid detection and identification of proteins. In this research, we performed a detailed evaluation of direct tryptic digestion of proteins on a MALDI target plate and the potential for integrating RP HPLC separation of protein with on-target tryptic digestion in order to achieve rapid and
Protein Identification by LC/MALDI/On-Target Digestion Approach
research articles
Figure 5. RP HPLC chromatogram of the protein mixture at 214 nm (bottom) and whole protein mass spectrum (top): A (insulin), B (Cyt C), C (β-LG), D (GADPH), E (catalase), F (myoglobin), G (glucose oxidase), and H (ferritin LC); insulin [m/z 5644], β-LG [m/z 9066, 18230], myoglobin [m/z 8385, 16859, 33832 (dimer)], GADPH [m/z 35207].
effective identification of proteins in complex biological samples. Our experimental design is presented on Figure 1. At the beginning, we decided to explore the potential of an on-target tryptic digest using BSA Fraction V powder from Sigma (Lot No. 109H1062), which had been stored in a refrigerator more than a year. The results obtained are shown in Table 1. Surprisingly, we observed 8 and 10 tryptic peptides of BSA by PFF with and without trypsin incubation, respectively. Following this observation, a sample of BSA from Pierce Biotechnology that had been stored in a refrigerator for about the same period of time was analyzed in the same way. This time, no tryptic peptides were detected in the control sample, whereas the test sample yielded 9 tryptic peptides of BSA. Puzzled by this observation, we purchased new BSA samples from Sigma, Pierce, and Equitech-Bio. In addition to the standard BSA, we also included a protease free BSA from Equitech-Bio in this experiment. As shown in Table 1, no tryptic peptides were observed in the control samples for BSA of all origins, while variable numbers of peptides were identified from all of the test samples. Because BSA is used widely as a control
protein in proteomics experiments, our experience with BSA lot 109H1062 demonstrates that to avoid reporting false positive results, it is always imperative to include a negative control (a sample with no trypsin) in experiments intending to show tryptic activity. The impact of spotting trypsin before or after protein deposition on the MALDI target plate was investigated using BSA (Pierce) and horse cytochrome C. Each protein solution (0.5 µL) was manually spotted on a 384-well MALDI plate. Half of the spots contained prespotted dried trypsin. To the remaining wells, trypsin was added after the proteins dried onto the plate. In either case, matrix was deposited on the spots prior to MS analysis. The results of this experiment clearly indicated that the order of deposition did not affect the extent of trypsin digestion significantly (data not shown). During fractionation and in-solution digestion of proteins, it is a common practice to add buffer to the protein samples in order to obtain optimal digestion conditions.23,29 In our experimental settings, adjustment of pH by depositing 25 mM ammonium bicarbonate (ABC) buffer before and after the Journal of Proteome Research • Vol. 7, No. 9, 2008 3703
research articles
Getie-Kebtie et al.
Table 4. Proteins Identified from the Spot at RT of 7.6 and List of Their Respective Peptides protein
% seq. cov.
peptide sequencea
seq. start
seq. end
calc. mass
obs. mass
error ppm
Myoglobin
82
Catalase
37
β-LG
56
GADPH
29
*GLSDGEWQQVLNVWGK GLSDGEWQQVLNVWGKVEADIAGHGQEVLIR *VEADIAGHGQEVLIR *LFTGHPETLEK *HGTVVLTALGGILK HGTVVLTALGGILKK KGHHEAELKPLAQSHATK *GHHEAELKPLAQSHATK *YLEFISDAIIHVLHSK HPGDFGADAQGAMTK ALELFRNDIAAK *YKELGFQG AAQKPDVLTTGGGNPVGDKLNSLTVGPR LNSLTVGPR *GAGAFGYFEVTHDITR FSTVAGESGSADTVRDPR FYTEDGNWDLVGNNTPIFFIR DALLFPSFIHSQK LAHEDPDYGLR *LFAYPDTHR ARVANYQR THFSGDVQR FNSANDDNVTQVR *NFSDVHPEYGSR *NFSDVHPEYGSRIQALLDK *VAGTWYSLAMAASDISLLDAQSAPLR *VYVEELKPTPEGDLEILLQK TKIPAVFK IDALNENK VLVLDTDYK VLVLDTDYKK TPEVDDEALEK *TPEVDDEALEKFDK LSFNPTQLEEQCHI *VKVGVNGFGR VGVNGFGRIGR *AITIFQERDPANIK *VIHDHFGIVEGLMTTVHAITATQK GAAQNIIPASTGAAK *VPTPNVSVVDLTCR *LISWYDNEFGYSNR
1 1 17 32 64 64 79 80 103 119 134 146 19 38 77 112 135 156 252 354 380 422 431 480 480 31 57 92 100 108 108 141 141 165 1 3 70 160 198 232 307
16 31 31 42 77 78 96 96 118 133 145 153 46 46 92 129 155 168 262 362 387 430 443 491 498 56 76 99 107 116 117 151 154 178 10 13 83 183 212 245 320
1815.90 3403.74 1606.85 1271.66 1378.84 1506.94 1982.06 1853.96 1885.02 1502.67 1360.76 941.47 2762.48 956.55 1740.83 1851.88 2518.20 1502.80 1285.62 1119.56 977.53 1046.50 1479.68 1407.63 2189.10 2707.38 2313.26 903.57 916.47 1065.58 1193.68 1245.58 1635.77 1658.78 1032.59 1131.64 1615.88 2618.38 1369.74 1499.79 1763.80
1815.89 3403.87 1606.87 1271.66 1378.84 1506.93 1982.06 1853.95 1885.01 1502.74 1360.79 941.48 2762.54 956.56 1740.84 1851.88 2518.24 1502.79 1285.62 1119.57 977.55 1046.51 1479.69 1407.63 2189.09 2707.43 2313.27 903.57 916.48 1065.58 1193.68 1245.59 1635.76 1658.77 1032.60 1131.61 1615.88 2618.42 1369.74 1499.78 1763.81
-9 38 7 0 -3 -6 0 -6 -4 47 23 8 21 13 2 1 14 -9 2 6 24 12 2 1 -6 19 4 3 4 -6 -1 5 -7 -11 8 -25 -1 16 -6 -8 3
a
Aterisk (*) represents peptides detected both by PMF and PFF.
deposition of trypsin did not add to the effectiveness of the digest, indicating the buffer can be avoided without loss of trypsin activity. This is an important observation in that buffer salts are known to interfere with matrix crystallization and with MS analysis due to the ion-suppression effects.30,31 Next, we explored the effect of reduction and alkylation on the effectiveness of the on-target digest. We compared on-target digests of the BSA (Protease Free from Equitech-Bio) solution with and without prior reduction/alkylation. The data obtained are shown in Figure 2. The reduction/alkylation significantly improved the protein digest quality as demonstrated by the increased number of peptides matched (47 peptides) instead of 16 peptides for the non reduced/alkylated sample. The sequence coverage also increased from 23% to 74%. Membrane proteins represent at least one-third of the proteins encoded by the human genome and more than twothirds of the known protein targets for drugs.32,33 Therefore, high-throughput approaches to characterize membrane proteins are of significant interest for drug discovery. Yet, representation of membrane proteins in proteomics data sets is 3704
Journal of Proteome Research • Vol. 7, No. 9, 2008
generally low due to a number of difficulties associated with their analysis. In particular, 2D electrophoresis is notoriously known for being a poor proteomic approach for membrane proteins.34 Shotgun proteomics provides better results, though it also has a number of problems associated with proteomic analysis of membrane proteins.35 The hepatic cytochrome P450 isozymes are largely responsible for phase I drug metabolism, which leads to formation of more hydrophilic metabolites as compared to parent compounds. The analysis of P450 isozymes presents considerable difficulties since CYPs are highly hydrophobic, large, membrane proteins. We have used a number of cytochrome P450 isozymes (CYP) to evaluate the applicability of on-target tryptic digestion for the proteomic analysis of membrane proteins. The results of these experiments clearly demonstrated that the membrane proteins can be digested effectively with trypsin by the ontarget digestion approach even in the absence of any denaturing, detergent-based environment (Table 2 and Figure 3; Supporting Information). The percentage sequence coverage ranged between 38 and 53 for the three membrane proteins,
research articles
Protein Identification by LC/MALDI/On-Target Digestion Approach a
Table 5. Protein Groups Identified from Confluent MRC5 Cell Culture
a
RT (min)
protein group
acc. no.
no. pep.
% SC
7.2 7.2-7.4 7.2-7.6 7.2-7.8 7.2-7.8 7.4-7.8 7.4 7.6 7.6 7.6 7.6 7.8 7.8 7.8 7.8 7.8-8.0 8.0 8.0 8.0 8.0 8.0 8.0 8.0 8.0 8.0-8.2 8.2 8.2 8.2 8.2 8.2 8.0-8.6 8.2-8.8 8.2-8.8 8.6 8.6 8.6 8.6 8.6-8.8 8.6-9.0 8.8 8.8 8.8 8.8 8.8-9.4 9.0 9.2-10.8 9.6 10.0-10.4 10.2-10.8
Glutathione S-transferase Mu 1 Heterogeneous nuclear ribonucleoproteins Methyltransferase-like protein 2 Macrophage inflammatory protein-2-alpha precursor Heat-shock protein Profilin-1 Beta-defensin 108 precursor Cofilin Calreticulin precursor Putative protein 15E1.2 Small proline-rich protein 4 Macrophage migration inhibitory factor Calumenin precursor Cytochrome P450 26B1 Serine/threonine-proteinkinase 24 Galectin-1 Nucleophosmin 40S ribosomal protein SH3 domain-binding glutamicacid-rich-like protein 3 Ras GTPase-activating protein1 Retinol dehydrogenase 12 Spindlin-like protein 3 Hexokinase D Suppressor of cytokine signaling 1 Tropomyosin Fructose-bisphosphate aldolase A Collagen-binding protein2 precursor Cytochrome P450 4F2 Kininogen precursor Carbonylreductase [NADPH] 1 Histone GADPH Vimentin Alphaenolase Myosin light polypeptide 6 Peripherin StAR-related lipid transfer protein 5 Pyruvate kinase Transgelin Glutathione S-transferase P Heat shock cognate 71 kDa protein Protein disulfide-isomerase precursor Receptor activity-modifying protein 3 precursor Tubulin beta-2 chain Annexin A2 Actin DNA-binding death effector domain-containing protein 2 14-3-3 protein Annexin A5
P09488 P22626 Q96IZ6 P19875 P04792 P07737 Q8NET1 P23528 P27797 O43716 Q96PI1 P14174 O43852 Q9NR63 Q9Y6E0 P09382 P06748 P62269 Q9H299 P20936 Q96NR8 Q99865 P35557 O15524 P09493 P04075 P50454 P78329 P01042 P16152 O60814 P04406 P08670 P06733 P60660 P41219 Q9NSY2 P14618 Q01995 P09211 P11142 P07237 O60896 P07437 P07355 P63261 Q8WXF8 P62258 P08758
6 4 7 3 9 4 4 2 5 4 5 2 3 2 4 4 2 5 2 10 7 5 7 5 3 6 11 7 8 5 3 4 13 7 5 4 6 9 8 3 13 3 3 11 9 12 4 5 16
31 19 23 42 45 39 64 15 19 38 45 17 9 6 7 34 11 26 20 12 18 24 18 29 10 25 49 16 16 29 26 21 35 20 34 8 30 23 36 22 26 8 33 36 33 51 19 27 64
RT, retention time in C4 RP HPLC column; SC, sequence coverage.
which equals or greatly exceeds data previously obtained using MALDI TOF PMF and LC/ESI/MS/MS.36–38 Of special note is the fact that all unique CYP isozyme specific peptides previously described for use in qualitative and quantitative targeted proteomic analysis of cytochrome(s) P450 were present in the on-target digests.36–38 Intact protein molecular masses were also detected for the three cytochromes using MALDI TOF MS in the linear detector mode. The spectrum for CYP 2E1, a recombinant 472 amino acids protein with a molecular weight of 54.576 kDa,39 is shown in Figure 4. As a next step, we examined a combination of RP HPLC followed by on-target fraction collection and tryptic digestion with a mixture of proteins (cytochrome c, insulin, β-LG,
catalase, GAPDH, glucose oxidase, myoglobin, and ferritin). Seven of the 8 proteins, cytochrome c, insulin, β-LG, catalase, GAPDH, myoglobin, and ferritin, were detected with a mass corresponding to the theoretical mass of the protein (Table 3). The RP HPLC chromatogram of the protein mixture at 214 nm is depicted in Figure 5. Insulin was the first protein eluted. The chromatographic peak labeled “A” represents the main peak of insulin. However, the intact protein MS analysis of the fractions revealed a mass corresponding to insulin in fractions from RT 5.6 min (peak A) to RT 7.8 min, indicating that trace amounts of insulin coeluted with other proteins up to RT 7.8 min. Another distinct peak was observed at a RT of 6.3 min corresponding to cytochrome C. A single chromatographic peak Journal of Proteome Research • Vol. 7, No. 9, 2008 3705
research articles
Getie-Kebtie et al.
Figure 6. MALDI TOF mass spectrum of the spot at RT of 9.4 resulting from RP HPLC of MRC 5 cell proteins. Spectral peaks corresponding to peptides from actin and tubulin are marked with 9 and O, respectively.
corresponding to five proteins (insulin, β-LG, GAPDH, myoglobin, and catalase), as identified by PFF, was observed at a RT of 7.6 min (Figure 5). When intact protein MS analysis was used, it was possible to identify all proteins, except catalase, in the spot at a RT of 7.6 min. A m/z value corresponding to catalase was rather detected in the adjacent spot at a RT of 7.4 min. The identity of these proteins was further confirmed by on-target digestion of the proteins and subsequent analysis of the resulting peptides as shown in Table 4. This demonstrates the benefit of performing intact protein MS in order to characterize the proteins more efficiently before performing further analyses. The intact protein MW information, which is not available from the “shotgun” approach, provides supplementary information to the PMF and PFF identification of proteins. Coelution of more than one protein in a single spot does not pose a serious problem for reliable protein identification, as subsequent proteomic analysis can identify the proteins. In a parallel run, the same 8 proteins have been subjected to tryptic digestion after chromatographic separation and deposition on a MALDI target plate followed by subsequent protein identification by PMF and PFF. Protein RTs observed by RP-HPLC with UV absorbance at 214 nm correlated well with the PMF and PFF identification of the proteins obtained from the on-target collected and digested fractions. Although some protein overlap between different fractions was observed, the results obtained in each instance have sufficient sequence coverage for protein identification. For example, as mentioned above, five proteins were coeluted by RP HPLC at a RT of 7.6 min. On-target digestion of the proteins at this spot followed by MALDI TOF/TOF analysis of the resulting peptides enabled us to identify all of these proteins, except insulin, for which no peptide was detected. The presence of only two trypsin cleavage sites in insulin may explain why no peptide was detected from this protein. Table 4 summarizes the list of peptides identified from each protein. The sequence coverage ranged from 29% (for GAPDH) to 82% (for myoglobin). In summary, all of the 3706
Journal of Proteome Research • Vol. 7, No. 9, 2008
proteins, except insulin, were identified both by PMF and PFF in at least two spots. At least 3 peptides were observed from all of the proteins identified by the on-target digestion approach (PFF). In our final experiment, we assessed the applicability of the HPLC/MALDI/on-target digestion approach for identification of proteins in complex biological systems. MRC5 cell line proteins were extracted and fractionated using the Tempo LC MALDI instrument for subsequent on-target tryptic digestion. Table 5 summarizes the proteins identified in this experiment. Those proteins for which a significant hit was observed by PMF and at least one peptide with positive PFF was identified are included in the list. A total of 49 proteins were identified successfully using the RP HPLC separation followed by ontarget tryptic digestion and MALDI-TOF/TOF analysis. Almost all proteins eluted between 7 and 10 min in a total of 15 fractions. In the majority of the cases, the same protein was observed in at least two subsequent spots. The data in Table 5 represent the results from spots where a majority of peptides were observed. While an effort could be exerted to minimize number of proteins in a single spot by extending the dwell time of the depositor in the spots (increasing the spot size), the identification of the same protein in more than one spot enhances the level of confidence toward true positive identification. Figure 6 shows the MS spectrum of the spot at a RT of 9.4 min. The ion masses were matched to two proteins: actin and tubulin. The peptides that originated from the two proteins are indicated in the spectrum. Most of the high intensity peaks matched to actin, indicating that actin was more abundant than tubulin in this spot. As Table 5 reveals, tubulin was eluted between RTs of 8.8 and 9.4 min, whereas actin was eluted between 9.2 and 10.8 min. Therefore, the spot at RT of 9.4 min was a transition from a tubulin-abundant eluate to an actinabundant eluate. Conclusion. The Tempo LC MALDI instrument is a hybrid of a MALDI target fraction collector and an HPLC instrument.
research articles
Protein Identification by LC/MALDI/On-Target Digestion Approach The spotting device automatically deposits samples onto MALDI target plates ready for analysis by a mass spectrometer. This automated device ensures a highly precise positioning of liquid deposition into a very small area of the spot surface, providing a means to enhance protein concentration for highly efficient digestion. This improves sensitivity for analysis of proteins present in low abundance. While most solution-phase digestions are performed over a period of hours and are subjected to high degree of autolysis of the enzymes used, the on-target digestion is performed in a few minutes (10 min on the average) with minimal occurrence of autolysis. In addition to the high protein target concentration described above, unfolding of proteins when immobilized and the susceptibility of such denatured proteins to proteolytic attack may explain the dramatic decrease in the time required for digestion. Furthermore, this approach has the advantage that the collection of the proteins, enzymatic digestion, and MS analysis are all integrated into one MALDI plate without any sample transfer in order to avoid unnecessary contamination or sample loss. This contributes to the relatively high sequence coverage obtained for most of the proteins. In summary, a method that integrates separation of proteins using RP-HPLC with on-target proteolytic digestion of the proteins for subsequent MALDI-MS analysis has been demonstrated with a number of soluble and membrane proteins and an MRC5 cell lysate. The rapid protein separation and direct deposition of fractions on a MALDI target plate provided by the RP HPLC combined with off-line interfacing with the MALDI MS is a unique platform for rapid protein identification with improved sequence coverage. This simple and robust approach also significantly reduces the sample handling and potential loss in large-scale proteomics experiments. This approach allows the combined information from peptide mass fingerprinting (PMF), MS/MS peptide fragment fingerprinting (PPF) and whole protein MS to increase confidence in protein identification and structural analysis of proteins.
Supporting Information Available: TOF-TOF data for tryptic peptides of the human CYP2E1. This material is available free of charge via the Internet at http://pubs.acs.org. References (1) Tyers, M.; Mann, M. Nature 2003, 422 (6928), 193–7. (2) Ahn, N. G.; Shabb, J. B.; Old, W. M.; Resing, K. A. ACS Chem. Biol. 2007, 2 (1), 39–52. (3) Fujii, K.; Nakano, T.; Kawamura, T.; Usui, F.; Bando, Y.; Wang, R.; Nishimura, T. J. Proteome Res. 2004, 3 (4), 712–8. (4) Coldham, N. G.; Woodward, M. J. J. Proteome Res. 2004, 3 (3), 595– 603. (5) Bogdan Bogdanov, R. D. S. Mass Spectrom. Rev. 2005, 24 (2), 168– 200.
(6) Helen, J. Cooper, K. H. A. G. M. Mass Spectrom. Rev. 2005, 24 (2), 201–22. (7) Klose, J. Humangenetik 1975, 26 (3), 231–43. (8) O’Farrell, P. H. J. Biol. Chem. 1975, 250 (10), 4007–21. (9) Scheele, G. A. J. Biol. Chem. 1975, 250 (14), 5375–85. (10) Rappsilber, J.; Ryder, U.; Lamond, A. I.; Mann, M. Genome Res. 2002, 12 (8), 1231–45. (11) Shi, Y.; Xiang, R.; Horva?th, C.; Wilkins, J. A. J. Chromatogr., A 2004, 1053 (1-2 SPEC. ISS.), 27–36. (12) Kreunin, P.; Urquidi, V.; Lubman, D. M.; Goodison, S. Proteomics 2004, 4 (9), 2754–65. (13) Zhu, K.; Miller, F. R.; Barder, T. J.; Lubman, D. M. J. Mass Spectrom. 2004, 39 (7), 770–80. (14) Zheng, S.; Schneider, K. A.; Barder, T. J.; Lubman, D. M. BioTechniques 2003, 35 (6), 1202–12. (15) Zhou, F.; Johnston, M. V. Electrophoresis 2005, 26 (7-8), 1383–8. (16) Bodnar, W. M.; Blackburn, R. K.; Krise, J. M.; Moseley, M. A. J. Am. Soc. Mass Spectrom. 2003, 14 (9), 971–79. (17) Harris, W. A.; Reilly, J. P. Anal. Chem. 2002, 74, 4410–16. (18) Warscheid, B.; Fenselau, C. Proteomics 2004, 4 (10), 2877–92. (19) Caputo, E.; Moharram, R.; Martin, B. M. Anal. Biochem. 2003, 321 (1), 116–24. (20) Zheng, S.; Yoo, C.; Delmotte, N.; Miller, F. R.; Huber, C. G.; Lubman, D. M. Anal. Chem. 2006, 78 (14), 5198–5204. (21) Yoo, C.; Zhao, J.; Pal, M.; Hersberger, K.; Huber, C. G.; Simeone, D. M.; Beer, D. G.; Lubman, D. M. Electrophoresis 2006, 27 (18), 3643–51. (22) Alterman, M.; Chaurasia, C.; Lu, P.; Hardwick, J.; Hanzlik, R. Arch. Biochem. Biophys. 1995, 320 (2), 289–96. (23) Lundell, N.; Schreitmuller, T. Anal. Biochem. 1999, 266 (1), 31–47. (24) Bark, S. J.; Muster, N.; Yates, J. R., III; Siuzdak, G. J. Am. Chem. Soc. 2001, 123 (8), 1774–5. (25) Havlis, J.; Thomas, H.; Sebela, M.; Shevchenko, A. Anal. Chem. 2003, 75 (6), 1300–6. (26) Slysz, G. W.; Schriemer, D. C. Rapid Commun. Mass Spectrom. 2003, 17 (10), 1044–50. (27) Havlis, J.; Thomas, H.; S?ebela, M.; Shevchenko, A. Anal. Chem. 2003, 75 (6), 1300–6. (28) Sebela, M.; Stosova, T.; Havlis, J.; Wielsch, N.; Thomas, H.; Zdrahal, Z.; Shevchenko, A. Proteomics 2006, 6 (10), 2959–63. (29) McComb, M. E.; Perlman, D. H.; Huang, H.; Costello, C. E. Rapid Commun. Mass Spectrom. 2007, 21 (1), 44–58. (30) Perlman, D. H.; Huang, H.; Dauly, C.; Costello, C. E.; McComb, M. E. Anal. Chem. 2007, 79 (5), 2058–66. (31) Rajnarayanan, R. V.; Wang, K. J. Mass Spectrom. 2004, 39 (1), 79– 85. (32) Wallin, E.; von Heijne, G. Protein Sci. 1998, 7 (4), 1029–38. (33) Hopkins, A. L.; Groom, C. R. Nat. Rev. Drug Discovery 2002, 1 (9), 727–30. (34) Santoni, V.; Molloy, M.; Rabilloud, T. Electrophoresis 2000, 21 (6), 1054–70. (35) Rabilloud, T. Nat. Biotechnol. 2003, 21 (5), 508–10. (36) Alterman, M. A.; Kornilayev, B.; Duzhak, T.; Yakovlev, D. Drug Metab. Dispos. 2005, 33 (9), 1399–1407. (37) Galeva, N.; Yakovlev, D.; Koen, Y.; Duzhak, T.; Alterman, M. Drug Metab. Dispos. 2003, 31 (4), 351–5. (38) Nisar, S.; Lane, C. S.; Wilderspin, A. F.; Welham, K. J.; Griffiths, W. J.; Patterson, L. H. Drug Metab. Dispos. 2004, 32 (4), 382–86. (39) Gillam, E. M.; Guo, Z.; Guengerich, F. P. Arch. Biochem. Biophys. 1994, 312 (1), 59–66.
PR800258K
Journal of Proteome Research • Vol. 7, No. 9, 2008 3707