Chem. Res. Toxicol. 2003, 16, 757-767
757
Comparative Identification of Prostanoid Inducible Proteins by LC-ESI-MS/MS and MALDI-TOF Mass Spectrometry Maria D. Person, Herng-Hsiang Lo, Kelly M. Towndrow,† Zhe Jia, Terrence J. Monks, and Serrine S. Lau* Center for Molecular and Cellular Toxicology, Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Austin, Texas 78712 Received June 21, 2002
Protein identification by MS is well-established. Mixtures of proteins from cell extracts are separated by either one- or two-dimensional gel electrophoresis, and specific bands or spots are subjected to in-gel digestion and subsequent analysis by MS. The two most common types of ionization used in MS are electrospray ionization (ESI) and matrix-assisted laser desorption/ ionization (MALDI). When ESI is used, the sample is typically analyzed by inline HPLC-ESIMS/MS with fragmentation of individual digest peptides, followed by database comparison between theoretical and experimental fragmentation patterns. MALDI-MS analysis is based on peptide mass mapping, with mass measurements of the digest peptides searched against a database of theoretical digests. We give here the results of a comparison between ESI-ion trap and MALDI-TOF (time-of-flight) analysis of 11-deoxy,16,16-dimethyl prostaglandin E2 (DDMPGE2) inducible proteins. Individual peptides identified by the two techniques differed, in general, but the resulting protein identification was the same. Slightly higher coverage of each protein was obtained by MALDI-TOF, but the MS/MS data were more definitive by requiring fewer peptides to assign a positive identification. Both methods effectively identified two proteins in the same gel band. The samples here are derived from a renal epithelial cell line (LLC-PK1) established from the New Hampshire minipig, a species poorly represented in the current database, and strategies and limitations for analyzing such species are discussed.
Introduction Changes in protein expression and/or function have long been appreciated in environmental chemical-induced toxicity and cancer. For many years, however, profiling protein targets of toxic chemicals has been a nearly impossible task. Recent developments in MS ionization methods and instrumentation now make possible the rapid, high throughput analysis and identification of proteins. MS has become the method of choice for sequencing peptides and proteins in samples with limited amounts of protein available. These analytical capabilities have driven the growth of proteomics, the study of the protein complement of the genome. The explosion in proteomics research has exposed an increasing number of biological and medical researchers to the technique of protein identification by MS (1-6). Differential protein expression between wild-type and transgenic animals, and proteins differentially expressed as a consequence of toxicant exposure, can be visualized by one- or twodimensional gel electrophoresis of the appropriate samples. Similarly, antibody staining of gels can reveal specific proteins modified by toxicants. To identify a protein in the gel, N-terminal Edman sequencing or mass spectrometric techniques are used. Increased instrumental sensitivity and the development of suitable preparative techniques for use with mass spectrometers have in* To whom correspondence should be addressed. Tel: 512-471-5190. Fax: 512-471-5002. E-mail:
[email protected]. † Present address: Department of Investigative Toxicology, Lilly Research Laboratories, Eli Lilly and Company, Greenfield, IN 46140.
Figure 1. Alternative strategies for protein identification using MS.
creased the popularity of this approach. Currently, the most widely utilized protocol for proteins separated by SDS-PAGE uses the protease trypsin to digest the proteins in the gel, and then, the digested peptides are subsequently extracted and analyzed by MS. The proteins are identified by either peptide mass mapping or peptide sequencing followed by a database search (Figure 1). Peptide mass mapping utilizes a list of tryptic peptide masses derived from a spectrum to query a database of theoretical digests and identifies the unknown protein from the best match (7-9). This technique is used predominately with MALDI-TOF1 mass spectrometers.
10.1021/tx020049d CCC: $25.00 © 2003 American Chemical Society Published on Web 05/09/2003
758
Chem. Res. Toxicol., Vol. 16, No. 6, 2003
The main advantage is high throughput, as spectral acquisition times are only seconds per sample. Current instrumentation, employing delayed extraction and reflectron detectors, has the mass accuracy necessary for definitive protein identification with ever-increasing database size (10). An alternative approach is that of MS/ MS fragmentation, or peptide sequencing, which isolates and fragments individual tryptic peptides and then queries a database of expected fragmentation ions of tryptic peptides (11, 12). When ESI is used for ionization, fragmentation is accomplished by collision-induced dissociation (CID), where collisions with an inert gas, such as helium, cause the peptides to fragment into characteristic product ions. Alternatively, PSD fragments are detected when MALDI-TOF is used. For separation of the peptides, reversed phase HPLC can be coupled in-line to the ESI mass spectrometer. The analysis requires increased time for the HPLC run, and the MS/MS database search requires more computer power to search the many spectra generated during the run and to correlate the results. The results of an MS/MS-based search are more likely to identify a component present at relatively low levels, or with few tryptic peptides, when a mixture of proteins is present. Additionally, because sequence information is generated from MS/MS spectra, similaritybased searches can identify proteins not present in the database for the species under study. A number of database search engines are publicly available on the web, or for license, such as Protein Prospector (prospector.ucsf.edu), Mascot (www. matrixscience.com), and Prowl (prowl.rockefeller.edu) as well as proprietary software from mass spectrometer manufacturers. The search engines can be used to query different protein and translated DNA databases, which are continually updated. The NCBI and Swiss-Prot databases are most commonly used, with Swiss-Prot providing extensive annotation and NCBI containing more sequence variants. The database sequences generally represent precursor proteins and do not include posttranslational modifications or processing. Some proteins in the database are present only as sequence fragments, as the full sequence has yet to be determined. While both MALDI-TOF peptide mapping and HPLCESI-MS/MS techniques have been used successfully to identify proteins, there has been little direct comparison of the relative effectiveness of the two techniques. Femtomolar sensitivity is possible with either technique depending on instrument choice. In a number of studies, MALDI-MS has been used for initial analysis, and ESIMS/MS has been used on samples that are not identified in the initial MALDI screen (13, 14). This strategy takes advantage of the high throughput of MALDI and the more definitive results of ESI-MS/MS analysis. When the results of an analysis of a single HPLC fraction containing tryptic digest peptides from an in-gel digest were compared by nanospray ESI and MALDI, the results were complementary, with two proteins identified by both methods (15). However, the superior sequencing ability 1
Abbreviations: ACN, acetonitrile; DDM-PGE2, 11-deoxy,16,16dimethyl prostaglandin E2; EF, elongation factor; ESI, electrospray ionization; GRP, glucose-regulated protein; HSP, heat shock protein; MALDI, matrix-assisted laser desorption/ionization; MS/MS, tandem mass spectrometry; NCBI, National Center for Biotechnology Information; PSD, post-source decay; QTOF, quadrupole time-of-flight; TER, transitional endoplasmic reticulum; moesin, membrane-organizing extension spike protein; TOF, time-of-flight; Xcorr, raw cross-correlation score.
Person et al.
of the nanospray ESI-QTOF instrument allowed for the identification of two additional protein components. When a set of yeast proteins separated by two-dimensional gel electrophoresis was subjected to in-gel digest and analysis by MALDI-TOF and LC-ESI-MS/MS (16), 90% of the proteins were successfully identified by MALDI and 100% by ESI. A study of yeast membrane proteins separated by one-dimensional gel electrophoresis identified more proteins by nano-LC-ESI-MS and -MS/ MS than by MALDI-MS and -PSD, using slightly different sample preparation methods for the different digests (17). Because the yeast genome sequence is complete, the database search should be able to identify every protein in the sample. This is not the case for most other species, where the genome sequence is incomplete. In the case of mammals, only the rat, mouse, and human are wellrepresented, and the genome and protein product variations are considerably more complex. We here provide an example of the applicability of combined MALDI-TOF-MS and HPLC-ESI-ion trap-MS/ MS analysis for solving specific questions that address the mechanism of chemical or toxicant action. We make a broad comparison of the results of protein identification of in-gel digest samples using protocols and instruments in widespread usage, MALDI-TOF-MS, and HPLC-ESIion trap-MS/MS. While not representing the most sophisticated instruments available, these systems are currently available with extensive automation, with spectral acquisition, and with processing methods able to generate protein identification lists with little user intervention required after sample loading. The proteins analyzed were derived from DDM-PGE2-treated LLC-PK1 cells (New Hampshire minipig renal proximal tubular cell line). DDM-PGE2 is a synthetic analogue of prostaglandin E2 and provides protection to LLC-PK1 cells against the reactive oxygen species generating toxicant 2,3,5-tris(glutathion-S-yl)hydroquinone (18). Differential protein expression was utilized to identify proteins induced by DDM-PGE2 and thus elucidate the potential mechanism by which it provides cytoprotection (19). DDM-PGE2 selectively stimulated the synthesis of several proteins in LLC-PK1 cells, as determined by [35S]methionine labeling. These proteins, which likely contribute to the cytoprotective effects of DDM-PGE2, were identified by MS and confirmed by western blot analysis (19). The comparative mass spectral approaches described herein should have applicability to unraveling mechanisms of chemical-induced toxicities when those toxicities are associated with functional changes in the proteome. Such changes in protein expression profiles may result from (i) chemical exposure, (ii) comparisons between transgenic/knockout and wild-type littermates, (iii) effects of using dominant negative and other transfection technologies, and (iv) comparisons between normal and neoplastic or diseased tissue.
Experimental Procedures Chemicals. HPLC grade solvents were purchased from EM Science (Cincinnati, OH), acetic acid was purchased from Aldrich (Milwaukee, WI), and other reagents were purchased from Sigma (St. Louis, MO). Protein Extraction from Cells. Details of the sample preparation are given in Towndrow et al. (19). Post confluent LLC-PK1 cells were exposed to 1 µM DDM-PGE2 or 0.04% (v:v) ethanol (vehicle control) in 35S-methionine-containing medium for 24 h. At the end of the experiment, cells were pelleted and
Protein Identification by LC-ESI-MS/MS and MALDI-TOF
Figure 2. Effect of DDM-PGE2 on the induction of specific proteins in LLC-PK1 cells. LLC-PK1 cells were exposed to 1 µM DDM-PGE2 (D) or to ethanol vehicle control (E) for 24 h in the presence of 35S-methionine. Protein lysates (1 × 106 CPM/lane) from 35S-labeled cells were collected and then separated via SDS-PAGE and detected by autoradiography. Eight bands were selected for protein identification by MS. lysed, and the supernatant was collected. 35S-Methionine-labeled proteins were separated by SDS-PAGE, stained with Coomassie Blue R, dried, and analyzed by autoradiography. Eight bands of interest were identified by increased radioactivity as compared to the control. Separate gels were prepared for analysis by MS. In-Gel Protein Tryptic Digestion. About 600 µg of protein lysate from DDM-PGE2-treated cells was loaded onto four lanes of a single gel. Proteins of interest were selected, and the identical bands were cut from all four lanes. Half of the sample was digested and subjected to LC-MS/MS analysis. The remaining half was stored at -80 °C in a 5% acetic acid solution and later subjected to in-gel digestion followed by MALDI-MS analysis. In-gel tryptic digestion of selected bands was based on a modification of standard protocols (20, 21). Prior to in-gel digest, individual bands were cut into 1 mm pieces and destained in 5% acetic acid, 50% methanol to remove the Coomassie Blue. Gel pieces were dehydrated with ACN, and residual ACN was evaporated in a SpeedVac (Therma Savant, Hollowbrook, NY). Proteins were then reduced with 10 mM DTT in 100 mM NH4HCO3 at room temperature for 1 h. Residual DTT was removed, and cysteines were alkylated with 50 mM iodoacetamide (in 100 mM NH4HCO3) for 1 h. After the residual iodoacetamide was removed, gel pieces were subjected twice to washing (100 mM NH4HCO3 for 10 min) and dehydration (5 min in ACN). Gels were dried for 2-3 min in a SpeedVac and rehydrated on ice with 20 ng/µL Sequencing Grade Modified Trypsin (Promega, Madison, WI; in 50 mM NH4HCO3) for 1015 min. Excess trypsin was removed, 20 µL of 50 mM NH4HCO3 was added, and gel pieces were digested overnight at 37 °C. After they were digested, peptides were extracted twice in 75 µL of 5% formic acid/50% ACN. The samples were reduced in volume to 10-20 µL with a SpeedVac. MALDI-TOF Analysis. Prior to MALDI-MS analysis, the in-gel digests from bands 1 and 2 (Figure 2) were desalted using a C18 0.6 µL ZipTip (Millipore, Bedford, MA), with 0.1% formic acid in water as the equilibration buffer and 0.1% formic acid in 50% ACN as the elution buffer, following the manufacturer’s protocol. MALDI-TOF spectra were taken on the delayed extraction Voyager De-Str (Applied Biosystems, Framingham,
Chem. Res. Toxicol., Vol. 16, No. 6, 2003 759 MA) instrument equipped with a 2 m linear flight path using the reflector detector in the positive ion mode. The instrument was equipped with a nitrogen laser operating at 337 nm. Standard method parameters were used, with a 200 ns delay time. The low mass gate was set at 600, and spectra were acquired over the mass range of 700-3600 Da. The matrix used was R-cyano-4-hydroxycinnamic acid (Agilent, Palo Alto, CA), mixed 1:1 with the sample and drop dried on a gold target in a total volume of 1 µL. A mixture of Bio-Rad (Hercules, CA) cze standards at 1 ng/µL and ACTH (Sigma) at 2.5 ng/µL was used for external calibration of the MALDI spectra, and 64 spectra were averaged per sample. The peptide mass list was derived from the monoisotopic peaks with a signal-to-noise ratio of at least two in the range of 790-3600 Da. Peptide mass lists were filtered for removal of trypsin autolysis peaks, low mass matrix-related ions, and peptide adducts of sodium (+22) and potassium (+38). The mass list for each sample was entered in the search program, MS-Fit, in the Protein Prospector suite (prospector.ucsf.edu). The SwissProt and Owl databases were searched using a 75 ppm peptide mass tolerance based on external calibration of a nearby spot, a maximum of two missed cleavages, and carbamidomethylation of the cysteines. For the database search, the protein molecular mass range was restricted to 1000-100 000 Da and the search was performed both for all species and for all mammals. For the two high molecular mass bands, a mass tolerance of 50 ppm was used with external calibration at a position adjacent to the sample spot on the MALDI target and the protein molecular mass range was not restricted. The protein represented by the highest scoring match was reported, and peptide sequences were assigned according to the best match. For samples containing two proteins, the second protein was not the highest in score but provided a minimum of three matching peptides and similar protein molecular mass. LC-ESI-MS/MS Analysis. A microbore HPLC system (Magic 2002, Michrom BioResources, Auburn, CA) coupled in-line with an electrospray ion trap mass spectrometer (LCQ, ThermoFinnigan, San Jose, CA) was used for the LC-ESI-MS/MS analyses. Peptides were separated with a 0.5 mm × 50 mm MAGIC MS C18 column (5 µm particle diameter, 200 Å pore size) using mobile phases A (ACN:water:acetic acid:trifluoroacetic acid, 2:98:0.1:0.02) and B (ACN:water:acetic acid:trifluoroacetic acid, 90:10:009:0.02). A linear elution gradient of 5-65% B over 30 min was followed by 95% B for 5 min at a flow rate of 20 µL/ min. Mass spectra were accumulated over a 45 min run time triggered by the start of the HPLC run. Automated acquisition of MS and MS/MS spectra was executed by data-dependent scanning using ThermoFinnigan Excalibur software. The settings for the LCQ were as follows: spray voltage, 3.5 kV; capillary temperature, 200 °C; capillary voltage, 46 V; tube lens offset, 55 V. The scan time settings used were 200 ms maximum injection time for all scans, with averaging of three microscans for full scan and five microscans for Zoom and MS/MS scans. The target number of ions was 1e8 for full scan, 1e7 for Zoom scan, and 2e7 for MS/MS. The full scan range for MS was 2002000 Da. Data-dependent MS/MS acquisition was performed with a default charge state of 2, isolation width of 2 amu, normalized collision energy level at 35%, and required a minimal signal of 30 000 counts. The scan event sequence included one full scan followed by a dependent Zoom scan and MS/MS scan of the most intense ion seen in the full scan. Sequences of individual peptides were identified using the SEQUEST algorithm, incorporated into the ThermoFinnigan BIOWORKS software, to correlate the MS/MS spectra with amino acid sequences in either the NCBI nonredundant FASTA or OWL protein database. The search was conducted for human or all species, using trypsin digest and one missed cleavage maximum. MS/MS data were searched for charge states +1, +2, and +3. The protein identified as the best hit and having the most peptide matches was reported, along with secondary matches. Peptides reported were those ranked first or second having greater than 50% of the expected fragment ions, gener-
760
Chem. Res. Toxicol., Vol. 16, No. 6, 2003
Person et al.
Figure 3. MALDI-MS identification of GRP 78. The MALDI-MS spectrum of the in-gel digest of gel band 5 is shown, with the GRP 78 tryptic peptides labeled. GRP 78 peptide fragment numbering is based on tryptic cleavage of the processed human GRP 78 sequence. Table 1. MALDI-MS Mass Lists for Each In-Gel Digest Banda gel band mass MH+
% matchedb primary ID secondary ID
band 1
band 2
band 4
band 5
band 6
band 7
band 8
1007.52 1045.56 1180.57 1225.59 1285.73 1298.72 1358.64 1400.66 1413.67 1415.69 1429.70 1489.80 1505.76 1518.80 1529.83 1533.80 1561.81 1595.83 1616.77 1923.95 2343.06 2415.10 2749.37 2893.51 50 filamin 1
1449.79 1571.86 1722.80 1773.78 1777.81 1950.00 2045.04 2166.10
1172.69 1274.69 1329.71 1361.69 1402.79 1554.74 1599.82 1742.84 1770.86 1799.89 1852.93 1887.97 1951.91 2090.12 2143.02 2220.11 2256.05 2353.17 2498.13 2639.15 2817.26 3117.43
band 3
1194.64 1236.63 1264.65 1348.67 1513.77 1589.87 1778.93 1782.96 1786.96 1808.96 1911.04 2015.01 2255.97
986.51 1191.64 1435.74 1460.75 1512.69 1528.77 1566.77 1642.81 1677.81 1815.94 1887.92 1933.96 2015.97 2018.94 2083.00 2164.90
1045.54 1104.59 1182.59 1233.59 1310.68 1578.91 1928.90 2081.96
1025.58 1286.81 1306.60 1314.70 1341.68 1352.66 1380.67 1404.70 1588.84 1779.89 1996.91 2432.07 2531.27 2686.99 3147.12
795.48 945.56 976.45 1132.51 1171.53 1198.71 1499.68 1515.72 1673.76 1790.87 1954.01 1960.90 2214.94 2230.92 3183.44 3199.53
38 myosin
68 EF-2 TER-ATPase
100 HSP 90 β HSP 90 R
81 GRP 78
75 moesin
40 EF-1-R
88 β,γ-actin R-actin
a Those matched to primary proteins are shown in bold, and those matched to secondary proteins are shown in italics. b Percentage of masses identified out of those submitted.
ally with preliminary Sp scores greater than 200 and Xcorr values greater than 1.
Results and Discussion MALDI and ESI Give Identical Protein Identification Results. Samples from a single gel were analyzed independently via MALDI-MS or HPLC-ESI-MS/MS. The eight bands were selected for in-gel digestion and identification based upon differential protein induction between DDM-PGE2-treated and untreated LLC-PK1 cells (Figure 2). Protein identification experiments were
repeated several times by LC-MS/MS as described in Towndrow et al. (19). MALDI-TOF was used for peptide mass mapping, with a typical spectra shown in Figure 3 where GRP 78 was identified from 13 tryptic peptide masses. The mass lists derived from the MALDI-TOF spectra for each sample are shown in Table 1 with the identified masses shown in bold or italics. The same samples were also analyzed by HPLC-ESI-MS/MS. Figure 4 illustrates a typical MS/MS spectrum showing the fragmentation of a single tryptic peptide from the protein filamin 1. The daughter ions most commonly seen are
Protein Identification by LC-ESI-MS/MS and MALDI-TOF
Chem. Res. Toxicol., Vol. 16, No. 6, 2003 761
Figure 4. MS/MS identification of filamin 1. An MS/MS spectrum identified as the peptide VTVLFAGQHIAK in filamin 1 from gel band 1. The doubly charged parent ion has m/z of 646.6, and the singly charged b and y fragment ions are labeled. From the difference in mass for adjacent members of the y ion series y4 to y10, the amino acid sequence can be established.
those produced by cleavage of the peptide amide bond, with charge retention at the N-terminal designated b ions and charge retention at the C-terminal designated y ions (22). The mass difference between two ions in a given series is the amino acid residue mass. The set of all MS/ MS spectra collected over one HPLC run was submitted to the SEQUEST database search program, and the results from the best match are shown in Table 2. A summary of peptides identified for each protein is shown in Table 2. The ESI assignment is based on MS/MS fragmentation spectra whereas the MALDI assignments are based on peptide mass. In all cases where a match was made, the MALDI and ESI methods identified the same proteins, even when two proteins were present in the sample. MALDI and ESI Give Different Amino Acid Coverage. Proteins were identified from a minimum of three peptides for each gel band. The results were confirmed by multiple LC-ESI-MS/MS runs (19). The average number of peptides identified from their MS/MS fragmentation pattern was 7.5 per protein, giving an average amino acid coverage of 12%. A second set of samples was subjected to in-gel digest and independently analyzed by MALDI-MS peptide mapping. Here, the average number of peptides per protein is 8.7, with 17% amino acid coverage (not including gel band 7 for which there is no corresponding ESI data). There were 11 proteins identified in the eight bands. The average number of identical peptides seen in both the MALDI and the ESI analyses is 2.7. While the samples came from the same gel, the in-gel digests were performed separately, accounting for some differences in peptides produced. However, because each method gave amino acid coverage of