Histidine-Rich Peptide Selection and Quantification in Targeted

Histidine-rich peptides were selected and, in the process, samples were ... Giron , Loïc Dayon , Natacha Turck , Christine Hoogland , and Jean-Charle...
0 downloads 0 Views 223KB Size
Histidine-Rich Peptide Selection and Quantification in Targeted Proteomics Diya Ren, Natalia A. Penner, Benjamin E. Slentz, and Fred E. Regnier* Department of Chemistry, Purdue University, West Lafayette, Indiana 47907 Received July 8, 2003

Agarose based immobilized copper (II) affinity chromatography (Cu(II)-IMAC) in tandem with reversedphase chromatography was applied to a yeast protein extract. Histidine-rich peptides were selected and, in the process, samples were substantially simplified prior to mass spectral analysis. Samples of proteins from the yeast extract at fermentation time periods of 2.5 and 10 h were compared quantitatively used the GIST protocol. Acylation of the N-terminus of tryptic peptides with N-acetoxysuccinamide was used to globally label and quantify relative protein concentration changes. Together with N-terminal acylation, an imidazole elution procedure allowed histidine-rich peptides to be preferentially selected by Cu(II)-IMAC. An inverse labeling strategy was applied to increase reliability in determinations of upand down-regulation. It was found that the concentration of some histidine-rich proteins changed in excess of 4-fold during fermentation. These proteins covered a wide range of molecular weight and pI values. Keywords: Cu(II)-IMAC • histidine-rich peptides • inverse labeling • N-terminus derivatization • yeast • comparative proteomics

Introduction It is often the case that specific domains in a protein are enriched in a particular amino acid.1 Enrichment in proline,2 hydroxyproline,3 tryptophan,4 aspartic acid,5 glycine,6 cysteine,7-11 and histidine12-15 are commonly encountered examples. Sometimes several amino acids are enriched within the same domain, as in the case of zinc-finger proteins.16,17 The biological role of clustered amino acids varies widely. With histidine and cysteine, clustering these amino acids, either individually or together, often plays a role in metal binding, as in many transcription factors,18 transport proteins,19,20 and metalloenzymes.21 Regions rich in a particular amino acids can also serve as sites for post-translational modification.3 But in many cases, the biological significance of clustering particular amino acids is unknown. Proteomics is based on the fact that peptide mass and sequence data obtained from mass spectra can be related to DNA and protein sequence databases.22 A major problem in this peptide-based approach to proteomics is that the number of peptides in samples often exceeds the analytical capacity of LC-MS systems. Although proteins from 2-D electrophoresis gels are often identified by the “mass fingerprint” of their tryptic peptides,23 they can also be identified using smaller numbers of peptides in an LC-MS approach.24 Protein identification based on a small number of daughter peptides is possible because peptide sequence becomes increasingly unique when * To whom correspondence should be addressed. Fred E. Regnier, Department of Chemistry, Purdue University, 1393 Brown Bldg, West Lafayette, IN, 47907-1393. Phone: (765) 494-9390. Fax: (765) 494-0359. E-mail: [email protected]. 10.1021/pr034049q CCC: $27.50

 2004 American Chemical Society

they are composed of six or more amino acids.25 This is particularly true in the case of peptides containing multiple histidine and cysteine residues. Moreover, the fraction of tryptic peptides from a proteome digest that contain multiple histidine or cysteine residues is in the range of 3-5%.26 If it were possible to select only those peptides that contain multiple histidine or cysteine residues, then it would be possible to simultaneously reduce the complexity of proteome digests and target important families of proteins with domains enriched in these amino acids. The prospect of selecting histidine-rich peptides that had been GIST coded for quantification27 was investigated in Saccharomyces cerevisiae used in ethanol production. Ethanol has proven to be an ideal liquid fuel for transportation. In industry, S. cerevisiae is used to convert glucose (or hexose)based agricultural products to ethanol. As part of the commercial optimization process, it is necessary to understand regulatory changes in yeast during ethanol production, both at the cytosolic and membrane level. Although mathematical modeling of the gene network involved in fermentation gives some idea of changes to be expected at the mRNA and protein expression levels,28-31 direct measurements are always more desirable. Several groups have used two-dimensional gels32-34 to examine protein expression, but it is difficult to detect posttranslationally modified (PTM) and membrane proteins along with proteins in excess of 180 kDa and pI values lower than 4 or higher than 9.35-36 The work described in this paper was directed at methods for selecting and quantifying changes in the concentration of histidine-rich yeast proteins at different stages of fermentation Journal of Proteome Research 2004, 3, 37-45

37

Published on Web 10/24/2003

research articles

Ren et al.

using Cu(II)-IMAC followed by RPC-MS. Quantification of the proteins in yeast was achieved with the GIST strategy27 via acetylation of N-terminus peptides with stable isotope derivatization. An inverse labeling strategy37 was applied to increase the reliability of determining regulatory changes. Proteins of yeast extracts from different fermentation times were analyzed and compared. Both the peptide mass fingerprinting (PMF) method and tandem MS analysis followed by database searches were used for protein identification.

Experimental Section Materials and Reagents. Yeast extracts38 were gifts from Dr. Miroslav Sedlak (Laboratory of Renewable Resources Engineering, Potter Engineering Center, Purdue University, West Lafayette, IN). TPCK-treated trypsin (bovine), copper(II) sulfate pentahydrade, urea, dithiothreitol (DTT), iodoacetic acid (IAA), N-(2Hydrozyethyl) piperazine-N′-(2-ethanesulfonic acid) (HEPES), N-tosyl-L-lysylchloromethyl ketone (TLCK), acetic acid, imidazole, calcium chloride, and ethylenediaminetetraacetic acid (EDTA) were purchased from Sigma-Aldrich (St. Louis, MO). Trifluoroacetic acid (Sequenal Grade) was acquired from Pierce (Rockford, IL). Sequencing grade modified trypsin was purchased from Promega (Madison, WI). HPLC grade acetonitrile (ACN), sodium acetate, sodium chloride, sodium phosphate dibasic heptahydrade, sodium phosphate monobasic, and phosphoric acid were obtained from Mallinckrodt Baker, Inc. (Paris, KY). C18 columns (4.6 × 250 mm and 1 × 250 mm) were purchased from Vydac (Hesperia, CA). HiTrap Chelating HP affinity columns (0.7 × 2.5 cm) were obtained from Amersham Pharmacia Biotech (Piscataway, NJ). The low-pressure microsplitter valve was from Upchurch Scientific (Oak Harbor, WA). Deionized water was produced by a Milli-Q A10 System from Millipore (Bedford, MA). Proteolysis of Proteins. Protein samples (3-5 mg/mL) were dissolved in 50 mM HEPES buffer (pH 7.8) containing 20 mM CaCl2 and 6 M urea. A 20× molar excess of DTT was added and the mixture incubated for 2 h at 37 °C. The solution was then cooled to room temperature and a 40× molar excess of iodoacetic acid was added. The mixture was incubated in darkness for 2 h with subsequent addition of a 20× molar excess of L-cysteine. This was followed by a second incubation period of 15 min. The protein solutions were then diluted 3-4fold and trypsin was added (1/50 of total amount of protein). The solution was incubated overnight at 37 °C. Trypsin digestion was stopped by adding TLCK in a slight molar excess over that of trypsin. Derivatization of Peptides. The N-hydroxysuccinimide (NHS) derivatives of 1H3-acetic acid and 2H3-acetic acid coding reagents were prepared according to a procedure described in the literature.39 Briefly, a 50-fold molar excess of coding reagent was added individually to experimental and control samples and the reaction was allowed to proceed for 2 h under ambient temperature at pH 7-8. After the reaction was completed, N-hydroxylamine was added in excess. The pH was adjusted to 11-12 with sodium hydroxide to hydrolyze esters and avoid mislabeling. The hydrolysis reaction was allowed to proceed for 10 min, then the pH was adjusted back to 7-8 with glacial acetic acid. Cu(II)-IMAC Selection. HiTrap Chelating HP affinity columns (0.7 × 2.5 cm) with iminodiacetic acid (IDA) chelating 38

Journal of Proteome Research • Vol. 3, No. 1, 2004

Figure 1. Inverse labeling strategy: yeast sample 1 (at fermentation time 2.5 h) and sample 2 (at fermentation time 10 h) were digested and each was divided equally into two parts, one was derivatized with the isotopically light form of N-acetoxysuccinamide (NAS), the other was derivatized with the heavy form. Sample 1 derivatized with light form was mixed with sample 2 derivatized with the heavy form and the mixture subjected to Cu(II)-IMAC selection followed by LC-MS. The inverse of this process was applied in a second experiment.

groups were used for histidine-containing peptide selection. All chromatographic steps were performed with a BioCAD 20 Micro-Analytical Workstation (Applied Biosystems, Framingham, MA). The flow rate was 1 mL/min and detection was at 280 nm. To prepare the column for selection, the column was first eluted with 7 bed volumes of a 50 mM of EDTA in 0.5 M NaCl (pH 8.0), then washed with deionized H2O and equilibrated with 0.1 M sodium acetate (NaAc) (pH 4.0) in 0.5 M NaCl. Thereafter, 50 mM copper(II) sulfate was loaded to the column. Excess copper was washed away with 0.1 M NaAc (pH 4.0) in 0.5 M NaCl. Histidine-containing peptides were selected by the displacer, imidazole. In this approach, the column was equilibrated with 1mM imidazole in 20 mM HEPES buffer (pH 7.0, 0.2 M NaCl) and eluted with 50 mM imidazole in the same buffer.26 LC-MS analysis. Peptide mixtures were separated on a Vydac C18 column (1 × 250 mm) using an Integral MicroAnalytical Workstation (Applied Biosystems, Framingham, MA) at 50 µL/min. The column was equilibrated with solvent A (0.01% TFA in deionized H2O) and peptides were eluted by increasing concentration of solvent B (95% ACN/0.01% TFA in dIH2O). The column was directly connected to a QSTAR workstation (Applied Biosystems, Framingham, MA) equipped with an Ion-spray source for collection of mass spectra. Spectra were obtained in the positive TOF mode at a sampling rate of one spectrum per second. A Vydac 218TP column (4.6 × 250 mm) was used to collect fractions for tandem MS analysis. The flow rate was 1 mL/min for an LC separation with the same gradient program as used for LC-MS. MS-MS Analysis and Database Searching. Fractions from RPC were collected and injected into a QSTAR for MS-MS analysis. The MS-MS analysis was done by flow injection at 3∼5 µL/min. Samples were prepared by dissolution into a mixture of CH3OH/H2O/acetic acid (50%/49%/1%). Identification of peptides and related proteins were obtained by subjecting tandem MS spectra to a MASCOT database search.40 Search parameters were set to locate the fixed modifications of N-terminal acetylation and cysteine carboxylmethylation, and variable modifications with acetylation on lysine and sodiation at the C-terminus and the carboxyl side chains of aspartic and glutamic acids. Enzyme specificity was set to trypsin or trypsin/

Histidine-Rich Peptide Selection and Quantification

research articles

Figure 2. ESI-MS spectrum of (a) a reversed-phase chromatography fraction from the yeast extracts. (b) An amplified view of this spectrum showing both sample complexity and the number of isobaric peptides. (c) A yeast sample after acylation of primary amine groups and simplification with Cu(II)-IMAC.

chymotrypsin with up to 2 missed cleavages. MS tolerances of parent ions and product ions were set to 1.2 and 0.6, respec-

tively. Only peptides with a score higher than the threshold value (P g 0.05) were considered. Journal of Proteome Research • Vol. 3, No. 1, 2004 39

research articles

Ren et al.

Figure 3. Total ion chromatogram of multiple histidine-containing peptides from an N-acetoxysuccinamide derivatized yeast sample.

Results and Discussion Quantification. Quantification of yeast samples at two fermentation times was achieved by GIST27 using an inverse labeling scheme37 (Figure 1). Two protein samples from yeast, one taken after 2.5 and a second after 10 h of fermentation, were each split into two equal fractions after proteolysis. One fraction of each sample was labeled with the light form of N-acetoxysuccinamide (designated H10 and H2.5 according to the isotope used in labeling and the fermentation time), and the second was labeled with the deuterated form (designated D10 and D2.5). As a result, a total of four labeled samples from two fermentation times were obtained. Samples H2.5 (the light form of N-acetoxysuccinamide labeled yeast sample after 2.5 h fermentation) and D10 (the heavy form of N-acetoxysuccinamide labeled yeast sample after 10 h fermentation) were mixed, injected onto a Cu(II)-IMAC column and the retained, histidine-containing peptide fraction was collected and analyzed by LC-MS. The mixture of H10 and D2.5 was analyzed in the same fashion. The method was designed in such a way that if in the first run, an increase in peptide (protein) isotope ratio was observed, a decrease should be seen in the second run at the same retention time as demonstrated in Figure 1. Artifacts, such as a variation in proteolysis, are easily seen in this way and the ratios disregarded. The advantage of this method is the ability to differentiate between proteolysis artifact and very large changes in concentration. If the singlet observed in the first LC-MS run is a result of massive up-regulation, the second run will have a peak at the mass of the observed singlet minus the number of deuterium atoms in the peptide multiplied by the mass difference between the heavy and light forms of the labeling reagent (3 amu in the case of Nacetoxysuccinamide). Alternatively, for massive down-regulation, the mass of peaks observed in the second run will increase to the same extent. Therefore, massive up- and down-regulations can be reliably and easily assigned to the sample from which they originated. Selection and LC-MS Separation of Digested Yeast Proteins. In addition to selecting histidine-rich peptides, a goal of this research was to investigate the chromatographic simplification obtained during the selection process. Simplification was necessary because the sample capacity of the reversed phase chromatography-mass spectrometry system used in these studies was inadequate to accommodate the 105 to 106 peptides that might be expected in a typical yeast proteome without peak overlap in the mass spectrometer. The need for sample simplification is demonstrated in Figure 2. The relatively 40

Journal of Proteome Research • Vol. 3, No. 1, 2004

simple mass spectrum in Figure 2a is from a single reversed phase fraction of a yeast proteome tryptic digest. When a portion of this spectrum is amplified (Figure 2b) it is apparent that at 1% of the base peak there are ions at nearly every m/z in the spectrum. This degree of sample complexity was seen in all fractions eluted from the reversed phase column (data not presented). Clearly, such a large number of isobaric peptides in spectra present two very serious problems. One is that it is difficult to know how many peptides are represented in any given spectral peak. The second is that MS-MS fragmentation patterns will almost always contain fragments from multiple peptides. Although it is possible to sequence multiple peptides simultaneously, it is only possible when the number is small and they are of similar concentration. The data in Figure 2 suggest that the most abundant peptides will dominate spectra and dynamic range will depend on the concentration of peptides of higher concentration. Cu(II)-IMAC has been used successfully for selection of histidine-containing proteins.41 Compared with Ni(II)-IMAC selection, Cu(II)-IMAC binds histidine residues more strongly42,43 and is less affected by the presence of a free R-amino group.44 Recently, it has been found to be a valuable sample simplification technique in comparative proteomics.26 As a result, in this study Cu(II)-IMAC was chosen to specifically select histidinecontaining peptides. The data below will show that acylation of the amino terminus of histidine-containing peptides, makes Cu(II)-IMAC even more selective. LC-MS analysis of fractions collected from a Cu(II)-IMAC column showed that the majority of yeast peptides were eluted between 15 and 40% of solvent B (95% ACN/0.01% TFA in dIH2O). This part of the gradient was expanded to provide better resolution. The total ion chromatogram shows reasonable separation of selected histidine-containing peptides from the sample (Figure 3) and can be applied to the study of regulatory changes. The resulting mass spectra following Cu(II)-IMAC selection (Figure 2c) are greatly simplified as compared to those seen in Figure 2b. MS-MS Analysis and Protein Identification by MASCOT. After selection and LC-MS analysis, all selected fractions from each sample were mixed and separated on a 4.6 × 250 mm Vydac C18 column. Fractions were collected, concentrated and then redissolved in CH3OH/H2O/acetic acid (50%/49%/1%). In most cases, MS-MS spectra were acquired for the peptide labeled with the light form of N-acetoxysuccinamide. MS-MS of the heavy labeled form was used to increase the reliability of identification. An example of a tandem mass spectrum of the same peptide labeled with light and heavy forms of

Histidine-Rich Peptide Selection and Quantification

research articles

Figure 4. (a) MS-MS of the differentially coded isoforms of peptide AHQTELTHHLLPR at m/z 797.92/799.46(2+), respectively. (b) Multiple peptide sequences obtained from a MASCOT search. (c) The peptide AHQTELTHHLLPR (from Ubiquinol-cytochrome C reductase complex 14 kDa protein, QCR7) showing the highest score with P ) 0.95 in the MASCOT search.

N-acetoxysuccinamide (NAS) is shown in the Figure 4a. The labeling strategy not only provides a tool for comparison of peptide concentration but also gives the possibility to distinguish between b- and y-series ions in MS-MS spectra, therefore, aiding determination of peptide sequence. The b-ions in the MS-MS spectra of two peptides from one pair should show

the mass difference introduced by the coding agent, whereas y-ions should appear with the same mass. It is clear that ions at m/z 251.11/254.14 and 609.28/612.25 are b-ions, whereas m/z at 1216.69, 873.52, 772.46, 635.40, and 498.34 are y-ions. Another important feature of a peptide that aids identification is the presence of certain amino acids in their structure. Journal of Proteome Research • Vol. 3, No. 1, 2004 41

research articles

Ren et al.

Table 1. Proteins from Yeast that Undergo Greater than 30% Change During Fermentationa pI

ratiob

PGK1 PGK1

44 738 44 738

7.77 7.77

0.5 0.5

phosphoglycerate kinase

PGK1

44 738

7.77

0.5

2468.13

MET17 protein

MET17

48 671

6.42

2.3

1643.86

elongation factor 1-alpha glucose-6-phosphate isomerase

TEF1

50 032

9.72

1.5

PGI1

61 299

6.43

0.7

GFA1

80 046

6.4

0.6

TDH2

35 847

6.96

3.9

TDH2

35 847

6.96

3.9

QCR6

17 257

3.81

up

PYK1 RPL3 RPS29A RPS29B TDH3

54 544 43 757 6661 6728 35 746

7.66 11.1 11.06 10.8 6.96

0.7 0.4 0.5 1.7

TDH3

35 746

6.96

1.7

TDH1

35 750

8.59

2.0

29 410 29 410 27 814

10.9 10.9 6.07

0.7

mass

YVLEHHPR HELSSLADVYIND AFGTAHR FRHELSSLADVYI NDAFGTAHR PSHFDTVQLHAG QENPGDNAHR SHINVVVIGHVDS GK GNVFTDYSTGSIL FGEPATNAQHSFF QLVHQGTK VLFLEDDDLAHIY DGELHIHR

1091.56 2257.18

phosphoglycerate kinase phosphoglycerate kinase

2560.24

3781.81

2561.17

2267.08

(DL)-glycerol-3-phosphatase 2

RPS4A RPS4B GPP2

2561.14

alcohol dehydrogenase II

ADH2

36 732

6.72

0.6

2998.33

superoxide dismutase [Cu-Zn]

SOD1

15 855

5.93

0.6

2146.03

pyruvate decarboxylase isozyme 1 pyruvate decarboxylase isozyme 2 pyruvate decarboxylase isozyme 3 thiamine metabolism regulatory protein THI3 fatty acid synthase subunit beta 5-methyltetrahydropteroyltriglutamates homocysteine methyltransferase ubiquinol-cytochrome C reductase complex core protein I, mitochondrial [precursor] heat shock protein SSB1 heat shock protein SSB2 enolase 1

PDC1

61 495

6.12

0.6

YAGEVSHDDKHII VDGHK ALVHHYEECAER

2145.04

MNFSHGSYEYHK HGHLGFLPR AHENVWFSHPR

1582.68 1074.56 1420.66

YAGEVSHDDKHII VDGK YAGEVSHDDKHII VDGKK GTVSHDDKHIIID GVK LAAPHHWLLDK

2007.98

STSGNTHLGGQDF DTNLLEHFK IEEELGDNAVFAG ENFHHGDKL GLDIPNVTHVINY DLPSDVDDYVHR YHGDYYLVSDDF ESYLATHELVDQE FHNQR IAHELPNAYHDYL NDNDISFDGSHFT K IVHSETVEFEKDL PHYHTK DLPHYHTK

gene name

1383.70

1001.52

ATHILDFGPGGAS GLGVLTHR NYPNHIGLGLFDI HSPR VLEHLHSTAFQNT PLSLPTR

protein name

glucosamines fructose-6-phosphate aminotransferase [isomerizing] glyceraldehyde 3-phosphate dehydrogenase 2 glyceraldehyde 3-phosphate dehydrogenase 2 ubiquinol-cytochrome C reductase complex 17 kDa protein pyruvate kinase 1 60S ribosomal protein L3 40S ribosomal protein S29-A 40S ribosomal protein S29-B glyceraldehyde 3-phosphate dehydrogenase 3 glyceraldehyde 3-phosphate dehydrogenase 3 glyceraldehyde 3-phosphate dehydrogenase 1 40S ribosomal protein S4

HIIVDGHK

DKPYFDAEHVIQV SHGWR YSGVCHTDLHAW HGDWPLPTK GFHIHEFGDATDG CVSAGPHFNPFKK QLLLHHTLGNGD FTVFHR

42

MW (Da)

peptide sequence

1555.69

2178.08 1858.92

PDC5

61 912

2.0

6.41

PDC6

61 580

6.13

THI3

68 366

6.33

FAS1

228 689

5.79

0.2

MET6

85 859

6.42

0.6

QCR1

50 227

7.34

0.3

SSB1 SSB2 ENO1

66 601 66 594 46 816

5.18 5.24 6.6

0.7 1.4

probable ATP-dependent RNA helicase DED1 glycogen phosphorylase

DED1

65 552

7.88

0.8

GPH1

103 274

5.39

0.6

3259.85

hypothetical 98.1 kDa protein in ROM1-UPF3 intergenic region

YGR071C

98 108

5.83

2.0

2434.15

xylulose kinase

XKS1

68 320

6.8

0.6

1092.54

xylulose kinase

XKS1

68 320

6.8

0.6

2117.05 1990.99 2302.18

2501.15 2524.15 2908.40 3731.73

Journal of Proteome Research • Vol. 3, No. 1, 2004

research articles

Histidine-Rich Peptide Selection and Quantification Table 1 (Continued) peptide sequence

mass

protein name

gene name

MW (Da)

pI

ratiob

LIDLCVGPHIPHT GR LSLTGGFSHHHAT DDVEDAAPETK

1726.87

threonyl-tRNA synthetase, cytoplasmic 30 kDa heat shock protein

THS1

84 520

7.01

0.7

HSP30

37 044

4.96

up

2618.11

a Data were obtained from http://genome-www.stanford.edu/. b Ratio was calculated by the signal at fermentation time 10 h versus fermentation time at 2.5 hours.

One is to determine whether a tryptic peptide has a lysine or arginine at the C-terminus. Because the GIST labeling agent differentially labels lysine and arginine, it is inherent in this method that one can in most cases identify the C-terminal amino acid in a tryptic peptide. This is done by determining the mass difference between the heavy and light forms of the peptide pair, as seen in the scheme below (for trypsin digestion only):

observed mass shift between derivatized peptide pairs

3 6 9

presence of Arg or Lys in peptide

Arg, no Lys Arg + Lys (missed cleavage) or Lys 2 Lys (missed cleavage) or Arg +2Lys (missed cleavage)

For example, if a mass shift of 3 amu is observed, the resulting peptide contains an arginine at the C-terminus. A 6 amu difference in contrast is probably due to the presence of lysine at the C-terminus, although it could be from a missed-cleavage in a peptide containing both lysine and arginine residues. In either case, the peptide contains lysine. Mass shifts of 9 amu or higher do occur, but are rare and the result of missedcleavages. Upon further analysis, all of the sequenced peptides contain multiple histidine residues (Table 1). This is most probably due to a combination of using an imidazole elution buffer and N-acetoxysuccinamide derivatization. Acylation of the N-terminal amino group clearly reduces the affinity of histidine containing peptides for the immobilized copper stationary phase. This occurs to the extent that peptides with a single histidine residue are no longer bound. This biases the selection process to those peptides with multiple histidine residues that are more strongly bound to the Cu(II)-IMAC column. It is concluded that use of both Imidazole as a displacer and N-terminal acetylation with Cu(II)-IMAC allows selection of peptides with multiple histidine residues exclusively. The fact that Cu(II)-IMAC selects histidine-containing peptides is even more useful. The resulting peptide sequence obtained from a MASCOT search of the MS-MS data presented in Figure 4b has arginine at the C-terminus as well as 3 histidines in the peptide. Both the presence of arginine at the C-terminus and histidine match the expectation suggested from isotopic and chromatographic data. Because of these data and the high probability score from MASCOT search engine (NCBI database), it can be concluded that the proposed sequence is valid. As another example, a peptide found at m/z 797.90 was fragmented by MS-MS (Figure 4a) and the NCBI, MASCOT search engine used for identification. Multiple peptide sequences were obtained from the search with parameters to help in correct peptide identification (Figure 4b). The delta value, i.e., the difference between the experimentally determined

molecular weight of a peptides and the theoretical value obtained from a database, was within 0.06 amu for 4 of the 5 peptides candidates found by the search engine. However, only one peptide contained multiple histidine residues. The fact that MASCOT gave this peptide the highest score (Figure 4c) means it was possible to identify this peptide without knowledge that it contained multiple histidine residues. But this was only possible through MS-MS sequence data. Using knowledge that the peptide had multiple histidine residues allowed it to be identified by molecular weight alone. Clearly, tandem mass analysis is able to identify peptides with no additional information, but at the cost of increased mass spectral analysis and computation. By contrast, it is possible to identify peptides from molecular weight alone when it is known they contain multiple histidine residues. Determination of Regulatory Changes in the Yeast Proteome. Calculating of up- and down-regulation was achieved by extracting the corresponding ions from the two inverse labeling LC-MS runs and averaging the isotope ratio from both runs. Three examples of inversely labeled peptides that show time dependent relative concentrations changes are shown in Figure 5. There were also few proteins found that did not change concentration during the fermentation time period studied. One of these was a peptide appearing at m/z 821.38 (2+) that was subsequently identified as AVFAGENFHHGDKL. The concentration in both the H2.5-D10 and D2.5-H10 was the same in both LC-MS runs. A peptide representative of upregulation was seen at m/z ) 716.03 (z ) 3+), Figure 5b. Its concentration in the inverse experiments increased 4-fold between 2.5 and 10 h after the initiation of fermentation. This peptide was identified as YAGEVSHDDKHIIVDGHK from glyceraldehyde 3-phosphate dehydrogenase 2 (TDH2). A more extreme case of up-regulation was observed with the ion at m/z ) 873.73, 3+ (Figure 5c). This peptide from heat shock protein (HSP30) was identified to have the sequence LSLTGGFSHHHATDDVEDAAPETK. In the H2.5-D10 run a singlet peak was found that was later matched to another singlet from the D2.5-H10 run. Both peaks appeared to be a triply charged peptide at the same retention time and differed by two amu. The number of lysine residues in this peptide is in agreement with that obtained by MS-MS sequencing. This pair would have been more difficult to identify without inverse labeling. The inverse labeling method is particularly useful in identifying extreme cases of up- and down-regulation. Only peptides that had undergone large, obvious concentration changes were sequenced and presented in Table 1. It is seen that this multidimensional chromatography approach can accommodate proteins covering a broad range of pI values and molecular weight. For example, the pI and molecular weight of Rps29B were 10.8 and 6,728, respectively.

Conclusions The great advantage of this method is that it allows both the selection and quantification of a very important class of Journal of Proteome Research • Vol. 3, No. 1, 2004 43

research articles

Ren et al.

Figure 5. Up- and down-regulation in protein expression observed by differentially coded isoforms of peptides in yeast extracts using inverse labeling strategy. (a) No change, (b) 4-fold up-regulation, and (c) an extreme case of up-regulation.

peptides. Another advantage is that proteome digest complexity is reduced to the extent that it is more likely to fit within the 44

Journal of Proteome Research • Vol. 3, No. 1, 2004

analytical capacity of an LC-MS. Still another advantage is that knowing that all peptides selected will have two or more

research articles

Histidine-Rich Peptide Selection and Quantification

histidine residues facilitates peptide identification. It is concluded that the methods described here enable the recognition, identification, and quantification of changes in proteins in a complex biological system while it is responding to an environmental stimulus. These methods should be of broad utility in the study of other biological systems as well. The limitation of this method is that it depends on either a single or a small number of peptides for the identification of proteins. When this peptide is common to closely related proteins, such as isoenzymes or other multimeric proteins with a common subunit, it is difficult to unequivocally identify the protein parent.

Acknowledgment. The authors greatly acknowledge financial support from grants 5R01 GM 59996-04. Dr. Miroslav Sedlak (Laboratory of Renewable Resources Engineering, Potter Engineering Center, Purdue University, West Lafayette, IN) is acknowledged for providing Yeast samples. We also thank Dr. Mark G. Goebl (Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN) for helpful discussion on yeast proteins. References (1) Aneeta; Sanan-Mishra, N.; Tuteja, N.; Kumar Sopory, S. Biochem. Biophys. Res. Commun. 2002, 296, 1063-1068. (2) Carlson, D. M. Biochimie 1988, 70, 1689-1695. (3) Kieliszewski, M. J.; Shpak, E. Cell. Mol. Life Sci. 2001, 58, 13861398. (4) Xie, Z.; Merchant, S. Biochim. Biophys. Acta 1998, 1365, 309318. (5) Odani, S.; Koide, T.; Ono, T. J. Biol. Chem. 1987, 262, 10 50210 505. (6) Weiskirchen, R.; Moser, M.; Weiskirchen, S.; Erdel, M.; Dahmen, S.; Buettner, R.; Gressner, A. M. Biochem. J. 2001, 359, 485-496. (7) Coyle, P.; Philcox, J. C.; Carey, L. C.; Rofe, A. M. Cell. Mol. Life Sci. 2002, 59, 627-647. (8) Bornstein, P. Matrix Biol. 2002, 21, 217-226. (9) Leikauf, G. D.; Borchers, M. T.; Prows, D. R.; Simpson, L. G. Chest 2002, 121, 166S-182S. (10) Rogers, G. E.; Powell, B. C. J. Invest. Dermatol. 1993, 101, 50S55S. (11) Retaux, S.; Bachy, I. Mol. Neurobiol. 2002, 26, 269-281. (12) Gupta, R. K.; Dobritsa, S. V.; Stiles, C. A.; Essington, M. E.; Liu, Z.; Chen, C.; Serpersu, E. H.; Mullin, B. C. J. Protein Chem. 2002, 21, 529-536. (13) Kirschke, C. P.; Huang, L. J. Biol. Chem. 2003, 278, 4096-4102. (14) Ranieri-Raggi, M.; Martini, D.; Sabbatini, A. R. M.; Moir, A. J. G.; Raggi, A. Biochim. Biophys. Acta 2003, 1645, 81-88. (15) Picello, E.; Damiani, E.; Margreth, A. Biochem. Biophys. Res. Commun. 1992, 186, 659-667. (16) Dathan, N.; Zaccaro, L.; Esposito, S.; Isernia, C.; Omichinski, J. G.; Riccio, A.; Pedone, C.; Di, B. B.; Fattorusso, R.; Pedone, P. V. Nucleic Acids Res. 2002, 30, 4945-4951.

(17) Kho, R.; Nguyen, L.; Torres-Martinez, C. L.; Mehra, R. K. Biochem. Biophys. Res. Commun. 2000, 272, 29-35. (18) Misra, P.; Qi, C.; Yu, S.; Shah, S. H.; Cao, W.; Rao, M. S.; Thimmapaya, B.; Zhu, Y.; Reddy, J. K. J. Biol. Chem. 2002, 277, 20 011-20 019. (19) Moreau, S.; Thomson, R. M.; Kaiser, B. N.; Trevaskis, B.; Guerinot, M. L.; Udvardi, M. K.; Puppo, A.; Day, D. A. J. Biol. Chem. 2002, 277, 4738-4746. (20) Li, L.; Kaplan, J. J. Biol. Chem. 1997, 272, 28 485-28 493. (21) Colpas, G. J.; Brayman, T. G.; Ming, L. J.; Hausinger, R. P. Biochem. 1999, 38, 4078-4088. (22) Aebersold, R.; Mann, M. Nature 2003, 422, 198-207. (23) Gevaert, K.; Vandekerckhove, J. Electrophoresis 2000, 21, 11451154. (24) Wolters, D. A.; Washburn, M. P.; Yates, J. R., III. Anal. Chem. 2001, 73, 5683-5690. (25) Geng, M.; Ji, J.; Regnier, F. E. J. Chromatogr. A. 2000, 870, 295313. (26) Ren, D.; Penner, N. A.; Slentz, B. E.; Mirzaei, H.; Regnier, F. E. J. Proteome Res. 2003, 2, 321-329. (27) Chakraborty, A.; Regnier, F. E. J. Chromatogr. A. 2002, 949, 173184. (28) Hatzimanikatis, V.; Choe, L. H.; Lee, K. H. Biotechnol. Prog. 1999, 15, 312-318. (29) Goffeau, A.; Barrell, B. G.; Bussey, H.; Davis, R. W.; Dujon, B.; Feldman, H.; Galibert, F.; Hoheisel, J. D.; Jacq, C.; Johnston, M.; Louis, E. J.; Mewes, H. W.; Murakami, Y.; Philippsen, P.; Tettelin, H.; Oliver, S. G. Science 1996, 274, 563-567. (30) Velculescu, V. E.; Zhang, L.; Zhou, W.; Vogelstein, J.; Basrai, M. A.; Bassett, D. E., Jr.; Hieter, P.; Bogelstein, B.; Kinzler, K. W. Cell 1997, 88, 243-251. (31) Wodicka, L.; Hong, H.; Mittmann, M.; Ho, M.-H.; Lockhart, D. J. Nat. Biotechnol. 1997, 15, 1359-1367. (32) Perrot, M.; Sagliocco, F.; Mini, T.; Monribot, C.; Schneider, U.; Shevchenko, A.; Mann, M.; Jeno¨, P.; Boucherie, H. Electrophoresis 1999, 20, 2280-2298. (33) Futcher, B.; Latter, G. I.; Monardo, P.; McLaughlin, C. S.; Garrels, J. I. Mol. Cell. Biol. 1999, 19, 7357-7368. (34) Guttman, A.; Csapo, Z.; Robbins, D. Proteomics 2002, 2, 469474. (35) Yanagida, M. J. Chromatogr. B. 2002, 771, 89-106. (36) Washburn, M. P.; Wolters, D.; Yates, J. R., III. Nat. Biotechnol. 2001, 19, 242-247. (37) Wang, Y. K.; Ma, Z.; Quinn, D. F.; Fu, E. W. Anal. Chem. 2001, 73, 3742-3750. (38) Sedlak, M.; Ho, N. W. Y. Enzyme Microb. Technol. 2001, 28, 1624. (39) Ji, J.; Chakraborty, A.; Geng, M.; Zhang, X.; Amini, A.; Bina, M.; Regnier, F. E. J. Chromatogr. B. 2000, 745, 197-210. (40) MASCOT, Matrix Science (http://www.matrixscience.com/ search_form_select.html). (41) Chaga, G. S. J. Biochem. Biophys. Methods 2001, 49, 313-334. (42) Gaberc-Porekar, V.; Menart, V. J. Biochem. Biophys. Methods 2001, 49, 335-360. (43) Sulkowski, E. BioEssays 1989, 10, 170-175. (44) Hansen, P.; Lindeberg, G. J. Chromatogr. A. 1995, 690, 155159.

PR034049Q

Journal of Proteome Research • Vol. 3, No. 1, 2004 45