Multitagging Proteomic Strategy to Estimate Protein Turnover Rates in

Feb 25, 2010 - Biology and Biophysics, University of Minnesota, 321 Church Street SE, Minneapolis, Minnesota 55455. Received August 31, 2009...
0 downloads 0 Views 2MB Size
Multitagging Proteomic Strategy to Estimate Protein Turnover Rates in Dynamic Systems Karthik P. Jayapal,† Siguang Sui,† Robin J. Philp,‡ Yee-Jiun Kok,‡ Miranda G. S. Yap,‡ Timothy J. Griffin,§ and Wei-Shou Hu*,† Department of Chemical Engineering and Materials Science, University of Minnesota, 421 Washington Avenue SE, Minneapolis, Minnesota 55455, Bioprocessing Technology Institute, Agency for Science Technology and Research, 20 Biopolis Way, #06-01, Centros, Singapore 138668, and Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 321 Church Street SE, Minneapolis, Minnesota 55455 Received August 31, 2009

Current techniques for quantitative proteomics focus mainly on measuring overall protein dynamics, which is the net result of protein synthesis and degradation. Understanding the rate of this synthesis/ degradation is essential to fully appreciate cellular dynamics and bridge the gap between transcriptome and proteome data. Protein turnover rates can be estimated through “label-chase” experiments employing stable isotope-labeled precursors; however, the implicit assumption of steady-state in such analyses may not be applicable for many intrinsically dynamic systems. In this study, we present a novel extension of the “label-chase” concept using SILAC and a secondary labeling step with iTRAQ reagents to estimate protein turnover rates in Streptomyces coelicolor cultures undergoing transition from exponential growth to stationary phase. Such processes are of significance in Streptomyces biology as they pertain to the onset of synthesis of numerous therapeutically important secondary metabolites. The dual labeling strategy enabled decoupling of labeled peptide identification and quantification of degradation dynamics at MS and MS/MS scans respectively. Tandem mass spectrometry analysis of these multitagged proteins enabled estimation of degradation rates for 115 highly abundant proteins in S. coelicolor. We compared the rate constants obtained using this dual labeling approach with those from a SILAC-only analysis (assuming steady-state) and show that significant differences are generally observed only among proteins displaying considerable temporal dynamics and that the directions of these differences are largely consistent with theoretical predictions. Keywords: protein turnover • half-life • SILAC • iTRAQ • Streptomyces coelicolor

Introduction Molecular tools like DNA microarrays and quantitative mass spectrometry are increasingly being employed to capture a snapshot of the cellular transcriptome or proteome at any given moment. The information obtained from these studies is the net effect of two counteracting processes, namely synthesis and degradation of mRNA or proteins. Mathematically, even a much simplified model for protein dynamics will yield, at steady state, a protein concentration dependency on mRNA concentration, rate of translation and rate of protein degradation.1 It is the diversity in the values of translation and protein degradation rate constants that accounts for the frequently observed discordances between mRNA and proteins. A reasonable correlation between mRNA and protein dynamics will be observed * To whom correspondence should be addressed. 421 Washington Avenue SE, Minneapolis, MN 55455-0132. Phone: (612) 625-0546. Fax: (612) 6267246. E-mail: [email protected]. † Department of Chemical Engineering and Materials Science, University of Minnesota. ‡ Bioprocessing Technology Institute. § Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota. 10.1021/pr9007738

 2010 American Chemical Society

only when the protein degradation rate constant is much smaller than specific cell growth rate. Not surprisingly, several previous studies have reported poor to moderate correlations (Spearman rank correlation of about 0.2 to 0.7) between mRNA and protein abundances.2-6 Previous work in our lab has also suggested that only about 65% of genes in the bacterium, Streptomyces coelicolor, show a reasonable correlation between mRNA and protein dynamics.7 Thus, protein turnover is an important, but as yet under-explored, dimension of cellular dynamics that needs further attention. In the past two decades, two common approaches have been adopted to study protein turnover. One approach involves use of reagents like cycloheximide to inhibit translation in eukaryotes followed by measuring decrease in protein concentrations through 2-D PAGE or quantitative Western blots.8-10 However, the use of external chemical reagents in such experiments may prompt one to question the physiological relevance of the results obtained. In the second approach, synthesis/degradation dynamics of an otherwise stable pool of proteins is performed using metabolic tracers. Identification of proteins that are being rapidly synthesized under various conditions have been carried out through pulse-chase experiments using Journal of Proteome Research 2010, 9, 2087–2097 2087 Published on Web 02/25/2010

research articles 35

radioactive S-methionine followed by 2-D PAGE and autoradiography.11-15 This technique, however, generally yields only qualitative results and protein identification may also be complicated due to the presence of other interfering proteins in the gel. Besides, when possible, it is desirable to avoid using 35 S-labeled amino acids because of safety concerns.16 Beynon and co-workers have presented an elegant experimental design using metabolic labeling of proteins with stable isotope tagged amino acids (SILAC) followed by LC-MS/MS.17-19 Metabolically labeled (or unlabeled) proteins were transferred to an unlabeled (or labeled) medium and rate of loss (or gain) of label was monitored using mass spectrometry. Their approach involved use of 2-D PAGE separation of proteins before protein spot excision, digestion and analysis by mass spectrometry. The results yielded turnover rates of some of the most abundant proteins in Saccharomyces cerevesiae, chicken skeletal muscle and human adenocarcinoma cells. SILAC experiments yield quantitative values for fraction of heavy (labeled) to total (labeled + unlabeled) protein levels, that is, IH(t)/[IH(t) + IL(t)], for every protein, at any given time t. However, to estimate the first order rate constant of protein degradation, one needs to estimate what fraction, f, of labeled protein at time t0 is remaining at any later time t, that is, f ) IH(t)/IH(t0). Under conditions of complete labeling at t0, the two denominator terms will be equal when a steady state assumption can be made (ie. no net synthesis/degradation for every protein i). Thus quantitative information from SILAC is sufficient to estimate protein turnover rates under conditions of steady state. The steady state assumption, however, is not applicable during analysis of dynamic processes like cellular differentiation. The focus of this work is on Streptomyces coelicolor, a multicellular differentiating bacterium that is commonly used as a model organism for the genus Streptomyces. Streptomycetes account for over two-thirds of naturally occurring antibiotics that are in clinical use today.20 To overcome the problem of steady state assumption, we present here a multitagging proteomic approach incorporating the principles of both SILAC and iTRAQ21 labeling systems to estimate protein turnover rate constants in S. coelicolor. To our knowledge, this is the first time that a multitagging proteomic strategy has been devised to estimate global protein turnover rates.

Materials and Methods Bacterial Strains and Culture Conditions. S. coelicolor M145 spores were generated by cultivating the mycelia in MannitolSoy flour agar plates.22 Liquid cultures were performed in incubated shakers (220 rpm; 30 °C) in a defined medium previously described23 with the following modifications. Instead of glutamic acid as sole nitrogen source, an amino acids mix containing the following amino acids per liter of medium was added: alanine, 59 mg; arginine, 70 mg; asparagine, 11 mg; aspartic acid, 39 mg; cysteine, 4.5 mg; glutamine, 18 mg; glutamic acid, 70 mg; glycine, 35 mg; histidine, 17 mg; isoleucine, 18 mg; leucine, 64 mg; lysine, 14 mg; methionine, 12 mg; phenylalanine, 21 mg; proline, 34 mg; serine, 25 mg; threonine, 35 mg; tryptophan, 15 mg; tyrosine, 18 mg; and valine. 48 mg. The relative amounts of amino acids were chosen based on the frequency of their occurrence in S. coelicolor theoretical proteome. In addition, a vitamin mix consisting of riboflavin, thiamine, niacin, folic acid, biotic, inositol and pyridoxal HCl were each added at 0.5 µg per liter of medium. 2088

Journal of Proteome Research • Vol. 9, No. 5, 2010

Jayapal et al. 13

During the labeling phase of culture, [ C6, 15N4]-arginine (Cambridge Isotopes, MA) was used, instead of normal arginine. Spores were inoculated in siliconized conical flasks with stainless steel coiled springs at concentrations of 107 per mL of culture medium. Antifoam 289 was added at 0.05% (v/v) concentration. Cell growth was monitored by measuring optical density at 450 nm of dispersed (sonicated) mycelia. For transfer of cells from labeled to unlabeled medium, cells were harvested by brief centrifugation at 4 °C, washed twice with ice-cold PBS and transferred to unlabeled medium. Samples for proteomic analysis were harvested periodically by rapid chilling in dry ice/ ethanol bath followed by centrifugation. Cell pellets were stored at -80 °C until further analysis. Cell Lysis. Frozen cell pellets were pulverized by grinding in liquid nitrogen and cellular contents were solubilized in 50 µL of lysis buffer (8 M urea, 4% CHAPS) supplemented with 4 mM phenylmethylsulfonyl fluoride. The volumes were brought up to 400 µL each with dissolution buffer (0.5 M triethylammonium bicarbonate, pH 8.0) and protein assays were then carried out using Coommassie Plus Bradford assay (Pierce Research Instruments, Singapore). In SILAC method, aliquots of 100 µg proteins from each sample were applied directly to run gel electrophoresis. In SILAC/iTRAQ methods, aliquots of 100 µg proteins from each sample were processed for labeling with iTRAQ according to manufacturer’s instructions (Applied Biosystems, Foster City, CA). The labeled samples were mixed and concentrated in a SpeedVac to reduce volatile content, before diluting 10× with SCX loading buffer (10 mM KH2PO4, 25% acetonitrile, pH 3.0). Cation Exchange Chromatography for iTRAQ-Labeled Sample. Injection of the iTRAQ labeled sample was performed in multiple aliquots onto a SCX column (PolyLC 2 mm × 150 mm, Nestgroup, Southborough, MA). Separation of peptides was performed by developing a 2-step gradient of KCl - from 0% to 20% salt buffer (10 mM KH2PO4, 20% acetonitrile, 500 mM KCl, pH 3) over 40 min, followed by an increase to 100% salt buffer over 20 min, at a flow rate of 200 µL/min with fractions collected every 1.5 min. Fractions were desalted using C-18 spin columns (Vivapure, Sartorius, Singapore) and eluted in 70% acetonitrile, 0.25% formic acid. Acetonitrile in the eluant was eliminated by SpeedVac and each fraction was reconstituted with 1% formic acid, 2% methanol in Milli-Q water for mass spectrometric analysis. Electrophoresis and In-Gel Digestion for SILAC Sample. Aliquots of 100 µg proteins from each sample were loaded onto a 13 cm length 10% acrylamide (Biorad) gel and ran at constant current of 30 mA. The gel was subsequently stained with Coommassie blue, and each lane was cut into 48 bands of equal width. Each band was cut into 1-2 mm2 pieces and washed in buffer of 25 mM ammonium bicarbonate in 50% acetonitrile. Samples were then incubated with 25 mM TCEP and 55 mM iodoacetamide in 100 mM ammonium bicarbonate at 37 °C for 1 h, to reduce and alkylate cysteine residues, and digested overnight with 0.2 mg trypsin (Promega V5111) in 25 mM ammonium bicarbonate at 37 °C. Peptides were extracted from gel with buffers containing 0.1% formic acid in 70% acetonitirile, followed by 0.1% formic acid in 100% acetonitrile. Organic content was removed by drying in SpeedVac, and peptides were resolubilised in 9 mL loading buffer (1% formic acid, 2% methanol). Mass Spectrometry. Nanoscale liquid chromatographytandem mass spectrometry (LC-MS/MS) was performed using a QSTAR-XL hybrid quadrupole-time-of-flight tandem mass spectrometer (Applied Biosystems) coupled to an LC-Packing

research articles

Protein Turnover Rates in Dynamic Systems (Sunnyvale, CA) system comprising of a FAMOS autoinjector unit, a SWITCHOS 10 port valve unit, and an ULTIMATEPLUS nanoflow pumping unit. An injection volume of 10 µL from each sample was made onto a reversed-phase C-18 peptide trapping cartridge (300 µm × 5 mm, LC-Packings) in a flow of 0.1% formic acid for 5 min at 25 µL/min. Following the wash step the flow from the pumping unit was diverted back through the trapping cartridge at 100 nL/min. Peptides were eluted from the cartridge by application of a gradient from 0 to 90% acetonitrile in 0.1% formic acid over 40 min at 100 nL/min, and separated by passing through a C-18 reversed phase column (packed in-house with 5 µm particle size packing material from Column Engineering, Ontario, CA). Peptides eluting from the column were sprayed directly into the orifice of the mass spectrometer, which was run in IDA (information dependent acquisition) mode selecting all 2+ to 4+ charged ions with signal intensity greater than eight counts per second over the specified mass range. For collision-induced dissociation, nitrogen gas was used at a setting of four and the collision energy set to automatic allowing increased energy with increasing ion mass. Protein Identification and Quantification. Protein identifications and quantifications were carried out by searching the raw data files (*.wiff) against a database containing predicted S. coelicolor protein sequences (ftp://ftp.sanger.ac.uk/pub/S_coelicolor/whole_genome/Sco.prot_fas) appended with a reversed sequence database (for estimation of false identification rates) and a list of common contaminant protein sequences. ProteinPilot software v2.0 was used for all searches. In the SILAC approach, trypsin specificity, cysteine alkylation with MMTS, and SILAC (Arg + 10 Da) were chosen along with other default options. In the SILAC-iTRAQ multitagging approach, all the setting were the same except that the data dictionary in the software was modified to include searches for [13C6,15N4]-arginine while iTRAQ was chosen as the labeling agent. Peptide spectra with >90% confidence level was selected as a threshold cutoff and exported to Microsoft Excel 2003 for further analysis. In the SILAC approach, heavy labeled argininecontaining peptides were selected for quantification of averaged IH(t)/IL(t) at protein levels. With the steady state assumption that the protein abundance is a constant, the ratio of heavy to light protein abundance at time t, IH(t)/IL(t), could be used to calculate the ratio of heavy abundance at time t to that at time 0, IH(t)/IH(t0). The detail is explained in the result. In the SILACiTRAQ approach, those peptides containing lysines were selected for the quantification of total protein dynamics, while those peptides containing user defined arginines were selected for the calculation of protein degradation constants. As samples from t ) 0, 2, 4, and 8 h were each labeled with iTRAQ reagents 114, 115, 116 and 117, the intensity ratios of iTRAQ regents (115, 116 and 117) to 114 were representative of the ratios of peptide abundances at 2, 4 and 8 h to those at 0 h, respectively. The average ratio of multiple peptide spectra from the same protein was calculated to stand for the ratio of the protein abundance at a certain time point to its quantity at time 0 h, IH(t)/IH(t0). The IH(t)/IH(t0) obtained from these proteins were fit to an exponential decay curve and a first order rate constant for label loss was estimated for every protein.

Results Concept and Experimental Design. The experiment was characterized by two culture phasessa “labeling phase” when intracellular proteins were labeled using a stable isotope

containing medium followed by a “chase phase” when the rate of label loss was detected in an unlabeled medium. Complete labeling of intracellular proteins was not achieved during the labeling phase due to endogenous amino acid synthesis. The extent of label incorporation (R) before cell transfer to unlabeled medium was experimentally determined using mass spectrometry. Let IH(t0) and IL(t0) be the mass spectrometric peak intensities of heavy labeled and unlabeled peptides respectively from protein i at time t ) 0, just prior to cell transfer. If cells had attained steady state in the labeling phase

R)

IH(t0) IH(t0) + IL(t0)

(1)

The value of the right-hand side of eq 1 determined from all peptides is expected to be approximately the same if cells had truly attained steady state during labeling phase and this will equal the fraction of labeled amino acids in the intracellular free amino acid pool. Let IH(t) and IL(t) be the peak intensities of the peptide from protein i at any later time t after transfer of cells to unlabeled medium. If a fraction f of protein i (labeled + unlabeled) at time t0 remains after time t, then the ratio of heavy to total protein can be written as f · R · pi(t0) IH(t) labeled protein att ) ) i total protein att I (t) + IL(t) p (t) H

(2)

where pi represents the total amount of intracellular protein i. If there is no net synthesis or degradation of protein i (steady state assumption), then pi(t0) ) pi(t). Equation 2 can then be rearranged to calculate f, which can be fit to an exponential decay curve e-kt to estimate the degradation rate constant k. The steady state assumption is, however, invalid for fast growing organisms like bacteria where intracellular protein levels change dynamically with time. In this paper we develop a novel multitagging approach drawing from the concepts of SILAC and iTRAQ to estimate protein turnover rates in dynamic organisms. The approach aims to directly estimate the fraction f using iTRAQ ion peaks, thereby bypassing the need for steady state assumption. A schematic of the approach is illustrated in Figure 1 using arginine as an example for the labeling amino acid. In this approach, proteins extracted from cell samples at t ) t0, t1, t2, and t3 after transfer to unlabeled medium are digested with trypsin, labeled with iTRAQ reagents, fractionated and analyzed by LC-MS/MS as shown in Figure 1a. When peptide mixtures are ionized and examined in mass spectrometer (MS scans), all arginine containing peptides are resolved into heavy and light peaks, both of which represent peptides from all four time points (Figure 1b). When these peptides are further analyzed in tandem mass spectrometry (MS/MS scans), the iTRAQ tags are fragmented and relative ratios from heavy and light peptides at each time point are obtained as shown in Figure 1b. The quantitation region of the MS/MS spectra (iTRAQ tags) corresponding to heavy (labeled) peptides, therefore, provides a direct estimation of decrease in labeled protein concentrations over time without the need for steady state assumption.

f ) iTQH )

IH(t) IH(t0)

(3)

Journal of Proteome Research • Vol. 9, No. 5, 2010 2089

research articles

Jayapal et al.

Figure 1. Schematic of the multitagging experimental strategy used and illustrations of peptide quantitation, (a) Protein samples isolated at time t ) 0, 2, 4 and 8 h are labeled with iTRAQ reagents 114 (yellow g), 115 (teal g), 116 (green g) and 117 (purple g) respectively. Each peptide contains heavy (red O) or light (blue O) isotopic arginine as well as one of the iTRAQ tags. The labeled peptides are mixed and analyzed by mass spectrometry. (b) In this example, the iTRAQ tags of a peptide from SCO2620 (cell division trigger factor) with no arginine (INQQVTVK, m/z ) 609.4 Da) fragments in tandem-MS is used to report the overall dynamics of the protein. Another peptide from the same protein with an arginine residue (LNVSQEELTEHLMR, unlabeled m/z ) 614.9 Da) is resolved into heavy and light peaks during MS scans. The iTRAQ tags derived from further fragmentation of each of these heavy and light fractions provide the rate of heavy label loss (decreasing trend) or light label incorporation (increasing trend) into the protein.

where the right-hand side ratio is directly estimated from iTRAQ analysis. This fraction, f, can be fit to an exponential decay curve e-kt to calculate a protein degradation rate constant. Peptide sequences that do not contain arginine residues provide no information on rate of label loss. However, quantitative information from such peptides reflects on the net dynamics of proteins (Figure 1b). This information is required to perform normalization (iTRAQ bias correction) because, unlike SILAC experiments, iTRAQ ratios are not internally normalized. Growth Kinetics and Sampling of Streptomyces coelicolor Cells. Streptomyces coelicolor is a relatively fast growing bacterium in which antibiotic synthesis is accompanied by rapid changes in proteomic profiles. S. coelicolor M145 cells were cultured in a completely defined medium (Figure 2) containing 2090

Journal of Proteome Research • Vol. 9, No. 5, 2010

maltose as the primary carbon source. All 20 amino acids were added in excess quantities to promote adequate amino acid uptake and minimize intracellular amino acid interconversions. Protein labeling was accomplished by using [13C6,15N4]-arginine instead of normal arginine for the first part of the culture. Arginine was chosen for labeling because trypsin cleaves proteins after every lysine or arginine (unless followed by proline) and that the theoretical proteome of S. coelicolor was about four times more abundant in arginine compared to lysine. Cells were cultivated in this heavy labeled medium for 24 h (at least 5-6 doublings starting from spores) after which they were harvested and transferred to unlabeled medium. To minimize physiological changes that may occur due to change of media, spent medium derived from a parallel unlabeled S. coelicolor culture was used for the chase phase of culture. Periodic samples were harvested from this part of the culture

research articles

Protein Turnover Rates in Dynamic Systems i

f)

Figure 2. Growth-time curve of S. coelicolor in defined medium with isotopic arginine labeling, The time at which a fraction of the heavy labeled cells were harvested and transferred to a preconditioned unlabeled medium is shown. The remaining cells continued to be cultivated in heavy arginine medium and the complete growth curve is shown.

at t ) 0, 2, 4 and 8 h after transfer to unlabeled medium for mass spectrometry analysis. Estimation of Protein Degradation Rates. Cell samples were analyzed using both the SILAC and SILAC-iTRAQ multitagging approaches. For SILAC approach total protein isolated from each of the time-points was first separated on a onedimensional SDS-PAGE. Gel bands were sliced from each of the sample and analyzed by mass spectrometry. The extent of label incorporation (R) at t ) 0 h sample was calculated to be 0.68. The intracellular proteins were not completely labeled with heavy arginine even when the culture was started using a very small quantity of biomass (spores) in the labeled medium, which was most likely due to the synthesis of arginine by S. coelicolor through endogenous mechanisms. After the cells were transferred to the unlabeled medium, the fraction of label in proteins steadily decreased over time, reaching a median value of 0.1 among identified proteins at t ) 8 h. Mass spectrometry analysis of t ) 2 h, 4 and 8 h samples identified a total of 493 proteins with ProteinPilot identification confidence g90%. However, only 246 of these proteins were identified with quantification peaks from two or more peptides at g3 time points and with Pearson’s coefficient (r) for the log(f) vs t straight line fit greater than or equal to 0.85 (Supplemental Figure S1, Supporting Information). For the SILAC-iTRAQ multitagging approach, samples from t ) 0 h, 2 h, 4 h, and 8 h were each labeled with iTRAQ reagents 114, 115, 116, and 117 respectively. The labeled samples were pooled together and analyzed using 2-D LC-MS. A total of 182 proteins were identified with ProteinPilot identification confidence g90% using this approach. Of these, only 115 had quantification peaks from two or more peptides at g3 time points and Pearson’s coefficient (r) as described above greater than or equal to 0.85 (Supplemental Figure S2, Supporting Information). Comparison of SILAC and Multitagging Approaches. To compare the SILAC and SILAC-iTRAQ multitagging approaches described earlier, we considered what type of error will be encountered in the SILAC approach for proteins with rapid changes in intracellular concentration. Equation 2 can be rewritten as:

IH(t) 1 p (t0) · · R pi(t) IH(t) + IL(t)

(4)

Therefore, for proteins whose concentrations increase with time, the term pi(t)/pi(t0) will be larger than 1.0. Hence the “SILAC-estimated” value of f (assuming that pi(t) ) pi(t0)) will be smaller than the true value from eq 4. This will result in larger degradation constant (k) estimates than the correct value. Similarly, for proteins whose concentrations decrease over time, the SILAC-estimated k will be smaller than the actual value. The SILAC-iTRAQ multitagging approach, as mentioned earlier, is not affected by the overall dynamics of protein and hence should provide a degradation constant estimate closer to reality in all cases. We sought to assess the value of pi(t)/pi(t0) (i.e., overall dynamics of each protein) using peptides containing no arginine residue in SILAC-iTRAQ multitagging approach. The iTRAQ ratio (iTQNA) for any protein calculated using only those peptides with no arginine will not be influenced by label loss kinetics and hence will provide a true estimation of the overall dynamics of that protein.

iTQNA )

INA(t) INA(t0)

(5)

where the subscript NA represents “no arginine” (Supplemental Figure S3, Supporting Information). To assess the overall dynamics of proteins, we used the Spearman’s rank correlation (rs) for every protein between the variables iTQNA(t) and time (t). On the basis of the value of rs, we classified proteins as follows: a. if 0.8 e rs e 1: iTQNA(t) and t are highly correlated, “monotonically increasing” protein b. if -1 e rs e -0.8: iTQNA(t) and t are highly anticorrelated, “monotonically decreasing” protein c. if -0.8