Assessing Reproducibility of a Protein Dynamics Study Using in Vivo

and Department of Biostatistics, Bloomberg School of Public Health, Center, ... Boris Zybailov, Amber L. Mosley, Mihaela E. Sardiu, Michael K. Col...
0 downloads 0 Views 199KB Size
Anal. Chem. 2005, 77, 2739-2744

Assessing Reproducibility of a Protein Dynamics Study Using in Vivo Labeling and Liquid Chromatography Tandem Mass Spectrometry Henrik Molina,†,‡ Giovanni Parmigiani,§,| and Akhilesh Pandey*,†

McKusick-Nathans Institute for Genetic Medicine and the Department of Biological Chemistry, The Sidney Kimmel Comprehensive Cancer Center, and Department of Biostatistics, Bloomberg School of Public Health, Center, Johns Hopkins University, Baltimore, Maryland 21205, Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense 5230, Denmark

Measuring dynamics of proteins abundance in cells in response to stimuli such as growth factors or drugs requires analysis of more than one time point. Proteomic approaches have traditionally been used to measure only one state at a time because quantitation is difficult, especially when mass spectrometry is used as a readout. Isotopically labeled reagents have recently been introduced that allow comparison of two or three different states by mass spectrometry. Here, we evaluate the reproducibility of an experiment that measures three states simultaneously through stable isotope labeling of cells with amino acids in cell culture (SILAC) using light, medium, and heavy versions of amino acids. The major goal of this study was to assess the reproducibility of such experiments in combination with liquid chromatography tandem mass spectrometry (LC-MS/MS). Our results show that it is possible to obtain reproducible quantitative data to study protein dynamics based on our analysis of more than 220 peptide sets derived from 20 proteins from 3 different LC-MS/MS runs. Approaching systems biology requires measurement of dynamics of various biological processes as a crucial component. Quantitative proteomics using mass spectrometry is a valuable tool for this purpose. Although some studies have addressed quantitation by comparing abundance of peptides from different samples;1,2 most of the quantitative approaches rely on the incorporation of stable isotopes into the sample that is to be measured. Several labeling methods have been developed over the past few years that involve chemical modifications, in vitro, * To whom correspondence should be addressed. E-mail: pandey@ jhmi.edu. † McKusick-Nathans Institute for Genetic Medicine and the Department of Biological Chemistry, Johns Hopkins University. ‡ University of Southern Denmark. § The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University. | Department of Biostatistics, Bloomberg School of Public Health, Center, Johns Hopkins University. (1) Chelius, D.; Zhang, T.; Wang, G.; Shen, R. F. Anal. Chem. 2003, 75, 66586665. (2) Wang, W.; Zhou, H.; Lin, H.; Roy, S.; Shaler, T. A.; Hill, L. R.; Norton, S.; Kumar, P.; Anderle, M.; Becker, C. H. Anal. Chem. 2003, 75, 4818-4826. 10.1021/ac048204b CCC: $30.25 Published on Web 03/22/2005

© 2005 American Chemical Society

by heavy isotope containing tags, or labeling, in vivo, by heavy isotope containing nutrients.3-10 For instance, in vitro methods include biotinylated isotope coded affinity tags for labeling cysteine residues,3 labeling of the peptide carboxy terminal with heavy oxygen,11,12 or labeling of the amino terminus.4,13 In vivo labeling approaches include the use of heavy isotope-containing media (e.g., 15N)8,7 or amino acids.9,14 Although the above-mentioned methods allow us to measure relative differences, older analytical methods have been reintroduced to measure absolute quantities, by spiking samples with one or more isotopically labeled synthetic components contained in the sample.15,16 Although, absolute quantitation is tempting and useful for some problems, the process is tedious and cannot easily be undertaken on a proteome-wide scale. In vivo labeling approaches are easy, result in efficient and uniform labeling of the proteome and allow relative quantitation to be carried out on a global scale. A recently published study evaluated the reproducibility of in vivo labeling of yeast cells using 15N-labeled media.10 In this study, (3) Gygi, S. P.; Rist, B.; Gerber, S. A.; Turecek, F.; Gelb, M. H.; Aebersold, R. Nat. Biotechnol. 1999, 17, 994-999. (4) Munchbach, M.; Quadroni, M.; Miotto, G.; James, P. Anal. Chem. 2000, 72, 4047-4057. (5) Goodlett, D. R.; Keller, A.; Watts, J. D.; Newitt, R.; Yi, E. C.; Purvine, S.; Eng, J. K.; von Haller, P.; Aebersold, R.; Kolker, E. Rapid Commun. Mass Spectrom. 2001, 15, 1214-1221. (6) Cagney, G.; Emili, A. Nat. Biotechnol. 2002, 20, 163-170. (7) Oda, Y.; Huang, K.; Cross, F. R.; Cowburn, D.; Chait, B. T. Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 6591-6596. (8) Conrads, T. P.; Alving, K.; Veenstra, T. D.; Belov, M. E.; Anderson, G. A.; Anderson, D. J.; Lipton, M. S.; Pasa-Tolic, L.; Udseth, H. R.; Chrisler, W. B.; Thrall, B. D.; Smith, R. D. Anal. Chem. 2001, 73, 2132-2139. (9) Ong, S. E.; Blagoev, B.; Kratchmarova, I.; Kristensen, D. B.; Steen, H.; Pandey, A.; Mann, M. Mol. Cell. Proteomics 2002, 1, 376-386. (10) Washburn, M. P.; Ulaszek, R. R.; Yates, J. R., 3rd. Anal. Chem. 2003, 75, 5054-5061. (11) Yao, X.; Freas, A.; Ramirez, J.; Demirev, P. A.; Fenselau, C. Anal. Chem. 2001, 73, 2836-2842. (12) Bantscheff, M.; Dumpelfeld, B.; Kuster, B. Rapid Commun. Mass Spectrom. 2004, 18, 869-876. (13) Lemmel, C.; Weik, S.; Eberle, U.; Dengjel, J.; Kratt, T.; Becker, H. D.; Rammensee, H. G.; Stevanovic, S. Nat. Biotechnol. 2004, 22, 450-454. (14) Blagoev, B.; Ong, S. E.; Kratchmarova, I.; Mann, M. Nat. Biotechnol. 2004, 22, 1139-1145. (15) Gerber, S. A.; Rush, J.; Stemman, O.; Kirschner, M. W.; Gygi, S. P. Proc. Natl. Acad. Sci. U.S.A. 2003, 100, 6940-6945. (16) Barnidge, D. R.; Dratz, E. A.; Martin, T.; Bonilla, L. E.; Moran, L. B.; Lindall, A. Anal. Chem. 2003, 75, 445-451.

Analytical Chemistry, Vol. 77, No. 9, May 1, 2005 2739

we have assessed the reproducibility and accuracy of an in vivo labeling method involving stable isotope labeling of amino acids in cells (SILAC) combined with LC-MS/MS, for relative quantitation of three different states simultaneously. Our study of reproducibility and accuracy of SILAC involved mixing of cell lysates derived from cells labeled with three isotopically different forms of arginine in a 2:3:4 ratio. The labeled proteins were resolved by 1D-gel electrophoresis and subjected to three independent analyses by liquid chromatography tandem mass spectrometry. From measuring the relative ratios of more than 600 peptides, we conclude the following: (1) Most peptide sets exhibited the expected abundance ratio based on mixing, and (2) averaging ratios from multiple peptide sets from a protein provided accurate ratios and intra- and interexperimental reproducibility. Thus, this method should be useful for measuring dynamics in any cell culture-based system where stable isotope containing amino acids can be used. EXPERIMENTAL SECTION Chemicals and Media. DMEM, dialyzed serum, and custom DMEM medium lacking arginine were purchased from Invitrogen (Carlsbad, CA). 13C614N4- and 13C615N4-arginine were purchased from Cambridge Isotope Laboratories (Andover, MA). NP-40 was purchased from Calbiochem (San Diego, CA). Solvents for liquid chromatography were purchased as follows: heptafluorobutyric acid (Sigma-Aldrich Corp., St. Louis, MO), glacial acetic acid (Fisher Scientific, Fairlawn, NJ), and HPLC-grade acetonitrile (J. T. Baker, Phillipsburg, NJ). Preparation of Three Different Labeled Versions of HeLa Cells. HeLa cells were grown in DMEM media supplemented with 10% dialyzed fetal bovine serum (Invitrogen). The cells were adapted and grown in DMEM medium lacking arginine and supplemented with either 13C614N4- or 13C615N4-arginine and 10% dialyzed fetal bovine serum. One 90% confluent 15-cm dish each of HeLa cells grown in normal, 13C614N4-labeled, or 13C615N4-labeled arginine was lysed in modified RIPA buffer (50 mM Tris-HCl pH 7.4, 150 mM NaCl, 1% NP-40, and 0.25% sodium deoxycholate). Mixing of Labeled Samples. Protein concentrations in the lysates from the three differently labeled HeLa cells were measured by the modified Lowry method. The three labeled HeLa cell lysates were mixed in a ratio of 2:3:4 being 12C614N4-arginine/ 13C 14N -arginine/13C 15N -arginine (light/medium/heavy). The 6 4 6 4 combined mixed lysates were resolved by SDS-PAGE and stained with colloidal Coomassie (Invitrogen kit). Protein Digestion and LC-MS/MS Analysis. Gel bands were excised and digested with trypsin and the extracted peptides subjected to LC-MS/MS analysis as follows: the samples were injected into a specially prepared trap column (i.d. 75 µm, length 5 cm) packed with C18 particles (ODS-A YMC), using an autosampler (1100-microwell plate autosampler, Agilent Technologies, Palo Alto, CA). Peptides were eluted from the trap onto an analytical C18 column (i.d. 75 µm, length 10 cm, Vydac, MS218) with a gradient increasing from 10% solvent B/90% solvent A (A: 0.4% acetic acid, 0.05% heptafluorobutyric acid. B: 90% acetotonitrile, 0.4% acetic acid, 0.05% heptafluorobutyric acid) to 50% B/50% solvent A in 30 min. All flow and gradients were delivered by a nanoflow pump (Agilent Technologies). The LC setup was connected to an ion trap mass spectrometer (LC/MSD Trap XCT, Agilent Technologies) using a nanoelectrospray source from 2740 Analytical Chemistry, Vol. 77, No. 9, May 1, 2005

Figure 1. Schematic of the strategy used to assess reproducibility of the protein dynamics study. HeLa cells grown in media containing normal arginine, 13C6-arginine, or 13C615N4-arginine were lysed and the lysates mixed in the ratio 2:3:4. After SDS-PAGE, the gel bands were digested with trypsin and analyzed by LC-MS/MS. Peptides containing arginine were found by database searching and confirmed by visual inspection of the mass spectra.

Figure 2. LC-MS/MS base peak chromatograms of a representative protein sample. Three base peak chromatograms from the triplicate analysis of a protein band are shown in three different colors and can be almost superimposed on each other.

Proxeon A/S (Odense, Denmark). The data were acquired in standard enhanced mode (8100 m/z per s) scanning from m/z 300 to 1200 and selecting the three most intense ions in each cycle for fragmentation.

Figure 3. Three replicate analyses of two representative doubly charged peptides from three different isotopic states. MS spectrum of a peptide corresponding to VSHLLGINVTDFTR (myosin) (panel A) or IGGIGTVPVGR (elongation factor 2) (panel B) is shown. A difference of 6 Da between the light and medium and 4 Da between the medium and heavy forms of the peptides is indicated at the top. The figures show that the relative observed ratios are indeed similar between the three replicates and between different proteins.

Database Searching and Data Analysis. MS/MS ion trap data were searched against the RefSeq (http://www.ncbi.nlm. nih.gov/) protein database using Mascot.14 Search parameters were as follows: full tryptic constraints and mass accuracies of precursor and peptide fragments of 1.3 and 0.5 Da, respectively. 13C614N4-arginine, 13C615N4-arginine, and oxidation of methionine residues were added as variable modifications. Peptides found to contain arginine were identified from the database search output and validated manually. The mass spectra of the validated peptides were retrieved using the elution time and m/z value for the respective peptide. Using the MS spectra, ratios between 13C614N4-arginine to normal (medium/light) and 13C615N4arginine to normal (heavy/light) were calculated using an averaged abundance of the monoisotopic ions from each of the three peptides in a triplet. Statistical Analysis. To measure the reproducibility of fold changes, we analyzed the data using two analysis of variance models.17 The first uses protein as the only factor and serves to quantify the reproducibility that can be expected on average from fold changes at the protein level, averaged over peptides. The second uses peptides as the only factor and serves to quantify the reproducibility of individual peptide measurements, as well as the random error. All analyses were performed on ratios in

the logarithmic scale and the results reported in the original scale after conversion using log-normal theory. The fit of the Gaussian distribution to the model residual was found to be excellent, based on inspection of quantile-quantile plots. All calculations were performed using the statistical program R.18 RESULTS AND DISCUSSION Design of the Experiment. To assess the reproducibility of stable isotope labeling of cells coupled to LC-MS/MS, we designed an experiment using proteins harvested from HeLa cells that were grown in media containing three different isotopic forms of arginine. In this article, we refer to the 12C614N4 isotopic version of arginine as “light”, the version containing 13C614N4-arginine as “medium”, and the 13C6-15N4-arginine as “heavy”. The HeLa cell lysates from the above-mentioned states were mixed in a 2:3:4 ratio, respectively. The mixed lysate was resolved by SDS-PAGE, bands were excised and analyzed by three individual LC-MS/MS experiments. Figure 1 shows a schematic outlining this strategy. (17) Box, G. E. P.; Hunter, W. G.; Hunter, J. S. Statistics for Experimenters: An Introduction to Design, Data Analysis and Model Building; Wiley-Interscience: New York, 1978. (18) Ihaka, R.; Gentleman, R. J. Comput. Graphical Statistics 1996, 5, 299314.

Analytical Chemistry, Vol. 77, No. 9, May 1, 2005

2741

Figure 4. Box plots representing medium/light and heavy/light ratios derived from 10 proteins for which at least three peptides were identified. A box plot for each of the triplicate analyses is shown. The dotted line represents the expected ratio, and the solid line indicates the calculated mean.

The experiment was designed in this manner because it allowed us to do the following: (1) measure the ratios of different peptides derived from the same proteinsintraexperimental reproducibility, (2) compare the measured ratios for protein sets between the three repeated LC-MS/MS experimentssinterexperimental reproducibility, and (3) obtain a large number of peptide sets permitting a robust statistical analysis of the measurement accuracy. We decided to use 2:3:4 as the mixing ratio because it mimics true biological conditions where extreme perturbations might not occur, but rather, small fluctuations need to be measured. Additionally, to mimic true conditions, we used sample amounts that reflected the sample amount that we normally obtain from biological experiments designed to identify proteins from signaling pathways. To assess the interexperimental reproducibility, identical sample amounts were analyzed in three independent LC-MS/MS runs. To further ensure that all analyses were similar, we compared the resulting base peak chromatograms of the triplicate analysis of all liquid chromatography runs. The base peak chromatograms from the three replicate LC-MS/MS experiments 2742

Analytical Chemistry, Vol. 77, No. 9, May 1, 2005

of a representative protein band are quite similar, indicating that the experiments are indeed comparable (Figure 2). Measurements of Relative Abundance Ratios. When comparing isotopically different forms of a peptide, it is important to ensure that the isotopic peaks being compared are derived from the same peptide and are not simply comigrating peptides. In most automated LC-MS/MS experiments, the peptide ions subjected to MS/MS experiments are selected based on their intensity in an MS survey scan, and peptide sequence information is obtained in subsequent MS/MS experiments. Most data acquisition software used for this purpose are able to exclude peptide ions already fragmented, thereby increasing the number of unique peptides. However, due to several factors, such as a wide dynamic range of the abundance of components and the dynamic nature of an LC-MS/MS experiment, not all peptides will necessarily be subjected to MS/MS. Using mass spectrometers with a high mass accuracy and mass resolution along with comparison of elution times for the respective peptides, one can partly compensate for missing MS/MS experiments. In this study we have chosen to operate our mass spectrometer in a mode that allows us to achieve

Table 1. Standard Deviations Calculated from Labeled Peptides Measured in All Three Experiments medium/light ratio protein name

mean

transketolase myosin heat shock 90 kDa enolase 1 moesin heat shock 70 kDa lamin A/C isoform 1 heat shock protein 8 elongation factor 2 actinin, R 4

1.63 1.69 1.72 1.69 1.78 1.74 1.71 1.70 1.57 1.41

std

deva

0.11 (9) 0.14 (17) 0.10 (18) 0.09 (12) 0.17 (20) 0.11 (15) 0.10 (12) 0.09 (12) 0.20 (12) 0.06 (12)

heavy/light ratio mean

std deva

1.85 2.56 2.13 2.12 2.47 2.08 2.07 2.05 2.16 1.82

0.12 (9) 0.25 (17) 0.12 (18) 0.10 (12) 0.19 (20) 0.14 (15) 0.16 (12) 0.09 (12) 0.22 (12) 0.13 (12)

a Values in parentheses are the total number of labeled peptides identified.

isotopic resolution on doubly charged peptides, thereby minimizing situations where a coeluting peptide could obscure the measurement. The two sets of ratios measured in this study were the ratio of intensity of medium versus light peptides (designated medium/ light) and the ratio of intensity of heavy versus light peptides (designated heavy/light). Figure 3 shows two doubly charged peptide sets corresponding to the peptides VSHLLGINVTDFTR (derived from myosin) and IGGIGTVPVGR (derived from elongation factor 2), respectively, from three independent LC-MS/MS runs. A complete list of peptides that were identified and the observed medium/light and heavy/light ratios is available as Table S1. Assessing Reproducibility of Quantitation. Digesting differentially encoded forms of a protein after mixing results in a set of peptides that exhibit a fixed mass difference because of incorporation of the isotopic label and the intensities of the peptides in a set provide the relative abundance of the protein in different states. Because a number of peptides are generated after proteolytic digestion, there are multiple opportunities to measure the relative abundance ratios of any given protein. Ideally, the ratios calculated from different peptide sets from the same protein should be identical. To measure the intraexperimental reproducibility in our SILAC experiments involving three states, we compared the ratios for the proteins for which three or more labeled peptides were identified in each of the three LC-MS/MS experiments (Figure 4). For example, the relative ratios obtained from peptides derived from transketolase are shown as one box plot (red in Figure 4) whereas the three red box plots together represent three independent LC-MS/MS experiments carried out where these three peptide sets were identified. Thus, each box plot, by itself, reflects the variability across peptides representing the same protein, while the differences across box plots of the same protein reflect replication variability. The expected ratios are indicated as dotted lines, and the calculated mean ratios are shown as a solid line. A simple visual inspection shows that the spread of the ratios in any single experiment (intraexperimental variability) is variable. Table 1 lists the standard deviations observed for the individual proteins, which ranged from 0.03 to 0.26 for the medium/light ratio (mean 1.64) and 0.02 to 0.29 for the heavy/light ratio (mean 2.15) with some proteins showing more variability than others.

Figure 5. Histogram plot of medium/light and heavy/light ratios using all the peptide sets from analysis in triplicate. Ratios are counted in bins of size 0.2.

Calculating the Spread for a Triple State SILAC Experiment. In a SILAC experiment, only labeled peptides can be used to measure the relative protein expression ratio. Often, depending on the amount and complexity of the protein sample to be analyzed, only a few peptides are available for measuring the relative protein expression. In this part of the experiment, we are assessing the spread of the measurements, using all peptide sets measured in this study and thereby creating a statically sound data set. A total of 220 peptide sets was used to obtain overall relative quantitation values. Figure 5 shows a histogram of the medium/light ratio and heavy/light ratio for the proteins identified in this study. Using this data set, we calculated the average values of the two ratios, 1.65 and 2.07 for medium/light and heavy/light ratios, respectively. These figures are relatively close to the expected ratios of 1.5 and 2.0. The standard deviation for all measurements in the three experiments was calculated to be 0.39 for the medium/light ratio (mean 1.65) and 0.48 for the heavy/ light ratio (mean 2.07). For an average protein, the medium/light and heavy/light ratios can reliably measure changes of 1.8-fold upregulation or 1.8-fold downregulation at a significance level of 95% from a single replicate. If the experiment is replicated three times, the ratios can reliably detect 1.57-fold upregulation or 1.57fold downregulation at a confidence level of 95%. CONCLUSIONS We have measured the reproducibility of an experiment designed to measure the dynamics of protein abundance by in Analytical Chemistry, Vol. 77, No. 9, May 1, 2005

2743

vivo labeling using SILAC in a cell culture-based system. From our results we conclude that the use of multiple peptide sets from the same protein provides adequate accuracy for measuring dynamics. It is essential to obtain several peptides from each protein in order to obtain more accurate ratios, although there is still variability with protein as a factor. To ensure identification of multiple peptides from the same protein, the biological experiments must be designed in such a way that adequate coverage is obtained. This could include use of labeling methods that provide quantitation of several peptides (and not simply N- or C-termini, or cysteine-containing peptides), scaling-up of the experiment to obtain sufficient amount of sample, use of chromatographic techniques to separate proteins or peptides, and use of highresolution mass spectrometers that can provide adequate resolution to distinguish peptides of interest from other unrelated peptides. Overall, we have demonstrated that it is possible to

2744

Analytical Chemistry, Vol. 77, No. 9, May 1, 2005

reliably measure alterations in protein abundance reproducibly even when the alterations are modest (1.5-2-fold). ACKNOWLEDGMENT This work was supported by a contract N01 HV28180 from the National Heart Lung and Blood Institute and grant U54 RR020839 from the National Institutes of Health. A.P. is also supported by Sidney Kimmel Scholar and Beckman Young Investigator awards. SUPPORTING INFORMATION AVAILABLE Additional information as noted in text. This material is available free of charge via the Internet at http://pubs.acs.org. Received for review December 3, 2004. Accepted February 14, 2005. AC048204B