Quantitative Bottom-Up Proteomics Depends on ... - ACS Publications

Dec 2, 2013 - Analytical Chemistry 2018 90 (3), 2216-2223 ... in Soybean Seed and Its Application to Field Trials Spanning Three Growing Seasons .... ...
0 downloads 0 Views 2MB Size
Article pubs.acs.org/ac

Quantitative Bottom-Up Proteomics Depends on Digestion Conditions Mark S. Lowenthal,* Yuxue Liang, Karen W. Phinney, and Stephen E. Stein Biomolecular Measurement Division, National Institute of Standards and Technology, 100 Bureau Drive, Stop 8314, Gaithersburg, Maryland 20899-8315, United States S Supporting Information *

ABSTRACT: Accurate quantification is a fundamental requirement in the fields of proteomics and biomarker discovery, and for clinical diagnostic assays. To demonstrate the extent of quantitative variability in measurable peptide concentrations due to differences among “typical” protein digestion protocols, the model protein, human serum albumin (HSA), was subjected to enzymatic digestion using 12 different sample preparation methods, and separately, was examined through a comprehensive timecourse of trypsinolysis. A variety of digestion conditions were explored including differences in digestion time, denaturant, source of enzyme, sample cleanup, and denaturation temperature, among others. Timecourse experiments compared differences in relative peptide concentrations for tryptic digestions ranging from 15 min to 48 h. A predigested stable isotope-labeled (15N) form of the full-length (HSA) protein, expressed in yeast was spiked into all samples prior to LC-MS analysis to compare yields of numerous varieties of tryptic peptides. Relative quantification was achieved by normalization of integrated extracted ion chromatograms (XICs) using liquid chromatography-tandem mass spectrometry (LCMS/MS) by multiple-reaction monitoring (MRM) on a triple quadrupole (QQQ) MS. Related peptide fragmentation transitions, and multiple peptide charge states, were monitored for validation of quantitative results. Results demonstrate that protein concentration was shown to be unequal to tryptic peptide concentrations for most peptides, including so-called “proteotypic” peptides. Peptide release during digestion displayed complex kinetics dependent on digestion conditions and, by inference, from denatured protein structure. Hydrolysis rates at tryptic cleavage sites were also shown to be affected by differences in nearest and next-nearest amino acid residues. The data suggesting nonstoichiometry of enzymatic protein digestions emphasizes the often overlooked difficulties for routine absolute protein quantification, and highlights the need for use of suitable internal standards and isotope dilution techniques.

Q

trypsin, to cleave high molecular weight proteins into smaller peptide chains that are more amenable to MS analysis. Yet, it is often tacitly assumed that the efficiency of enzymatic digestion of a protein, or proteome, is at or near stoichiometric levels, a questionable assumption.7,8 In fact, even so-called “proteotypic” peptides, unique peptides most likely to be detectable by mass spectrometry techniques, do not necessarily fit the “one-peptidefrom-one-protein” model when subjected to suboptimal sample preparation.8−10 Trypsin is the most commonly used protease for generation of MS-amenable peptides due to its high cleavage efficiency and high specificity targeting C-terminal to basic amino acid residues (lysine and arginine) resulting primarily in multiply charged peptides ideally suited for identification by tandem mass spectrometry. However, in order for trypsin to cleave embedded

uantitative accuracy in proteomics measurements is fundamental to the ongoing search for disease-state biomarkers. As evidence mounts suggesting that disease diagnosis can only fully be understood in biochemical terms through a comprehensive analysis,1−6 it follows that small changes in numerous measurands could reflect relevant pathological changes, increasing the need for quantitative accuracy. Proteins, however, which differ greatly in their susceptibility to digestion, do not lend themselves easily to quantification through universal conditions, but instead those conditions must be optimized independently. Therefore, standardizing measurement approaches is crucial for any comparative proteomics experiment. A great number of publications based on quantifying protein biomarkers are available in the literature. Yet, quantitative accuracy is often based on generic, nonoptimized digestion protocols favored by individual investigators. Not surprisingly, only a limited number of useful clinical protein assays are currently available. Traditional mass spectrometry (MS)-based bottom-up proteomics relies on the use of proteolytic enzymes prior to quantitative analysis, such as This article not subject to U.S. Copyright. Published 2013 by the American Chemical Society

Received: August 15, 2013 Accepted: December 2, 2013 Published: December 2, 2013 551

dx.doi.org/10.1021/ac4027274 | Anal. Chem. 2014, 86, 551−558

Analytical Chemistry

Article

regions of native proteins, it is essential to first unfold secondary and tertiary structures to expose embedded cleavage sites. One of the most variable aspects of this process is the variety of techniques for denaturing the protein. This is typically done using a denaturing agent, such as a surfactant (Rapigest, Triton, CHAPS), a chaotropic agent (SDS, urea), organic solvent (ACN, MeOH), or heat. Given the wide range on available reagents, it is not surprising that detailed methods of digestion vary greatly between laboratories, and sometimes between samples. For example, the duration of enzyme reaction or protein denaturation, the method of denaturation, the choice of detergents or surfactants, desalting methods, or the choice of a trypsin source all may have large effects on digestion efficiency, peptide yields, and chemical modification. As the search for protein biomarkers of disease shifts away from single targeted analytes toward panels of measurands quantified in unison, it is ever more apparent that the effects of digestion protocols on quantitative yields of peptides need be re-examined and debated. The National Institute of Standards and Technology (NIST) is actively involved in improving quantitative reproducibility and accuracy through expression of stable isotope-labeled full-length proteins for use as internal standards in protein-based reference materials development.11−15 For most proteomics studies that require absolute quantification, selecting proper internal standards means working with proteins rather than synthetic tryptic peptides to account for digestion variability during sample preparation (and more so when prefractionation or enrichment steps are required for quantification of low-abundance proteins). This work describes the development of an LC-MS/MS (MRM) approach for measuring relative peptide concentrations for a variety of digestion protocols and then evaluating inherent variability associated with those methods. Human serum albumin (HSA) was selected as a model protein with the digest of its isotopically labeled analog (15N HSA) employed as an internal standard. HSA is by far the most abundant human plasma protein constituting ∼55% of the plasma proteome.16 It is a good model protein for digestion studies because of its high solubility and monomeric tertiary structure consisting of 17 disulfide bonds, only one of which is reduced in the absence of detergents.17 HSA is a relatively large protein consisting of 58 potential [fully] tryptic peptides (greater than four amino acids in length) with highly variable hydrolysis rates. This manuscript identifies peptides in a conventional shotgun mode which are then quantitatively measured by MRM-MS. HSA peptides were quantified from several cleavage classes (fully tryptic, semitryptic, missed cleavages, chymotryptic) and modification types (Supporting Information Figure S-1) and multiple fragmentation transitions (precursor ion → product ion) and charge states were monitored from selected peptides for method validation. This is part of an ongoing effort to emphasize the need for consistency of experimental digestion techniques and to better measure and control variability in proteomics. Digestion variability is not a new idea within the proteomics community; however, there are surprisingly few examples of thorough literature reports detailing digestion variability studies.8−10,18−22 One mission of the National Institute of Standards and Technology (NIST), to support measurement capabilities for the proteomics community, calls for researchers to revisit the issues of quantitative accuracy and analytical consistency. Standardization of digestion protocols in the proteomics community is complicated by the wide variety of differing applications and goals of this scientific field. However, one motivation of this work is to quantitatively determine the

degree of measurement variability over a range of practical conditions in quantitative proteomics to inform practitioners of the possibly large dependence of their measurements on the detailed conditions of their analysis.



EXPERIMENTAL SECTION General Procedure. One mg of human serum albumin (HSA, Lee Biosolutions, St. Louis, MO) was subjected to enzymatic digestion following one of 12 sample preparation protocols, described below. In all cases where HSA disulfide bonds were chemically reduced, 5 μL of 200 mmol/L dithiothreitol (DTT) in 50 mmol/L tris-HCl, pH 8.0, was added and incubated at 60 °C for one hour followed by alkylation using 20 μL of 200 mmol/L iodoacetamide (IAM) solution in 50 mmol/L tris-HCl, pH 8, with storage at room temperature in dark for one hour. Excess IAM was quenched by further addition of 20 μL of 200 mmol/L DTT solution in 50 mmol/L tris-HCl, pH 8, with room temperature incubation for one hour. Reduced and alkylated HSA samples were subjected to trypsin digestion (final protein-to-trypsin ratio of 50-to-1) at 37 °C for either 2 or 18 h for each of the twelve unique denaturation approaches described. In all, 24 unique preparations were assayed, with the 2 and 18 h digests serving as quantitative replicates. All digestion reactions were quenched by the addition of 10 μL of 50% formic acid with the pH of the solution adjusted to ∼3. Final concentration of each peptide can be estimated based on the concentration of digested protein equivalent to 1 mg/mL HSA. HSA digests were stored at −20 °C until needed. Denaturation Conditions. (1) Urea_NoZeba: 6 mol/L Urea in 100 mmol/L tris buffer incubated at room temperature for ten minutes. Urea concentration was reduced to 0.6 mol/L prior to the addition 20 μg of Promega (Madison, WI) trypsin. (2) Urea_Zeba: 6 mol/L Urea in 100 mmol/L tris buffer incubated at room temperature for ten minutes. Alkylated protein solutions were desalted using Zeba (Pierce) spin columns (7 kDa molecular weight cutoff (MWCO)) prior to addition of 20 μg of Promega trypsin. (3) Guanidine_RT: 6 mol/ L Guanidine hydrochloride (HCl) in 100 mmol/L tris buffer incubated at room temperature for ten minutes. The alkylated protein solution was desalted using Zeba spin columns (7 kDa MWCO) prior to addition of 20 μg of Promega trypsin. (4) Guanidine_HT: 6 mol/L Guanidine HCl in 100 mmol/L tris buffer incubated at 85 °C for one hour. The alkylated protein solution was desalted using Zeba spin columns (7 kDa MWCO) prior to addition of 20 μg of Promega trypsin. (5) HT: Heating at high temperatures (85 °C) for one hour in 100 mmol/L tris buffer prior to addition of 20 μg of Promega trypsin. (6) Rapigest: 0.2% Rapigest (Waters) solution in 100 mmol/L tris buffer incubated at 85 °C for five minutes prior to addition of 20 μg of Promega trypsin.(7) Methanol: 20% Methanol in 100 mmol/L tris buffer incubated at room temperature for ten minutes prior to addition of 20 μg of Promega trypsin. (8) Urea_LysC_NoZeba: 6 mol/L Urea in 100 mmol/L tris buffer incubated at room temperature for ten minutes. Following alkylation, Promega LysC was added to give a protease-to-substrate ration of 1-to-50. The solution was incubated for two hours at 30 °C. The urea concentration was reduced to 0.6 mol/L prior to addition of 20 μg of Promega trypsin. (9) Urea_LysC_Zeba: 6 mol/L Urea in 100 mmol/L tris buffer incubated at room temperature for ten minutes. Following alkylation, the protein solution was desalted using Zeba spin columns (7 kDa MWCO). Promega Lys-C was added to give a protease-to-substrate ration of 1-to-50. The solution was incubated for two hours at 30 °C. The urea 552

dx.doi.org/10.1021/ac4027274 | Anal. Chem. 2014, 86, 551−558

Analytical Chemistry

Article

determined experimentally. Optimization of the MRM fragmentation parameters (collision energy and declustering potential) was determined experimentally for each peptide transition: representative plots are provided (Supporting Information Figure S-2). Supporting Information Table S-1 annotates all monitored peptide sequences, charge states, modifications, cleavage types, 14N and 15N fragmentation transitions, and ion fragment types. Data acquisition and peak integration were performed using Analyst version 1.5 software (Applied Biosystems). All peak integrations were visually inspected for consistency, and in some cases, manual integration was necessary. Peptide transition selection criteria based on the NIST Library of Peptide Fragmentation Mass Spectra23,24 are described in Supporting Information.

concentration was reduced to 0.6 mol/L prior to addition of 20 μg of Promega trypsin. (10) TFE: 50% 2,2,2-Trifluoroethanol (TFE) in 100 mmol/L tris buffer incubated at room temperature for ten minutes prior to addition of 20 μg of Promega trypsin. (11) 1x Sigma Trypsin: 6 mol/L Urea in 100 mmol/L tris buffer incubated at room temperature for ten minutes. Urea concentration was reduced to 0.6 mol/L prior to the addition 20 μg of Sigma T1426 trypsin. (12) 5x Sigma Trypsin: 6 mol/L Urea in 100 mmol/L tris buffer incubated at room temperature for ten minutes. Urea concentration was reduced to 0.6 mol/L prior to the addition 100 μg of Sigma T1426 trypsin. Timecourse Studies. A timecourse for tryptic digestion efficiency was performed at six time points using approximately equimolar mixtures of labeled and unlabeled HSA digests. Digestion times for the 14N HSA varied from 15 min to 48 h using a single two hour 15N HSA spike-in digest. Both digestions were done in 6 mol/L urea with 100 mmol/L tris buffer and 20 μg of Promega porcine trypsin, as described above (1). LC-MS/ MS (MRM) analysis was performed in triplicate. Peak area ratios were plotted versus time and an optimal digestion time was assessed for each peptide. Sample and Internal Standard Preparation. Full-length, stable isotope-labeled recombinant 15N HSA was kindly provided by Dr. Illarion Turko (IBBR/UMD/NIST) and was synthesized in yeast by DNA vector transformation. Purified 15N HSA was digested using a single denaturation protocol identical to that described above as urea. Briefly, the protein was denatured using 6 mol/L urea in 100 mmol/L tris buffer incubated at room temperature for ten minutes, reduced using DTT, and alkylated (IAM) as described above. Urea concentration was reduced to 0.6 mol/L prior to the addition 20 μg of Promega (Madison, WI) trypsin. The labeled protein was digested for 2 h at 37 °C and stored at −20 °C until needed (∼1 mg/mL). Sample preparation was done by mixing equal amounts of 14N HSA digests with 15N HSA internal standard digest. This was confirmed by the finding that similar total abundances of peptide ions were generated from 14N and 15N proteins. Five μL of each 1 mg/mL digest was added to 90 μL of 0.1% formic acid in water with the resulting peptide solution concentration estimated to be 50 ng/μL. Samples were stored at −20 °C prior to LC-MS/MS analysis. Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/MS) Analysis (MRM). Liquid chromatographic separation was achieved using a Repro-Sil-Pur (Eksigent) C18-AQ reversed-phase analytical chip column (75 μm × 15 cm, 3 μm particles, 120 Å) with a matched-phase trap chip column (200 μm × 0.5 mm) in a trap-elute setup optimized for the Eksigent (Applied Biosystems) cHiPLC-Nanoflex with a flow rate of 300 nL/min provided by an Eksigent nanoLC-1D plus system with a sample loading flow rate of 3 μL/min. Peptide elution was accomplished using an increasing linear gradient of organic/ aqueous solvent (ACN/H2O) up to 50% B over 40 min, followed by a column wash and re-equilibration. Mobile phases A and B consisted of 0.1% (v/v) formic acid in H2O and ACN, respectively. Column temperature was maintained at 35 °C; autosampler plate temperature control was set at 5 °C. Blank injections were monitored for sample carry-over. The nanoLC system was coupled in-line with an AB Sciex API 4000 QTRAP mass spectrometer (functioning in triple quadrupole mode) equipped with a NanoSpray III ion source. Ions were detected using a multiple-reaction monitoring (MRM) method in positive ion mode. The MRM methods were created for the 14N/15N HSA digests using the best two transitions per peptide



RESULTS AND DISCUSSION Selection of Peptides. Quantitative bottom-up proteomics often relies on the assumption that each tryptic peptide in a digest is present at a concentration equal to a constant fraction of the concentration of its precursor protein, overlooking variations in yield due to sample preparation and digestion conditions. For this work, HSA was selected as a model protein for studying digestion variability using the enzyme, trypsin. In theory, trypsin cleaves peptide chains with high specificity at the carboxyl side of the basic amino acids lysine and arginine, except when either is N-terminally adjacent to proline.25 Experimentally, however, there are many unusual cleavage types and modifications found in a tryptic protein digest. A theoretical HSA digestion produces 58 fully tryptic peptides greater than 400 Da (fewer than ∼4 residues makes identification highly nonspecific), see Supporting Information Table S-2. If one allows for one or two missed cleavages, the number of theoretical HSA fully tryptic peptides increases to 140 and 221 peptides, respectively. Yet, this is a small fraction of what is observed in practice. A consensus HSA library24,26 (http://peptide.NIST.gov) annotates over 1041 experimentally observed peptides from 2,921 unique spectra, and counting. Peptides are observed in many different charge states, from a number of cleavage classes, and with various posttranslational modifications, both chemical (from sample preparation) and biological (Supporting Information Figure S1). In fact, fully tryptic, unmodified peptides with no missed cleavages constitute a small percentage of observed peptides, and represents only a fraction of the total peptide abundance. Discovery of a modified or unusually cleaved peptide requires only enough ions to trigger an MS2 acquisition; however, quantitative measurement requires a sufficient flux of ions above a detection threshold. Thus, the majority of HSA peptides from a typical digest are too low in abundance for reliable quantitative measurement. As a result, many of the least abundant HSA peptides were selectively removed from quantitative assays because they were shown to yield irreproducible results. Peptides selected for quantitative analysis included: the more easily detectable tryptic peptides, series of single fully tryptic peptides and the related 1-, 2-, and 3-missed cleavages of that peptide, single peptides in more than one charge state, and many peptides determined to be interesting because of their uniqueness. Multiple Reaction Monitoring (MRM): To directly compare each of the 12 digestion protocols described above, replicate LCMS/MS (MRM) assays were performed for the peptide transitions from a total of 169 HSA peptides and/or peptide charge states (Supporting Information Table S-1). Two fragmentation transitions were monitored from each precursor 553

dx.doi.org/10.1021/ac4027274 | Anal. Chem. 2014, 86, 551−558

Analytical Chemistry

Article

peptide, for both the 14N and 15N isotopic forms (676 total transitions). Because of the large number of transitions, the MRM analysis was split into multiple acquisition methods. Representative MRM total ion chromatograms are provided in Supporting Information Figure S-3a−c for an equimolar mixture of 14N and 15N HSA digests. A single digestion of 15N HSA was used as an internal standard and spiked into each of the 12 unique 14 N HSA digest preparation. Relative quantification was achieved by evaluating peak area ratios (14N/15N) for monitored peptides. Overall, good agreement is observed between related transitions from the same precursor peptide ion as well as between different charge state ions of the same precursor peptide. Both criteria were used to assess consistency of the assays. For example, Figure 1 presents 14N/15N peak area ratios measured for the 12 different

Table 1 presents a summary of the variability observed among all 169 monitored peptides for each digestion condition. Overall, Table 1. Summary of the Variability of Digestion Efficiency among HSA Peptides for Given Digestion Types

digestion method

mean relative peptide abundance

median relative peptide abundance

standard deviation between peptides

% CV

Urea_Zeba methanol Rapigest Urea_LysC_NoZeba TFE 1× Sigma Trypsin Urea_NoZeba Urea_LysC_Zeba 5× Sigma Trypsin Guanidine_HT Guanidine_RT high temperature totals

0.98 0.64 0.70 1.1 0.78 0.66 1.1 0.96 2.1 0.78 1.2 0.82 0.98

0.76 0.50 0.55 0.68 0.51 0.38 0.86 0.52 0.81 0.49 0.69 0.19 0.58

1.0 0.72 0.95 1.6 1.2 1.2 3.1 2.8 6.5 3.9 6.7 10.7 3.4

102 113 136 150 154 182 270 292 316 500 573 1305 341

the mean 14N/15N peak area ratio, as expected for an equimolar protein mix, is near unity (0.98), yet this value differs greatly between digestion types where the mean value ranges from 0.64 (methanol) up to 2.1 (5× Sigma Trypsin). Even more dramatic is the standard deviation among peptides using any given digestion type, which is always greater than the measurement mean. Coefficients of variation (% CV) for digestion types range from 102% (Urea_Zeba) to upward of 1305% (high temperature). Certainly these values are exaggerated by outlier peptides; yet, the data strongly suggests that the one-size-fits-all simplification made for global proteomic digestions is unsatisfactory. A more practical examination of this data set considering only proteotypic peptides supports this conclusion, and is discussed below. Two distinct factors affecting cleavage were responsible for a large fraction of this variability.25 One was the proximity of acidic amino acid residues near the cleavage site (DDNPNLPR or VHTECCHGDLLECADDR), which is known to suppress cleavage rates. Another source of variability was the presence of multiple nearby R/K residues apparently leading to varying degrees of competition between potential tryptic cleavage sites (LVRPEVDVMCTAFHDNEETFLK[K]) (Figure 2). In addition, hydrophilic peptides typically show more variability among digestion conditions due to analytical losses during analysis. Also, peptides with missed cleavages and those due to irregular (semitryptic) cleavage are often found to be highly variable (Supporting Information Table S-3). In fact, among the 82 most reproducibly measured peptides assayed in this work, semitryptics (including LYEYAR, ALVLIAF, VFDEFKPL, and AQYLQQCPFEDHVK) represent the four most variably digested peptides (333%, 275%, 243%, and 239%, respectively) among the 12 digestion conditions. Considering all 169 HSA peptides, the average CV among the 12 digestion techniques was determined to be ∼72%, with a median CV of ∼65%. This represents a statistical problem for many quantitative proteomics studies that search for biological and diagnostic changes at much lower variances. In HSA, the fraction of total ion intensity for peptides with missed cleavages declined from one-third to oneeighth for digestion times increasing from 2 to 18 h, while semitryptic products typically increased in relative concentration

Figure 1. 14N/15N peak area ratios (relative abundance) measured for three charge states of a representative peptide (ADDKETCFAEEGKK) using 12 different digestion conditions (two hours of digestion time). Different charge states of the same peptide were used to validate the measurement approach.

digestion conditions (as described in Experimental Section) at two hours of digestion time for three charge states of a representative peptide (ADDKETCFAEEGKK). For this peptide, variability is observed in peak area ratios between different digestion types, however, consistency is found among all three peptide charge state ions for any given digestion protocol. For this representative peptide which has two missed lysine cleavages, the peak area ratios are generally less than unity (mean = 0.50) suggesting that the internal standard digestion protocol is more favorable for quantifying this given peptide. Digestion Variability. Variability in peak area ratio of a given peptide for different digest types is notably high (67.8%), suggesting the importance of sample preparation. However, the three monitored charge states (+2, +3, and +4) exhibit similar relative changes under varying digestion conditions (