Article pubs.acs.org/jpr
Cutting Edge Proteomics: Benchmarking of Six Commercial Trypsins Jakob Bunkenborg,*,† Guadalupe Espadas,‡ and Henrik Molina*,‡,§ †
Department of Clinical Biochemistry, Copenhagen University Hospital Hvidovre, DK-2650 Hvidovre, Denmark Center for Genomic Regulation, 08003 Barcelona, Spain § The Rockefeller University, 1230 York Avenue, New York, New York 10065, United States ‡
S Supporting Information *
ABSTRACT: Tryptic digestion is an important component of most proteomics experiments, and trypsin is available from many sources with a cost that varies by more than 1000-fold. This high-mass-accuracy LC−MS study benchmarks six commercially available trypsins with respect to autolytic species and sequence specificity. The analysis of autolysis products led to the identification of a number of contaminating proteins and the generation of a list of peptide species that will be present in tryptic digests. Intriguingly, many of the autolysis products were nontryptic peptides, specifically peptides generated by C-terminal cleavage at asparagine residues. Both porcine and bovine trypsins were demonstrated to be tyrosine O-sulfated. Using both a label-free and a tandem mass tag (TMT) labeling approach, a comparison of the digestion of a standard protein mixture using the six trypsins demonstrated that, apart from the least expensive bovine trypsin, the trypsins were equally specific. The semitryptic activity led to a better sequence coverage for abundant substrates at the expense of low-abundance species. The label-free analysis was shown to be more sensitive to unique features from the individual digests that were lost in the TMT-multiplexing study. KEYWORDS: proteomics, trypsin, digestion, artifacts, tyrosine sulfation, mass spectrometry, label-free and TMT quantitation
■
INTRODUCTION Trypsin plays an important role in bottom-up mass spectrometry-based proteomics as the single most utilized protease. The widespread use of trypsin is due to its high catalytic activity and specificity in cleaving proteins into peptides at the C-terminal side of arginine and lysine residues. The proteolytic products generated are typically peptides with a length of 10−12 residues and contain either arginine or lysine at the C-terminal of the peptide. Most tryptic peptides behave well under reversed-phase conditions, and since they contain basic sites at the N-terminus and C-terminus, they are easily protonated and ionized. Importantly, the protonated peptides yield fragmentation spectra in tandem mass spectrometry experiments that are information dense, from which the primary peptide sequence and modifications can be identified. Many proteomics experiments rely on protein identification via nano-LC−MS/MS where often less than 5 μg of a protein sample is used per LC−MS/MS experiment. Because the trypsin-to-substrate ratio in most proteomics experiments ranges between 1:20 and 1:100, only very little trypsin is required for such an experiment. The cost of trypsins typically used in proteomics experiments is approximately $1 per microgram, and trypsin’s financial impact on an experiment is therefore marginal compared to that of other chemicals, not to mention salary costs, together with depreciation and service contracts for LC and MS hardware. However, proteomics experiments with focus on post-translationally modified © 2013 American Chemical Society
peptides often require the proteolysis of tens of milligram amounts for a successful enrichment outcome. This is exemplified by studies focusing on lysine acetylation,1 cysteine oxidation,2 and phosphorylation.3 For such studies, the cost of trypsin becomes significant. Trypsin is available from many commercial sources with prices varying by more than 1000-fold. Given the importance of trypsin, there are relatively few studies on the specificity of commercial trypsins, and these have given differing results: tryptic proteolysis of freshly isolated mouse liver proteins using conventional proteomics settings concluded that trypsin solely cleaves C-terminal to arginine and lysine.4 An in-depth study on the proteolysis of five bovine standard proteins by Picotti et al. found that while tryptic peptides accounted for 80% of the overall ion current, the number of partly tryptic peptides amounted to 75% of the peptide identifications.5 The findings of Picotti are supported by a recent study by Burkhart et al.6 testing the digestion of human platelet proteins using different trypsins. We decided to compare six commercially available trypsins differing in origin, vendor, and price and tested these trypsin preparations with respect to contaminating protein, autodigestion profiles,7 and specificity. For the analysis we used a “standard proteome” composed of eight purchased reference proteins as the trypsin Received: February 13, 2013 Published: June 19, 2013 3631
dx.doi.org/10.1021/pr4001465 | J. Proteome Res. 2013, 12, 3631−3641
Journal of Proteome Research
Article
A 5 μg sample of each trypsin was separated using a NuPAGE 1D gel with the MES SDS running buffer system and silver stained in accordance with the manufacturer’s protocol (Invitrogen, Grand Island, NY).
Fisher, Odense, Denmark) or a Dionex 3000 series HPLC instrument (Thermo Fisher, Sunnyvale, CA). Peptides were separated by reversed-phase chromatography columns packed with C18 particles (Nikkyo Technos Co., Ltd., Japan): 100 μm i.d. with 5 μm particles or 75 μm i.d. with 3 μm particles. The binary gradients were generated with solvents A (0.1% formic acid) and B (80% acetonitrile, 0.1% formic acid) at 500 nL/ min, increasing from 3% B to 15% B in 4 min, followed by a more gradual increase to 45% B in either 38 min (autodigest analysis) or 108 min (digest comparison). After the gradient, the column was washed for 11 min with 90% B. Peptides were loaded directly onto the analytical column at 1.5−2 μL/min using a wash volume of 4−5 times the injection volume. Survey spectra were acquired over the mass range of 350−2000 m/z using an FTMS auto gain control (AGC) target set to 1E6 for full scans. MS spectra were measured using two microscans at a resolution of 30 000. The lock mass (m/z 445.120024) was used. Dynamic exclusion (30 s) and charge state filtering disqualifying singly charged peptides were activated. All data were recorded in profile mode. For fragmentation experiments, the isolation window was set to 2.0 m/z, and an AGC value of 5e4 was used for higher energy collisional dissociation (HCD) experiments. One microscan was used for all MS/MS experiments, which were measured using m/z 100 as the lowest mass. The maximum injection time for MS/MS experiments in the ion trap and Orbitrap was 25 and 250 ms, respectively. The 10 most intense multiply charged ions were fragmented per cycle. For tandem mass tag experiments, a normalized collision energy of “45” was used. For all other experiment a normalized collision energy of “35” was used.
Trypsin Specificity Analysis
Data Analysis
Eight standard proteins, bovine lactalbumin (cat. no. L6010), bovine thyroglobulin (cat. no. T1001), bovine α-casein (cat. no. C6780), fetuin fetal calf serum (cat. no. F3385), bovine apotransferrin (cat. no. T1428), bovine serum albumin (cat. no. A9647), avian lysozyme (cat. no. L7651), and ovalbumin (cat. no. 05438), were purchased from Sigma-Aldrich; see also Supplementary Text 1 and Supplementary Table 1 (Supporting Information) for details. Standard proteins were dissolved in 6 M urea/150 mM TEAB and mixed in equimolar proportions (12 nmol per protein). The protein mixture was reduced with dithiothreitol (DTT; Sigma-Aldrich), alkylated using iodoacetamide (Sigma-Aldrich), and diluted to 90% of the MS signal while nontryptic peptides accounted for less than 0.5% of the measured peptide signal (Figure 2B,C). These observations closely match those of Picotti et al.5and Burkhardt et al.,6 and the finding of a large number of not fully tryptic autolysis and substrate products across the tested trypsins contradicts to some extent Olsen et al.’s title “Trypsin cleaves exclusively C-terminal to arginine and lysine residues”.4 However, it is important to see this observation in the context. Though we identified many semi- and nontryptic peptides, the 3635
dx.doi.org/10.1021/pr4001465 | J. Proteome Res. 2013, 12, 3631−3641
Journal of Proteome Research
Article
peptides increases, and there seems to be a secondary slower proteolysis, particularly at asparagine residues. Comparison of Six Trypsins
In a more detailed comparison of the six different trypsins, we analyzed the peptides generated from digestions of the reference protein mix. We only considered peptides matched by MS/MS more than once. In the first analysis, we grouped the peptides on the basis of enzymatic specificity (seven groups, ranging from fully tryptic to not tryptic). For the label-free quantitation (LFQ) experiment, we summed the intensities of the precursor ions by peptide type for each trypsin. For the tandem mass tag experiment, we used the intensities of the six diagnostic ions representing each of the six trypsins. Figure 4 shows the summed intensities normalized by the median intensity for each of the six trypsins, grouped according to the seven peptide-specificity groups. Results from the TMT (light gray) and the LFQ (darker gray) experiments are plotted together and show that the results from the two quantitative strategies are consistent. The figure shows that there is little difference among the six trypsins for tryptic peptides with no missed cleavages (Figure 4A), while the porcine trypsin from Sigma and the bovine trypsin from Worthington-Biochem resulted in a slightly higher level of peptides with missed tryptic cleavage events (Figure 4B,C). The bovine trypsin from Sigma stands out with regard to a much higher nontryptic cleavage activity (Figure 4E−G). We then used individual peptides to add further details to the comparison of the six trypsins. For the TMT experiment, we selected all peptides that could be matched to our standard proteome (a total of 729 different sequences). For the LFQ experiment, we imposed the additional requirement that the results between the technical replicates had to be consistent. By consistency, we mean that a valid peptide should be measured, or not measured, in each of the three technical replicates. In addition, we only considered peptides with retention time CVs of