Integrating Carbamylation and Ultraviolet Photodissociation Mass

Oct 16, 2017 - The most popular bottom-up proteomics workflow uses trypsin to enzymatically cleave proteins C-terminal to lysine and arginine residues...
0 downloads 8 Views 2MB Size
Article Cite This: Anal. Chem. XXXX, XXX, XXX-XXX

pubs.acs.org/ac

Integrating Carbamylation and Ultraviolet Photodissociation Mass Spectrometry for Middle-Down Proteomics James D. Sanders, Sylvester M. Greer, and Jennifer S. Brodbelt* Department of Chemistry, University of Texas at Austin, Austin, Texas 78712, United States S Supporting Information *

ABSTRACT: The most popular bottom-up proteomics workflow uses trypsin to enzymatically cleave proteins C-terminal to lysine and arginine residues prior to LCMS/MS analysis of the resulting peptides. The high frequency of these residues generates short peptides, some of which are too small or uninformative for optimal analysis and which potentially contribute to gaps in sequence coverage of proteins. Analysis of larger peptides, termed “middledown”, has the potential to span greater sections of protein sequences if the larger peptides are adequately characterized based on their fragmentation patterns. We describe a strategy to generate larger peptides in conjunction with successful characterization by ultraviolet photodissociation (UVPD) for MS/MS analysis in a middle-down workflow, as demonstrated for proteins from E. coli lysates. The larger peptides are produced via modification of lysine residues by carbamylation of proteins. Carbamylation of proteins followed by tryptic digestion produced peptides similar to those expected from Arg-C proteolysis, yet with fewer missed and nonspecific cleavages. UVPD provides excellent sequence coverage of the larger peptides that are often less well characterized by traditional collision-based activation methods.

O

found in the collections of peptides analyzed by bottom-up methods, mapping of combinatorial patterns of modifications is rarely possible because the sequence stretches are too short. Top-down methods entail direct MS and MS/MS analysis of intact proteins, thus allowing the potential for complete characterization of mutations and all post-translational modifications of proteins at the expense of sensitivity, less efficient separation methods for proteins, and the more demanding technical complications of data analysis and scoring.8,9 Middle-down strategies which generate and analyze large (3−15 kDa) peptides offer a compelling compromise between the common bottom-up and the more challenging top-down approaches.10−15 Capitalizing on the potential attributes of middle-down methods is constrained by two key requirements: (i) robust means to produce large peptides, and (ii) MS/MS methods well-suited for characterization of the resulting large peptides. Protein processing to generate suitable middle-down size peptides has been implemented via several options, typically based on using proteolytic enzymes or chemical conditions with very specific cleavage-site selectivity6,7,14,17or adopting proteolysis conditions that limit protein cleavage.10,11,15 For example, OmpT is an aspartyl protease that only cleaves between adjacent nonterminal basic residues (e.g., Arg-Arg),

ver the past two decades, mass spectrometry has cemented its position as a primary tool for highthroughput proteomics workflows.1−4 Central to this success has been the utilization of proteolytic enzymes to cleave proteins into small peptides to enable the bottom-up approach.1−4 Peptides are more easily separated by liquid chromatographic methods and more readily identified by tandem mass spectrometry (MS/MS) than are intact proteins.5 While numerous enzymes offering a wide variety amino acid specificity are available for proteolysis,6,7 trypsin has remained by far the most popular owing to its high proteolytic efficiency C-terminal to Arg and Lys residues. Trypsin typically results in production of peptides smaller than 3 kDa which are well-suited for standard LCMS methods,4−6 Despite the enormous success of bottom-up workflows, other approaches that characterize intact proteins (top-down)8,9 or use limited proteolysis to generate large peptides (middle-down)6,7,10−15 have gained popularity. These methods address some of the limitations of bottom-up strategies.15,16 For example, smaller-sized proteolytic peptides may be too short for effective MS/MS analysis, resulting in loss of information about sequence regions containing high frequencies of the residues targeted by the protease, such as Arg and Lys for trypsin-based methods. Moreover, although proteins may be confidently identified based on analysis of just a few representative peptides, the short length of tryptic peptides often means that large stretches of the protein sequence remain uncharacterized, thus potentially resulting in missed sequence mutations and post-translational modifications.16 Even if modifications of specific residues are © XXXX American Chemical Society

Received: August 21, 2017 Accepted: October 16, 2017 Published: October 16, 2017 A

DOI: 10.1021/acs.analchem.7b03396 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

strategy entails carbamylation of the side chains of Lys residues of proteins prior to trypsin proteolysis. Carbamylation prevents tryptic cleavage at Lys sites, thus limiting proteolysis to less frequent Arg sites and creating large peptides suitable for middle-down proteomic workflows. UVPD is used to enhance sequence coverage of the resulting larger peptides. Use of this method enables analysis of lysine-rich portions of proteins that might be unobservable using trypsin alone while still taking advantage of the outstanding robustness, efficiency, and specificity of trypsin as a protease. We evaluate this method by comparing it to both trypsin (without protein derivatization) and Arg-C proteolysis of E. coli whole cell lysate, and the performance of UVPD is compared to HCD as the MS/MS method of the middle-down workflow.

thus yielding very large peptides owing to the low frequency of this motif.14 Another highly specific protease is IdeS which cleaves immunoglobulins below the hinge region to produce large antibody fragments.18 Cyanogen bromide, although no longer commonly used, cleaves at the C-terminal side of methionine residues, thus leading to large peptides owing to the low frequency of methionine in proteins.19 For the alternative limited proteolysis approach, solution conditions may be established to prevent denaturation of the proteins of interest or to reduce the efficiency of the protease, thus restricting the frequency of proteolytic cleavage to generate larger peptides than those produced using bottom-up proteolysis conditions.10,11 A third option for generating larger peptides entails derivatization of proteins to convert standard proteolysis sites into noncleavable sites, thus modulating the size of peptides produced.20,21 In one recent study, Lys residues were chemically modified to restrict proteolytic cleavage by trypsin to Arg residues, resulting in identification of a greater number of proteins and higher sequence coverage.20 This general type of strategy has also facilitated the analysis of histones,21 in which propionylation of lysine residues prior to trypsinolysis quenched cleavage at the Lys sites. The success of these derivatization methods motivated one aspect of the present study, in which carbamylation, rather than propionylation, of lysine residues is employed as a means to restrict trypsin proteolysis to Arg sites. The large size of peptides in the middle-down approach makes them more difficult to characterize by traditional collision-based MS/MS methods such as collision-induced dissociation (CID) or higher-energy collisional dissociation (HCD). The fragmentation mechanism promoted by collisional activation methods typically requires mobile protons in order to achieve extensive dissociation, and the efficiency of fragmentation decreases with peptide size and other factors owing to the redistribution of internal energy through many vibrational modes.22 This latter factor generally diminishes fragmentation efficiency as the size of the peptide ion increases. To address this size-based limitation of the fragmentation efficiency of collisional activation methods, several alternative activation techniques more suitable for larger peptides and proteins have been explored in recent years. For example, electron-based methods such as electron capture dissociation (ECD) and electron transfer dissociation (ETD) yield more extensive sequence coverage of larger peptides and afford better retention of PTMs compared to collision-based methods.23−25 These attributes have led to widespread adoption of ETD for proteomics applications,26,27 with the caveat that chargereduction is a significant (and generally uninformative) competing process for peptides in low charge states or with low charge density. Another promising alternative activation method for proteomics applications is ultraviolet photodissociation (UVPD).28−43 UVPD provides extensive sequence coverage and preserves retention of PTMs.28,36,38,39 Unlike ETD, UVPD has proven to be effective for peptides in low charge states and for both large and small peptides42 as well as intact proteins.29,40 An additional attribute is that UVPD, unlike collisional activation methods, generates informative fragmentation patterns of deprotonated peptides, thus making it suitable for analysis of the acidic proteome which extends the breadth and depth of proteomic analysis.34,43 Here we describe a method to generate large peptides with high efficiency and to characterize them with high sequence coverage by UVPD-MS, as summarized in Scheme S1. The



EXPERIMENTAL SECTION Materials. Urea, ammonium bicarbonate, dithiothreitol, iodoacetamide, and HPLC grade solvents were purchased from Merck Millipore (Billerica, MA) and used without further purification. Equine myoglobin, bovine ubiquitin, and bovine serum albumen were obtained from Sigma-Aldrich. Sequencing grade trypsin was purchased from Promega (Madison, WI), and sequencing grade Arg-C protease was purchased from Worthington Biochemical Corporation (Lakewood, NJ). Derivatization and Digestion. Carbamylation was performed prior to tryptic digestion using established protocols.43,44 Briefly, protein samples were suspended in an 8 M urea solution buffered at pH 8 with 150 mM ammonium bicarbonate (ABC) and incubated at 80 °C for 4 h. Urea concentration was reduced to below 1 M by diluting samples 10:1 in ABC buffer prior to disulfide reduction and alkylation. Disulfide bonds were reduced by addition of dithiothreitol (DTT) to a concentration of 5 mM and incubation at 55 °C for 50 min and subsequently alkylated by addition of iodoacetamide to a concentration of 15 mM and incubation in the dark for 50 min. Alkylation was quenched by a second addition of DTT to a final concentration of 10 mM. Samples prepared for ArgC digestion were reduced and alkylated in the same manner using 90 mM Tris buffer in place of ABC. Prior to addition of the ArgC protease, 8.5 mM CaCl2, 5 mM DTT, and 0.5 mM ethylenediaminetetraacetic acid (EDTA) were added in accordance with the manufacturer’s protocol. Trypsin or ArgC was then added at a protein to protease ratio of 50:1, and solutions were incubated overnight at 37 °C. After digestion, solutions were desalted using either a C18 spin column or a 3 kDa molecular weight cutoff (MWCO) centrifuge filter. LC-MS/MS. All samples were analyzed using a Thermo Fisher Scientific Orbitrap Fusion Lumos mass spectrometer (Thermo Fisher Scientific, San Jose, CA) modified to enable UVPD in the dual linear ion trap using a 500 Hz 193 nm excimer laser (Coherent, Santa Clara, CA) as described previously.45 Separations were performed using a Dionex Ultimate 3000 nano LC system equipped with trap (3 cm × 100 μm i.d.) and analytical columns (30 cm polymer reversephase (PLRP) for carbamylated samples or 15 cm C18 for unmodified samples, each 75 μm i.d.) packed in-house. Both C8 and PLRP stationary phases were evaluated for the analytical separations via nanoLC. While both proved to be suitable, PLRP provided the best separation for the widest variety of peptides and was used for all carbamylated samples. Peptides were eluted using water with 0.1% formic acid as mobile phase A and acetonitrile with 0.1% formic acid as B

DOI: 10.1021/acs.analchem.7b03396 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry mobile phase B applied as a gradient at a net flow rate of 300 nL/min. After loading approximately 1 μg onto the precolumn trap for 5 min, the analytical gradient was run from 2% to 50% B over the course of 60 min for single protein samples and 120 min for E. coli whole cell lysate samples. Mass spectra were acquired using a resolution of 120 000 for MS1 spectra and 60 000 for MS/MS spectra. A single microscan was used for each scan event. Precursor ions were automatically selected for either HCD or UVPD using the top-speed data-dependent acquisition mode. HCD was performed using a normalized collision energy (NCE) setting of 30, and UVPD was performed using two 5 ns pulses at three mJ per pulse. Data were searched against appropriate protein databases using Byonic software (Protein Metrics, San Carlos, CA) using a tolerance of 10 ppm for precursor ions, 20 ppm for fragment ions, and a 1% false discovery rate cutoff and allowing a maximum of two missed cleavages. Scores calculated by Byonic range from 0 to 1000 and are an indicator of the correctness of the peptide spectral match, with scores over 500 deemed highly confident.

Peptide Sizes and Protein-Level Sequence Coverage. Owing to the reduction in the number of sites available for tryptic cleavage after carbamylation of lysine side chains, we expected the average size of peptides produced to be larger than those generated from a standard tryptic digestion of proteins. Furthermore, we speculated that the longer peptides might cover areas of protein sequences that were missed by trypsin alone, providing additional complementary characterization. Proteins containing lysine-rich regions were of particular interest because these proteins may generate numerous small peptides when subjected to conventional trypsin proteolysis. Carbamylated and unmodified proteins and E. coli whole cell lysate were digested with trypsin and analyzed in triplicate to characterize the peptides produced by each method as well as the coverage of protein sequences. One benchmark example is illustrated for myoglobin, a lysine-rich protein containing 19 Lys among its 153 residue sequence. Whereas tryptic digestion of myoglobin would be expected to generate numerous small peptides, including ones with as few as one to four residues, tryptic digestion after carbamylation produced only a handful of much larger peptides with the longest containing over 100 residues. UVPD has been previously shown to yield higher sequence coverage of large peptides and proteins compared to HCD,29,40−42so it was expected to perform particularly well for the larger peptides generated by middle-down workflows. An example of the HCD and UVPD mass spectra obtained for the largest middle-down sized peptide from myoglobin is shown in Figure S1, and the corresponding deconvoluted HCD and UVPD mass spectra which underscore the much larger range of sizes of fragment ions produced by UVPD compared to HCD are shown in Figure S2. The peptide (12.7 kDa) identified in Figure S1 contained 16 Lys, all carbamylated, and produced charge states ranging from 8+ to 15+ upon ESI. UVPD yielded a sequence coverage of 77% for this large peptide; HCD gave a sequence coverage of 33% that was mostly focused in 25 residue stretches encompassing the Cand N-termini. This example confirmed both the high efficiency of carbamylation, the ability to produce high charge states of a peptide containing 16 nonbasic Lys residues, a single Arg, and 10 His, and the good sequence coverage afforded by UVPD. Figure 1 shows an example of the HCD and UVPD mass spectra obtained for one large peptide generated from the carbamylation/tryptic digestion procedure of E. coli lysate. The 50 residue peptide (∼5 kDa) with a sequence of SKTIATENAPAAIGPYVQGVDLGNMIITSGQIPVNPKTGEVPADVAAQAR originates from the parent protein 2iminobutanoate/2-iminopropanoate deaminase. For both HCD and UVPD, the dominant fragments are b- and y-type ions, with a few additional c- and z-type ions observed in the UVPD mass spectrum. Most interesting is the production of significantly longer sequence ions upon UVPD; the largest sequence ion observed upon HCD contains 18 amino acids whereas the fragment ion size extends up to 35 residues upon UVPD. While both methods provide sufficient coverage to confidently identify the peptide and its protein of origin, the stretch between G14 and I32 is not covered by any fragments in the HCD mass spectrum, and every residue in the sequence is covered by at least three fragments in the UVPD spectrum. Although the UVPD spectrum shows only moderately better sequence coverage than HCD (53% compared to 45% for HCD), the presence of larger fragments such as y24 and b35 provide coverage of the middle portion of the peptide.



RESULTS AND DISCUSSION The performance of the two-part carbamylation/tryptic digestion strategy to generate middle-down size peptides was evaluated using individual benchmark proteins (ubiquitin, myoglobin, carbonic anhydrase) and then applied to E. coli whole cell lysate. UVPD was used to characterize the array of middle-down peptides, and the performance metrics of UVPD were evaluated in comparison to HCD with respect to both the number of peptides and proteins identified as well as the depth of characterization facilitated by each activation method. Additionally, ArgC was used to generate peptides with identical sequences as those produced upon carbamylation/tryptic digestion in order to compare the efficiency and specificity of both proteolysis methods. Optimization of Cleanup Procedure and LC Stationary Phase. The combination of larger peptide size and the greater hydrophobicity of carbamylated peptides significantly increases the affinity of the peptides for the C18 stationary phase normally used for both sample cleanup and chromatographic separation in conventional bottom-up proteomic workflows. While larger peptides were being created upon tryptic digestion of carbamylated proteins, they were often significantly under-represented in the LC-MS/MS data sets. We hypothesized that these larger peptides were not completely eluting from the C18 spin columns used for sample cleanup after digestion, thus motivating the exploration of alternative cleanup procedures and stationary phases. A method using molecular weight cutoff centrifuge filters with a 3 kDa cutoff was adapted for this purpose.12 While this type of membrane filter is designed to retain only species with molecular weights greater than 3 kDa, it was found that by performing a limited number of “rinses” (i.e., rinsing the sample with ∼400 μL buffer after adding the sample into the filter cartridge) smaller species could be retained. While this method inherently involves some sample loss, it allows the ratio of small (3 kDa) peptides to be modulated by changing the number of rinses performed. Three rinses produced ample enrichment of larger peptides while still maintaining a sufficient number of smaller peptides to provide good sequence coverage at the protein level in the present study. The number and volume of rinses can be tailored to customize the enrichment of the larger peptides as desired. C

DOI: 10.1021/acs.analchem.7b03396 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

peptides was 2.0 kDa based on HCD-MS analysis, and 11% had a mass greater than 3 kDa. Interestingly, the average mass of identified peptides by UVPD-MS increased to 2.2 kDa and the portion of peptides with a mass greater than 3 kDa increased to 18%. The increase in average peptide mass observed for the UVPD results is rationalized by the fact that UVPD has proven to be particularly effective for characterizing larger peptides, ones for which HCD may give inadequate sequence coverage for confident identification. To assess the extent to which the experimental findings for the carbamylation/trypsin approach reflect the theoretical size distribution of peptides generated by both digestion protocols, two in silico digestions were performed on the 5,973 protein sequences found in the E. coli K12 database downloaded from www.uniprot.org. A theoretical mass was calculated for each peptide composed of four or more amino acid residues, and these masses were plotted as a histogram in Figure S3. The set of theoretical peptides generated by cleaving protein sequences after both Arg and Lys residues returned an average mass of 1.5 kDa, and 9% of the peptides had masses greater than 3 kDa. The peptide set generated by cleaving sequences only after Arg residues (to mimic the outcome of carbamylation) had an average mass of 2.3 kDa, and 25% of peptides had masses greater than 3 kDa. Comparison of these theoretical distributions to the experimental findings (Figure 2) indicated that the size distribution of unmodified tryptic peptides identified by HCD is similar to the theoretical distribution. However, when carbamylated peptides are analyzed using HCD, the larger (>3 kDa) peptides are underrepresented when compared to the theoretical distribution. When the same carbamylated peptides are analyzed by UVPD, the size distribution of peptides more closely mirrors what would be expected based on the in silico results, thus reflecting the capability of UVPD to evaluate longer peptides. To compare the performance of HCD and UVPD for the characterization of carbamylated middle-down size peptides on a larger scale, the peptides found in all three replicate samples analyzed using each activation method were compiled using the peptide spectral match (PSM) with the highest score for each individual peptide. Considering only peptides that were identified in all three replicates of each type of experiment, 1711 peptides were identified by HCD and 1204 peptides were identified by UVPD, with 1092 peptides common to both. A Venn diagram showing this overlap is shown in Figure S4. For each carbamylated peptide found in common, the difference in score obtained by HCD versus UVPD is graphically displayed in Figure 3. While HCD typically identifies a greater number of peptides compared to UVPD (average of 3555 peptides per run for HCD versus 3151 peptides per run for UVPD), UVPD identifies peptides that, on average, are somewhat larger (Figure 2) and more highly charged (Figure S5). UVPD also provides more confident scores than HCD for nearly 75% of peptides (Figure 3). While UVPD yielded generally higher scores across all mass ranges, the performance advantage of UVPD compared to HCD especially increases with the size of the peptide, as illustrated by the box plots shown as a function of peptide mass bins in Figure S6. The scoring advantage of UVPD is particularly notable for peptides larger than 4.5 kDa, likely because HCD typically produces shorter fragment ions that do not encompasses large portions of these longer sequences and inconsistently span the midsections. Examples of the total sequence coverage obtained for two representative E. coli proteins, 30S ribosomal protein S4 and

Figure 1. (a) HCD and (b) UVPD mass spectra of a 50 residue peptide (4+) identified from a tryptic digest of E. coli proteins after carbamylation. The green Lys residues indicate carbamylated sites.

Collisional activation methods have consistently afforded scantier coverage in the middle regions of large peptides and intact proteins. The ability to generate fragments that contain more than 50% of the sequence of large peptides and proteins is a hallmark of UVPD and can provide critical insights into the locations of PTMs or point mutations. E. coli peptides identified by both HCD and UVPD methods were sorted into bins according to their molecular mass, and the average percentage of the population of each bin was plotted as a histogram (Figure 2). Tryptic digestion produced detectable peptides with an average mass of 1.6 kDa, and less than 3% of identified peptides had a mass greater than 3 kDa. For the carbamylation/trypsin protocol, the average mass of

Figure 2. Histogram of binned peptide sizes from E. coli lysate digested with trypsin without or after carbamylation and analyzed using HCD or UVPD. D

DOI: 10.1021/acs.analchem.7b03396 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

of the sequence are captured in larger peptides, providing coverage of portions of proteins that might otherwise be unmapped in standard tryptic digests. A second protein, elongation factor Tu2 shown in Figure S7, provides further examples of additional protein sequence coverage that can be obtained by using this approach. For this protein, there are nine segments outlined with red boxes that represent stretches only mapped by UVPD or HCD in conjunction with carbamylation. To facilitate the examination of protein level sequence coverage on a larger scale, the proteins identified in at least two out of three replicates for each of the three experimental conditions (unmodified/trypsin-HCD, carbamylated/trypsinHCD, and carbamylated/trypsin-UVPD) were categorized in a Venn diagram format shown in Figure 5. Although the total

Figure 3. Comparison of scores between HCD and UVPD for peptides identified in all three HCD and all three UVPD replicates of carbamylated/trypsin E. coli lysate, ranked by difference in score between HCD and UVPD. Data above the origin represents peptides better characterized by UVPD (blue points).

elongation factor Tu2, by HCD and UVPD are shown in Figures 4 and S7. For each of the two proteins, the sequence is

Figure 5. Venn diagram showing proteins found in at least two of three replicate tryptic digests of either carbamylated or unmodified E. coli lysate, using HCD for the unmodified lysate and either HCD or UVPD for carbamylated samples.

number of identified proteins was greatest using HCD of the conventional unmodified tryptic digest, 83 additional proteins were uniquely identified by MS/MS analysis of the tryptic digests of the carbamylated proteins. Based on the Venn diagram analysis in Figure 5, 465 proteins were found in common by all three of the methods. The sequence coverages obtained for each of these 465 proteins were compared for each of the three methods, and these results are shown graphically in Figure 6 as difference plots. The magnitude of each difference score reflects the magnitude of the difference in coverage. The results show that while there is a negligible difference in coverage when comparing the carbamylated/trypsin-HCD

Figure 4. Protein-level sequence coverage of 30S ribosomal protein S4 from E. coli K12 cell lysate. Blue underlines indicate peptides identified in an unmodified tryptic digest, while red and gold underlines indicate peptides from a sample that was carbamylated prior to tryptic digestion (identified by HCD and UVPD, respectively). Stretches encompassed in red boxes were only identified by carbamylation/ trypsin. All Lys residues are highlighted in green font; all Arg residues are highlighted in purple font.

shown with color-coding to indicate the regions covered by three methods: unmodified protein/trypsin digestion/HCD, carbamylated protein/trypsin digestion/HCD, and carbamylated protein/trypsin digestion/UVPD. Stretches encompassed by blue boxes are those uniquely identified by HCD analysis of the unmodified tryptic peptides (none in Figure 4 and three in Figure S7). The regions encompassed by red boxes indicate those uniquely identified by HCD or UVPD analysis of the carbamylated tryptic peptides. Among the carbamylated peptides in Figure 4, those identified by HCD are underscored in red, and those identified by UVPD are underscored in gold. While all three methods provide some unique areas of coverage, advantages of the carbamylation/middle down strategy are apparent. Several portions of the proteins sequence, such as SGVR (beginning at S23), NYYK (beginning at N74), VK (beginning at V155), and EK (beginning at E166) produce short peptides upon trypsin proteolysis owing to the close spacing of Lys residues. Such short peptides, ones containing less than six residues, are frequently missed in conventional bottom-up approaches and are typically nondiagnostic if identified. By preventing cleavage at Lys residues, these areas

Figure 6. Comparison of the difference in protein level sequence coverage for MS/MS analysis of tryptic digests of E. coli lysate for proteins identified in all three experiments (HCD of peptides generated from tryptic digestion of unmodified samples (unmod), and HCD or UVPD of peptides generated by tryptic digestion of carbamylated samples (carbam)). E

DOI: 10.1021/acs.analchem.7b03396 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

Figure S9 shows a comparison between MS1 spectra of two peptides from the E. coli protein elongation factor TU2 with the same primary sequence (GSALKALEGDAEWEAKILELAGFLDSYIPEPER) generated by either carbamylation/ trypsin or Arg-C that differ only by the addition of the carbamylation modification (43.0058 Da) to each of the two Lys residues of the peptide for the carbamylated sample. While this peptide is found in both the 3+ and 4+ charge states in both samples, the 3+ charge state is favored for the carbamylated peptide whereas the 4+ charge state is favored for the peptide generated by Arg-C, an outcome reflecting the effect of the reduced basicity resulting from carbamylation of the primary amines. Figure S10 shows a comparison of the UVPD mass spectra of the 3+ charge state of this peptide, which yields nearly identical sequence coverage between the carbamylated/trypsin and Arg-C versions.

method to the unmodified/trypsin-HCD method, the sequence coverages obtained for the carbamylated/trypsin-UVPD method were greater than those obtained by the unmodified/ trypsin-HCD procedure for nearly 75% of the proteins. This outcome is attributed to the superior performance of UVPD for characterizing larger peptides. Comparison of Carbamylation/Trypsin to Arg-C. Because the carbamylation/trypsin method should produce peptides with identical sequences to those produced by the protease Arg-C, the two strategies were compared for an E. coli lysate. While Arg-C exhibits a preference for cleavage Cterminal to Arg residues, it is known to produce cleavages Cterminal to Lys residues and, to a lesser extent, other residues as well.7 The specificity of Arg-C was thus compared to the carbamylation/trypsin method as a means of assessing the ability to mitigate sample complexity and enhance the effectiveness of targeted experiments. A nonspecific search was performed on the data sets from both the carbamylation/ trypsin and Arg-C digests run in triplicate and analyzed using HCD. To compare cleavage specificity between the two methods, the search results were sorted based on the residue N-terminal to each identified peptide so that the total number of peptides generated by cleavage C-terminal to each of the 20 common amino acid residues could be counted. Figure 7 shows



CONCLUSIONS Carbamylation of Lys residues prior to tryptic digestion provides a simple, robust, and more selective alternative to Arg-C for generation of middle-down size peptides resulting from enzymatic cleavage C-terminal to Arg residues. Carbamylation of Lys proceeds with extremely high efficiency, resulting in less basic residues that are no longer recognized by trypsin. UVPD generates rich fragmentation patterns that can facilitate more complete characterization of peptides compared to collision-based methods, particularly for larger peptides and/ or ones in low charge states generated in middle-down proteomic workflows. UVPD returned greater peptide ID scores than HCD for over 65% of the peptides identified by both MS/MS methods. Combining HCD and UVPD in the middle-down workflow resulted in identification of unique proteins not found using a conventional bottom-up strategy and filled in gaps in sequence coverage missed based on analysis of standard tryptic peptides.



ASSOCIATED CONTENT

* Supporting Information

Figure 7. Relative abundance of peptides classified based on their Cterminal residue generated by cleavage with Arg-C (green) or by trypsin digestion after carbamylation (orange) obtained from E. coli lysate. Peptides containing the N-terminus of the protein, with or without the initiator methionine residue, were excluded.

S

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.analchem.7b03396. Workflow diagram, raw and deconvoluted MS/MS spectra, theoretical size distributions of peptides, Venn diagram of peptide identifications, histogram of peptide charge states, peptide identification scores, and sequence coverages (PDF)

the frequencies of cleavages C-terminal to each residue for carbamylated/trypsin and Arg-C digests. For the Arg-C samples, on average 80% of peptides were the result of cleavage C-terminal to Arg residues, 6.7% originated from cleavage C-terminal to Lys residues, and the remaining 13.3% arose from other nonspecific cleavages, in agreement with a previous study.6 For the carbamylated/trypsin samples, 91% of identified peptides resulted from cleavage C-terminal to Arg residues, 1.3% occurred from cleavage C-terminal to Lys residues, and the remaining 7.7% originated from other nonspecific cleavages. These results demonstrate that the carbamylation strategy provides superior specificity of cleavage sites compared to Arg-C. Because carbamylation converts primary amines to less basic carbamyl derivatives, peptides produced by the carbamylation/ trypsin are expected to be found in lower charge states compared to those produced by Arg C. The histogram shown in Figure S8 indicates that nearly 90% of the peptides identified in the carbamylation/trypsin data set were found in the 2+ and 3+ charge states, compared to 60% in Arg-C experiments.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Phone: (512)-471-0028. ORCID

Jennifer S. Brodbelt: 0000-0003-3207-0217 Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS We acknowledge the following funding sources: NSF (grant CHE1402753) and the Welch Foundation (grant F-1155). This material is based upon work supported by the National Science Foundation Graduate Research Fellowship (grant DGE1610403) awarded to J.S. Funding from the UT System for F

DOI: 10.1021/acs.analchem.7b03396 Anal. Chem. XXXX, XXX, XXX−XXX

Article

Analytical Chemistry

(34) Madsen, J. A.; Xu, H.; Robinson, M. R.; Horton, A. P.; Shaw, J. B.; Giles, D. K.; Kaoud, T. S.; Dalby, K. N.; Trent, M. S.; Brodbelt, J. S. Mol. Cell. Proteomics 2013, 12 (9), 2604−2614. (35) Greer, S. M.; Parker, W. R.; Brodbelt, J. S. J. Proteome Res. 2015, 14 (6), 2626−2632. (36) Fort, K. L.; Dyachenko, A.; Potel, C. M.; Corradini, E.; Marino, F.; Barendregt, A.; Makarov, A. A.; Scheltema, R. A.; Heck, A. J. R. Anal. Chem. 2016, 88 (4), 2303−2310. (37) Madsen, J. A.; Kaoud, T. S.; Dalby, K. N.; Brodbelt, J. S. Proteomics 2011, 11 (7), 1329−1334. (38) Madsen, J. A.; Ko, B. J.; Xu, H.; Iwashkiw, J. A.; Robotham, S. A.; Shaw, J. B.; Feldman, M. F.; Brodbelt, J. S. Anal. Chem. 2013, 85 (19), 9253−9261. (39) Robinson, M. R.; Taliaferro, J. M.; Dalby, K. N.; Brodbelt, J. S. J. Proteome Res. 2016, 15 (8), 2739−2748. (40) Cannon, J. R.; Cammarata, M. B.; Robotham, S. A.; Cotham, V. C.; Shaw, J. B.; Fellers, R. T.; Early, B. P.; Thomas, P. M.; Kelleher, N. L.; Brodbelt, J. S. Anal. Chem. 2014, 86 (4), 2185−2192. (41) Cannon, J. R.; Martinez-Fonts, K.; Robotham, S. A.; Matouschek, A.; Brodbelt, J. S. Anal. Chem. 2015, 87 (3), 1812−1820. (42) Cotham, V. C.; Brodbelt, J. S. Anal. Chem. 2016, 88 (7), 4004− 4013. (43) Greer, S. M.; Cannon, J. R.; Brodbelt, J. S. Anal. Chem. 2014, 86 (24), 12285−12290. (44) Angel, P. M.; Orlando, R. Rapid Commun. Mass Spectrom. 2007, 21 (10), 1623−1634. (45) Klein, D. R.; Holden, D. D.; Brodbelt, J. S. Anal. Chem. 2016, 88 (1), 1044−1051.

support of the UT System Proteomics Core Facility Network is gratefully acknowledged.



REFERENCES

(1) Bensimon, A.; Heck, A. J. R.; Aebersold, R. Annu. Rev. Biochem. 2012, 81 (1), 379−405. (2) Cox, J.; Mann, M. Annu. Rev. Biochem. 2011, 80 (1), 273−299. (3) Mann, M.; Kulak, N. A.; Nagaraj, N.; Cox, J. Mol. Cell 2013, 49 (4), 583−590. (4) Yates, J. R. J. Am. Chem. Soc. 2013, 135 (5), 1629−1640. (5) Zhang, Y.; Fonslow, B. R.; Shan, B.; Baek, M.-C.; Yates, J. R. Chem. Rev. 2013, 113 (4), 2343−2394. (6) Tsiatsiani, L.; Heck, A. J. R. FEBS J. 2015, 282 (14), 2612−2626. (7) Giansanti, P.; Tsiatsiani, L.; Low, T. Y.; Heck, A. J. R. Nat. Protoc. 2016, 11 (5), 993−1006. (8) Toby, T. K.; Fornelli, L.; Kelleher, N. L. Annu. Rev. Anal. Chem. 2016, 9 (1), 499−519. (9) Catherman, A. D.; Skinner, O. S.; Kelleher, N. L. Biochem. Biophys. Res. Commun. 2014, 445 (4), 683−693. (10) Xu, P.; Peng, J. Anal. Chem. 2008, 80 (9), 3438−3444. (11) Valkevich, E. M.; Sanchez, N. A.; Ge, Y.; Strieter, E. R. Biochemistry 2014, 53 (30), 4979−4989. (12) Cannon, J. R.; Edwards, N. J.; Fenselau, C. J. Mass Spectrom. 2013, 48 (3), 340−343. (13) Boyne, M. T.; Garcia, B. A.; Li, M.; Zamdborg, L.; Wenger, C. D.; Babai, S.; Kelleher, N. L. J. Proteome Res. 2009, 8 (1), 374−379. (14) Wu, C.; Tran, J. C.; Zamdborg, L.; Durbin, K. R.; Li, M.; Ahlf, D. R.; Early, B. P.; Thomas, P. M.; Sweedler, J. V.; Kelleher, N. L. Nat. Methods 2012, 9 (8), 822−824. (15) Cristobal, A.; Marino, F.; Post, H.; van den Toorn, H. W. P.; Mohammed, S.; Heck, A. J. R. Anal. Chem. 2017, 89 (6), 3318−3325. (16) Duncan, M. W.; Aebersold, R.; Caprioli, R. M. Nat. Biotechnol. 2010, 28 (7), 659−664. (17) Swatkoski, S.; Gutierrez, P.; Wynne, C.; Petrov, A.; Dinman, J. D.; Edwards, N.; Fenselau, C. J. Proteome Res. 2008, 7 (2), 579−586. (18) Vincents, B.; von Pawel-Rammingen, U.; Bjö r ck, L.; Abrahamson, M. Biochemistry 2004, 43 (49), 15540−15549. (19) Simpson, R. J. Cold Spring Harbor Protoc. 2007, 3, doi: 10.1101/ pdb.prot4704. (20) Golghalyani, V.; Neupärtl, M.; Wittig, I.; Bahr, U.; Karas, M. J. Proteome Res. 2017, 16 (2), 978−987. (21) Lin, S.; Garcia, B. A. In Methods in Enzymology; Elsevier: New York, 2012; Vol. 512, pp 3−28. (22) Wysocki, V. H.; Tsaprailis, G.; Smith, L. L.; Breci, L. A. J. Mass Spectrom. 2000, 35 (12), 1399−1406. (23) Chi, A.; Huttenhower, C.; Geer, L. Y.; Coon, J. J.; Syka, J. E. P.; Bai, D. L.; Shabanowitz, J.; Burke, D. J.; Troyanskaya, O. G.; Hunt, D. F. Proc. Natl. Acad. Sci. U. S. A. 2007, 104 (7), 2193−2198. (24) Mikesh, L. M.; Ueberheide, B.; Chi, A.; Coon, J. J.; Syka, J. E. P.; Shabanowitz, J.; Hunt, D. F. Biochim. Biophys. Acta, Proteins Proteomics 2006, 1764 (12), 1811−1822. (25) Good, D. M.; Wirtala, M.; McAlister, G. C.; Coon, J. J. Mol. Cell. Proteomics 2007, 6 (11), 1942−1951. (26) Ledvina, A. R.; Beauchene, N. A.; McAlister, G. C.; Syka, J. E. P.; Schwartz, J. C.; Griep-Raming, J.; Westphall, M. S.; Coon, J. J. Anal. Chem. 2010, 82 (24), 10068−10074. (27) Riley, N. M.; Hebert, A. S.; Dürnberger, G.; Stanek, F.; Mechtler, K.; Westphall, M. S.; Coon, J. J. Anal. Chem. 2017, 89 (12), 6367−6376. (28) Madsen, J. A.; Boutz, D. R.; Brodbelt, J. S. J. Proteome Res. 2010, 9 (8), 4205−4214. (29) Shaw, J. B.; Li, W.; Holden, D. D.; Zhang, Y.; Griep-Raming, J.; Fellers, R. T.; Early, B. P.; Thomas, P. M.; Kelleher, N. L.; Brodbelt, J. S. J. Am. Chem. Soc. 2013, 135 (34), 12646−12651. (30) Brodbelt, J. S. Chem. Soc. Rev. 2014, 43 (8), 2757. (31) Reilly, J. P. Mass Spectrom. Rev. 2009, 28 (3), 425−447. (32) Ly, T.; Julian, R. R. Angew. Chem., Int. Ed. 2009, 48 (39), 7130− 7137. (33) Zhang, L.; Reilly, J. P. J. Proteome Res. 2010, 9 (6), 3025−3034. G

DOI: 10.1021/acs.analchem.7b03396 Anal. Chem. XXXX, XXX, XXX−XXX