Complementary or Alternative to Tryptic Digestion? - ACS Publications

Jan 4, 2017 - Functional Proteomics, Centre for Biochemistry, Medical School, Goethe-University, Frankfurt 60590, Germany. §. Cluster of Excellence ...
1 downloads 0 Views 2MB Size
Subscriber access provided by University of Newcastle, Australia

Article

ArgC-like digestion: complementary or alternative to tryptic digestion? Vahid Golghalyani, Moritz Neupärtl, Ilka Wittig, Ute Bahr, and Michael Karas J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.6b00921 • Publication Date (Web): 04 Jan 2017 Downloaded from http://pubs.acs.org on January 5, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

ArgC-like digestion: complementary or alternative to tryptic digestion? AUTHOR NAMES Vahid Golghalyani1*, Moritz Neupärtl1, Ilka Wittig2,3,4, Ute Bahr1, Michael Karas1* AUTHOR ADDRESS 1

Institute of Pharmaceutical Chemistry, Goethe-University, Frankfurt am Main, Germany

2

Functional Proteomics, Centre for Biochemistry, Medical School, Goethe-University, Frankfurt,

Germany 3

Cluster of Excellence “Macromolecular Complexes”, Goethe University, Frankfurt am Main,

Germany 4

German Center of Cardiovascular Research (DZHK), Partner site RheinMain, Frankfurt,

Germany

KEYWORDS Shotgun proteomics, ArgC, trypsin, amino modification, protein modification, on bead digestion, acetylation, reductive methylation, carbethoxylation, propionylation

ACS Paragon Plus Environment

1

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 37

ABSTRACT

Enzymatic digestion of complex protein samples is often performed by use of multiple proteases to improve protein identification and characterization. Combining trypsin with ArgC is one option to enhance sequence coverage in bottom-up proteomics. However, the low selectivity of this endoprotease derogates from the benefit of the combination. Our approach here is to mimic ArgC digestion by chemically modifying all lysine residues in proteins so that trypsin can only cleave C-terminal to arginine. Four different amine modifications, dimethylation, acetylation, propionylation and carbethoxylation were tested and the protocols were optimized. A nearly complete conversion of the primary amines was achieved for all modifications. Tryptic digestion of Escherichia coli lysate proteins after acylation of lysine residues shows the most significant improvement compared to data received from ArgC digest. After propionylation, 9216 unique peptides identified 1439 proteins which, compared to a conventional tryptic digestion, represents the identification of 150 additional proteins due to a reasonable reduction of the sample complexity and higher fragmentation efficiencies of the peptides. It is therefore concluded that the Arg-C like digestion should no longer be regarded as a complementary approach, but forms a viable and superior alternative to the conventional trypsin digestion.

ACS Paragon Plus Environment

2

Page 3 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

TEXT 1 Introduction “Bottom-up” proteomics is the most frequently used approach to identify and characterize proteins by mass spectrometry. Complex protein samples are proteolytically digested and the generated peptide mixture is subsequently subjected to separation and analysis by liquid chromatography and mass spectrometry (LC-MS/MS). The proteolytic enzyme trypsin is still the workhorse protease, due to its high cleavage specificity and efficiency and its stability under a wide variety of reaction conditions. Cleavage C-terminal to arginine and lysine residues leads to a high number of peptides in the mass range suitable for LC separation and effective fragmentation by tandem mass spectrometry. In the best case, the highly basic residue at the Ctermini of peptides leads to prominent and easily interpretable y-ion series. However, trypsin has certain limitations. An inadequate distribution of trypsin cleavage sites in certain protein domains results in peptides that are either too long or too short for mass spectrometric analysis. Several approaches have been published improving the sequence coverage of proteins using multi-enzyme approaches by combining trypsin digestion with alternative proteases such as ArgC, GluC, Chymotrypsin, LysC, LysN or AspN.1–3 The use of alternative enzymes with respect to the shortcomings of the nearly exclusive use of trypsin in proteomics is comprehensively summarized in a state-of-the-art-review.4 Compared to trypsin most of these proteases cleave proteins in a less specific manner and the absence of a terminal basic amino acid causes a less efficient fragmentation.

One attractive alternative to trypsin is the endopeptidase ArgC, which preferentially cleaves at the C-terminus of arginine residues and thus theoretically generates a less complex sample

ACS Paragon Plus Environment

3

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 37

enabling faster and easier analysis and containing a higher amount of unique peptides. Practically however, ArgC is not frequently been used because of two main drawbacks, its lack of specificity5 and its high costs. In the work presented here we will demonstrate a strategy to circumvent these ArgC drawbacks by chemically derivatizing the side chain of lysine to block trypsin cleavage at this site and to induce an ArgC-like digest. At the same time internal lysine residues are chemically modified which renders the resulting peptides less hydrohilic and better amenable to their RP-LC separation and their decrease in basicity improves fragmentation of the peptides with CID, due to a better distribution of the mobile protons.6 Because of their high reactivity peptide and protein amino groups are favored sites of modification and numerous protocols can be found in the literature.7,8 For our approach we chose four amine modifications which promised the production of homogeneous products and have so far successfully been applied only to peptides and small proteins. Propionylation of the ε-amino group of the lysine side chain by propionic acid anhydride (PA) has been used to chemically derivatize histones which otherwise show too many small fragments after trypsin digestion due to their high number of basic amino acids.9 In another approach it has been used to label the N-terminus of peptides.10 Dimethyl-labeling (DM) is frequently used as a labeling strategy,11–14 while the most prominent application is the multiplex quantification of proteins.15 In a different approach the reductive methylation of amines was performed prior to modification of carboxyl-groups to analyze proteins’ carboxy termini.16 Sulfo-NHS-Acetate (NHS) contains a NHS ester moiety, which results in acetylation of amino groups under mild conditions in a quantitative manner. One example for its usage is to block the ε-amino group of the lysine on protein level in a so-called N-terminomics approach.17

ACS Paragon Plus Environment

4

Page 5 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Diethyl pyrocarbonate (DEPC) is the most reactive reagent. Reactions have been reported with residues containing a nucleophilic group18. Thus it was used for carbethoxylation of histidine residues to analyze the fragmentation behavior19 or the catalytic site of enzymes.20 To restrict the modification to lysine residues the addition of hydroxylamine is necessary to reverse unwanted reactions. Additionally, the selectivity can be increased using high pH values. The main reason for the poor implementation of chemical modifications in the analysis of complex protein samples is the often incomplete labeling, which complicates MS identification as well as quantification. Even small amounts of side reactions lead to a severe increase in the complexity of the peptide mixture.21 To overcome this problem a purification method is used here which allows one to apply harsher reaction conditions and multiple incubation steps to increase the modification efficiency. In recent years, new methods for proteomic sample preparation were introduced, which circumvent many of the challenges associated with traditional protein purification methods which are mainly based on precipitation. The introduction of the filter-aided approach FASP22 was a progress, but has not been expanded into the area of chemical protein modification yet. Another method was introduced recently based on the immobilization of proteins and peptides on the hydrophilic surface of carboxylate-coated paramagnetic beads by a mechanism similar to hydrophilic interaction chromatography (HILIC) or electrostatic repulsion hydrophilic interaction chromatography (ERLIC).23 In the following we elaborate on the use of NHS-activated magnetic beads to bind proteins covalently prior to reduction, alkylation and amine modification with the aim of simplifying and optimizing the purification steps while minimizing sample loss. By immobilizing the proteins onto the beads, reaction yields can be enhanced by the application of higher amounts of organic

ACS Paragon Plus Environment

5

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 37

solvents and reagents without sacrificing the subsequent proteolytic digestion of the proteins. Proteomic identification of the differently modified proteins after the ArgC-like digestion was compared to the standard tryptic digestion and the gain in information was evaluated.

Figure 1. Workflow to modify lysine residues in proteins to block tryptic cleavage at this site and compare the results with ArgC and trypsin digestion. Proteins immobilized on NHS activated beads were reduced and alkylated before either being digested with trypsin or ArgC (left and middle branch) or subjected to different modifications (right branch) and being digested tryptically afterwards. The resulting samples were measured using LC-MS/MS and evaluated subsequently. The procedure is exemplified here for a hypothetic peptide containing cleavage sites for ArgC and trypsin.

ACS Paragon Plus Environment

6

Page 7 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

3 Materials and Methods 3.1 Sample preparation The proteins, human apo-transferrin and alcohol dehydrogenase from saccharomyces cerevisiae (Sigma Aldrich) and the Escherichia coli lysate sample supplied from Biorad were solved in triethylammonium bicarbonate buffer (TEAB) (100 mM, pH = 8.5, Sigma Aldrich) and subsequently bound to the NHS Mag sepharose beads (GE Healthcare Life Sciences) according to the supplier’s protocol. Briefly, the beads were activated by adding 1mM cold HCl solution after removing the isopropanol from the stock suspension. Concurrently the activation solution was replaced by the protein solution. For 100 µg protein sample 40µL bead stock suspension were used. The binding process was quenched after 45 minutes of incubation at room temperature by replacing the buffer by ammonium bicarbonate (160 mM, Sigma Aldrich). After binding the immobilized proteins were reduced with DTT (30mg/ml, Sigma Aldrich) for 45 min at 57° and alkylated with IAA (Roth) for 45 min at room temperature in the dark. The beads were washed two times with 100mM TEAB to ensure an ammonia free buffer for further modification steps. All subsequently described amino modifications were stopped by removing the reaction buffer. The beads were washed with an ammonia containing buffer and afterwards the sample was transferred to a clean tube (crucial) and washed again before adding the digestion buffer. If an unwanted modification of peptides N-termini is observed after digestion, an additional quenching step by adding 2 µL hydroxylamine solution (50 wt. %, Sigma Aldrich) can be integrated between the washing steps. 3.2 Derivatization Propionic acid anhydride (PA). 480 µL methanol (HiPerSolv CHROMATONORM, gradient grade) and 120 µL TEAB (1.0 M, pH = 8.5) were added to the immobilized proteins. Before

ACS Paragon Plus Environment

7

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 37

adding 6 µL propionic acid anhydride (Sigma Aldrich) the bead suspension was cooled down to 4°C. After 2 hours of incubation at 4°C the reaction was stopped. Sulfo-NHS-Acetate (NHS). The reaction takes place in a solution containing 2mg/mL SulfoNHS-Acetate (Thermo Fisher) solved in a mixture of 50% DMSO (Roth) and 50% TEAB (200mM) for 1 hour at room temperature. Diethyl pyrocarbonate (DEPC). The reaction solution contains 10% DEPC (Sigma Aldrich) solved in acetonitrile. 120 µL of this solution were added to the beads, which were suspended in 4°C cold 480 µL TEAB (1M, pH=8.5). The reaction was stopped after 1 hour at 4°C. Dimethyl-Labeling (DM). Beads were suspended in 200µL 100 mM TEAB before adding 200 µL 2-picoline-boran (28mg/mL, Sigma Aldrich) solved in ACN. After adding 200 µL formaldehyde (36%, Sigma Aldrich), the reaction was incubated for 1 hour at room temperature. Substance P (Sigma Aldrich) was derivatized according to published protocols9,15,17,19, which are summarized in the supplementary data (Supplementary_1, page 3-4).

ACS Paragon Plus Environment

8

Page 9 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

3.3 Digestion Trypsin. The washed protein carrying beads were suspended in 90 µl 50mM TEAB. 1 micrograms of trypsin (Sigma Aldrich) diluted in 10 µL 1 mM HCl were added and digestion was carried out at 37 °C overnight (ca. 18 h). ArgC. Digestion was performed according to the supplier’s protocol. Briefly, the beads were suspended in 80 µL digestion buffer (50mM Tris/HCl, 10mM CaCl2, pH=7.6). 1 micrograms of ArgC (Roche) diluted in 10 µL of the digestion buffer were added prior to adding 10 µL working solution (50mM Tris/HCl, 10mM CaCl2, 50mM DTT, pH=7.6). Digestion was carried out at 37 °C overnight (ca.18 h). Digestion was stopped by adding 1µL 100% TFA and the samples were purified with C18 spin columns (Pierce, Thermo Fisher).

3.4 Mass spectrometry MALDI MS. MS measurements for optimizing the single modification protocols were conducted on a MALDI LTQ Orbitrap XL (Thermo Scientific) in Fourier transform MS mode, positive polarity, 30 000 resolution and mass-to-charge range 800–4000. All other MALDI MS spectra were measured with a MALDI TOF/TOF Analyzer (ABI 4800, Applied Biosystems) with a Nd:YAG laser (355 nm), positive polarity and mass-to-charge range 800-4000. All samples were spotted by using α-Cyano-4-hydroxycinnamic acid (Bruker) as matrix (3 mg/mL in 70% ACN, 0.1% TFA). LC-MSMS. LC/MS measurement was performed on Thermo Scientific™ Q Exactive equipped with an ultra-high performance liquid chromatography unit (Thermo Scientific Dionex Ultimate 3000) and a Nanospray Flex Ion-Source (Thermo Scientific). Peptides were loaded onto

ACS Paragon Plus Environment

9

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 37

a C18 reversed-phase precolumn (Thermo Scientific) followed by separation on a 2.4 µm Reprosil C18 resin (Dr. Maisch GmbH, Germany) in-house packed picofrit emitter tip (diameter 100 µm, 15 cm long, New Objective) using a gradient from 5% mobile phase A (4% acetonitrile, 0.1% formic acid) to 30 % mobile phase B (80% acetonitrile, 0.1% formic acid) for 70 min followed by a second gradient to 60% B for 30 min with a flow rate 300 nL/min. The LC run was finished by washout with 99% B for 5 min and reequilibration in 1% B. MS data were recorded by data dependent acquisition Top10 method selecting the most abundant precursor ions in positive mode for HCD fragmentation. The Full MS scan range was 300 to 2000 m/z with a resolution of 70000, and an automatic gain control (AGC) value of 3*106 total ion counts with a maximal ion injection time of 160 ms. Only higher charged ions (2+ or greater) were selected for MS/MS scans with a resolution of 17500, an isolation window of 2 m/z and an automatic gain control value set to 105 ions with a maximal ion injection time of 150 ms. Selected ions were excluded in a time frame of 30 s following fragmentation event. Full Scan MS-Data were acquired in profile mode by Xcalibur software.

3.5 Data Analysis All vendor specific files were converted to an open source raw format. Thermo RAW files were converted to mzML with MSConvert24, whereas T2D files from MALDI TOF/TOF were converted to mzXML with “T2D Converter by Bathy”. MALDI raw files with an open source format could be loaded into mMass25 to perform signal processing like peak picking and deisotoping to generate peaklists, which were saved as mgf files. LC-MSMS Thermo RAW files were directly converted to mgf files with MSConvert. All database searches were performed with Mascot 2.4. The results of the database searches were extracted from the resulting dat file by

ACS Paragon Plus Environment

10

Page 11 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

using the MSParser from Matrix Science. All the mentioned tools (except of t2d file converter) were integrated into KNIME26 to automatize the processes. An example workflow is attached to the supplementary data (Supplementary_1, page 16, figure S9). Mascot search. Detailed description for each sample is attached in the supplementary data (Supplementary_1, page 14 – 15, table S1 and S2). Briefly, for the modification protocols each search was performed taking carbamidomethylation of cysteine and the particular modification of lysine as fixed modification. Methionine oxidation and modification of protein’s N-term were taken as variable modification and the specified enzyme was ArgC allowing no missed cleavage. Peaklists generated with the MALDI-Orbitrap were searched with 10 ppm peptide mass tolerance, the submitted MALDI-TOFTOF files were searched with 80 ppm peptide mass tolerance. Peptide identification of MALDI data was done via peptide mass fingerprint (PMF). LC-MSMS files were searched with 5 ppm peptide mass tolerance and 0.1 Da MS/MS mass tolerance allowing one missed cleavage. Identification was done by choosing an MS/MS ion search. The specified enzyme was ArgC in all cases except of the trypsin digestion without prior amino modification. SwissProt (version 04.2016) was selected as database. The decoy MS/MS ion search was performed with a shuffled database to estimate the false discovery rate. All MS/MS database searches were concurrently percolated27 to improve the sensitivity and specificity of the results. Design of Experiment (DoE). The DoE-based optimization of the reaction conditions was done with MODDE (Version 11.0.1.1878) by “MKS Umetrics AB”.

ACS Paragon Plus Environment

11

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 37

2 Results and Discussion 2.1 Optimization of lysine modifications on protein level Lysine modification efficiency was first assessed on peptide level by applying the aforementioned published protocols9,15,17,19 (Supplementary_1, page 3-4). Substance P was chosen because it contains only one internal lysine and can thus be modified twice. MALDImass spectra of unmodified and modified (flipped spectra) Substance P are shown in the supplementary data (Supplementary_1, page5, figure S1). The annotated mass shifts reflect the introduction of two tags for each modification (N-terminus and lysine side chain). It is obvious that the standard protocols result in a complete conversion of Substance P’s amino groups without any remaining precursor mass. However, transferring these protocols to the derivatization of intact proteins was not successful, as reactions were incomplete and side reactions occur. Thus, higher modification efficiencies needed to be achieved by improving the reaction conditions and sample purification. For each modification the reaction conditions were optimized by analyzing the effects of different factors (e.g. solvent, amount of reagent, temperature and time) on the result. Those factors which had a significant effect were selected for further optimization. Experimental design, including variations of the chosen factors, was created by MODDE selecting a fractional factorial design to find ideal conditions. Intensity coverage (calculated by dividing the sum of the ion intensities of the matched peaks by the sum of all peak intensities in the spectrum) was chosen as the measure of derivatization quality. A Mascot PMF (peptide mass fingerprint) search was performed with the highest specificity, i.e. no missed cleavages were considered. The procedure for the optimization of the Sulfo-NHS-protocol is given here as an example. The standard protocol which gives excellent results on peptide level was applied to protein

ACS Paragon Plus Environment

12

Page 13 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

alcohol dehydrogenase (ADH), which is a homotetramer of subunits with 37 kDA. Assuming a complete conversion of the primary amines, a tryptic digestion of this protein should produce only three or four fragments, respectively in the accessible mass range (see table 1). The MALDI spectrum of the protein digest before optimization depicts the incomplete modification (see figure 2A, upper trace). Screening experiments revealed that the addition of DMSO and an increasing amount of derivatization reagent have a significant effect on the modification efficiency. The experimental design was created by defining the factors (DMSO and derivatization reagent amount) and the response (intensity coverage). The result is the contour plot in figure 2C which shows how intensity coverage is affected by varying the assessed factors. The best results were obtained with a reaction buffer containing more than 50% DMSO (in 100 mM TEAB) and a Sulfo-NHSAcetate concentration above 1.8 mg/mL. Peak lists for calculating intensity coverage were generated without setting any intensity threshold, before the signal processing steps (peak picking and deisotoping) were applied. Under these preconditions, an intensity coverage of 80% indicates a very high conversion rate. Applying this protocol to ADH the tryptic digestion results in the spectrum shown in figure 2A (lower trace, flipped spectrum). Only the predicted peaks are observed demonstrating a complete conversion of the amino groups. All other protocols were optimized in a similar way and in summary each optimization gave improved results (Supplementary_1, page 8-11, figure S3 – S6). The contour plots as well as the MALDI mass spectra after optimization are given in the supplementary data (Supplementary_1, page 12 – 13, figure S7 and S8).

ACS Paragon Plus Environment

13

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 37

A

B

C

Figure 2. Optimizing the modification (actylation) efficiency on protein level. (A) MALDI mass spectra of alcohol dehydrogenase before (upper trace) and after optimization (lower trace, flipped spectrum) of the standard protocols. Only those three peptides, which are predicted in the accessible mass range are observed. (B) Acetylation of amino groups with Sulfo-NHS-Acetate.

ACS Paragon Plus Environment

14

Page 15 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(C) Contour plot of the reaction optimization. The percentage of DMSO and the concentration of blocking reagent (100% is equal to 2mg/mL) in the reaction buffer affect the intensity coverage and thus the conversion efficiency. The color scale ranges from blue (30%) to red (80 %) intensity coverage.

Table 1. Expected and observed tryptic fragments after complete derivatization and digestion of alcohol dehydrogenase. Expected

Measured

Modifications

Sequence mass

mass

position

EALDFFAR

968.48

968.48

-

VLGIDGGEGKEELFR

1660.85

1660.86

Acetyl:10 Acetyl:4, 8,21,

3436.91

-

GLVKSPIKVVGLSTLPEIYEKMEKGQIVGR

24 Acetyl:4, 8,21, 3452.91

GLVKSPIKVVGLSTLPEIYEKMEKGQIVGR

3452.91

24 Oxidation: 22

ACS Paragon Plus Environment

15

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 37

2.2 Comparison of different lysine modifications in proteins In order to evaluate which lysine modification protocol results in the best ArgC-like fragmentation, the optimized modification protocols were applied to the model protein human apotransferrin which has more cleavage sites compared to alcohol dehydrogenase. The degree of modification was again calculated from the intensity coverage as described above and compared to the results from trypsin and ArgC digestions of unmodified proteins. Six technical replicates were performed for each protocol and each of the resulting samples spotted eight times on the MALDI target. Each spot was measured with 5000 laser shots with a 4800 MALDI TOF/TOF mass spectrometer. Spectra from the same sample were summed up before peak picking and deisotoping. Generated peak lists were subjected to a Mascot PMF search with ArgC or trypsin as enzyme allowing no missed cleavages. Modifications were selected dependent on the applied protocol. The results are summarized in the bar plot in figure 3A, showing that lysine acylation with PA or NHS yields the most specific results with 65 or 59% intensity coverage, respectively. DM and DEPC-modified proteins exhibit somewhat lower intensity coverages. According to this plot, digestion with ArgC leads by far to the highest amount of unspecific and unpredicted signals which makes spectra interpretation challenging. The low yield of specific fragments can mainly be explained by its known trypsin-like behavior5. Even trypsin shows a decreased digestion efficiency in comparison to most of the ArgC-like digests, probably due to a higher amount of missed cleavage sites, which will be discussed in the next chapter. To verify these results a complex protein sample, an Escherichia coli lysate, was subjected to all four modification procedures. After trypsin digestion the results were compared to the ArgCdigested lysate. The protocol with the best performance was finally compared to a conventional

ACS Paragon Plus Environment

16

Page 17 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

tryptic digestion in the next chapter. Each digestion was carried out in triplicate, and the generated peptide mixtures were analyzed by LC-MS/MS as described in the method part. Generated Thermo RAW files were converted to mgf files with Protoewizard’s MSConvert and afterwards submitted to Mascot to perform a MS/MS ion search. For each sample which was modified prior to digestion with trypsin the database search was performed two times with different search parameters, which are detailed described in the supplementary data (Supplementary_1, page 14 – 15, table S1 and S2). In the first database search (Table 1) all modifications except protein’s N-term and methionine oxidation were set as fixed and ArgC was chosen as enzyme, thus the search space was kept small and the results of each sample were comparable to those of the ArgC and trypsin digestions. The second search (table 2) was performed to detect peptides resulting from incomplete protein modifications or unwanted side reactions. To enable their detection, trypsin was set as enzyme and 8 missed cleavages were allowed. Except of the carbamidomethylation of cysteine all other modifications were set as variable. Additionaly, the acetylation of protein’s N-term was considered and for samples treated with DEPC the carbethoxylation of histidines as reported before.20 After generating the target and decoy peptide list with MSParser, just those peptides were further considered for protein identification which have an ion score above the identity threshold (~13). The results of the first database search are listed in table 2, which shows the mean number of peptide to spectra matches (PSMs), unique peptides and unique proteins and their corresponding false discovery rates (FDR). Obviously, best values were obtained after propionylation of the primary amines, in this case 1439 proteins and 9216 unique peptides could

ACS Paragon Plus Environment

17

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 37

be identified. While all derivatization reactions, which reduce the basicity of the lysine side chain, yield comparable results, propionylation is superior with respect to the number of identified peptides. Modification protocols, which do not decrease the basicity of proteins, produced the lowest number of peptides and proteins. Although the highest number of peptides was generated with a conventional trypsin digestion (11178), the protein matches were lower compared to most of the ArgC-like digestions. The MS/MS results of the tryptic digests show the highest number of assigned peptides. About 44% of the MS/MS spectra are associated to peptides from Escherichia coli proteins. Among the ArgC-like digests the highest number of assigned peptides is observed after acylation of the amino groups: about 35% (NHS) and 38% (PA).

ACS Paragon Plus Environment

18

Page 19 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 2. Protein and peptide identification results. #unique

#unique

FDR

FDR proteins

protocol peptides

proteins

peptides [%]

[%]

PA

9216

1439

0.40

0.91

NHS

8683

1402

0.44

1.45

DM

7166

1206

0.40

1.35

DEPC

7868

1425

0.47

1.64

ArgC

4624

1164

0.24

0.78

Trypsin

11178

1287

0.52

2.71

The second mascot search which consider incomplete amino modifications and unwanted site reactions revealed that the higher number of unassigned peptides is caused by several reasons. One reason is the reaction of remaining amino blocking agent with N-terminal amino groups of the proteolytic peptides. The database search revealed that except for the reductive alkylation all other modifications occur on the N-termini of peptides. DEPC has the highest reactivity and/or resistance to washing, on average 695 ± 352 N-terminal modified peptides were detected making up to 6± 3.4% of the whole measured intensity. N-terminal peptide modifications originating from the acylation agents PA and NHS occur approximately to the same extent (PA: 511± 343, NHS: 468± 155) making up to 3 % of the whole measured intensity. The high standard deviation estimated from 3 replicates suggests the conclusion that at least in one sample the number of post-modified peptides is low and one can assume that variations in reaction or purification conditions have a high impact. We therefore believe that the extent can probably be reduced by integrating an additional quenching step with hydroxylamine and additional washing steps.

ACS Paragon Plus Environment

19

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 37

Although the number of post-modified peptides appears to be high in most samples the intensity ratio between modified and unmodified peptides is only in the low percent range. Furthermore, the second database search identified for each derivatization incomplete modified peptides, which contribute to the higher number of unassigned peptides in the first database search. The extent of incompletely modified peptides is unattached of the used modification in the same range. On average about 500 additional peptides could be identified, which make up to 3% of the whole measured intensity. The number of identified peptides which originate from protein’s N-term and exhibit an acetylation at their N-term is negligibly small, whereas the number of peptides which exhibit a modified histidine after treatment with DEPC ranges on average about 250 making up to 2% of the whole measured intensity. ArgC is frequently used in multi-enzyme approaches1,3 to enhance sequence coverages of proteins and protein identifications in complex samples. To address the complementarity of protein information from ArgC-like and trypsin digestions, the distribution of the sequence coverages of all proteins identified after trypsin digestion of Escherichia coli lysate and the increments by the different ArgC-like digests are outlined in the pictured boxplots in figure 3C. The sequence coverage increase is calculated by use of a self-written Python script and by implementing the additional peptide fragment masses from ArgC-like digestions into the trypsin peak lists of each identified protein. Each digestion results in a shift to higher sequence coverages after taking the gained information into account, whereas the highest shift is observed for the PA and NHS protocol indicating the highest complementarity to a conventional tryptic digestion.

ACS Paragon Plus Environment

20

Page 21 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

The results show that the chemical blocking of lysine prior to digestion with trypsin, independent of the choice of the chemical reagent, forms an excellent alternative to an ArgC digestion. The enzyme ArgC shows inferior results in each assessed parameter. Although dimethylation is favorable in respect because of lack of post-digestion modifications of peptides, the acylation of proteins and especially the propionylation is superior with respect to specificity, proteome coverage and complementarity to ArgC. Furthermore, it is one of the most economical modifications. Therefore, the results from the PA protocol are compared in the following to a conventional trypsin digestion of the same Escherichia coli sample.

ACS Paragon Plus Environment

21

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

A

Page 22 of 37

B

C

Figure 3. Comparison of the results on protein and proteome level.

ACS Paragon Plus Environment

22

Page 23 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(A) Intensity coverage after tryptic digestion of lysine-modified human apotransferrin (lines 1-4) and unmodified protein (line 6) as well as after ArgC digestion (line 5). The propionylated protein gives most specific results. All modifications are more specific than ArgC digestion. (B) Number of unassigned and assigned peptides (above identity threshold) from Mascot MS/MS queries is reproducible high across the triplicates of each modification. Highest number of assigned peptides comparing the modification protocols is observed for the PA protocol. (C) Boxplot combined with a swarmplot visualizes the distribution of sequence coverages of the identified proteins with a conventional trypsin digestion (blue box). Combining the sequence information obtained from trypsin digestion with each of the ArgC like digestions results in all cases in a shift to higher sequence coverages with the highest value for the PA protocol (red box).

ACS Paragon Plus Environment

23

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 37

2.3 Comparison of ArgC-like (PA) and trypsin digestion The propionylation of lysines prior to digestion with trypsin revealed to be the best alternative to an ArgC enzymatic digestion. In the following the results of this protocol are compared to a conventional trypsin digestion for the same Escherichia coli lysate. To obtain comparable results the unmodified samples were also immobilized to magnetic beads prior to tryptic digestion. MS measurements and data analysis were carried out in a similar fashion as described in the previous part. The matched peptides and proteins result from the Mascot search which considered one missed cleavage and propionylation of protein’s N-term as a variable and of lysine as fixed modification. Again only those peptides for protein identification were considered which have an ion score above the identity threshold (~13). Figure 4A displays the Venn diagram of the number of unique proteins identified via tryptic digestions of modified and unmodified Escherichia coli lysate and the number of overlapping proteins identified by both methods. 1187 proteins were found in both digests, 252 uniquely identified after modification with PA and 100 uniquely with trypsin. Combining both results 1539 proteins could be identified. Although the highest number of peptides is identified in the tryptic digest (see table 1) the number of identified proteins is about 10% higher with the PA protocol, probably caused by a slight reduction of sample complexity.

This assumption is supported by the scatter plot in Fig 4B. Each spot represents a protein, xand y-value show the mean number of peptides assigned to this protein after tryptic digest with or without PA acylation, respectively. The size of the spots and the degree of the darkness reflect the abundancy of the proteins, which are estimated by the normalized spectral abundance factor (NSAF).28 Even though a first view on the scatter plot gives a different impression, the majority

ACS Paragon Plus Environment

24

Page 25 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

of the spots is in fact above the dashed line (519 vs. 432), indicating that proteins are determined with a higher number of assigned peptides for the ArgC-like compared to the tryptic digestion. The large spread in the scatter below the line is presumably caused by proteins of highest abundance which have much more assigned peptides after the digestion with trypsin. The elongation factor Tu 1(EFTU1_ECOLI), for example, yields approximately 400 more assigned peptides after tryptic digestion compared to ArgC-like digestion, which increases the complexity of the peptide sample without promoting protein identification. The lower number of assigned peptides for highly abundant proteins seems to be one reason for the higher protein identification rate with the PA protocol.

ACS Paragon Plus Environment

25

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

A

Page 26 of 37

B

Figure 4. Complementarity of results (A) Venn diagram of the number of unique proteins identified via tryptic digestions of PAmodified (252) and unmodified (100) Escherichia coli lysate and the number of overlapping proteins identified by both methods (1187). (B) Scatter plot. Each dot represents the mean number of peptides per protein generated after tryptic (x-value) and the ArgC-like (y-value) digestion. The size and degree of darkness of each spots reflect the abundance of a protein which is estimated by the normalized spectral abundance factor (NSAF)28.

ACS Paragon Plus Environment

26

Page 27 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Even though it may not be expected or even counter intuitive due to the reduced number of cleavage sites for ArgC-like compared to tryptic digestion, the distributions of the peptide masses for both digestion reactions do not differ strongly (see figure 5A) and therefore the composition of the fragments was further analyzed. For all our data evaluations one missed cleavage was taken into consideration. Since ArgC-like fragments are inherently longer than tryptic ones, the number of peptides with one missed cleavage which fell into the optimal mass range for fragmentation should be small. Figure 5C visualizes the portions of missed cleavage sites in the identified peptides in dependence of the mass range. Approximately 14% of the matched tryptic peptides exhibit one missed cleavage site making up to 12% of the whole measured intensity. For mimicked ArgC peptides the values are only about 4 % for number and intensity, respectively. Two main reasons seem to decrease the cleavage efficiency of trypsin after having a look at the iceLogos29 in Figure 5E, showing the sequence context of missed cleaved tryptic peptides. One reason is an acidic environment induced by the presence of aspartic and glutamic acid around the cleavage site, in which lysine is more affected than arginine. The presence of an additional cleavage site, in particular the presence of arginine, in vicinity also disrupts the cleavage process. The lower number of missed cleaved peptides is thus a direct result of blocking lysine residues, since cleavage at arginine seems to be more robust against an acidic environment and is mainly affected by an adjacent second arginine. Both influencing factors are in accordance with results from investigations reported before.30–32 The propionylated peptides also differ in their fragmentation behavior compared to common tryptic peptides. The quality of peptide fragmentation was estimated by taking percolator adjusted Mascot ion scores as indicator. The graphic chart in figure 5B depicts a broader ion

ACS Paragon Plus Environment

27

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 37

score distribution shifted to the right for the modified peptides compared to the trypsin distribution. Several factors are known to influence the quality of fragment ion spectra and thus the ion scores, these are the length of the peptide (m/z value), its charge and its composition.33 In the latter case the number of basic residues in the chain can lead to unassignable fragments. The influence on the fragmentation behavior is illustrated graphically by violin plots showing the distribution of ions scores in the defined mass ranges for both protocols (figure 5D). Whereas no significant difference is observable in the lower mass range, the compared distributions show larger differences in the higher mass range. The distribution of the ions scores for the PA protocol is considerably shifted to higher values in comparison to those of trypsin.

ACS Paragon Plus Environment

28

Page 29 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

A

B

C

D

E

ACS Paragon Plus Environment

29

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 37

Figure 5. Comparison of results from tryptic digestions of modified (PA) and unmodified Escherichia coli lysate (A) Calculated m/z values of matched peptides. (B) Distribution of the percolator adjusted ions scores. (C) Fractions of peptides with one missed cleavage in the same mass ranges as defined before. (D) The violin plot is constructed of two parts: the left half visualizes the ions score distribution of the trypsin protocol and the right half the ions score distribution of the PA protocol. Five distinct mass ranges are defined with one violin to show the distribution of the ions scores within the defined mass range for each protocol. (E) Sequence context of all missed cleaved peptides (trypsin: n = 709, PA: n= 192). IceLogo plots show the occurrence of particular residues normalized for their occurrence in the input protein database. For iceLogo analysis all peptides with one missed cleavage for each protocol (left: trypsin, right: PA) were used. The Escherichia coli SwissProt database was used as background.

ACS Paragon Plus Environment

30

Page 31 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Conclusion The non-satisfying results after digestion with ArgC, which are mainly caused by the lack of cleavage specificity, had been the starting point to search for alternative methods to generate ArgC-like peptides. The chemical modification of protein’s amino group and the subsequent digestion with trypsin were tested with different reagents after binding proteins covalently to magnetic beads to simplify the purification between the steps. Propionic acid anhydride shows the best results within the derivatization reagents and in comparison to an ArgC digestion the results are superior in respect to cleavage efficiency, proteome coverage and complementarity to trypsin. Taking into account that ArgC-like digestions are more time-consuming, laborious and slightly more expensive compared to a direct tryptic digestion, the direct comparison of the result reveals that both methods are comparable to each other and unexpectedly in many assessed parameters the ArgC-like results are even superior. The main reasons for the superiority of ArgC-like digestions are a higher number of identified proteins, a lower occurrence of peptides with missed cleavages and higher average ion scores which result in a higher proteome coverage, higher cleavage and fragmentation efficiency.

ACS Paragon Plus Environment

31

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 37

SUPPORTING INFORMATION The following files are available free of charge at ACS website http://pubs.acs.org: Supplementary_1 (pdf file): This document contains the cited amino modification protocol, MALDI-MS spectra of Substance `P after applying the protocols to it, experimental worksheets, results of the optimization of the reaction conditions and the MALDI-MS spectra of modified and digested alcohol dehydrogeanse are displayed before and after optimizing the reaction conditions. Additionally it contains the mascot search parameter and the KNIME workflow to create figure 3.

Peptide_list (Excel document): Excel document with the identified peptides after performing the first Mascot database search.

AUTHOR INFORMATION Corresponding Author *Michael Karas, [email protected] *Vahid Golghalyani, [email protected]

ACS Paragon Plus Environment

32

Page 33 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Author Contributions The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.

Funding Sources IW was supported by Deutsche Forschungsgemeinschaft (SFB815, project Z1).

ABBREVIATIONS PA, propionic acid anhydride; DM, dimethyl-labeling; NHS, Sulfo-NHS-Acetate; DEPC, diethyl pyrocarbonate REFERENCES (1)

(2)

(3) (4) (5)

(6)

(7)

(8) (9)

Guo, X.; Trudgian, D. C.; Lemoff, A.; Yadavalli, S.; Mirzaei, H. Confetti: A Multiprotease Map of the HeLa Proteome for Comprehensive Proteomics. Molecular & Cellular Proteomics 2014, 13 (6), 1573–1584. Choudhary, G.; Wu, S.-L.; Shieh, P.; Hancock, W. S. Multiple Enzymatic Digestion for Enhanced Sequence Coverage of Proteins in Complex Proteomic Mixtures Using Capillary LC with Ion Trap MS/MS. Journal of Proteome Research 2003, 2 (1), 59–67. Swaney, D. L.; Wenger, C. D.; Coon, J. J. Value of Using Multiple Proteases for Large-Scale Mass Spectrometry-Based Proteomics. Journal of Proteome Research 2010, 9 (3), 1323–1329. Giansanti, P.; Tsiatsiani, L.; Low, T. Y.; Heck, A. J. R. Six Alternative Proteases for Mass Spectrometry–based Proteomics beyond Trypsin. Nature Protocols 2016, 11 (5), 993–1006. Krueger, R. J.; Hobbs, T. R.; Mihal, K. A.; Tehrani, J.; Zeece, M. G. Analysis of Endoproteinase Arg C Action on Adrenocorticotrophic Hormone by Capillary Electrophoresis and Reversed-Phase HighPerformance Liquid Chromatography. J. Chromatogr. 1991, 543 (2), 451–461. Frese, C. K.; Altelaar, A. F. M.; Hennrich, M. L.; Nolting, D.; Zeller, M.; Griep-Raming, J.; Heck, A. J. R.; Mohammed, S. Improved Peptide Identification by Targeted Fragmentation Using CID, HCD and ETD on an LTQ-Orbitrap Velos. Journal of Proteome Research 2011, 10 (5), 2377–2388. Geoghegan, K. F. Modification of Amino Groups. In Current Protocols in Protein Science; Coligan, J. E., Dunn, B. M., Speicher, D. W., Wingfield, P. T., Eds.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1996; p 15.2.1-15.2.18. Lundblad, R. L. Chemical Reagents for Protein Modification, Fourth edition; CRC Press, Taylor & Francis Group: Boca Raton, 2014. Garcia, B. A.; Mollah, S.; Ueberheide, B. M.; Busby, S. A.; Muratore, T. L.; Shabanowitz, J.; Hunt, D. F. Chemical Derivatization of Histones for Facilitated Analysis by Mass Spectrometry. Nat. Protocols 2007, 2 (4), 933–938.

ACS Paragon Plus Environment

33

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(10)

(11) (12)

(13) (14)

(15) (16) (17) (18) (19)

(20) (21) (22) (23)

(24)

(25)

(26)

(27) (28)

Page 34 of 37

Zhang, X.; Jin, Q. K.; Carr, S. A.; Annan, R. S. N-Terminal Peptide Labeling Strategy for Incorporation of Isotopic Tags: A Method for the Determination of Site-Specific Absolute Phosphorylation Stoichiometry. Rapid Communications in Mass Spectrometry 2002, 16 (24), 2325–2332. Hsu, J.-L.; Huang, S.-Y.; Chow, N.-H.; Chen, S.-H. Stable-Isotope Dimethyl Labeling for Quantitative Proteomics. Analytical Chemistry 2003, 75 (24), 6843–6852. She, Y.-M.; Rosu-Myles, M.; Walrond, L.; Cyr, T. D. Quantification of Protein Isoforms in Mesenchymal Stem Cells by Reductive Dimethylation of Lysines in Intact Proteins. PROTEOMICS 2012, 12 (3), 369–379. Liu, M.-C.; Lin, Y.-R.; Huang, M.-F.; Tsai, D.-C.; Liang, S.-S. Mass Spectrometry Signal Enhancement by Reductive Amination. International Journal of Mass Spectrometry 2015, 387, 16–23. Krusemark, C. J.; Frey, B. L.; Smith, L. M.; Belshaw, P. J. Complete Chemical Modification of Amine and Acid Functional Groups of Peptides and Small Proteins. In Gel-Free Proteomics; Gevaert, K., Vandekerckhove, J., Eds.; Humana Press: Totowa, NJ, 2011; Vol. 753, pp 77–91. Boersema, P. J.; Raijmakers, R.; Lemeer, S.; Mohammed, S.; Heck, A. J. R. Multiplex Peptide Stable Isotope Dimethyl Labeling for Quantitative Proteomics. Nat. Protocols 2009, 4 (4), 484–494. Schilling, O.; Barré, O.; Huesgen, P. F.; Overall, C. M. Proteome-Wide Analysis of Protein Carboxy Termini: C Terminomics. Nature Methods 2010, 7 (7), 508–511. McDonald, L.; Beynon, R. J. Positional Proteomics: Preparation of Amino-Terminal Peptides as a Strategy for Proteome Simplification and Characterization. Nat Protoc 2006, 1 (4), 1790–1798. Mühlrad, A.; Hegyi, G.; Horányi, M. Studies on the Properties of Chemically Modified Actin III. Carbethoxylation. Biochimica et Biophysica Acta (BBA) - Protein Structure 1969, 181 (1), 184–190. Willard, B. B.; Kinter, M. Effects of the Position of Internal Histidine Residues on the CollisionInduced Fragmentation of Triply Protonated Tryptic Peptides. J. Am. Soc. Mass Spectrom. 2001, 12 (12), 1262–1271. Miles, E. W. [41] Modification of Histidyl Residues in Proteins by Diethylpyrocarbonate. In Methods in Enzymology; Elsevier, 1977; Vol. 47, pp 431–442. Ong, S.-E.; Mann, M. Mass Spectrometry–based Proteomics Turns Quantitative. Nature Chemical Biology 2005, 1 (5), 252–262. Manza, L. L.; Stamer, S. L.; Ham, A.-J. L.; Codreanu, S. G.; Liebler, D. C. Sample Preparation and Digestion for Proteomic Analyses Using Spin Filters. PROTEOMICS 2005, 5 (7), 1742–1745. Hughes, C. S.; Foehr, S.; Garfield, D. A.; Furlong, E. E.; Steinmetz, L. M.; Krijgsveld, J. Ultrasensitive Proteome Analysis Using Paramagnetic Bead Technology. Molecular Systems Biology 2014, 10 (10), 757–757. Holman, J. D.; Tabb, D. L.; Mallick, P. Employing ProteoWizard to Convert Raw Mass Spectrometry Data: Employing ProteoWizard to Convert Raw Mass Spectrometry Data. In Current Protocols in Bioinformatics; Bateman, A., Pearson, W. R., Stein, L. D., Stormo, G. D., Yates, J. R., Eds.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2014; p 13.24.1-13.24.9. Strohalm, M.; Kavan, D.; Novák, P.; Volný, M.; Havlícek, V. mMass 3: A Cross-Platform Software Environment for Precise Analysis of Mass Spectrometric Data. Anal. Chem. 2010, 82 (11), 4648– 4651. Berthold, M. R.; Cebron, N.; Dill, F.; Gabriel, T. R.; Kötter, T.; Meinl, T.; Ohl, P.; Thiel, K.; Wiswedel, B. KNIME - the Konstanz Information Miner: Version 2.0 and beyond. ACM SIGKDD Explorations Newsletter 2009, 11 (1), 26. Käll, L.; Canterbury, J. D.; Weston, J.; Noble, W. S.; MacCoss, M. J. Semi-Supervised Learning for Peptide Identification from Shotgun Proteomics Datasets. Nature Methods 2007, 4 (11), 923–925. Paoletti, A. C.; Parmely, T. J.; Tomomori-Sato, C.; Sato, S.; Zhu, D.; Conaway, R. C.; Conaway, J. W.; Florens, L.; Washburn, M. P. Quantitative Proteomic Analysis of Distinct Mammalian Mediator

ACS Paragon Plus Environment

34

Page 35 of 37

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(29) (30) (31) (32)

(33)

Complexes Using Normalized Spectral Abundance Factors. Proc. Natl. Acad. Sci. U.S.A. 2006, 103 (50), 18928–18933. Colaert, N.; Helsens, K.; Martens, L.; Vandekerckhove, J.; Gevaert, K. Improved Visualization of Protein Consensus Sequences by iceLogo. Nature Methods 2009, 6 (11), 786–787. Monigatti, F.; Berndt, P. Algorithm for Accurate Similarity Measurements of Peptide Mass Fingerprints and Its Application. J. Am. Soc. Mass Spectrom. 2005, 16 (1), 13–21. Siepen, J. A.; Keevil, E.-J.; Knight, D.; Hubbard, S. J. Prediction of Missed Cleavage Sites in Tryptic Peptides Aids Protein Identification in Proteomics. J. Proteome Res. 2007, 6 (1), 399–408. Yen, C.-Y.; Russell, S.; Mendoza, A. M.; Meyer-Arendt, K.; Sun, S.; Cios, K. J.; Ahn, N. G.; Resing, K. A. Improving Sensitivity in Shotgun Proteomics Using a Peptide-Centric Database with Reduced Complexity:  Protease Cleavage and SCX Elution Rules from Data Mining of MS/MS Spectra. Anal. Chem. 2006, 78 (4), 1071–1084. Dongré, A. R.; Jones, J. L.; Somogyi, Á.; Wysocki, V. H. Influence of Peptide Composition, GasPhase Basicity, and Chemical Modification on Fragmentation Efficiency: Evidence for the Mobile Proton Model. Journal of the American Chemical Society 1996, 118 (35), 8365–8374.

ACS Paragon Plus Environment

35

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 37

SYNOPSIS For TOC only:

ACS Paragon Plus Environment

36

Page 37 of 37

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ACS Paragon Plus Environment

37