(iArg-C) Digestion, a Highly Specific Arg-C Digestion Using Trypsin

Data-dependent analysis was employed in MS analysis: the 15. 15 most abundant ions in each MS scan were automatically selected and fragmented in. 16. ...
0 downloads 9 Views 1MB Size
Subscriber access provided by UNIV OF NEW ENGLAND ARMIDALE

Technical Note

Reversible Lysine Derivatization Enables Improved Arg-C (iArgC) Digestion, a Highly Specific Arg-C Digestion Using Trypsin Zhen Wu, Jichang Huang, Jianan Lu, and Xumin Zhang Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.7b04410 • Publication Date (Web): 20 Dec 2017 Downloaded from http://pubs.acs.org on December 22, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

Reversible Lysine Derivatization Enables Improved Arg-C (iArg-C)

2

Digestion, a Highly Specific Arg-C Digestion Using Trypsin

3 4

Zhen Wu1, Jichang Huang1, Jianan Lu1, Xumin Zhang*,1

5

1

6

of Life Sciences, Fudan University, Shanghai 200438, China

State Key Laboratory of Genetic Engineering, Department of Biochemistry, School

7 8

Keywords:

9

Citraconylation / proteomics / LC-MS/MS / Arg-C / Lys-C / trypsin

10 11 12

*Corresponding author:

13

Dr. Xumin Zhang

14

E-mail: [email protected]

15

Tel: +86 21 51630575

16

1

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Abstract

2

Bottom-up proteomics approach has become an important strategy in diverse areas of

3

biological research, and the enzymatic digestion is essential for this technology.

4

Endopeptidase Arg-C catalyzing the hydrolytic cleavage of peptide bonds C-terminal

5

to arginine could be an important protease in bottom-up proteomics. However, it has

6

been seldom applied due to its low specificity and high cost. In this report, the

7

reversible amine derivatization method (citraconylation and decitraconylation) was

8

introduced and optimized towards a real Arg-C digestion using trypsin. Combination

9

of the reversible derivatization and trypsin digestion (termed iArg-C digestion for

10

improved Arg-C digestion) resulted in 64.2% more peptide identification (11,925 ±

11

199 vs 7,262 ± 59) and significantly higher cleavage specificity (95.6% vs 73.6%)

12

than the conventional Arg-C digestion. Comparison of iArg-C digestion with the

13

widely used trypsin and Lys-C digestion revealed that iArg-C performed slightly

14

better than Lys-C although not comparable to trypsin. Therefore, the well-established

15

iArg-C digestion method is a promising approach for proteomics studies and could be

16

used as the prior alternative digestion method to trypsin digestion in order to achieve

17

higher proteome coverage. Data are available via ProteomeXchange with identifier

18

PXD007994.

19

2

ACS Paragon Plus Environment

Page 2 of 22

Page 3 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

Introduction

2

During the latest two decades, proteomics has become the core technology for

3

high-throughput protein characterization and quantification. However, unlike

4

genomics studies, proteomics studies often suffer from the low coverage at both

5

protein and amino acid level.1

6

In bottom-up proteomics analysis, samples are enzymatically digested, subsequently

7

fractionated and analyzed by liquid chromatography coupled with tandem mass

8

spectrometry (LC-MS/MS).2 Trypsin is always the most frequently used protease in

9

bottom-up proteomics studies because it results in MS-favored proteolytic peptides

10

with high specificity and reasonable cost.3-5 It was revealed that the priority of the

11

most used proteases in proteomics studies is: trypsin > Lys-C > chymotrypsin >

12

Glu-C > pepsin, and 96% studies utilized trypsin for protein digestion.4 However,

13

trypsin also has some limitations. E.g., more than half of tryptic peptides are too small

14

(≤ 6 residues) for MS identification, and thus trypsin alone cannot achieve high

15

sequence coverage.6

16

Multi-enzyme digestion approaches have been developed to improve proteome

17

coverage.7-10 Lys-C cleaves peptide bonds C-terminal to Lys,11,12 whereas Arg-C

18

cleaves peptide bonds C-terminal to Arg.13,14 Since these two proteases produce

19

peptides possessing basic residues at C-termini, they could get more chance to be used

20

as alternatives to trypsin. In addition, both proteases produced longer and

21

higher-charged peptides than trypsin, which are favored by ETD analysis and

22

middle-down proteomics.15 However, Arg-C gained rather fewer applications than

23

Lys-C due to the unsatisfactory specificity and the high cost (> 30 times more

24

expensive than trypsin).5,14 Amine derivatization approaches were adopted to

25

Arg-C-like digestion utilizing trypsin and the most used strategies are irreversible,

26

including dimethylation,16,17 propionylation18-20 and acetylation21,22.

27

A special reversible amine derivatization approach has been reported using

28

citraconylation and decitraconylation.23-25 The citraconyl groups can react with

3

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

amines at high pH (pH > 8) and the reaction can be reversed at low pH (pH < 4). In

2

proteomics application, proteins are first citraconylated and digested using trypsin,

3

and then the peptide solution is adjusted to acidic pH to remove the citraconyl groups

4

(Figure 1). By this mean, trypsin works as exactly as Arg-C and Lys remains its

5

original form, facilitating the studies on Lys modifications, ETD analysis,

6

middle-down proteomics and also the consecutive proteolytic digestion similar to the

7

method described by Wisńiewski et al.26 The reversible derivatization had been tested

8

by a few groups.27,28 However, these studies focused on a single protein using

9

matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) analysis,

10

and the reaction conditions varied greatly (10 µL/mL solution vs 3.0 g/g protein), one

11

even using Tris-HCl buffer,27 which is obviously not compatible with amine

12

derivatization reaction. Therefore, it lacks a systematic investigation on the reaction,

13

especially for proteome-scale studies.

14

In this study, we systematically optimized the reversible derivatization method and

15

evaluated iArg-C digestion in large-scale proteomics studies. We revealed that iArg-C

16

digestion performs much better than conventional Arg-C digestion and slightly better

17

than Lys-C digestion. We concluded that iArg-C digestion can be applied as the prior

18

alternative digestion method to the most used trypsin digestion.

19

4

ACS Paragon Plus Environment

Page 4 of 22

Page 5 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

EXPERIMENTAL SECTION

2

Materials and Chemicals

3

Microcon YM-10 (10-kDa cutoff) was purchased from Merck-Millipore (Bedford,

4

MA). Dithiothreitol (DTT), acrylamide, triethylammonium bicarbonate (TEAB),

5

guanidine hydrochloride, citraconic anhydride and trifluoroacetic acid (TFA) were

6

purchased from Sigma-Aldrich (St. Louis, MO, USA). Sodium hydroxide was from

7

Sinopharm shares (Shanghai, China). Mass spectrometry grade trypsin was obtained

8

from Promega (cat. no. V528A) (Madison, WI), Lys-C MS grade was from Wako

9

Chemicals (cat. no. 129-02541) (Osaka, Japan) and Arg-C was obtained from Roche

10

(cat. no. 11.370.529.001) (Indianapolis, IN). All other reagents and solvents were used

11

without further purification.

12

Citraconylation reaction

13

After reduction and alkylation, E. coli or HeLa cell proteins in lysis buffer (4 M

14

guanidine hydrochloride and 100 mM TEAB, pH 8.7) were submitted to

15

citraconylation. Citraconic anhydride and NaOH were added five times to the solution

16

to accomplish the citraconylation reaction. E.g., to reach a final concentration of 200

17

mM citraconic anhydride, 5 µL of 2M citraconic anhydride (dissolved in acetonitrile

18

(ACN)) was added to 200 µL protein sample and immediately followed by the

19

addition of 5 µL 4M NaOH, then the solution was kept at 25 °C in a Thermomixer

20

with a vortex of 800 rpm for 10 min. The procedure was repeated for another four

21

times. Finally the solution was kept for 1 h to accomplish citraconylation reaction.

22

Digestion

23

The FASP method was adapted for digestion in Microcon YM-10 filters.29 After

24

three-time buffer displacement with digestion buffer (100 mM TEAB, pH 8.0),

25

digestion was carried out at 37 °C for 12 h using trypsin (enzyme/protein as 1:50).

26

After digestion, the solution was filtrated out and the filter was washed twice with 10%

27

ACN, and all filtrates were pooled and vacuum-dried to remove ACN.

5

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Decitraconylation reaction

2

TFA was added to reach a final concentration of 1% (~pH 2.0) and the acidified

3

solution was kept at room temperature for 2 h to accomplish decitraconylation

4

reaction.

5

Nanoflow LC-ESI-MS/MS

6

LC-ESI-MS/MS analysis was performed using a nanoflow EASY-nLC 1000 system

7

coupled to an LTQ Orbitrap Elite mass spectrometer. A two-column system was

8

adopted for all analyses. Samples were first loaded onto an Acclaim PepMap100 C18

9

Nano Trap Column (5 µm, 100 Å, 100 µm i.d. × 2 cm, (Thermo Fisher Scientific,

10

Sunnyvale, CA)) and then analyzed on an Acclaim PepMap RSLC C18 column (2 µm,

11

100 Å, 75 µm i.d. × 25 cm (Thermo Fisher Scientific, Sunnyvale, CA)). The mobile

12

phases consisted of Solvent A (0.1% formic acid) and Solvent B (0.1% formic acid in

13

ACN). The peptides were eluted using the following gradients: 2-5% B in 3 min, 5-28%

14

B in 160 min, 28-35% B in 5 min, 35-90% B in 2 min and 90% B for 10 min at a flow

15

rate of 200 nL/min. Data-dependent analysis was employed in MS analysis: the 15

16

most abundant ions in each MS scan were automatically selected and fragmented in

17

HCD mode. All experiments were carried out in duplicate.

18

Data Analysis

19

The raw data were analyzed by Proteome Discoverer (version 1.4, Thermo Fisher

20

Scientific) using an in-house Mascot server (version 2.3, Matrix Science, London,

21

UK).30 E. coli protein database (20161228, 4,304 sequences) and Human protein

22

database (20160213, 20,186 sequences) were downloaded from UniProt. Data were

23

searched using the following parameters: up to two missed cleavage sites were

24

allowed; 10 ppm mass tolerance for MS and 0.05 Da for MS/MS fragment ions;

25

propionamidation on cysteine as fixed modifications; oxidation on methionine as

26

variable modifications. Additional enzyme-specific parameters were as follows: for

27

citraconylated samples, Arg-C/P as the enzyme and protein N-terminal citraconylation,

28

citraconylation on lysine as variable modifications; for Arg-C, Lys-C or trypsin

6

ACS Paragon Plus Environment

Page 6 of 22

Page 7 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

digested samples, Arg-C/P, Lys-C/P or trypsin/P as the enzyme. For the analysis of

2

wrong cleavage sites, the enzymes were changed to the corresponding semi-enzyme.

3

The incorporated Target Decoy PSM Validator in Proteome Discoverer and the

4

mascot expectation value was used to validate the search results and only the hits with

5

FDR ≤ 0.01 and MASCOT expected value ≤ 0.05 were accepted for discussion. The

6

mass spectrometry proteomics data have been deposited to the ProteomeXchange

7

Consortium via the PRoteomics IDEntifications (PRIDE) partner repository with the

8

dataset identifier PXD007994.31,32

9

7

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

RESULTS AND DISCUSSION

2

Optimization of citraconylation and decitraconylation reactions

3

To determine the favored concentration of citraconic anhydride, E.coli proteins were

4

derivatized by citraconic anhydride with different concentrations: 0, 100, 200 and 400

5

mM.

6

It is inappropriate to judge the derivatization efficiency by the percentage of identified

7

citraconylated Lys since citraconyl groups are probably lost during LC-MS/MS

8

analysis (~pH 2.7). Considering that the citraconylated Lys cannot be cleaved by

9

trypsin, the number of K-end peptides in well-derivatized sample should be far lower

10

than in underivatized sample. Therefore, the percentage of K-end peptides was used to

11

evaluate the derivatization efficiency.

12

Data were searched using trypsin as enzyme with four missed cleavage sites (higher

13

than four missed cleavage did not lead to more identified peptides). As summarized in

14

Figure 2A, 200 and 400 mM citraconic anhydride resulted in 1.4% and 0.9% K-end

15

peptides, respectively, much lower than that in underivatized sample (42.5%),

16

indicating that high citraconylation efficiency can be achieved when citraconic

17

anhydride concentration is not lower than 200 mM.

18

The identification capacity of desired peptides was also examined. For this purpose,

19

underivatized sample was searched using trypsin as enzyme and derivatized samples

20

were searched using Arg-C as enzyme; and for all samples up to two missed cleavage

21

sites were allowed. As shown in Figure 2B, 200 mM citraconic anhydride resulted in

22

the highest identification number among the derivatized samples. Therefore, 200 mM

23

citraconic anhydride was chosen for derivatization.

24

It is well known that the citraconyl groups could be completely removed at acidic pH.

25

After digestion, TFA was added to the solution to reach a final concentration of 1%

26

(v/v). The acidified solution was incubated at different temperatures (25 and 37 °C)

27

for different time scales (1 and 2 h). Figure 2B illustrates the removal efficiency.

28

Clearly all conditions worked very well and there was no significant difference. To 8

ACS Paragon Plus Environment

Page 8 of 22

Page 9 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

avoid unexpected side-effect at high temperature, we decided to choose 25 °C and 2 h

2

as the decitraconylation condition. All identification results can be found in Table S-1.

3

Reactions of citraconylation on Ser, Thr and Trp were also evaluated. About 2.0%, 1.5%

4

and 0.2% peptides were identified with citraconylated Ser, Thr and Trp, respectively,

5

with two orders of magnitude smaller than corresponding unmodified peptides in

6

terms of peak area. Therefore, these side reactions would also be reversible under the

7

current experimental conditions and the effects are negligible.

8

Using the optimized conditions, we employed iArg-C digestion using HeLa cell

9

proteins. Similar to that from E. coli sample, the percentages of K-end peptides and

10

citraconylated Lys were 0.7% and 1.3%, respectively, suggesting that the reaction

11

conditions worked well regardless of sample origins. HeLa cell proteins were used for

12

the following experiments. Detailed iArg-C digestion protocol can be found in SI.

13

iArg-C performs better than Arg-C in proteomics analysis

14

Subsequently we carried out a systematic comparison of iArg-C and Arg-C for

15

proteomics analysis. The identification results can be found in Table S-2. We

16

evaluated the digestion performance on the basis of three aspects: identification

17

capacity, digestion efficiency and specificity.

18

As shown in Figure 3A and 3B, it is evident that iArg-C digestion performs much

19

better in identification capacity and cleavage specificity. iArg-C digestion identified

20

11,925 ± 199 peptides corresponding to 2,779 ± 36 proteins, about 64.2% more

21

peptides and 19.9% more proteins than Arg-C digestion. Moreover, iArg-C digestion

22

led to very low wrong cleavage rate (4.4%) due to the high specificity of trypsin, and

23

the superior specificity would be highly beneficial for the high confidence in

24

large-scale identification. While Arg-C digestion led to a rather high wrong cleavage

25

rate (26.4%).

26

In terms of digestion efficiency, iArg-C digestion resulted in 93.3% completely

27

digested peptides (zero missed cleavage site), lower than 96.9% by Arg-C digestion. It

28

seems that Arg-C has better cleavage ability than trypsin, however, iArg-C still 9

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

resulted in considerably more completely digested peptides than Arg-C (11,129 ± 189

2

vs 7,038 ± 37).

3

Therefore, taking advantage of reversible amine derivatization and high specificity of

4

trypsin, iArg-C demonstrated much higher specificity and resulted in considerably

5

more peptide/protein identification than Arg-C.

6

iArg-C performs slightly better than Lys-C in proteomics analysis

7

Next we compared iArg-C with two most used proteases: trypsin and Lys-C (Table

8

S-3). Figure 4A depicts the identification performance of the three proteases. Lys-C

9

identified 11,312 ± 30 peptides corresponding to 2,652 ± 14 proteins, while trypsin

10

identified 19,656 ± 141 peptides corresponding to 3,129 ± 35 proteins. Trypsin as

11

expected outperformed the other two proteases, and iArg-C identified slightly more

12

peptides and proteins than Lys-C.

13

Figure 4B shows the digestion efficiencies of different proteases. Lys-C and trypsin

14

resulted in 96.5% and 81.1% completely digested peptides, respectively. The lower

15

digestion efficiency of trypsin is ascribed to its low cleavage ability on Lys.33,34 It was

16

revealed that 76.2% of all missed cleavage sites are contributed by Lys.

17

Proteases other than trypsin are often used to acquire results complementary to that

18

from trypsin digestion in order to increase proteome coverage. As illustrated in Figure

19

4C, the identified protein number and sequenced amino acids increase as more

20

proteases are included. Averagely iArg-C identified 416 proteins, which were not

21

identified by trypsin, while Lys-C identified 307 proteins. Combination of three

22

proteases together led to identification of total 3,758 ± 4 proteins and 427,256 ± 380

23

amino acids, 20.1% and 60.5% more than trypsin alone.

24

Figure 4D shows the wrong cleavage rates of different proteases. Trypsin again

25

outperformed iArg-C and Lys-C. Although iArg-C digestion also utilizes trypsin, its

26

wrong cleavage rate is about twice as much as that in trypsin digestion (4.4% vs 2.1%)

27

because it has only half desired cleavage sites of trypsin. Lys-C performed the worst

28

with 8.6% wrong cleavage rate. 10

ACS Paragon Plus Environment

Page 10 of 22

Page 11 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

The reproducibility of different digestion approaches was also examined. 68%, 60%,

2

66% and 61% peptides could be identified by both experiments using iArg-C, Arg-C,

3

Lys-C

4

reproducibility of iArg-C method. Among the co-identified peptides, 86%, 80%, 78%

5

and 85% were found with relative standard deviation (RSD) < 20% in terms of peak

6

area for iArg-C, Arg-C, Lys-C and trypsin digestion approaches, respectively.

7

Therefore, iArg-C approach resulted in comparable reproducibility with trypsin in

8

both qualitative and quantitative proteomics studies.

9

Taken together, iArg-C performs slightly better than Lys-C in terms of identification

and

trypsin,

respectively,

indicating

the

acceptable

identification

10

capacity, complementarity to trypsin and digestion specificity.

11

Cleavage specificity of different proteases

12

The specificity of iArg-C, Arg-C, Lys-C and trypsin were further investigated (Figure

13

5). As expected, iArg-C and trypsin demonstrated a very similar amino acid

14

distribution with the exception of Lys. In iArg-C results, Lys ranked third after Phe

15

and Asn, accounting for 11% of total wrong cleavage sites, and it ranked sixth after

16

Phe, His, Tyr, Asn and Met when the background was considered. Considering the

17

K-end peptides are more prone to MS identification, its portion could be far lower

18

than observed. These observations certainly confirm the high efficiency of

19

citraconylation reaction.

20

In Lys-C digestion, Arg contributed for the majority of the wrong cleavage sites,

21

indicating a certain cleavage ability of Lys-C at Arg residues (Figure 5C). Similar

22

situation was observed for Lys in Arg-C digestion (Figure 5D).

11

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

CONCLUSIONS

2

In this report, we systematically optimized the reversible amine derivatization method,

3

citraconylation and decitraconylation, and successfully applied it to iArg-C digestion,

4

a real high-specificity and low-cost Arg-C digestion using trypsin. Although not

5

comparable to trypsin, iArg-C performed slightly better in all tested aspects when

6

compared with Lys-C, the second most used protease, and much better than the

7

conventional Arg-C digestion. Therefore, iArg-C could be used as the prior alternative

8

digestion method to trypsin digestion. In addition, it could get more chances in studies

9

on Lys modifications, ETD analysis, middle-down proteomics and also the

10

consecutive proteolytic digestion.

11

The well-developed iArg-C digestion offers researchers a new digestion possibility

12

and would become an important component in our digestion toolbox for different

13

proteomics studies.

14

12

ACS Paragon Plus Environment

Page 12 of 22

Page 13 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

ASSOCIATED CONTENT

2

Supporting Information

3

iArg-C digestion protocol (PDF)

4

Tables of detailed identification results (XLSX)

5

AUTHER INFORMATION

6

Corresponding Author

7

*E-mail: [email protected]. Tel.: +86 21 5163 0575.

8

ORCID

9

Xumin Zhang: 0000-0002-2810-6363

10

Notes

11

The authors declare no competing financial interest.

12

ACKNOWLEDGEMENTS

13

This work was supported by National Natural Science Foundation of China

14

(31470806), the starting funding for Xumin Zhang from Fudan University and the

15

Research Fund of the State Key Laboratory of Genetic Engineering, Fudan

16

University.

17

13

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

REFERENCES (1) Gstaiger, M.; Aebersold, R. Nat. Rev. Genet. 2009, 10, 617-627. (2) Zhang, Y. Y.; Fonslow, B. R.; Shan, B.; Baek, M. C.; Yates, J. R. Chem. Rev. 2013, 113, 2343-2394. (3) Olsen, J. V.; Ong, S. E.; Mann, M. Mol. Cell. Proteomics 2004, 3, 608-614. (4) Tsiatsiani, L.; Heck, A. J. FEBS J. 2015, 282, 2612-2626. (5) Giansanti, P.; Tsiatsiani, L.; Low, T. Y.; Heck, A. J. R. Nat. Protoc. 2016, 11, 993-1006. (6) Swaney, D. L.; Wenger, C. D.; Coon, J. J. J. Proteome Res. 2010, 9, 1323-1329. (7) Choudhary, G.; Wu, S. L.; Shieh, P.; Hancock, W. S. J. Proteome Res. 2003, 2, 59-67. (8) Biringer, R. G.; Amato, H.; Harrington, M. G.; Fonteh, A. N.; Riggins, J. N.; Hühmer, A. F. Brief Funct. Genomic Proteomic 2006, 5, 144-153. (9) Guo, X. F.; Trudgian, D. C.; Lemoff, A.; Yadavalli, S.; Mirzaei, H. Mol. Cell. Proteomics 2014, 13, 1573-1584. (10) Wisniewski, J. R. Anal. Chem. 2016, 88, 5438-5443. (11) Jekel, P. A.; Weijer, W. J.; Beintema, J. J. Anal. Biochem. 1983, 134, 347-354. (12) Raijmakers, R.; Neerincx, P.; Mohammed, S.; Heck, A. J. R. Chem. Commun. 2010, 46, 8827-8829. (13) Mitchell, W. M.; Harrington, W. F. J. Biol. Chem. 1968, 243, 4683-4692. (14) Krueger, R. J.; Hobbs, T. R.; Mihal, K. A.; Tehrani, J.; Zeece, M. G. J. Chromatogr. 1991, 543, 451-461. (15) Molina, H.; Horn, D. M.; Tang, N.; Mathivanan, S.; Pandey, A. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 2199-2204. (16) Hsu, J. L.; Huang, S. Y.; Chow, N. H.; Chen, S. H. Anal. Chem. 2003, 75, 6843-6852. (17) Boersema, P. J.; Raijmakers, R.; Lemeer, S.; Mohammed, S.; Heck, A. J. Nat. Protoc. 2009, 4, 484-494. (18) Garcia, B. A.; Mollah, S.; Ueberheide, B. M.; Busby, S. A.; Muratore, T. L.; Shabanowitz, J.; Hunt, D. F. Nat. Protoc. 2007, 2, 933-938. (19) Sidoli, S.; Yuan, Z. F.; Lin, S.; Karch, K.; Wang, X. S.; Bhanu, N.; Arnaudo, A. M.; Britton, L. M.; Cao, X. J.; Gonzales-Cope, M.; Han, Y. M.; Liu, S. C.; Molden, R. C.; Wein, S.; Afjehi-Sadat, L.; Garcia, B. A. Proteomics 2015, 15, 1459-1469. (20) Golghalyani, V.; Neupartl, M.; Wittig, I.; Bahr, U.; Karas, M. J. Proteome Res. 2017, 16, 978-987. (21) Choudhary, C.; Kumar, C.; Gnad, F.; Nielsen, M. L.; Rehman, M.; Walther, T. C.; Olsen, J. V.; Mann, M. Science 2009, 325, 834-840. (22) Baeza, J.; Dowell, J. A.; Smallegan, M. J.; Fan, J.; Amador-Noguez, D.; Khan, Z.; Denu, J. M. J. Biol. Chem. 2014, 289, 21326-21338. (23) Dixon, H. B. F.; Perham, R. N. Biochem. J. 1968, 109, 312-314. (24) Habeeb, A. F. S. A.; Atassi, M. Z. Biochemistry 1970, 9, 4939-4944. (25) Shetty, J. K.; Kinsella, J. E. Biochem. J. 1980, 191, 269-272. (26) Wisniewski, J. R.; Mann, M. Anal. Chem. 2012, 84, 2631-2637. (27) Kadlcik, V.; Strohalm, M.; Kodicek, M. Biochem. Bioph. Res. Co. 2003, 305, 1091-1093. (28) Son, Y. J.; Kim, C. K.; Kim, Y. B.; Kweon, D. H.; Park, Y. C.; Seo, J. H. Biotechnol. Progr. 2009, 25, 1064-1070. (29) Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M. Nat. Methods 2009, 6, 359-360. (30) Perkins, D. N.; Pappin, D. J. C.; Creasy, D. M.; Cottrell, J. S. Electrophoresis 1999, 20, 3551-3567. (31) Vizcaino, J. A.; Deutsch, E. W.; Wang, R.; Csordas, A.; Reisinger, F.; Rios, D.; Dianes, J. A.; Sun, Z.; 14

ACS Paragon Plus Environment

Page 14 of 22

Page 15 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1 2 3 4 5 6 7 8 9

Farrah, T.; Bandeira, N.; Binz, P. A.; Xenarios, I.; Eisenacher, M.; Mayer, G.; Gatto, L.; Campos, A.; Chalkley, R. J.; Kraus, H. J.; Albar, J. P.; Martinez-Bartolome, S., et al. Nat Biotechnol 2014, 32, 223-226. (32) Vizcaino, J. A.; Csordas, A.; del-Toro, N.; Dianes, J. A.; Griss, J.; Lavidas, I.; Mayer, G.; Perez-Riverol, Y.; Reisinger, F.; Ternent, T.; Xu, Q. W.; Wang, R.; Hermjakob, H. Nucleic Acids Res. 2016, 44, D447-D456. (33) Glatter, T.; Ludwig, C.; Ahrne, E.; Aebersold, R.; Heck, A. J. R.; Schmidt, A. J. Proteome Res. 2012, 11, 5145-5156. (34) Huesgen, P. F.; Lange, P. F.; Rogers, L. D.; Solis, N.; Eckhard, U.; Kleifeld, O.; Goulas, T.; Gomis-Ruth, F. X.; Overall, C. M. Nat. Methods 2015, 12, 55-58.

10 11

15

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Figure legends

2

Figure 1. Workflow of the iArg-C digestion approach. After citraconylation at high pH

3

and digestion using trypsin, the resulting peptides are decitraconylated at low pH. By

4

this mean, trypsin works as exactly as Arg-C.

5

Figure 2. Optimization of citraconylation and decitraconylation reactions. (A) The

6

number of identified peptides (the left y axis) and the percentage of K-end peptides

7

(the right y axis). (B) The number of identified peptides (the left y axis) and the

8

occurrence of citaconylated Lys (the right y axis).

9

Figure 3. Comparison of iArg-C and Arg-C digestion. (A) The number of MS/MS

10

scans and peptide spectrum matches (PSMs) (the left y axis) and the number of

11

unique peptides and proteins (the right y axis). (B) Proportion of identified peptides

12

with 0, 1 or 2 missed cleavage sites. (C) The number and percentage of wrong

13

cleavage sites.

14

Figure 4. Comparison of iArg-C, Lys-C and trypsin digestion. (A) The number of

15

MS/MS scans and PSMs (the left y axis) and the number of unique peptides and

16

proteins (the right y axis). (B) Proportion of identified peptides with 0, 1 or 2 missed

17

cleavage sites. (C) The number of proteins (the left y axis) and sequenced amino acids

18

(the right y axis) identified by trypsin or the three proteases combined. (D) The

19

number and percentage of wrong cleavage sites.

20

Figure 5. The distribution of wrong cleavage sites of the four different proteases. (A)

21

iArg-C digestion, (B) trypsin digestion, (C) Lys-C digestion and (D) Arg-C digestion.

22

16

ACS Paragon Plus Environment

Page 16 of 22

Page 17 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

2 3

For TOC only

4

17

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. Workflow of the iArg-C digestion approach. After citraconylation at high pH and digestion using trypsin, the resulting peptides are decitraconylated at low pH. By this mean, trypsin works as exactly as Arg-C. 83x85mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 18 of 22

Page 19 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2. Optimization of citraconylation and decitraconylation reactions. (A) The number of identified peptides (the left y axis) and the percentage of K-end peptides (the right y axis). (B) The number of identified peptides (the left y axis) and the occurrence of citaconylated Lys (the right y axis). 175x63mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. Comparison of iArg-C and Arg-C digestion. (A) The number of MS/MS scans and peptide spectrum matches (PSMs) (the left y axis) and the number of unique peptides and proteins (the right y axis). (B) Proportion of identified peptides with 0, 1 or 2 missed cleavage sites. (C) The number and percentage of wrong cleavage sites. 175x125mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 20 of 22

Page 21 of 22 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 4. Comparison of iArg-C, Lys-C and trypsin digestion. (A) The number of MS/MS scans and PSMs (the left y axis) and the number of unique peptides and proteins (the right y axis). (B) Proportion of identified peptides with 0, 1 or 2 missed cleavage sites. (C) The number of proteins (the left y axis) and sequenced amino acids (the right y axis) identified by trypsin or the three proteases combined. (D) The number and percentage of wrong cleavage sites. 160x114mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5. The distribution of wrong cleavage sites of the four different proteases. (A) iArg-C digestion, (B) trypsin digestion, (C) Lys-C digestion and (D) Arg-C digestion. 160x112mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 22 of 22