Arg-C, a More Specific and Efficient Digestion Approach for

Kwiatkowski, Krösser, Wurlitzer, Steffen, Barcaru, Krisp, Horvatovich, Bischoff, and Schlüter. 2018 90 (16), pp 9951–9958. Abstract: The complexit...
0 downloads 0 Views 1MB Size
Subscriber access provided by UNIV OF NEW ENGLAND

Technical Note

Lys-C/Arg-C, a More Specific and Efficient Digestion Approach for Proteomics Studies Zhen Wu, Jichang Huang, Jingnan Huang, Qingqing Li, and Xumin Zhang Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.8b02448 • Publication Date (Web): 19 Jul 2018 Downloaded from http://pubs.acs.org on July 20, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

Lys-C/Arg-C, a More Specific and Efficient Digestion Approach for

2

Proteomics Studies

3 4

Zhen Wu1, Jichang Huang1, Jingnan Huang1, Qingqing Li1, Xumin Zhang*,1

5

1

6

of Life Sciences, Fudan University, Shanghai 200438, China

State Key Laboratory of Genetic Engineering, Department of Biochemistry, School

7 8

Keywords:

9

Lys-C / Arg-C / trypsin / DIA / mass spectrometry

10 11 12

*Corresponding author:

13

Dr. Xumin Zhang

14

E-mail: [email protected]

15

Tel: +86 21 31246575

1

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Abstract

2

Nowadays, bottom-up approaches are predominantly adopted in proteomics studies,

3

which necessitate a proteolysis step prior to MS analysis. Trypsin is often the best

4

protease in choice due to its high specificity and MS-favored proteolytic products. A

5

lot of efforts have been made to develop a superior digestion approach but hardly

6

succeed, especially in large-scale proteomics studies. Herein, we report a new tandem

7

digestion using Lys-C and Arg-C, termed Lys-C/Arg-C, which has been proven to be

8

more specific and efficient than trypsin digestion. Reanalysis of our previous data

9

(Anal Chem. 90(3):1554-1559, 2018) revealed that both Lys-C and Arg-C are

10

trypsin-like proteases and perform better when considered as trypsin. In particular, for

11

Arg-C, the identification capacity is increased to 2.6 times and even comparable with

12

trypsin. The good complementarity, high digestion efficiency and high specificity of

13

Lys-C and Arg-C prompt the Lys-C/Arg-C digestion. We systematically evaluated

14

Lys-C/Arg-C digestion using qualitative and quantitative proteomics approaches and

15

confirmed its superior performance in digestion specificity, efficiency and

16

identification capacity to the currently widely used trypsin and Lys-C/trypsin

17

digestions. As a result, we concluded that Lys-C/Arg-C digestion approach would be

18

the choice of next-generation digestion approach in both qualitative and quantitative

19

proteomics studies. Data are available via ProteomeXchange with identifier

20

PXD009797.

21

2

ACS Paragon Plus Environment

Page 2 of 26

Page 3 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

Introduction

2

Bottom-up approach has become the most widely used strategy in proteomics analysis

3

with the rapid development of mass spectrometer and computational tools.1 A typical

4

bottom-up approach involves digestion by proteases, separation by liquid

5

chromatography (LC), identification and quantification by tandem mass spectrometry

6

(MS/MS), and data analysis by computational tools.2 Quantitative strategies have

7

been becoming an important component of proteomics analysis, e.g. stable isotope

8

labeling and label-free approaches, relying exclusively on the correctly cleaved

9

peptides.3

10

A number of digestion protocols have been developed utilizing different proteases and

11

digestion conditions.4-6 Although some alternative proteases have been applied in

12

proteomics analysis, trypsin is still the most frequently used and it often allows the

13

best results due to its high specificity and MS-favored proteolytic products.7 Trypsin

14

is a serine protease specially cleaving peptides bond C-terminal to Lys and Arg

15

residues (Lys-X and Arg-X bonds).8 Thus the tryptic peptides possess a basic

16

N-terminus and a C-terminal basic residue, easily amenable to MS identification.9

17

Tandem double and triple protease combinations have also been utilized to improve

18

the proteolytic efficiency for comprehensive proteomics analysis.10-12 Lys-C is the

19

second widely used protease in bottom-up proteomics studies, with an efficient

20

cleavage at Lys-X bonds.13 Since trypsin owns a relatively low cleavage ability at

21

Lys-X bonds,9,14 tandem Lys-C/trypsin digestion has been applied in order to acquire

22

more fully cleaved peptides (FCPs) for quantitative studies.15,16 However, there are

23

some studies reporting that the identified peptides from Lys-C/trypsin digestion are

24

fewer than trypsin digestion.17-21 It seems difficult to develop a better digestion

25

approach although with great efforts.22-24

26

It is commonly known that Lys-C cleaves Lys-X bonds, whereas Arg-C cleaves Arg-X

27

bonds.25 However, we previously revealed that Lys-C owns a certain cleavage ability

28

at Arg-X bonds, while Arg-C owns a considerable cleavage ability at Lys-X bonds.26

3

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

According to this perspective, both Lys-C and Arg-C could perform trypsin-like

2

digestion with different cleavage propensity. The complementarity of these two

3

proteases inspired us to investigate on the digestion performance using the

4

combination of both proteases.

5

Herein, we systematically evaluated the tandem Lys-C/Arg-C digestion in terms of

6

different aspects, i.e. identification capacity, cleavage specificity and efficiency. Our

7

results demonstrated that Lys-C/Arg-C digestion performs clearly better than the

8

widely used trypsin and Lys-C/trypsin digestions in both qualitative and quantitative

9

analyses. We concluded that tandem Lys-C/Arg-C digestion would provide a better

10

solution for bottom-up proteomics studies.

11

4

ACS Paragon Plus Environment

Page 4 of 26

Page 5 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

EXPERIMENTAL SECTION

2

Materials and Chemicals

3

Dithiothreitol (DTT), acrylamide, triethylammonium bicarbonate (TEAB), guanidine

4

hydrochloride, Tris base, trifluoroacetic acid (TFA), acetonitrile (ACN), and formic

5

acid (FA) were purchased from Sigma-Aldrich (St. Louis, MO, USA). Hydrochloric

6

acid was from Sinopharm shares (Shanghai, China). Microcon YM-10 (10-kDa cutoff)

7

was purchased from Merck-Millipore (Bedford, MA). Mass spectrometry grade

8

trypsin was obtained from Promega (V528A, Madison, WI), Lys-C was from Wako

9

Chemicals (129-02541, Osaka, Japan) and Arg-C was obtained from Roche

10

(11.370.529.001, Indianapolis, IN). iRT kit was purchased from Biognosys

11

(Switzerland).

12

Protein Extraction from HeLa Cells

13

HeLa (Human epithelial carcinoma) Cells (ATCC) were maintained in DMEM

14

supplemented with 10% fetal bovine serum in humidified atmosphere with 5% CO2 at

15

37 °C and were grown as monolayer cultures in 10 cm tissue culture plate. HeLa cells

16

were harvested by 2,000 g centrifugation at 4 °C for 20 min. The harvested cells were

17

washed three times with PBS buffer and then resuspended in lysis buffer containing 4

18

M guanidine hydrochloride and 100 mM TEAB. The slurry solution was sonicated for

19

6 min (2 s sonication with 5 s intervals) in ice incubation, and the supernatant was

20

collected by centrifugation at 20,000 g for 20 min at 4 °C. A volume of 20 µL of

21

sample aliquot was kept for protein determination using Bradford assay. Subsequently,

22

the sample was submitted to reduction by incubation with 10 mM DTT at 37 °C for

23

45 min, followed by alkylation using 100 mM acrylamide for 1 h at room temperature.

24

The excess acrylamide was quenched by adding 40 mM DTT. The protein solution

25

was diluted to 1 mg/mL with lysis buffer prior to the following application.

26

Protein Digestion Using FASP Method

27

The FASP method was adapted for the following procedures as previously

28

described.5,27 For each digestion approach, three vials containing 50 µg proteins were 5

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

used for the triplicate experiment. The samples were transferred to Microcon YM-10

2

filters and centrifuged at 13,800 g for three-time buffer displacement with digestion

3

buffer. After buffer displacement, digestion was carried out using different proteases

4

with a ratio of enzyme/protein as 1:50. For Lys-C/Arg-C digestion, Lys-C proteolysis

5

was carried out with 50 mM Tris-HCl (pH 8.0) at 37 °C for 6 h, and then Arg-C

6

digestion was performed overnight with 5 mM DTT, 8.5 mM calcium chloride and 0.5

7

mM EDTA. For Lys-C/trypsin digestion, after Lys-C digestion at 37 °C for 6 h,

8

trypsin was added for overnight digestion. After digestion, the solution was filtrated

9

out and the filter was washed twice with 15% ACN. All filtrates were pooled and

10

vacuum-dried to reach a final concentration to about 1 mg/mL. The detailed

11

Lys-C/Arg-C digestion protocol can be found in Supplemental Information.

12

Data Dependent Acquisition (DDA) Analysis

13

LC-ESI-MS/MS analysis was performed using a nanoflow EASY-nLC 1000 system

14

(Thermo Fisher Scientific, Odense, Denmark) coupled to an LTQ Orbitrap Elite mass

15

spectrometer (Thermo Fisher Scientific, Bremen, Germany). 1 µg of peptides were

16

applied for all analyses. Samples were first loaded onto a homemade pre-column (100

17

µm i.d. × 2 cm; 5 µm, ReproSil-Pur 120 C18-AQ, Dr. Maisch GmbH, Germany) and

18

then analyzed on a homemade analytical column (75 µm i.d. × ~25 cm; 2.4 µm,

19

ReproSil-Pur 120 C18-AQ, Dr. Maisch GmbH, Germany). The mobile phases were

20

consisted of solvent A (0.1% formic acid) and solvent B (0.1% formic acid in ACN).

21

The peptides were eluted using the following gradients: 2-5% B in 3 min, 5-28% B in

22

100 min, 28-35% B in 5 min, 35-90% B in 2 min and 90% B for 10 min at a flow rate

23

of 200 nL/min. Data acquisition mode was set to obtain one MS scan at a resolution

24

of 60,000 (m/z 350-1,600, automatic gain control (AGC) target of 1e6 and maximum

25

ion injection time (IIT) of 100 ms) and followed by 15 MS/MS scans of the most

26

intense ions using HCD (normalized collision energy (NCE) of 35; isolation width of

27

2 m/z; resolution of 15,000; AGC target of 5e4 and maximum IIT of 100 ms).

28

Dynamic exclusion of 60 s was applied with a precursor mass tolerance of 10 ppm.

29

Data Independent Acquisition (DIA) Analysis 6

ACS Paragon Plus Environment

Page 6 of 26

Page 7 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

The LC conditions in DIA analysis were the same as that for DDA analysis. To

2

facilitate the retention time calibration, iRT kit was spiked into samples according to

3

the supplier’s protocol.28,29 The DIA method was consisted of a survey scan from 350

4

to 1,600 m/z at a resolution of 60,000 and 21 predefined MS/MS scans of contiguous

5

precursor windows at a resolution of 15,000. The widths of precursor window were

6

set as 50 m/z in the range of 350-450 m/z, 25 m/z in the range of 450-800 m/z, 50 m/z

7

in the range of 800-900 m/z, 100 m/z in the range of 900-1,100 m/z, and 500 m/z in

8

the range of 1,100-1,600 m/z. The data were analyzed by Spectronaut 9.0 (Biognosys,

9

Switzerland) 30 using default settings as previous reported.31

10

Database Searching and Data Analysis

11

The raw data were analyzed by Proteome Discoverer (version 1.4, Thermo Fisher

12

Scientific) using an in-house Mascot Server (version 2.3, Matrix Science, London,

13

UK).32 Human database (20160213, 20,186 sequences) was downloaded from UniProt.

14

Data were searched using the following parameters: trypsin/P as the enzyme; up to

15

two missed cleavage sites were allowed; 10 ppm mass tolerance for MS and 0.05 Da

16

for MS/MS fragment ions; propionamidation on cysteine as fixed modification;

17

oxidation on methionine as variable modification. Semi-trypsin/P was set as the

18

enzyme for the analysis of wrong cleavage sites. The incorporated Target Decoy PSM

19

Validator in Proteome Discoverer and the Mascot expectation value was used to

20

validate the search results and only the hits with FDR ≤ 0.01 and MASCOT expected

21

value ≤ 0.05 were accepted for discussion. The mass spectrometry proteomics data

22

have been deposited to the ProteomeXchange Consortium via the PRIDE partner

23

repository with the dataset identifier PXD009797.33,34

24

7

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

RESULTS AND DISCUSSION

2

Trypsin-like digestion by Lys-C and Arg-C

3

Our previous studies revealed that Lys-C owns a certain cleavage ability at Arg-X

4

bonds, whereas Arg-C owns a considerable cleavage ability at Lys-X bonds.26

5

Considering that Lys-C and Arg-C might act as trypsin-like digestion, the data were

6

re-searched using trypsin as the enzyme with different missed cleavage sites.

7

As shown in Figure 1A, when missed cleavage sites vary from zero to four, the

8

number of identified peptides by Lys-C digestion increases from 10,914 to 11,315

9

when using Lys-C as the enzyme, and from 7,570 to 13,353 when using trypsin as the

10

enzyme. Consequently the number of identified peptides increases about 18% (11,315

11

vs 13,353) when switching the enzyme setting from Lys-C to trypsin. More

12

astonishingly, as shown in Figure 1B, Arg-C digestion retrieves 11,697 more peptides

13

(7,262 vs 18,959) when switching the enzyme setting from Arg-C to trypsin,

14

extending the identification capacity to 2.6 times and very close to that by trypsin

15

digestion (20,012) .

16

The distributions of the cleavage and miscleavage sites were evaluated. For this

17

purpose, the identification results searched using trypsin as the enzyme with 4 missed

18

cleavage sites were used (Table S-1). As demonstrated in Figure 1C, Lys-C mainly

19

cleaves Lys-X bonds (91.3%), and Arg-C preferably cleaves Arg-X bonds (61.6%).

20

However, compared to the cleavage ratio at Arg-X bonds by Lys-C (8.7%), Arg-C

21

presents a clearly higher cleavage ratio at Lys-X bonds (38.4%). In contrast to

22

cleavage sites, the miscleavage sites show an opposite propensity. As for trypsin

23

digestion, the percentage of miscleaved Lys-X bonds was 76.5%, indicating the lower

24

cleavage ability at Lys-X bonds, which is consistent with previous studies.14,18,24

25

Since the wrong cleavage sites of Lys-C and Arg-C are mainly contributed by Arg-X

26

and Lys-X, respectively,26 and the identified rates increase dramatically when they are

27

considered as trypsin, we re-examined their specificity according to trypsin criteria

28

(Figure 1D). The wrong cleavage sites by Lys-C and Arg-C digestions account for 1.1% 8

ACS Paragon Plus Environment

Page 8 of 26

Page 9 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

and 0.7%, respectively, far lower than that when they are considered as Lys-C or

2

Arg-C (8.6% and 26.4%, respectively);26 whereas the wrong cleavage sites by trypsin

3

account for 2.2%. Therefore, once considered as trypsin-like protease, Lys-C and

4

Arg-C are extremely specific, even significantly more specific than trypsin, the most

5

specific protease used in proteomics studies so far.24 The detailed illustration of wrong

6

cleavage sites can be found in Figure S-1. The wrong cleavage frequently occurs at

7

His, Asn, Tyr, Met, Ser and Ala (in descending order) for Lys-C digestion; His, Tyr,

8

Met, Ala and Cys for Arg-C digestion; and Phe, His, Tyr, Asn and Met for trypsin

9

digestion. Clearly, compared with Lys-C and trypsin digestions, wrong cleavage sites

10

by Arg-C digestion seem more random and no significant bias to amino acid residues,

11

whereas Lys-C favors the cleavage at basic residues and trypsin favors the cleavage at

12

basic residues and hydrophobic residues, which is contributed by its chymotrypsin

13

activity.35-38

14

Taken together, Lys-C and Arg-C could perform trypsin-like digestion, and to some

15

extent, the two proteases could be considered as trypsin with a certain cleavage bias at

16

Lys or Arg and higher specificity than genuine trypsin.

17

Combination of Lys-C and Arg-C

18

We subsequently evaluated Lys-C, Arg-C and trypsin digestions in terms of the

19

cleavage efficiency at K-X and R-X bonds. The cleavage efficiency at different

20

K/R-X bonds is demonstrated by the percentage of the observed cleavage sites

21

(

22

more efficient than the other two proteases with the only exception of K/R-P.

23

However, Lys-C is far the best in terms of the cleavage efficiency at K-X, especially

24

when the amino acid is Asp, Glu, or Pro (Figure 2B); and likewise, Arg-C is no doubt

25

the best in terms of the cleavage efficiency at R-X, especially when the amino acid is

26

Pro (Figure 2C).

27

Therefore, on the basis of the high-specificity trypsin-like digestion, complementarity

28

of cleavage priority and high cleavage efficiency, the combination of Lys-C and

No .cleaved × 100 % ). Obviously, as shown in Figure 2A, trypsin is No .cleaved + No .miscleaved

9

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Arg-C possibly surpasses trypsin alone.

2

Proteomics analysis of Lys-C/Arg-C digestion

3

Next, we carried out a systematic analysis of the tandem Lys-C/Arg-C digestion using

4

HeLa cell lysates to evaluate its potential in proteomics studies.

5

The widely applied trypsin and Lys-C/trypsin digestions were employed for

6

comparison. Since tandem digestion approaches involve two proteases with the ratio

7

of protease to sample as 1:50 each, trypsin digestion was carried out at two different

8

ratios of protease to sample as 1:50 and 1:25 for a fair comparison. All experiments

9

were carried out in triplicate. The identification capacity, digestion efficiency and

10

specificity were respectively evaluated. All the identification results can be found in

11

Table S-2.

12

The identification results of the four different approaches were illustrated in Figure

13

3A. Lys-C/Arg-C digestion clearly outperforms other digestions with the

14

identification of 14,221 peptides corresponding to 2,398 proteins, at least 12.3% more

15

peptides and 11.9% more proteins than others. We further analyzed the overlaps of

16

different digestion approaches (Figure S-2). Out of the total 28,427 peptides, 4,705

17

were solely identified by Lys-C/Arg-C, about three times more than others. Moreover,

18

regarding the peptides solely identified by different approach, the missed cleaved

19

peptides (MCPs) account for only 14.1% in Lys-C/Arg-C, whereas they account for

20

38.5%, 41.5% and 42.0% in Lys-C/Trypsin, trypsin (1:25) and trypsin (1:50),

21

respectively, indicating the different digestion mode of Lys-C/Arg-C approach. The

22

superior identification capacity clearly shows that Lys-C/Arg-C digestion has the

23

great potential to be widely used in bottom-up proteomics studies. Comparison of

24

Lys-C/trypsin and two trypsin digestions revealed that the three approaches performed

25

comparably with the identification of similar numbers of peptides/proteins. The

26

overlaps of the triplicate experiment by Lys-C/Arg-C, Lys-C/trypsin, and two trypsin

27

digestions were 67%, 69%, 70%, and 67%, respectively, indicating the good

28

reproducibility of these approaches (Figure S-3).

10

ACS Paragon Plus Environment

Page 10 of 26

Page 11 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

We further investigated the digestion efficiency of the four approaches. High

2

digestion efficiency would be of great help for quantitative studies. Currently, more

3

quantitative work relies only on the unique FCPs since MCPs often exist in multiple

4

forms and would deteriorate the quantitation accuracy. As shown in Figure 3B, the

5

percentage of FCPs by Lys-C/Arg-C is the highest as 91.3%, followed by 87.8% of

6

Lys-C/trypsin. Regarding FCP only, Lys-C/trypsin surpasses two trypsin digestions,

7

and undoubtedly, Lys-C/Arg-C is the superior with at least 19.2% more than others.

8

Moreover, the similar identified peptide/protein number and digestion efficiency (79.9%

9

and 80.5% of FCPs) of two different trypsin digestion approaches suggest that the

10

increase on the ratio of trypsin to sample from 1:50 to 1:25 has a minor effect. Figure

11

3C illustrates the percentage of peptides with different miscleavage sites. Apparently,

12

Lys-C/Arg-C digestion is more balanced at the two cleavage sites, whereas

13

Lys-C/trypsin digestion results in significantly more miscleavage sites at R-X and

14

both trypsin digestions result in significantly more miscleavage sites at K-X. Taken

15

together, Lys-C/Arg-C digestion is more efficient and results in more FCPs than

16

Lys-C/trypsin and trypsin digestions.

17

Since the combination of two proteases would possibly deteriorate the cleavage

18

specificity, we subsequently assessed the cleavage specificity. Figure 3D shows the

19

wrong cleavage rates of different digestion approaches. The wrong cleavage rate of

20

Lys-C/Arg-C digestion was 1.1%, whereas the wrong cleavage rates of the other three

21

digestions are very close, ranging from 1.9% to 2.2%, about twice of that from

22

Lys-C/Arg-C digestion. Compared with wrong cleavage rates by Lys-C, Arg-C and

23

trypsin only (1.1%, 0.7%, and 2.2%, respectively, Figure 1D), the wrong cleavage

24

rates of combined proteases seem not the sum of both proteases but very close to that

25

of the less specific protease. The detailed information of wrong cleavage sites is

26

elaborated in Figure S-4. The wrong cleavage sites of tandem digestion are also

27

mainly contributed by the less specific protease. E.g., the distribution pattern of wrong

28

cleavage sites by Lys-C/Arg-C digestion is very similar to that by Lys-C alone (Figure

29

S-1A), and likewise, the distribution pattern of wrong cleavage sites by Lys-C/trypsin

11

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

is nearly identical to that by trypsin alone. To summarize, Lys-C/Arg-C digestion

2

outperforms Lys-C/trypsin and trypsin digestions in terms of the digestion specificity,

3

and the combination of two proteases does not increase the wrong cleavage rate.

4

Therefore, tandem Lys-C/Arg-C digestion performs clearly the best in terms of the

5

identification capacity, digestion efficiency and specificity.

6

Analysis of the cleavage efficiency at K/R-X

7

Next, we investigated the cleavage efficiency of Lys-C/Arg-C, Lys-C/trypsin and

8

trypsin digestion approaches at different K/R-X bonds. As shown in Figure 4A,

9

Lys-C/Arg-C again outperforms in most cases, particularly in the cleavage at K/R-P,

10

owing to the combination of high cleavage efficiency at K-P by Lys-C and R-P by

11

Arg-C. Compared with trypsin, Lys-C/Arg-C and Lys-C/trypsin own the higher

12

cleavage efficiency when the following amino acid is Asp, Glu, or Arg. It is also

13

necessary to mention that the cleavage efficiency of Lys-C/Arg-C is slightly lower

14

than Lys-C only at K-X or Arg-C only at R-X (Figure 2B and C), averagely 2.6% and

15

7.2% lower respectively. The results indicate that the two proteases may digest each

16

other and impair the digestion efficiency to a minor extent. The significant less effect

17

at K-X is possibly because Lys-C is added 6 h before Arg-C and is less affected.

18

Similar observation was also observed for Lys-C/trypsin.

19

The cleavage efficiency of trypsin at different K/R-X bonds was discussed by

20

previous studies.18,39-43 It was well accepted that trypsin cleaves at K/R-X with the

21

exception of K/R-P. However, Rodriguez et al. challenged the definition and argued

22

that cleavage at K/R-P is not rare and comparable to that at K/R-C and K/R-W.41 Our

23

results show that the cleavage efficiency for trypsin at K/R-P, in particular K-P, is far

24

lower than others, and the cleavage efficiency at K/R-D/E is relatively lower, but no

25

significantly low cleavage efficiency at K/R-C and K/R-W has been observed. Our

26

results are in agreement with the very original report by Thiede et al..39 However, the

27

cleavage frequency at K/R-P is significantly higher than non-specific cleavages and

28

should be included as trypsin substrate as suggested by Rodriguez et al..41 In

12

ACS Paragon Plus Environment

Page 12 of 26

Page 13 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

particular, the K/R-P should definitely be considered in Lys-C/trypsin and

2

Lys-C/Arg-C digestions since the corresponding cleavage efficiency is dramatically

3

improved.

4

DIA analysis of Lys-C/Arg-C digestion

5

Recently, Glatter et al. compared the Lys-C/trypsin and trypsin digestions and

6

observed significantly more FCPs from Lys-C/trypsin digestion with higher intensities,

7

which would facilitate the accuracy in quantitative proteomics studies.18 We thus

8

conducted a DIA quantification analysis to decipher which approach is able to

9

produce more FCPs with higher intensities. Spectronaut was used for DIA data

10

analysis.30,31 The detected peptides in triplicate DIA experiments were used to

11

evaluate the reproducibility of different digestion approaches. As demonstrated in

12

Figure S-5, the percentage of the peptides detected in all three replicates by

13

Lys-C/Arg-C approach is slightly higher than others.

14

In total, 15,574 peptides were quantified across all samples, among which 14,473 are

15

FCPs and 1,101 are MCPs (Table S-3). We globally evaluated the reproducibility of

16

different digestion approaches by principal component analysis (PCA) (Figure S-6).

17

The PCA demonstrated a tight grouping of the samples from same digestion method,

18

confirming a high reproducibility of each approach. Lys-C/Arg-C, Lys-C/trypsin and

19

trypsin digestions can be nicely separated, but no separation can be observed for the

20

two trypsin digestions. We also evaluated the correlation of the intensity of the same

21

identified peptides in two replicates of the same approach, and the results indicated

22

the high accuracy of different digestion approaches (Figure S-7).

23

We further performed detailed comparisons of all identified peptides by calculating

24

the intensity ratios and the relevant p-values. As demonstrated in Figure 5A-C,

25

Lys-C/Arg-C digestion results in overwhelmingly more FCPs and fewer MCPs with

26

higher intensities in all comparisons (fold change > 2, p-value < 0.01). When

27

compared with trypsin, 578 and 569 FCPs with higher intensities were observed from

28

Lys-C/Arg-C digestion, at least 2 times more than that from trypsin alone, 180 and

13

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

141 for 1:50 and 1:25, respectively. On the contrary, significantly fewer

2

higher-intensity MCPs were observed from Lys-C/Arg-C digestion than two trypsin

3

approaches (29 vs 195 and 27 vs 163 for 1:50 and 1:25, respectively). Similarly,

4

significantly more higher-intensity FCPs (326 vs 112) and fewer higher-intensity

5

MCPs (40 vs 175) were observed from Lys-C/Arg-C digestion than Lys-C/trypsin

6

digestion.

7

We next conducted a detailed analysis on the higher-intensity FCPs acquired by

8

different digestion approaches. In terms of the higher-intensity FCPs, Lys-end

9

peptides account for 72% among Lys-C/Arg-C digestion, whereas about 32% for

10

trypsin. Moreover, compared to trypsin, the cleavage efficiency at K/R-X by

11

Lys-C/Arg-C was significantly improved, especially when X is Pro or Lys, indicating

12

the superior cleavage efficiency of Lys-C/Arg-C. As for Lys-C/Arg-C versus

13

Lys-C/trypsin, the cleavage does not show any significant bias to Lys (41% vs 46%)

14

or Arg (59% vs 54%).

15

The comparisons of Lys-C/trypsin and two trypsin digestions were illustrated in

16

Figure 5D-F. Consistent with the previous quantitative studies, Lys-C/trypsin

17

digestion improves the yield of FCPs compared to trypsin alone.18 Moreover, the two

18

trypsin digestions do not demonstrate significant difference to each other, confirming

19

that the concentration of trypsin has a minimal effect on digestion efficiency.

20

In summary, Lys-C/Arg-C digestion resulted in significantly more FCPs with

21

higher-intensity than trypsin and Lys-C/trypsin digestions, which would definitely

22

improve the quantitative accuracy and thus facilitate large-scale quantitative

23

proteomics analysis.

24

Practical Limitations

25

The main limitation of Lys-C/Arg-C digestion is the price of Arg-C, which is around

26

30 times more expensive than trypsin (200 RMB vs 7 RMB for 1 µg). The high price

27

of Arg-C may hamper its application when digestion of high amount of sample is

28

necessary. Nevertheless, concerning that 25-100 µg of sample (recommended for 14

ACS Paragon Plus Environment

Page 14 of 26

Page 15 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

FASP digestion) is usually enough for most proteomics studies,44 the extra spending

2

on protease is rather minimal. Also it is very likely that the recombinant Arg-C will be

3

commercially available to lower the price when the approach is widely accepted.

4

15

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

CONCLUSIONS

2

Currently, quantitative proteomics studies mainly rely on the comparison of FCPs and

3

thus necessitate a high cleavage efficiency and specificity of digestion approach.

4

Trypsin is often the best protease for its high specificity and MS-favored proteolytic

5

products. It is very difficult to develop a better digestion approach than it.

6

We proposed the Lys-C/Arg-C digestion as the upgradation of the widely used trypsin

7

and Lys-C/trypsin digestions. We systematically evaluated the Lys-C/Arg-C digestion

8

from different aspects, including identification capacity, digestion specificity and

9

efficiency. Our results reveal that Lys-C/Arg-C digestion is the best in terms of all

10

tested aspects, and shows a superior performance in both qualitative and quantitative

11

proteomics studies. Moreover, Lys-C/Arg-C digestion method has been used for

12

different sample preparation in our lab and the superior performance has been

13

reproducibly observed. Therefore, we concluded that Lys-C/Arg-C digestion approach

14

would provide a new choice for large-scale proteomics studies and has the great

15

potential to replace the currently widely used trypsin and Lys-C/trypsin approaches.

16

ACS Paragon Plus Environment

Page 16 of 26

Page 17 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

ASSOCIATED CONTENT

2

Supporting Information

3

Lys-C/Arg-C digestion protocol, analysis of wrong cleavage sites, reproducibility

4

analysis of identification and quantification results (PDF)

5

Tables of detailed identification results (XLSX)

6 7

AUTHOR INFORMATION

8

Corresponding Author

9

*E-mail: [email protected]. Tel.: +86 21 3124 6575.

10

ORCID

11

Xumin Zhang: 0000-0002-2810-6363

12

Notes

13

The authors declare no competing financial interest.

14 15

ACKNOWLEDGEMENTS

16

We thank Dr. Yanhong Li and Miss Lin Huang for their help with MS analysis. This

17

work was supported by National Natural Science Foundation of China (31470806),

18

the starting funding for Xumin Zhang from Fudan University and the Research Fund

19

of the State Key Laboratory of Genetic Engineering, Fudan University.

20

17

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

References (1) Domon, B.; Aebersold, R. Science 2006, 312, 212-217. (2) Zhang, Y. Y.; Fonslow, B. R.; Shan, B.; Baek, M. C.; Yates, J. R. Chem. Rev. 2013, 113, 2343-2394. (3) Domon, B.; Aebersold, R. Nat. Biotechnol. 2010, 28, 710-721. (4) Hervey, W. J. t.; Strader, M. B.; Hurst, G. B. J. Proteome Res. 2007, 6, 3054-3061. (5) Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M. Nat. Methods 2009, 6, 359-360. (6) Proc, J. L.; Kuzyk, M. A.; Hardie, D. B.; Yang, J.; Smith, D. S.; Jackson, A. M.; Parker, C. E.; Borchers, C. H. J. Proteome Res. 2010, 9, 5422-5437. (7) Tsiatsiani, L.; Heck, A. J. R. FEBS J. 2015, 282, 2612-2626. (8) Olsen, J. V.; Ong, S. E.; Mann, M. Mol. Cell. Proteomics 2004, 3, 608-614. (9) Vandermarliere, E.; Mueller, M.; Martens, L. Mass. Spectrom. Rev. 2013, 32, 453-465. (10) MacCoss, M. J.; McDonald, W. H.; Saraf, A.; Sadygov, R.; Clark, J. M.; Tasto, J. J.; Gould, K. L.; Wolters, D.; Washburn, M.; Weiss, A.; Clark, J. I.; Yates, J. R., 3rd. Proc. Natl. Acad. Sci. U. S. A. 2002, 99, 7900-7905. (11) Bian, Y. Y.; Ye, M. L.; Song, C. X.; Cheng, K.; Wang, C. L.; Wei, X. L.; Zhu, J.; Chen, R.; Wang, F. J.; Zou, H. F. J. Proteome Res. 2012, 11, 2828-2837. (12) Meyer, J. G.; Kim, S.; Maltby, D. A.; Ghassemian, M.; Bandeira, N.; Komives, E. A. Mol. Cell. Proteomics 2014, 13, 823-835. (13) Jekel, P. A.; Weijer, W. J.; Beintema, J. J. Anal. Biochem. 1983, 134, 347-354. (14) Huesgen, P. F.; Lange, P. F.; Rogers, L. D.; Solis, N.; Eckhard, U.; Kleifeld, O.; Goulas, T.; Gomis-Ruth, F. X.; Overall, C. M. Nat. Methods 2015, 12, 55-58. (15) Ebhardt, H. A.; Sabido, E.; Huttenhain, R.; Collins, B.; Aebersold, R. Proteomics 2012, 12, 1185-1193. (16) Guo, X. F.; Trudgian, D. C.; Lemoff, A.; Yadavalli, S.; Mirzaei, H. Mol. Cell. Proteomics 2014, 13, 1573-1584. (17) Klammer, A. A.; MacCoss, M. J. J. Proteome Res. 2006, 5, 695-700. (18) Glatter, T.; Ludwig, C.; Ahrne, E.; Aebersold, R.; Heck, A. J. R.; Schmidt, A. J. Proteome Res. 2012, 11, 5145-5156. (19) Chen, E. I.; Cociorva, D.; Norris, J. L.; Yates, J. R. J. Proteome Res. 2007, 6, 2529-2538. (20) Nilsson, T.; Mann, M.; Aebersold, R.; Yates, J. R.; Bairoch, A.; Bergeron, J. J. Nat. Methods 2010, 7, 681-685. (21) Wisniewski, J. R.; Mann, M. Anal. Chem. 2012, 84, 2631-2637. (22) Swaney, D. L.; Wenger, C. D.; Coon, J. J. J. Proteome Res. 2010, 9, 1323-1329. (23) Peng, M.; Taouatas, N.; Cappadona, S.; van Breukelen, B.; Mohammed, S.; Scholten, A.; Heck, A. J. R. Nat. Methods 2012, 9, 524-525. (24) Giansanti, P.; Tsiatsiani, L.; Low, T. Y.; Heck, A. J. R. Nat. Protoc. 2016, 11, 993-1006. (25) Krueger, R. J.; Hobbs, T. R.; Mihal, K. A.; Tehrani, J.; Zeece, M. G. J. Chromatogr. 1991, 543, 451-461. (26) Wu, Z.; Huang, J. C.; Lu, J. N.; Zhang, X. M. Anal. Chem. 2018, 90, 1554-1559. (27) Zhang, Y.; He, Q. Z.; Ye, J. Y.; Li, Y. H.; Huang, L.; Li, Q. Q.; Huang, J. N.; Lu, J. N.; Zhang, X. M. Anal. Chem. 2015, 87, 10354-10361. (28) Gillet, L. C.; Navarro, P.; Tate, S.; Rost, H.; Selevsek, N.; Reiter, L.; Bonner, R.; Aebersold, R. Mol. Cell. Proteomics 2012, 11. 18

ACS Paragon Plus Environment

Page 18 of 26

Page 19 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

(29) Bruderer, R.; Bernhardt, O. M.; Gandhi, T.; Reiter, L. Proteomics 2016, 16, 2246-2256. (30) Bruderer, R.; Bernhardt, O. M.; Gandhi, T.; Miladinovic, S. M.; Cheng, L. Y.; Messner, S.; Ehrenberger, T.; Zanotelli, V.; Butscheid, Y.; Escher, C.; Vitek, O.; Rinner, O.; Reiter, L. Mol. Cell. Proteomics 2015, 14, 1400-1410. (31) Huang, J. N.; Wang, J.; Li, Q. Q.; Zhang, Y.; Zhang, X. M. J. Proteome Res. 2018, 17, 212-221. (32) Perkins, D. N.; Pappin, D. J. C.; Creasy, D. M.; Cottrell, J. S. Electrophoresis 1999, 20, 3551-3567. (33) Vizcaino, J. A.; Deutsch, E. W.; Wang, R.; Csordas, A.; Reisinger, F.; Rios, D.; Dianes, J. A.; Sun, Z.; Farrah, T.; Bandeira, N.; Binz, P. A.; Xenarios, I.; Eisenacher, M.; Mayer, G.; Gatto, L.; Campos, A.; Chalkley, R. J.; Kraus, H. J.; Albar, J. P.; Martinez-Bartolome, S., et al. Nat. Biotechnol. 2014, 32, 223-226. (34) Vizcaino, J. A.; Csordas, A.; Del-Toro, N.; Dianes, J. A.; Griss, J.; Lavidas, I.; Mayer, G.; Perez-Riverol, Y.; Reisinger, F.; Ternent, T.; Xu, Q. W.; Wang, R.; Hermjakob, H. Nucleic Acids Res. 2016, 44, D447-D456. (35) Casey, R.; Lang, A. Biochim. Biophys. Acta. 1976, 434, 184-188. (36) Keil, B. Protein Sequences & Data Analysis 1987, 1, 13-20. (37) Perona, J. J.; Craik, C. S. J. Biol. Chem. 1997, 272, 29987-29990. (38) Burkhart, J. M.; Schumbrutzki, C.; Wortelkamp, S.; Sickmann, A.; Zahedi, R. P. J. Proteomics 2012, 75, 1454-1462. (39) Thiede, B.; Lamer, S.; Mattow, J.; Siejak, F.; Dimmler, C.; Rudel, T.; Jungblut, P. R. Rapid. Commun. Mass Sp. 2000, 14, 496-502. (40) Siepen, J. A.; Keevil, E. J.; Knight, D.; Hubbard, S. J. J. Proteome Res. 2007, 6, 399-408. (41) Rodriguez, J.; Gupta, N.; Smith, R. D.; Pevzner, P. A. J. Proteome Res. 2008, 7, 300-305. (42) Bunkenborg, J.; Espadas, G.; Molina, H. J. Proteome Res. 2013, 12, 3631-3641. (43) Fannes, T.; Vandermarliere, E.; Schietgat, L.; Degroeve, S.; Martens, L.; Ramon, J. J. Proteome Res. 2013, 12, 2253-2259. (44) Wisniewski, J. R. Anal. Chem. 2016, 88, 5438-5443.

27

19

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Figure legends

2

Figure 1. Comparison of Lys-C, Arg-C, and trypsin digestion. (A) The number of

3

unique peptides identified by Lys-C digestion when searched as Lys-C or trypsin with

4

0 to 4 missed cleavage sites. (B) The number of unique peptides identified by Arg-C

5

digestion when searched as Arg-C or trypsin with 0 to 4 missed cleavage sites. (C)

6

The proportion of cleaved K/R-X bonds (the left y-axis) and miscleaved K/R-X bonds

7

(the right y-axis). (D) The percentage of wrong cleavage sites. N/C-term cleavage

8

represents the cleavage at N/C-term to the identified peptide. The wrong cleavage

9

percentage is calculated by dividing the wrong cleavage number by the all cleavage

10

number.

11

Figure 2. The cleavage efficiencies of Lys-C, Arg-C and trypsin at (A) K/R-X bonds,

12

(B) K-X bonds, and (C) R-X bonds.

13

Figure 3. Comparison of Lys-C/Arg-C, Lys-C/trypsin, and two trypsin digestion

14

approaches. (A) The number of MS/MS scans and peptide spectrum matches (PSMs)

15

(the left y-axis) and the number of unique peptides and proteins (the right y-axis). (B)

16

The proportion of identified peptides with 0, 1 or 2 missed cleavage sites (MCS) (the

17

left y-axis) and the number of FCPs (the right y-axis). (C) The distribution of missed

18

cleavage sites. (D) The percentage of wrong cleavage sites. N/C-term cleavage

19

represents the cleavage at N/C-term to the identified peptide. The wrong cleavage

20

percentage is calculated by dividing the wrong cleavage number by the all cleavage

21

number.

22

Figure 4. Analysis of cleavage efficiencies of different digestion approaches. The

23

cleavage efficiencies at (A) K/R-X bonds, (B) K-X bonds, and (C) R-X bonds.

24

Figure 5. Volcano plot of peptide intensities in different digestion approaches. The

25

numbers of peptides with intensity fold change > 2 and p-value < 0.01 are presented.

26

(A) Lys-C/Arg-C vs trypsin (1:50), (B) Lys-C/Arg-C vs trypsin (1:25), (C)

27

Lys-C/Arg-C vs Lys-C/trypsin, (D) Lys-C/trypsin vs trypsin (1:50), (E) Lys-C/trypsin

28

vs trypsin (1:25), and (F) Trypsin (1:25) vs trypsin (1:50). 20

ACS Paragon Plus Environment

Page 20 of 26

Page 21 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1 2

For TOC only

21

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. Comparison of Lys-C, Arg-C, and trypsin digestion. (A) The number of unique peptides identified by Lys-C digestion when searched as Lys-C or trypsin with 0 to 4 missed cleavage sites. (B) The number of unique peptides identified by Arg-C digestion when searched as Arg-C or trypsin with 0 to 4 missed cleavage sites. (C) The proportion of cleaved K/R-X bonds (the left y-axis) and miscleaved K/R-X bonds (the right yaxis). (D) The percentage of wrong cleavage sites. N/C-term cleavage represents the cleavage at N/C-term to the identified peptide. The wrong cleavage percentage is calculated by dividing the wrong cleavage number by the all cleavage number. 170x119mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 22 of 26

Page 23 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2. The cleavage efficiencies of Lys-C, Arg-C and trypsin at (A) K/R-X bonds, (B) K-X bonds, and (C) R-X bonds. 170x119mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3. Comparison of Lys-C/Arg-C, Lys-C/trypsin, and two trypsin digestion approaches. (A) The number of MS/MS scans and peptide spectrum matches (PSMs) (the left y-axis) and the number of unique peptides and proteins (the right y-axis). (B) The proportion of identified peptides with 0, 1 or 2 missed cleavage sites (MCS) (the left y-axis) and the number of FCPs (the right y-axis). (C) The distribution of missed cleavage sites. (D) The percentage of wrong cleavage sites. N/C-term cleavage represents the cleavage at N/C-term to the identified peptide. The wrong cleavage percentage is calculated by dividing the wrong cleavage number by the all cleavage number. 170x119mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 24 of 26

Page 25 of 26 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 4. Analysis of cleavage efficiencies of different digestion approaches. The cleavage efficiencies at (A) K/R-X bonds, (B) K-X bonds, and (C) R-X bonds. 175x119mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 5. Volcano plot of intensity changes of peptides detected in different digestion approaches. The numbers of peptides with intensity fold change > 2 and p-value < 0.01 are presented. (A) Lys-C/Arg-C vs trypsin (1:50), (B) Lys-C/Arg-C vs trypsin (1:25), (C) Lys-C/Arg-C vs Lys-C/trypsin, (D) Lys-C/trypsin vs trypsin (1:50), (E) Lys-C/trypsin vs trypsin (1:25), and (F) Trypsin (1:25) vs trypsin (1:50). 175x180mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 26 of 26