Antibody-Free Approach for the Global Analysis of Protein Methylation

Nov 1, 2016 - Strategy Based on Deglycosylation, Multiprotease, and Hydrophilic Interaction Chromatography for Large-Scale Profiling of Protein Methyl...
6 downloads 3 Views 1MB Size
Subscriber access provided by BOSTON UNIV

Technical Note

An Antibody-free Approach for the Global Analysis of Protein Methylation Keyun Wang, Mingming Dong, Jiawei Mao, Yan Wang, Yan Jin, Mingliang Ye, and Hanfa Zou Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.6b02872 • Publication Date (Web): 01 Nov 2016 Downloaded from http://pubs.acs.org on November 1, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

An Antibody-free Approach for the Global Analysis of

2

Protein Methylation

3 4

Keyun Wang†, ‡, Mingming Dong†, ‡, Jiawei Mao†, ‡, Yan Wang†, ‡, Yan Jin†, ‡,

5

Mingliang Ye†, ‡, *, Hanfa Zou†, ‡, #

6

† CAS Key Lab of Separation Sciences for Analytical Chemistry, Dalian Institute of

7

Chemical Physics, Chinese Academy of Sciences Dalian 116023, China

8

‡ University of Chinese Academy of Sciences, Beijing 100049, China

9

* To whom correspondence should be addressed: (M.L. Ye) Phone: +86-411-84379620. Fax:

10

+86-411-84379620. E-mail: [email protected].

11

# Deceased April 25, 2016

12 13 14 15 16 17 18 19 20 21 22 1

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2

Abstract

3

Protein methylation is receiving more and more attention for it’s important regulating

4

role in diverse biological processes including epigenetic regulation of gene

5

transcription, RNA processing, DNA damage repair, and signal transduction. Global

6

analysis of protein methylation at proteome level requires the enrichment of

7

methylated peptides with various forms, unfortunately the immunoaffinity

8

purification method can only enrich a subset of them due to lacking of pan specific

9

antibody. Because methylation does not significantly alter the physico-chemical

10

properties of arginine or lysine residues, chemical approach for global methylome

11

analysis is still at infancy. In this study, by exploiting the fact that the methylation on

12

Arg and Lys prohibiting the cleavage by proteases for these sites, we developed an

13

antibody-free method to enrich methylated peptides, which enabled the identification

14

of 887 methylation forms on 768 sites from HepG2 cells. This technique allows the

15

simultaneous analysis of both Lys and Arg methylation while has better performance

16

for the identification of Arg methylation. It should find broad applications in studying

17

methylation regulated biological processes.

18 19 20 21 22 23 2

ACS Paragon Plus Environment

Page 2 of 33

Page 3 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

Introduction

2

Protein methylation, well known for it’s role in epigenetic regulation, remains one of

3

the most challenging Post-Translational Modifications (PTMs) to be studied in high

4

throughput, mainly due to the lack of efficient enrichment method. Protein

5

methylation occurs predominantly on lysine and arginine residues, which can be

6

modified by S-adenosyl-L-methionine (SAM) methyltransferases by adding one, two

7

or three methyl groups to lysine (K) residue, and one or two methyl groups to arginine

8

(R) residue1. Methionine is an essential amino acid that mammal cells cannot

9

synthesize de novo, and it is also the precursor of the S-adenosyl-L-methionine, the

10

sole donor of the methyl group2. When heavy stable isotope form of methionine is

11

used in the cell culture medium, heavy-methyl groups will be introduced by

12

methyltransferases into lysine and arginine residues, which is beneficial to improve

13

the confidence of methylation identification3. Lysine methylation is the most

14

well-known to regulate histone function and is involved in epigenetic regulation of

15

gene transcription4,5. Arginine methylation has been reported to regulate RNA

16

processing, gene transcription, DNA damage repair, and signal transduction5-7.

17

Studying the methylation roles in regulations of biological processes requires the

18

detailed analysis of the methylated proteins being involved. However, the complexity

19

of the methylome and the low stoichiometry of protein methylation pose a serious

20

technical challenge.

21

A crucial step for proteomics analysis of protein methylation is to specifically isolate

22

the methylated peptides from a complex peptide mixture derived from digestion of 3

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

sample.

Because

methylation

does

Page 4 of 33

1

proteome

not

2

physico-chemical properties of arginine or lysine residues, it is challenging to develop

3

an efficient enrichment method like IMAC (immobilized metal ion affinity

4

chromatography) to phosphorylation8. Thus, though this important modification was

5

firstly found almost 60 years ago9, the methodology advancement for the global

6

methylome analysis is severely lagged behind other major PTMs, such as protein

7

phosphorylation8, protein ubiquitination10,11 and protein acetylation12. To enable

8

in-depth exploration of this important modification, it is urgent to develop an efficient

9

approach to make the high-throughput and global analysis of protein methylation

10

possible. To date, a variety of approaches were developed to enrich methylated

11

proteins/peptides for proteome analysis. The methylated proteins can be captured by

12

binding domains such as 3xMBT13,14 and chromo domain in HP1β15, however, these

13

binding domains are unable to efficiently enrich methylated peptides from protein

14

digest, which makes it difficult to determine the methylation sites13. Antibodies were

15

also developed to enrich methylated peptides, which enabled the identification of

16

methylation sites3,16-21. However, the structure multiformity introduced by addition of

17

diverse number of methyl groups make it impossible to develop highly efficient

18

pan-specific antibodies for the global capture of methylated proteins or peptides, and

19

thus antibodies currently used in methylation investigation are mostly developed

20

targeting one specific form of methylation states. Among them, antibodies targeting

21

arginine methylation are more successful and over 1000 arginine methylation sites

22

could be identified in a single study18. However, sequence-independent anti-lysine 4

ACS Paragon Plus Environment

significantly

alter

the

Page 5 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

methylation antibodies with adequate affinity and specificity for proteome analysis

2

are still lacking20. In a previous study, a chemical proteomics approach, involving

3

chemical propionylation of mono-methylated lysine and immunoaffinity enrichment

4

of the modified monomethylated peptides, was developed for the analysis of

5

mono-methyllysine20,

6

mono-methylation sites in 398 proteins. Recently, more than 1800 lysine

7

mono-methylation sites were successfully identified in a single study using

8

immunoaffinity enrichment22, which provides novel insights into the cellular activity

9

and substrate recognition of SMYD2 as well as the global landscape and regulation of

10

protein mono-methylation. Technologies based on protein engineering in tandem with

11

SAM analogue cofactors and bioorthogonal click chemistry were also used to profile

12

the substrates of protein methylation23-25. However, this strategy, though useful at

13

identifying the methylated proteins, was unsuitable to locate the modified sites,

14

making it difficult to be widely used in high-throughput investigations. Up to now, it

15

is still a daunting challenge to develop innovative methods to enable the simultaneous

16

enrichment of the peptides with different arginine and lysine methylation states for

17

global analysis of protein methylation.

18

Many proteases such as trypsin, caspase can cleave peptide bonds on proteins in a

19

specific way. However, Post-Translational Modifications (PTMs) of the cleavage sites

20

would inhibit the binding of enzyme to substrate and therefore prevent the cleavage.

21

For example, the phosphorylation of the substrates of caspases at or near the caspase

22

cleavage site by protein kinases would prevent caspase-catalyzed cleavage,

which

resulted

in

the

identification

5

ACS Paragon Plus Environment

of

446

lysine

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 33

1

representing an important mechanism for the regulation of caspase activity26. Trypsin

2

cleaves carboxy-terminuses (C-terminuses) of Arg and Lys, which are target residues

3

of methylation. However, if the Arg and Lys have PTMs such as methylation or

4

acetylation, the modified Arg and Lys residues cannot be efficiently bound by the

5

substrate-binding site of trypsin, resulting in poor digestion efficiency and thus

6

miscleavage. Unlike acetylation at lysine side-chains that neutralizes their positive

7

charge, the methylated Arg and Lys does not cause a change in overall charge.

8

Therefore the resulting methylated peptides have one more positive charge than most

9

of the fully cleaved tryptic peptides. In principle such methylated peptides could be

10

separated from other peptides according to their charge differences using

11

strong-cation exchange chromatography (SCX)27,28. However, such endeavor has not

12

been

13

charge-suppressing strategy29 was developed to improve the SCX separation

14

performance by using chemical reactions to eliminate the charges on unmodified RK

15

residues and peptide N-termini. However, two steps of chemical reactions were

16

required. Herein, we developed a simple SCX method without using of any chemical

17

derivatization to enrich methylated peptides (Figure 1). In this method, two types of

18

interference peptides, the peptides with miscleavage sites and histidine (H) containing

19

peptides, were significantly reduced by the digestion by using multiple enzymes and

20

by SCX separation conducted at high pH condition, respectively.

21

Experimental Procedures

very

successful27.

Recently,

an

interesting

6

ACS Paragon Plus Environment

approach

termed

as

Page 7 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

hM-SILAC labeling and cell lysis for MS analysis. The hM-SILAC (heavy

2

methionine-SILAC) labeling media was prepared by adding L-Arginine, L-Lysine and

3

L-Methionine-(methyl-13C,D3) at a final concentration of 0.398 mM, 0.798 mM and

4

0.2 mM, respectively to the custom-purchased DMEM medium (Gibco) lacking

5

L-Methionine, L-Arginine, and L-Lysine supplemented with 10% dialyzed fetal

6

bovine serum (Gibco). HepG2 cells were grown at 37 °C in a humidified 5%

7

CO2-containing atmosphere for at least 8 cell doublings in the hM-SILAC labeling

8

media.

9

Cells were harvested by trypsin digestion, washed with PBS buffer (pH 7.4), and

10

lysated by sonication in 8 M urea, 1% v/v Triton X-100, 65mM DTT, 1mM PMSF,

11

1% v/v protease inhibitor cocktail, 50mM Tris-Cl buffer (pH 7.5). After

12

centrifugation at 25,000 g for 1 h, the supernatant contained the total proteins of

13

HepG2 cells and the concentration of proteins was measured by the Bradford method.

14

A two-step multi-enzyme digestion FASP protocol (refer to the supplemental material

15

for optimization of the digestion procedure) was applied to digest the protein sample.

16

Proteins of 1 mg were reduced by 20 mM DTT at 37 ℃ for 2 h and alkylated by 40

17

mM iodoacetamide in the dark at RT for 40 min. The solution was transferred to a

18

10-kDa-MWCO filter (Millipore) and centrifuged at 14,000g for 15 min to remove

19

the urea lysis buffer followed with washing three time with 50mM NH4HCO3 (pH

20

8.1). After adding 0.4ml 50mM NH4HCO3 (pH 8.1) to the 10-kDa-MWCO filter,

21

Trypsin (1:50) and Lys-C (1:50) was added at an enzyme-to-protein ratio of 1:50 to

22

the 10-kDa-MWCO filter and incubated for 16 h for digestion at 37 °C. The 7

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 33

1

10-kDa-MWCO filter was centrifuged at speed of 14,000g to remove the Trypsin and

2

Lys-C. Collect the filtrate. Wash the 10-kDa-MWCO filter by adding 0.4ml 50mM

3

NH4HCO3 (pH 8.1) and centrifuge it at 14,000g for 15 min. Collect the filtrate again.

4

Combine the obtained filtrate together and add 500 mM CaCl2 to above filtrate to a

5

final concentration of 5mM, then add 10× activation buffer (50mM Tris-HCl (pH

6

7.6–7.9), 50mM DTT and 2mM EDTA) to a final concentration of 1×. Then add

7

Arg-C (Worthington Biochemical) at an enzyme-to-protein ratio of 1:50 and incubate

8

for 16 h for further digestion at 37 °C. Quench the digestion by acidification with

9

TFA to 1% (vol/vol) on ice and desalt the resulting peptides with 60mg HLB

10

cartridge.

11

Preparation of the SCXtip. Pack a small piece of degreasing cotton to 200 µl pipet

12

tip as the sieve, compact it with a fused silica capillary. Carefully suspend the solution

13

containing 2 mg SCX beads (Sepax) (As the separation is conducted at high pH range,

14

the SCX beads of polymer matrix stable under alkaline condition should be used) to

15

the above tips. Spin at speed of 1000g to properly pack the SCX beads. Add 2 × 200

16

µl 5mM BRUB buffer (pH12) to wash the SCXtip by centrifugation at 1000g for 5

17

min. Equilibrate the SCXtip by washing with 3 × 200 µl loading buffer (60% (vol/vol)

18

ACN, 40% (vol/vol) 5 mM BRUB buffer (pH 2.5)). It should be noted that 5mM

19

BRUB buffers30 used in this study contained 5mM acetic acid, 5mM phosphoric acid

20

and 5mM boric acid, which were adjusted to corresponding pH with 1M NaOH.

21

Separation of synthetic methylated peptides by high pH SCXtip. 0.05 mg (0.01

22

mg

per

peptide)

mixture

of

five

synthetic

8

ACS Paragon Plus Environment

methylated

peptides

Page 9 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

(EIAQDF-K(Me1)-TDLR,

EIAQDF-K(Me2)-TDLR,

2

SG-R(Me1)-GGNFGFGDSR, N-R(Me2S)-GAGGFGGGGGTR) was dissolved in

3

600µl loading buffer (60% (vol/vol) ACN, 40% (vol/vol) 5mM BRUB buffer (pH

4

2.5)). Load the sample by applying 3 × 200 µl of the sample onto the SCXtips, collect

5

the flow-through. Then, the tip was washed with 90% ACN, 5mM BRUB buffer (pH

6

9); 85% ACN, 5mM BRUB buffer (pH 9); 80% ACN, 5mM BRUB buffer (pH 9);

7

60% ACN, 5mM BRUB buffer (pH 9) and 30% ACN, 5mM BRUB buffer (pH 12)

8

respectively. Collect the corresponding effluent and subject to MALDI-TOF MS

9

analysis. To test the binding performance of the SCXtip, 0.05 mg methylated peptides

10

and 1 mg BSA tryptic peptides were dissolved in 600 µl loading buffer (60% (vol/vol)

11

ACN, 40% (vol/vol) 5mM BRUB buffer (pH 2.5)). Load the sample by applying 3 ×

12

200 µl of the sample onto the SCXtips, collect the flow-through and subject to

13

MALDI-TOF MS analysis.

14

Methylproteomics analysis using high pH SCXtips. Reconstitute desalted peptide

15

samples from hM-SILAC HepG2 cells in 600 µl loading buffer (60% (vol/vol)

16

acetonitrile, 40% (vol/vol) 5mM BRUB buffer (pH 2.5)). Load the sample by

17

applying 3 × 200 µl of the sample onto the SCXtips with slow spinning speed at ~500

18

g. Wash the SCXtip by applying 3 × 200µl of washing buffer (80% (vol/vol)

19

acetonitrile, 20% (vol/vol) 5mM BRUB buffer (pH 9)) onto the spin tip via moderate

20

centrifugation at ~1000g (~40 µl min

21

applying 3×200 µl of elution buffer 1 (60% (vol/vol) acetonitrile, 40% (vol/vol) 5mM

22

BRUB buffer (pH 9)) via moderate centrifugation at ~1000 g (~40 µl min − 1) for ~15

−1

EIAQDF-K(Me3)-TDLR,

) for ~15 min. Elute the bound peptides by

9

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

min, collect the eluted fraction. Then, applying elution buffer 2 (60% (vol/vol)

2

acetonitrile, 40% (vol/vol) 5mM BRUB buffer (pH 10)), elution buffer 3 (60%

3

(vol/vol) acetonitrile, 40% (vol/vol) 5mM BRUB buffer (pH 11)), elution buffer 4

4

(30% (vol/vol) acetonitrile, 70% (vol/vol) 5mM BRUB buffer (pH 12)) and elution

5

buffer 5 (5mM BRUB buffer (pH 12), 1M NaCl) respectively. Dry down the eluted

6

peptide samples using a vacuum concentrator, then reconstitute the peptides in 150 µl

7

of 1% (vol/vol) TFA. Desalt above peptides using self-made HLB/SDB-XC

8

StageTips following protocol described by Rappsilber et al31. Dry down the desalted

9

peptide samples using a vacuum concentrator.

10

MS analysis. MALDI-TOF MS was used to characterize the enriched sample from a

11

semicomplex sample, i.e. mixture of synthetic methylated peptides. In our study,

12

MALDI analysis was performed on an AB Sciex 5800 MALDI-TOF/TOF mass

13

spectrometer with a pulsed Nd/YAG laser at 355 nm using DHB as MALDI matrix.

14

LC-MS/MS was used to analyze enriched samples of higher complexity such as those

15

originating from hM-SILAC cells. For the LC system, a nano-HPLC Dionex UltiMate

16

3000 (Thermo Scientific) is connected to an Q-Exactive mass spectrometer. The

17

injected sample was first captured on a trapping column (SunChrom C18, 5 µm, 3 cm

18

× 200 µm) before being separated in an analytical column (SunChrom C18, 3 µm, 25

19

cm × 75µm). Peptides are chromatographically separated by using a separation

20

gradient of 180 min at a column flow rate of ~300 nl min − 1. The column effluent is

21

directly introduced into the ESI source of the MS; HCD fragmentation is used on the

22

Q-Exactive MS. The mass spectrometer was operated in a ‘Top 10’ data-dependent 10

ACS Paragon Plus Environment

Page 10 of 33

Page 11 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

acquisition mode with dynamic exclusion enabled (30 s). Survey scans (mass range

2

300-1750 m/z) were acquired at a resolution of 70,000 at 200 m/z with the 10 most

3

abundant multiply charged (z ≥ 2) ions selected with a 2 m/z isolation window for

4

HCD fragmentation. MS/MS scans were acquired at a resolution of 70,000 at 200

5

m/z.

6

Reconstitute the enriched peptides from hM-SILAC HepG2 cells by adding

7

appropriate amount of 0.1% (vol/vol) TFA (60 µl 0.1% TFA to elution fraction 1, and

8

10 µl 0.1% TFA to other elution fractions). Analyze 6 µl of sample from each elution

9

fractions by LC-MS/MS with the flow rate adjusted to 300 nL/min.

10

Data processing and analysis. The MS data were searched against the Uniprot

11

human database (downloaded on August 26, 2015; containing 147,854 protein

12

sequence entries) by using the MaxQuant (v1.3.0.5) software32,33 for peptide and

13

protein identifications. The mass tolerances for MS and MS/MS were set to 6 ppm

14

and 20 ppm, respectively. Enzyme was set to Try/P (cleavage C-terminuses to lysine

15

and arginine) with up to 5 missed cleavages. Oxidation (M), Mono-H-methylation(K,

16

R),

17

acetylation were set as variable modifications. Carbamidomethyl on Cys and

18

heavy-methionine (M) were searched as fixed modification. When setting the

19

modification parameter (Table S-5), the position of Mono-H-methylation was set to

20

“anywhere”, while the position of Di-H-methylation and Tri-H-methylation was set to

21

“notCterm”. FDR at peptide, protein and site level was set to be below 1%, and

22

localization probability of ≥0.75 was required, as usually done in other large scale

Di-H-methylation(K,

R),

Tri-H-methylation(K),

and protein N-terminal

11

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 33

1

PTMs identification34-37. The cutoff value of Maxquant score was set to 40 to ensure

2

high confident peptide identification. Default settings were used for all other

3

parameters

4

http://www.iprox.cn, and the name of the project is “Approach for the Global

5

Analysis of Protein Methylation” and the project ID is IPX00075000. When

6

Motif-X38 and weblogo (http://weblogo.berkeley.edu/logo.cgi) were used to analyze

7

the potential motifs and the sequence characteristic, ±6 residues flanking the

8

methylation sites were subjected to analysis. The GO analysis was conducted using

9

DAVID bioinformatics resources, uniprot accession numbers of identified methylated

in

MaxQuant.

Raw

MS/MS

data

can

be

downloaded

from

10

proteins were subjected to analysis.

11

Results and Discussion

12

Methylation on Arg and Lys prohibits the cleavage by Trypsin

13

Trypsin cleaves carboxy-terminuses (C-terminuses) of Arg and Lys, which are target

14

residues of methylation. We systematically investigated whether the methylation of

15

Arg and Lys inhibits the trypsin cleavage by using synthetic peptides with different

16

methylation forms. As shown in Figure 2 and Figure S-1, the unmethylated peptide

17

was fully cleaved, while the mono-methylated (K and R) peptide was only partially

18

cleaved (less than 30%), and no obvious cleavage was found on Di-methylated (K and

19

R) and Tri-methylated (K) peptides even though these peptides were incubated with

20

trypsin for 16 h. These results confirmed that trypsin cuts poorly at methylated

21

residues. 12

ACS Paragon Plus Environment

Page 13 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

Therefore missed cleavage would occur on methylated R/K residues when the

2

methylated proteins are digested by trypsin (Figure 2E). As the resulting methylated

3

peptides have one more positive charges than majority of the fully cleaved tryptic

4

peptides, these methylated peptides could be enriched from the protein digest by using

5

SCX in principle due to the charge difference

6

types of unmethylated peptides makes the conventional SCX approach less

7

effective27. Firstly, the trypsin digestion generates many peptides with missed

8

cleavage sites even they are not methylated. Indeed, we found that 14.3% of identified

9

peptides contains missed cleavage sites when the protein sample was incubated with for

overnight

digestion.

27,28

Secondly,

. However, the presence of two

10

trypsin

there

11

histidine-containing peptides (>18%) in the tryptic digest. Because the conventional

12

SCX separation is conducted under acidic condition (typically pH 2.7), the histidine

13

residues are also positively charged and so the histidine containing peptides have the

14

same charges as the methylated peptides. To circumvent above problems, we

15

presented an integrated approach (Figure 1) by combining multiple enzyme digestion

16

and high pH SCX separation.

17

Enhanced digestion by utilizing multiple enzymes

18

To reduce the interference of the unmethylated peptides with miscleavage sites on the

19

analysis of methylated methods, we proposed the enhanced digestion approach by

20

using multiple enzymes. In addition to trypsin, Lys-C and Arg-C were also used.

21

Arg-C and Lys-C, respectively cleaving carboxy-terminuses of Arg and Lys, share the

22

cleavage specificity with trypsin. It was reported that Lys-C can compensate for the 13

ACS Paragon Plus Environment

are

huge

amount

of

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 33

1

evident trypsin proteolytic deficiency at lysine residues39 and Arg-C can overcome the

2

trypsin proteolytic deficiency at C-terminus of arginine residues next to proline

3

Thus, the using of extra Lys-C and Arg-C are expected to reduce the miscleavage rate.

4

After systematically investigation of different digestion schemes (Table S-1), we

5

found the two-step multi-enzyme digestion FASP protocol had the best performance.

6

In this protocol, the protein samples were firstly digested by addition of Trypsin and

7

Lys-C, and after removal of the enzymes and addition of activation buffer, Arg-C was

8

added for further digestion. This multiple enzyme digestion strategy significantly

9

reduced the miscleavage of the background peptides, from 14.3% to 4.8% (Table S-1).

10

Clearly this digestion strategy could strongly reduce the background interference

11

resulted from missed cleavage. However, one concern is whether the miscleavage

12

induced by methylation modification is conserved after such digestion scheme. By

13

using the synthetic methylated peptides as the test samples, we found, as shown in

14

Figure S-2, the methylated sites still poorly cleaved under such digestion condition.

15

This indicated that the enhanced digestion would not compromise the SCX-based

16

methylproteomics analysis.

17

SCX separation conducted at high pH range

18

We then investigated if the methylated peptides could be captured by the SCXtip

19

under

20

(EIAQDF-K(Me1)-TDLR,

21

SG-R(Me1)-GGNFGFGDSR, N-R(Me2S)-GAGGFGGGGGTR) (0.05mg in total)

22

was loaded onto SCXtip under acidic condition. No methylated peptides were found

above

condition.

Mixture

of

five

synthetic

EIAQDF-K(Me2)-TDLR,

14

ACS Paragon Plus Environment

methylated

40

,

peptides

EIAQDF-K(Me3)-TDLR,

Page 15 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

in the flow-through solution (Figure S-4), indicating all the methylated peptides were

2

captured by the SCXtip. A more challenging sample, the mixture of 0.05 mg

3

methylated peptides and 1 mg BSA tryptic peptides, was also used for the evaluation.

4

Although many peptides were found in the flow-through solution (Figure S-5),

5

methylated peptides are preferably retained on the SCXtip. This is because the

6

methylated peptides have higher charges and so stronger binding affinity to SCX. We

7

then investigate the washing and elution conditions. When a series of buffers were

8

used to elute the bound peptides, the methylated peptides were firstly found in the

9

washing step of 80% ACN, 5mM BRUB buffer (pH 9) (Figure S-4). Considering the

10

methylated peptides (EIAQDFK*TDLR) with three acidic residues are quite weak to

11

be retained on the SCX, this buffer was used as the washing buffer in the in-depth

12

investigation of global methylproteomes. After the washing step, the bound

13

methyl-peptides could be eluted with BRUB buffers by further increasing the pH (up

14

to pH 12) into several fractions to increase the methylproteome coverage.

15

Conventional SCX separation is usually conducted under acidic (pH 2.7)

16

condition27,28. However, under such condition, the histidine would also be charged

17

and the histidine-containing tryptic peptides would be the major contaminant within

18

the SCX fractions in which most methylated peptides were detected27. To alleviate the

19

interference of histidine-containing peptides, we proposed a SCXtip (Figure S-3)

20

separation approach. The protein digest was firstly loaded onto the SCXtip at low pH

21

as usual. The key for this new approach is to wash the SCXtip with high pH buffer

22

(pH9) prior to the elution of the bound methylated peptides. This washing step is 15

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

expected to remove the histidine-containing peptides from the SCXtip by reducing

2

their charges. To evaluate the performance of this method, we compared the

3

percentages of the histidine containing peptides in flow-through fraction (60% ACN,

4

5mM BRUB, pH2.5), washing fraction (80% ACN, 5mM BRUB, pH 9) and elution

5

fractions using higher pH buffers. The percentages of identified peptides contains

6

histidine for the flow-through, washing and elution fractions were about 4.3%, 33%,

7

and 24.3%, respectively (Table S-2). For comparison, the sample without SCX

8

pretreatment was also analyzed and the percentage of histidine containing peptides

9

was found to be 18.5%. Very low percentage of histidine containing peptides in the

10

flow-through fraction (4.3%) compared with the untreated sample (18.5%) indicated

11

that majority of those peptides were bound onto the SCXtip when the sample was

12

loaded. While significant fraction of these bound histidine containing peptides was

13

washed away during the washing step which can be evidenced by the fact that over 30%

14

of peptides in the washing fraction were histidine containing peptides. As a result,

15

only less than 25% of the identified peptides contains histidine were observed in the

16

eluted fraction. While this percentage was over 60% in the fraction for the peptides

17

with over +3 charges in the conventional SCX separation where methylated peptides

18

were present27. Clearly this new approach cloud alleviate the interference of histidine

19

containing peptides.

20

Global methylome analysis of heavy methionine labeled HepG2 cells

21

Methylation was indistinguishable from several amino acid substitutions because the

22

introduced mass shifts of 14 Da, 28 Da and 42 Da are similar to the mass difference 16

ACS Paragon Plus Environment

Page 16 of 33

Page 17 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

between many amino acids (see Table S-3, by checking the Unimod Database,

2

http://www.unimod.org). Thus, for the identification of methylation sites by MS,

3

heavy methyl-SILAC (stable isotope labeling by amino acids in cell culture) need to

4

be used to improve the identification confidence13. In this study, the cell culture was

5

supplemented with [13CD3]methionine to generate the biological precursor of labeled

6

methyl donor, [13CD3]SAM, and so the introduced mass shift of Mono-, Di- and

7

Tri-methylation are 18, 36 and 54, respectively (see Table S-5). These mass

8

differences are different from amino substitution and no identical mass difference is

9

introduced by other modifications (see Table S-4, by checking the Unimod Database,

10

http://www.unimod.org), which is beneficial for the reduction of false positive

11

matches. The labeling efficiency of heavy methyl-SILAC (hM-SILAC) was

12

confirmed by the substitution rate of heavy methionine, indicating more than 98%

13

substitution was achieved (see Table S-6). Thus, in the database searching, the heavy

14

methionine was set as fixed modification.

15

We applied this integrated approach for the global identification of methylome of

16

heavy methionine labeled HepG2 cells. After controlling the FDR at peptide, protein

17

and site level to be < 1%, and the site localization probability to be ≥0.75, and setting

18

the cutoff value of Maxquant score to be 40, direct evidence of modification was

19

obtained for ~708 methylation forms on ~612 sites per technical replicate using only

20

~1 mg of total cellular proteins. As shown in Figure S-6A, the methylated peptide

21

identifications account for more than 5% of the total identified peptides for the elution

22

fractions (except one fraction with 2.4%), while in flow-through fraction and washing 17

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

fraction the methylated peptides account for around 0.25% of the total peptides.

2

Compared to the untreated sample where only 0.8% of identifications were

3

methylated peptides, around 6.25 fold enrichment was achieved for this SCXtip

4

approach. Obviously the methylated peptides were successfully enriched by the

5

SCXtip from such a complex sample. Combining the results from 3 technical

6

replicates, 768 methylation sites, including 68 on lysine residue and 700 on arginine

7

residue (Figure S-6B and Table S-10), were identified on 287 proteins, corresponding

8

to 887 different methylation forms. Compared to the results obtained by conventional

9

SCX approach where only 39 Meth-R sites were identified from 4–6 mg peptides27,

10

the SCXtip approach obtained much more identifications from less starting sample

11

(~612 methylation sites from 1 mg peptides), indicating the excellent performance of

12

this new approach.

13

Among the 768 methylation sites, around 62.1% (477) of the methylation sites were

14

identified in all 3 replicates and more than 76.9% (591) of the methylation sites were

15

identified in at least 2 replicates (Figure S-6B). High percentage of overlapped

16

identifications and similar numbers of identifications (621, 616, 599 sites for the three

17

replicates) indicate good reproducibility of this integrated approach. The

18

methyl-peptides bound on the SCX tip were eluted into 5 fractions to increase the

19

methylproteome coverage. The overlap between consecutive fractions were quite low

20

(Figure S-6C), indicating good separation between fractions.

21

Evaluation of the confidence of the methylated peptide identifications

18

ACS Paragon Plus Environment

Page 18 of 33

Page 19 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

The methylated peptide identifications was achieved by using the well established

2

MaxQuant software with strict criteria, these identifications are expected to be highly

3

confident. However, it is of interest to evaluate the confidence of identifications by

4

using other methods. For this purpose, we determined the rate of random matches by

5

using control datasets in which the methylated peptides (-13CD3) were not exist. Three

6

MS raw data for sample without heavy labeling (light labeled sample, +14 for -CH3)

7

were searched against database as the dataset of heavy labeled sample by setting the

8

variable modifications of +18 (-13CD3). Because no heavy isotopic labeling was

9

introduced, the identifications with heavy labeling (+18, -13CD3) were false positive

10

matches (random matches). At the PSM (Peptide-Spectrum-Match) level, the

11

percentage of methylated peptide matches in these sample were 0.028% (3/10396),

12

0.028% (3/10500) and 0.058% (6/10364), which indicated the random matches were

13

extremely low. Above results statistically demonstrated the high confidence of

14

methylated peptide identifications.

15

Manual validation is an effective approach to further improve the identification

16

confidence. One of the important criteria is the presence of continuously matched b

17

series and y series ions. By checking the 12 spectra of the methylated peptide

18

identifications in the control datasets shown above (these identifications were false

19

positive matches as no heavy methylation was introduced), only 2 or 3 b, y ions were

20

matched continuously (Figure S-10). Thus, the following criteria were applied to

21

manually check the identifications: at least 4 continuous b, y ions were required to

22

match major peaks or at least 6 continuous b, y ions were required if matching to 19

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 33

1

minor peaks, and most of the major peaks need to be assigned. 414 Peptide spectra

2

matches (Figure S-11) passed above criteria and the identified methylation sites were

3

termed as high confidence ones. The left methylation sites were termed as low

4

confident ones (201 spectra, Figure S-12). It is well known that methylated Arg sites

5

have the glycine-arginine-rich consensus sequences

6

approach to estimate the confidence of methylation site identifications. We compared

7

the sequence patterns of the methylated Arg sites in the high and low confidence

8

groups as well as the Arg sites identified in control experiment. As shown in Figure

9

S-7, there was no glycine-arginine-rich consensus sequence for the methylated sites in

10

control experiment while the consensus sequences were observed for the sites from

11

both groups, no matter they are from high or low confidence groups. The high

12

consistence of the sequence pattern of the sites in the low confidence group with the

13

known pattern indicated that majority of these sites were also true positive

14

identifications. Removal of these identifications will result in losing many valuable

15

methylated sites. For this reason, we gave all the sites and their spectra to allow the

16

researchers with different interests and expertise to evaluate them by themselves.

17

These sites are given separately (Figure S-11, S-12). If the confidence is more

18

concerns, the sites in the high confidence group should be considered. Otherwise, the

19

sites in the low confidence should also be considered.

20

We compared the methylation sites (Table S-10) identified in this study with the

21

methylation sites included in PhosphoSitePlus (http://www.phosphosite.org/). More

22

than one third (37.6%) of identified sites were reported before. The high percentage of

7,16

. This gives us a rough

20

ACS Paragon Plus Environment

Page 21 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

known sites also indicated the high confidence of this dataset. The percentage of

2

known sites in high confidence group and low confidence group were 42.6% and

3

35.2%, respectively. The percentage of known sites on Arg (39%) is greater than that

4

on Lys (25%), mainly because methylation on Arg residue is easier to be isolated and

5

identified and so more known sites were included in the database of PhosphoSitePlus.

6

We also compared the spectra counts of these sites in high confidence group and low

7

confidence group. The averaged spectra counts for the methylated peptides in the low

8

confidence group (3.8 counts/peptide) were lower than these in high confidence group

9

(5.2 counts/peptide), indicating more low abundance methylation sites were present in

10

the low confidence group.

11

Methylome data covering all five methylation states

12

The 887 different methylation forms were identified on the 768 methylation sites,

13

indicating around 15.5% of the methylation sites are identified with at least 2

14

methylation forms. Mono- and di-methylated Arg were simultaneously observed on

15

105 sites while mono-, di-, tri-methylated Lys were simultaneously observed on only

16

6 sites (Figure 3A and Figure 3B). eEF1A1 is an important protein responsible for the

17

enzymatic delivery of aminoacyl tRNAs to the ribosome41, K165 of this protein are

18

identified in all 3 methylation forms while in the Uniprot database this site was only

19

annotated to be di-methylated, thus our study provides evidence to the co-existing of 3

20

methylation forms on this residue. Histone H3.3 represents an epigenetic imprint of

21

transcriptionally active chromatin42, K27 and K36 of this protein were identified in 3

22

methylation states. Annotated MS/MS spectra of K*SAPATGGVK (K28) are shown 21

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

in Figure 4, in which all three methylation (K) forms of this methylated peptide were

2

confidently identified. As annotated in the Uniprot database, the above 3 sites can all

3

be modified by acetylation, thus the simultaneous identification of diverse

4

methylation forms is crucial to understand the complex cross-talks between

5

acetylation and methylation in these sites. KHDRBS1 involves in a variety of cellular

6

processes, including alternative splicing, cell cycle regulation43, 10 sites (R284, R289,

7

R291, R302, R304, R310, R315, R320, R325, R331) are identified in 2 methylation

8

(R) forms. Additionally, diverse methylation (R) forms are found on 8 sites (R45,

9

R57, R63, R65, R70, R78, R223, R232) of THO complex subunit 4, a factor for

10

mRNA transport44. As monomethylated arginine is often a transition state toward

11

dimethylation7, thus the simultaneous capture of different methylation forms would be

12

helpful to reveal the possible underlying cross-talks between different methylation

13

states.

14

Bioinformatics analysis of identified methylome

15

To analyze the sequence specificity of the identified methylome, the motif-x38 was

16

used. As can be seen in Figure S-8, several known motifs including RG, RGG, PR,

17

RXXXP are significantly enriched. Of the total 700 methylation sites on Arg, 66.4%

18

occurred at RG and RGG sites of glycine-arginine-rich (GAR) sequences, while the

19

sequences containing proline at either or all of these positions (x(P)R(P)xxx(P))

20

account for 22.7% of the identified methylation sites, largely in accordance with the

21

result obtained by previous reports16,27. However, no significant motif was found on

22

Lys residue and the sequence characteristic of Lys is evidently different from that of 22

ACS Paragon Plus Environment

Page 22 of 33

Page 23 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

Arg (Figure 3D), implying that methylation on Arg and Lys residues follow different

2

regulation mechanism. In the GO cellular compartment analysis, as shown in Figure

3

5A, the numbers of methylated proteins were greater in the nucleosolic compared

4

with the cytosolic fractions, suggesting a predominant occurrence of protein

5

methylation in the nucleus. As can be seen in Figure 5B, gene ontology (GO)

6

classification analysis revealed enrichment of methylated proteins involved in RNA

7

processing (P=1.2×10-16) and RNA metabolic process (P=2.7×10-20). Additionally,

8

methylated proteins were enriched in molecular functions including nucleic acid

9

binding (P=4.6×10-15), protein binding (P=3.8×10-10) and nucleotide binding (P=9.8

10

×10-8).

11

The relatively small size of methyl group and the structure multiformity introduced by

12

addition of diverse number of methyl groups make it impossible to develop highly

13

efficient pan-specific antibodies for the global capture of methylated proteins or

14

peptides. Antibodies currently used in methylation investigation are developed

15

targeting one specific form of methylation states and these antibodies also suffer from

16

low enrichment efficiency, especially to the lysine methylation13,20. Different types of

17

methylation states, highly correlated, could not be enriched using one antibody even

18

to single residue of lysine or arginine. Thus, the enrichment utilizing antibody is not

19

the optimal option for the global investigation of protein methylation. While the

20

antibody-free approach has the potential to enrich the methylated peptides with all

21

methylation forms. Indeed, all the five methylation states, mono-, di-, tri-methylated

22

Lys and mono-, di-methylated Arg, were all observed in this study. However, it should 23

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

mentioned that 700 (91.1%) methylated sites were occurred on the arginine residues.

2

The prevalence of arginine methylation was also observed in multiple studies, for

3

example 96.1% in conventional SCX and HILIC approach27 (10 on Lys and 249 on

4

Arg), 83.9% in OFFGEL approach16 (18 on Lys and 94 on Arg) and 86.2% in

5

antibody based approach18 (160 on Lys and 1000 on Arg). Most likely Methyl-K

6

occurs much less frequently than Methyl-R in proteins in vivo. We do believe the

7

SCXtip separation developed in this study has bias. Firstly, this method may bias to

8

some types of methylation forms due to the different cleavage efficiencies of the

9

methylated residues or the different affinities of the methylated peptides to SCX. The

10

low number of Methyl-K identifications may because the monomethylation of K

11

cannot effciently inhibit the trypsin cleavage and the methyl-K has weaker binding

12

affinity to the SCX material. Secondly, as it exploits charge differences between

13

methylated peptides and fully cleaved tryptic peptides, the SCXtip is unable to

14

differentiate the methylated peptides with high acidic residue content or with

15

phosphorylated residues from other unmethylated tryptic peptides. As shown in

16

Figure S-9, the percentage of acidic residue in the eluted fractions is lower than that in

17

loading & washing fractions, which indicated that methylated peptides with high

18

acidic residue content may retain poorly in the SCXtip, and thus are prone to be lost

19

during the loading & washing process. Besides, methylated peptides with >=2

20

methylation sites (miscleavage sites) have higher affinity to bind onto SCX which

21

indicated a bias to detect such methylated peptides.

24

ACS Paragon Plus Environment

Page 24 of 33

Page 25 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

In summary, by exploiting the fact that the methylation on Arg and Lys prohibits the

2

cleavage by proteases for these sites, we developed a chromatography based method to

3

enrich methylated peptides. Compared with immunoaffinity purification approach, a

4

significant advantage of this antibody-free approach is that it allows the enrichment of

5

methylated peptides with different methylation forms. Combination of the enhanced

6

protein digestion method and high-pH SCXtip separation dramatically improved the

7

performance of site specific characterization of protein methylation at proteome level.

8

Our approach should thus find broad applications in the understanding of the regulation

9

mechanisms of biological processes by protein methylation at the system level.

10

Acknowledgments: This work was supported, in part, by funds from the China

11

State Key Basic Research Program Grants (2016YFA0501402, 2013CB911202,

12

2012CB910101, and 2012CB910604), the National Natural Science Foundation of

13

China (21235006, 21321064, 21535008, 81430072, 81361128015). MY is a recipient

14

of the National Science Fund of China for Distinguished Young Scholars (21525524).

15

Notes: The authors declare that they have no competing interests.

16

REFERENCES

17

(1) Afjehi-Sadat, L.; Garcia, B. A. Curr Opin Chem Biol 2013, 17, 12-19.

18

(2) Lau, H. T.; Lewis, K. A.; Ong, S. E. Methods Mol Biol 2014, 1188, 161-175.

19

(3) Ong, S.-E.; Mittler, G.; Mann, M. Nat Methods 2004, 1, 119-126.

20

(4) Dillon, S. C.; Zhang, X.; Trievel, R. C.; Cheng, X. Genome Biol 2005, 6, 227.

21 22

(5) Lee, D. Y.; Teyssier, C.; Strahl, B. D.; Stallcup, M. R. Endocr Rev 2005, 26, 147-170.

23

(6) Bedford, M. T.; Richard, S. Mol Cell 2005, 18, 263-272.

24

(7) Bedford, M. T.; Clarke, S. G. Mol Cell 2009, 33, 1-13.

25

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2

(8) Zhou, H.; Ye, M.; Dong, J.; Corradini, E.; Cristobal, A.; Heck, A. J.; Zou, H.; Mohammed, S. Nat Protoc 2013, 8, 461-480.

3

(9) Ambler, R. P.; Rees, M. W. Nature 1959, 184, 56-57.

4

(10) Kim, W.; Bennett, Eric J.; Huttlin, Edward L.; Guo, A.; Li, J.; Possemato, A.; Sowa, Mathew E.; Rad, R.; Rush, J.; Comb, Michael J.; Harper, J. W.; Gygi, Steven P. Mol Cell 2011, 44, 325-340.

1

5 6

8

(11) Udeshi, N. D.; Mertins, P.; Svinkina, T.; Carr, S. A. Nat Protoc 2013, 8, 1950-1960.

9

(12) Guan, K. L.; Yu, W.; Lin, Y.; Xiong, Y.; Zhao, S. Nat Protoc 2010, 5, 1583-1595.

10

(13) Carlson, S. M.; Moore, K. E.; Green, E. M.; Martin, G. M.; Gozani, O. Nat Protoc 2014, 9, 37-50.

7

11 12 13 14 15 16

(14) Moore, K. E.; Carlson, S. M.; Camp, N. D.; Cheung, P.; James, R. G.; Chua, K. F.; Wolf-Yadlin, A.; Gozani, O. Mol Cell 2013, 50, 444-456. (15) Liu, H.; Galka, M.; Mori, E.; Liu, X.; Lin, Y. F.; Wei, R.; Pittock, P.; Voss, C.; Dhami, G.; Li, X.; Miyaji, M.; Lajoie, G.; Chen, B.; Li, S. S. Mol Cell 2013, 50, 723-735.

18

(16) Bremang, M.; Cuomo, A.; Agresta, A. M.; Stugiewicz, M.; Spadotto, V.; Bonaldi, T. Mol Biosyst 2013, 9, 2231-2247.

19

(17) Cao, X. J.; Arnaudo, A. M.; Garcia, B. A. Epigenetics 2013, 8, 477-485.

20

(18) Guo, A.; Gu, H.; Zhou, J.; Mulhern, D.; Wang, Y.; Lee, K. A.; Yang, V.; Aguiar, M.; Kornhauser, J.; Jia, X.; Ren, J.; Beausoleil, S. A.; Silva, J. C.; Vemulapalli, V.; Bedford, M. T.; Comb, M. J. Mol Cell Proteomics 2014, 13, 372-387.

17

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

(19) Geoghegan, V.; Guo, A.; Trudgian, D.; Thomas, B.; Acuto, O. Nat Commun 2015, 6, 6758. (20) Wu, Z.; Cheng, Z.; Sun, M.; Wan, X.; Liu, P.; He, T.; Tan, M.; Zhao, Y. Mol Cell Proteomics 2015, 14, 329-339. (21) Sylvestersen, K. B.; Horn, H.; Jungmichel, S.; Jensen, L. J.; Nielsen, M. L. Mol Cell Proteomics 2014, 13, 2072-2088. (22) Olsen, J. B.; Cao, X. J.; Han, B.; Chen, L. H.; Horvath, A.; Richardson, T. I.; Campbell, R. M.; Garcia, B. A.; Nguyen, H. Mol Cell Proteomics 2016, 15, 892-905. (23) Islam, K.; Bothwell, I.; Chen, Y.; Sengelaub, C.; Wang, R.; Deng, H.; Luo, M. J Am Chem Soc 2012, 134, 5909-5915. (24) Bothwell, I. R.; Islam, K.; Chen, Y.; Zheng, W.; Blum, G.; Deng, H.; Luo, M. J Am Chem Soc 2012, 134, 14905-14912. (25) Wang, R.; Zheng, W.; Yu, H.; Deng, H.; Luo, M. J Am Chem Soc 2011, 133, 7648-7651.

26

ACS Paragon Plus Environment

Page 26 of 33

Page 27 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1 2 3 4 5 6 7 8 9

(26) Tozser, J.; Bagossi, P.; Zahuczky, G.; Specht, S. I.; Majerova, E.; Copeland, T. D. Biochem J 2003, 372, 137-143. (27) Uhlmann, T.; Geoghegan, V. L.; Thomas, B.; Ridlova, G.; Trudgian, D. C.; Acuto, O. Mol Cell Proteomics 2012, 11, 1489-1499. (28) Beausoleil, S. A.; Jedrychowski, M.; Schwartz, D.; Elias, J. E.; Villen, J.; Li, J.; Cohn, M. A.; Cantley, L. C.; Gygi, S. P. Proc Natl Acad Sci U S A 2004, 101, 12130-12135. (29) Ning, Z.; Star, A. T.; Mierzwa, A.; Lanouette, S.; Mayne, J.; Couture, J. F.; Figeys, D. Chemical communications 2016, 52, 5474-5477.

11

(30) Britton, H. T. S.; Robinson, R. A. Journal of the Chemical Society (Resumed) 1931, 1456-1462.

12

(31) Rappsilber, J.; Mann, M.; Ishihama, Y. Nat Protoc 2007, 2, 1896-1906.

13 14

(32) Cox, J.; Neuhauser, N.; Michalski, A.; Scheltema, R. A.; Olsen, J. V.; Mann, M. J Proteome Res 2011, 10, 1794-1805.

15

(33) Cox, J.; Mann, M. Nat Biotechnol 2008, 26, 1367-1372.

16 18

(34) Scholz, C.; Weinert, B. T.; Wagner, S. A.; Beli, P.; Miyake, Y.; Qi, J.; Jensen, L. J.; Streicher, W.; McCarthy, A. R.; Westwood, N. J.; Lain, S.; Cox, J.; Matthias, P.; Mann, M.; Bradner, J. E.; Choudhary, C. Nat Biotechnol 2015, 33, 415-423.

19

(35) Humphrey, S. J.; Azimifar, S. B.; Mann, M. Nat Biotechnol 2015, 33, 990-995.

20

(36) Sharma, K.; D'Souza, R. C.; Tyanova, S.; Schaab, C.; Wisniewski, J. R.; Cox, J.; Mann, M. Cell reports 2014, 8, 1583-1594.

10

17

21

23

(37) Zielinska, D. F.; Gnad, F.; Schropp, K.; Wisniewski, J. R.; Mann, M. Mol Cell 2012, 46, 542-548.

24

(38) Schwartz, D.; Gygi, S. P. Nat Biotechnol 2005, 23, 1391-1398.

25

(39) Saveliev, S.; Bratz, M.; Zubarev, R.; Szapacs, M.; Budamgunta, H.; Urh, M. Nat Methods 2013, 10.

22

26

28

(40) Keil, B. Essential Substrate Residues for Action of Endopeptidases; Springer Berlin Heidelberg, 1992, p 43-228.

29

(41) Zhang, Y. E. Dev Cell 2011, 20, 289-290.

30

(42) Tagami, H.; Ray-Gallet, D.; Almouzni, G.; Nakatani, Y. Cell 2004, 116, 51-61.

31

(43) Kalsotra, A.; Cooper, T. A. Nat Rev Genet 2011, 12, 715-729.

32

(44) Suganuma, H.; Kumada, M.; Omi, T.; Gotoh, T.; Lkhagvasuren, M.; Okuda, H.; Kamesaki, T.; Kajii, E.; Iwamoto, S. Febs J 2005, 272, 2696-2704.

27

33 34 35

27

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2

Figures

3 4 5 6 7 8 9 10 11

12 13 14

Figure 1. The integrated approach for global analysis of protein methylation.

15

Methylated lysine/arginine are labeled in hM-SILAC medium. Proteins are then

16

digested with multiple enzymes followed with the enrichment of the methylated

17

peptides by the high pH SCXtip.

18 19 20 21 22 23 24 25 26 27 28

ACS Paragon Plus Environment

Page 28 of 33

Page 29 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1 2 3 4 5 6

7 8 9

Figure 2. Methylation inhibit the cleavage by trypsin. (A-D) The MALDI mass

10

spectrometric analyses of synthetic peptides with and without methylation after

11

the digestion with trypsin. (A) Unmethylated peptide, EIAQDF-K-TDLR. (B)

12

Mono-methylated (K) peptide, EIAQDF-K(Me1)-TDLR. (C) Di-methylated (K)

13

peptide,

14

EIAQDF-K(Me3)-TDLR. (E) The methylation induced miscleavage makes the

15

methylated peptides having one extra charge after trypsin digestion.

EIAQDF-K(Me2)-TDLR.

(D)

Tri-methylated

16 17 18 19 20 21 22 29

ACS Paragon Plus Environment

(K)

peptide,

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2 3 4 5 6

7 8 9

Figure 3. The global view of methylproteome of HepG2 cells. (A) The overlap of

10

mono-, di- and tri-methylated lysine residues. (B) The overlap of mono- and

11

di-methylated arginine residues. (C) Distribution of identified methylation types on

12

lysine and arginine residues. (D) Sequence logo representation of the identified

13

Methyl-K and Methyl-R sites.

14 15 16 17 18 19 20 21 30

ACS Paragon Plus Environment

Page 30 of 33

Page 31 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1 2 3 4

5 6 7

Figure 4. An example of annotated MS/MS spectrums of a peptide (from H3F3B,

8

Histone H3.3) identified by using our protocol with all three methylation (K)

9

states identified. y ions, b ions and fragment ions with a neutral loss are reported in

10

red, blue and yellow, respectively.

11 12 13 14 15 31

ACS Paragon Plus Environment

Analytical Chemistry

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2 3 4 5

6 7 8

Figure 5. GO classification and motif analysis of the identified methylome. (A)

9

Cellular compartment terms associated with methylated proteins. (B) GO molecular

10

functions and GO biological processes enriched in the identified methylated proteins.

11 12 13 14 15 16 17 18 19 20 21 32

ACS Paragon Plus Environment

Page 32 of 33

Page 33 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1 2 3 4 5 6 7 8 9

For TOC only

10 11

12

33

ACS Paragon Plus Environment