MT-MAMS: Protein Methyltransferase Motif Analysis by Mass

Aug 29, 2018 - The use of this heavy methyl donor gives unique mass shifts to methylated ... allowing their unambiguous quantification by mass spectro...
0 downloads 0 Views 1MB Size
Subscriber access provided by Kaohsiung Medical University

Article

MT-MAMS: protein methyltransferase motif analysis by mass spectrometry Joshua J. Hamey, Ryan J. Separovich, and Marc R. Wilkins J. Proteome Res., Just Accepted Manuscript • Publication Date (Web): 29 Aug 2018 Downloaded from http://pubs.acs.org on August 29, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

MT-MAMS: protein methyltransferase motif analysis by mass spectrometry Joshua J. Hamey1, Ryan J. Separovich1, Marc R. Wilkins1* 1

School of Biotechnology and Biomolecular Sciences, University of New South Wales, New

South Wales, 2052, Australia *To whom correspondence should be addressed: Marc R. Wilkins. Tel.: +61 2 9385 3633. E-mail address: [email protected]

1 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Protein methyltransferases often recognise their substrates through linear sequence motifs. The determination of these motifs is critical to understand methyltransferase mechanism, function and drug targeting. Here we describe MT-MAMS (methyltransferase motif analysis by mass spectrometry), a quantitative approach to characterise methyltransferase substrate recognition motifs. In MT-MAMS, peptide sets are synthesised which contain all amino acid substitutions at single positions within a template sequence. These are then incubated with the methyltransferase of interest in the presence of deuterated S-adenosyl methionine (D3AdoMet). The use of this heavy methyl donor gives unique mass shifts to methylated peptides, allowing their unambiguous quantification by mass spectrometry. The stoichiometry of methylation resulting from each substitution is then derived, and finally the methyltransferase substrate recognition motif is generated. We validated MT-MAMS by application to lysine methyltransferase G9a, generating the substrate recognition motif (TKRN)-(A>RS>G)-(R>>K)-K(STRCKMAQHG)-Φ; this is highly similar to that previously determined by peptide arrays. We then determined the recognition motif of yeast lysine elongation factor methyltransferase 1 (Efm1) to be (Y>FW)-K-^P-G-G-Φ. This is a new type of lysine methyltransferase recognition motif that only contains non-charged residues, excluding the target lysine. We further determined recognition motifs of major yeast and human arginine methyltransferases Hmt1 and PRMT1, revealing them to be ^(DE)-^(DE)-R-(G>>A)-(GN>RAW)-(FYW>ILKHM) and ^(DE)^(DE)-R-(G>>N)-(GR>ANK)-(K>YHMFILW), respectively. These motifs expand significantly on the canonical RGG recognition motif and include the negative specificity of these enzymes, a

2 ACS Paragon Plus Environment

Page 2 of 24

Page 3 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

feature unique to MT-MAMS. Finally, we show that MT-MAMS can be used to generate insights into the processivity of protein methyltransferases. Key words Methyltransferase, protein methylation, motif, substrate recognition, enzyme specificity, synthetic peptides, quantitative mass spectrometry, G9a, PRMT1, Hmt1

3 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Introduction Protein methyltransferases are promising targets for the development of new therapeutics1. Like kinases, methyltransferases often recognise their substrate proteins through short linear sequence motifs2-7. Quantitative characterisation of kinase recognition motifs led to advances in kinase drug design8 and was crucial to understand the effects of inhibitors9. For protein methyltransferases, there is thus a pressing need for methods to quantitatively determine their substrate recognition motifs, especially given that methylation is of such high functional importance10. Existing motif detection methods that use peptide arrays11, 12, while useful, suffer a number of drawbacks, most notably an inability to detect mono-, di- or tri-methylation. Here we have developed methyltransferase motif analysis by mass spectrometry (MT-MAMS). MT-MAMS uses mass spectrometry to accurately and simultaneously quantify the methylation of a peptide and its variants that carry single amino acid substitutions. Using a template sequence from a known substrate, peptide mixtures are synthesised wherein a single position near the target residue is substituted to all 20 amino acids (Figure 1A). These 20-peptide sets, representing all possible amino acids at one position in the template sequence, are then assayed with the methyltransferase, before the stoichiometry of methylation on each substituted peptide is quantified by mass spectrometry (Figure 1A). Since methylation confers the same mass shift as many amino acid substitutions, assays are carried out using a deuterated form of S-adenosyl methionine (D3-AdoMet). This confers a +17.0345 Da, +34.0690 Da or +51.1034 Da mass shift to mono-, di- or tri-methylated peptides. These mass shifts are completely unique and distinguishable with high resolution mass spectrometry from all amino

4 ACS Paragon Plus Environment

Page 4 of 24

Page 5 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

acid substitutions that are present in the 20-peptide set. The only exceptions are substitutions of isobaric amino acids leucine to isoleucine, or vice versa, which are quantified together. Together, this process allows the quantification of enzyme-mediated mono-, di- or trimethylation for each substituted peptide, which is then used to develop the substrate recognition motif (Figure 1A). Experimental Section Cloning, expression and purification of proteins G9a with an N-terminal GST tag was purchased from Sigma-Aldrich (SRP0135). Efm1 was cloned into pET15b from Saccharomyces cerevisiae genomic DNA, with a C-terminal 6xHis tag, by Gibson assembly with the Gibson Assembly® Cloning Kit (New England Biolabs). PRMT1v2 and PRMT1v1 pET15b were a kind gift from Dr. Jason Low at The University of Sydney. Yeast eEF1A and Hmt1 were cloned into pET15b previously13, 14. Proteins were expressed in Escherichia coli Rosetta (DE3) and purified according to previous methods13, except for PRMT1v2 and PRMT1v1, which were purified on a Profinia™ Affinity Chromatography Protein Purification System (BioRad) using the ‘Native IMAC’ method. Methylation assays For MT-MAMS, synthetic 20-peptide sets (2 μM per peptide, 40 μM total) (ChinaPeptides) were incubated with or without methyltransferase (1 μM) in an in vitro methylation buffer (50 mM HEPES, 20 mM NaCl, 1 mM EDTA, pH 7.4) in the presence of S-Adenosyl-L-methionine-D3 (Smethyl-D3) tetra(p-Toluenesulfonate) salt (500 μM) (Medical Isotopes) overnight at 30 °C (Efm1

5 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

and Hmt1) or 37 °C (G9a, PRMT1v2 and PRMT1v1). Assay samples were then cleaned up for mass spectrometry with 100 μL Bond Elut OMIX C18 tips (Agilent) according to the manufacturer’s instructions. For the whole-protein eEF1A methylation assay, purified eEF1A (1 μM) was incubated with or without Efm1 (1 μM) in the in vitro methylation buffer in the presence of 500 μM AdoMet overnight at 30 °C. Protein methylation assays were separated by SDS-PAGE, eEF1A gel bands digested with trypsin or LysargiNase (Proteolysis Lab, IBMB-CSIC Barcelona Science Park, Barcelona, Spain) and samples prepared for mass spectrometry according to previous methods13, except that the LysargiNase digestion buffer was 50 mM HEPES, 5 mM CaCl2, pH 7.5. Mass spectrometry MT-MAMS samples were analysed by LC-MS/MS on a Fusion Lumos, a Q Exactive Plus or an LTQ Orbitrap Velos (Thermo Fisher Scientific), according to previous methods13. In particular, precursor scans were acquired with a resolution of 60,000 at m/z 200 for the Fusion Lumos, 70,000 at m/z 200 for the Q Exactive Plus or 30,000 at m/z 400 for the LTQ Orbitrap Velos. eEF1A methylation assay samples were analysed by LC-MS/MS on a Q Exactive Plus (Thermo Fisher Scientific) in the same way, except that inclusion lists containing the m/z values for the methylated peptides of interest were used13. All mass spectrometry data have been deposited to the ProteomeXchange Consortium via the PRIDE15 partner repository with the dataset identifier PXD009515. Data analysis

6 ACS Paragon Plus Environment

Page 6 of 24

Page 7 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

To obtain the theoretical mass of peptides methylated with D3-AdoMet, 17.0345 Da (monomethyl), 34.0690 Da (dimethyl) or 51.1034 Da (trimethyl) was added to the monoisotopic mass of the unmethylated peptide. Extracted ion chromatograms (XICs) for all methylation states of peptides were obtained in Thermo Xcalibur Qual Browser 2.2 SP1.48 at ±10 ppm of the theoretical m/z of the monoisotopic peak with a filter to only analyse precursor (MS1) spectra. The area under the curve of XIC peaks corresponding to methylated and unmethylated peptides were determined using the ICIS algorithm, with a constrained peak width corresponding to 5% of the peak height and a tailing factor of 9, and a multiplet resolution of 2. Methylation stoichiometry was then determined as the percentage of each methylation state compared to the total abundance of the peptide given by the summed abundances of all methylation states. Methyltransferase substrate recognition motifs were constructed by first calculating the amount of information present at each particular position in the motif, expressed as bits, as is used for motifs derived from sequence alignments16. First, a methylation score (mx) for each peptide was calculated as follows:  = : + (2 × : ) + (3 × : ) Where sx:me1/2/3 represents the stoichiometry of mono-, di- or tri-methylation for peptide x. Methylation scores for all substituted peptides in a single 20-peptide set (corresponding to a single position in the motif) were then normalised by dividing the methylation score of peptide x by the summed methylation scores of all peptides in the set, to give a normalised methylation score (nx):

7 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

 =

Page 8 of 24

 ∑  

Where N represents the number of different amino acid substitutions possibilities being analysed (19 for G9a and Efm1 and 18 for Hmt1, PRMT1v2 and PRMT1v1) and mi represents the methylation score m on peptide containing amino acid i. Secondly, the total information at the site was calculated by subtracting the observed entropy from the maximum possible entropy16:   = 

− "− #



   $



Where ni represents the normalised methylation score of peptide containing amino acid i. Finally, the total information at the site was multiplied by the normalised methylation score for each different amino acid substitution, giving a scaled representation of the amino acid preference. For generating negative substrate recognition motifs of Hmt1, PRMT1v2 and PRMT1v1, the above procedure was follow except that a negative methylation score (negmx) was used instead: % = : + (2 × :& ) Where sx:me0/1 represents the stoichiometry of unmethylated or monomethylated peptide x. Raw data from the whole-protein eEF1A methylation assay were converted to Mascot Generic Format (.mgf) using RawConverter17 (v. 1.0.0.0). Converted data were searched against the SwissProt database (2015_11, 549,832 sequences to 2015_12, 550,116 sequences) and the contaminants database (10062014) using Mascot (v. 2.4, Matrix Sciences) hosted by the Walter

8 ACS Paragon Plus Environment

Page 9 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

and Eliza Hall Institute for Medical Research (Melbourne, Australia). The following settings were used: Taxonomy: Saccharomyces cerevisiae; enzyme: Trypsin or LysargiNase; Max missed cleavages: 2; Precursor ion tolerance: 4 ppm; Fragment ion tolerance: 10 mmu, Peptide charge: 2+, 3+ and 4+; Instrument: Q-Exactive_Gen; Variable modifications: Oxidation (M), Methyl (K), Dimethyl (K), Trimethyl (K) and Methyl (DE). Differentiating target-site and substituted arginine methylation by parallel reaction monitoring Quantification of methylation at arginines substituted in positions +1, +2 and +3 by Hmt1, PRMT1v2 and PRMT1v1 was performed on a Fusion Lumos (Thermo Fisher Scientific) by parallel reaction monitoring (PRM) using electron-transfer dissociation (ETD). Inclusion lists were generated of the triply- and quadruply-charged arginine-substituted peptide (GGFGGPR[RG/GR]YGGYSR or GGFGGPRGGRGGYSR) carrying up to four methyl groups. The instrument was then set to run in targeted MS2 mode with one MS1 scan acquired in the Orbitrap (scan range = 300-1500 m/z, resolution = 60,000, automated gain control target = 4 × 105, maximum injection time = 50 ms) followed by 10 consecutive MS2 scans of inclusion-listed m/z values. Precursors were isolated (isolation width = 3 m/z), fragmented by ETD (“Use Calibrated Charge-Dependent ETD Parameters” set to “True” and “ETD Supplemental Activation” set to “False”) and then fragment ions were analysed in the Orbitrap (scan range = 350-1000 m/z, resolution = 30,000, automated gain control target = 5 × 104, maximum injection time = 54 ms). Extracted ion chromatograms (±10 ppm) were obtained for all fragment ions that differentiate localisation between the two arginines, and the area under the curve of these XIC peaks were obtained as above. Next, for each fragment ion type (e.g. the c7 ion), the

9 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

percentage that corresponds to target-site arginine methylation, as opposed to introduced arginine methylation, was calculated. The percentages for all fragment ions deriving from a single precursor ion were then averaged and used to correct the abundance of the precursor ion. Results and Discussion To validate motif determination by MT-MAMS, we applied it to human lysine trimethyltransferase G9a, a SET-domain containing enzyme for which a substrate recognition motif has already been described2: (NTGS)-(GCS)-R-K-(TGQSVMA)-(FVILA). Seven 20-peptide sets, derived from a histone H3 peptide template (ARTKQTARKSTGGKA, target lysine underlined), were synthesised and assayed with G9a to analyse positions -4 to +3 relative to the target lysine (K9). In accordance with the described activity of G9a, peptides were found to be mono-, di- or tri-methylated (Figure S-1). Notably, the methylation stoichiometries obtained by MT-MAMS showed a high degree of reproducibility (Figure S-2). Compared to the previously described motif2, MT-MAMS successfully recapitulated the strong preference for arginine in the -1 position, as well as the preference for hydrophobic residues in the +2 position (Figure 1B). In the -3 and -2 positions, we observed preferred residues that were very similar to those seen previously2 (ARSGC in the -2 position and TKRNSQGH in the -3 position) (Figure 1B). Interestingly, we observed that G9a could also accept lysine in the -1 position. This may explain its ability to methylate K373 in p5318 and K303 in ERα19, both of which have lysine in this -1 position. Overall, MT-MAMS determined the substrate recognition motif of G9a to be (TKRN)-

10 ACS Paragon Plus Environment

Page 10 of 24

Page 11 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(A>RS>G)-(R>>K)-K-(STRCKMAQHG)-Φ, where Φ represents any hydrophobic residue. This motif largely agrees with, but refines, the motif described previously2. We next applied MT-MAMS to a non-histone lysine monomethyltransferase without a described substrate recognition motif, the Saccharomyces cerevisiae SET-domain enzyme elongation factor methyltransferase 1 (Efm1)20. In vitro, we found that Efm1 could monomethylate its substrate protein, elongation factor 1A, at both K30 and K253 (Figure S3A,B,C). This is due to an apparent recognition motif that extends from positions -2 to +4 relative to the target lysine (Figure S-3D). 20-peptide sets were therefore synthesised for these six positions, using template sequences corresponding to both K30 (HLIYKCGGIDK, target lysine underlined) and K253 (QDVYKIGGIGT, target lysine underlined), and assayed with Efm1. As expected, Efm1 exclusively catalysed monomethylation on all peptides (Figures S-4 and S-5). Remarkably, MT-MAMS generated near-identical substrate recognition motifs from either template sequence (Figure 1C), demonstrating that motif-determination is independent of sequence context. The high dynamic range afforded by mass spectrometric analysis (>1000×) allowed us to definitively assign a complete absence of methylation when proline is substituted at the +1 position (Figures S-4 and S-5), indicating that Efm1 is intolerant of backbone inflexibility at this position. Overall, MT-MAMS revealed the substrate recognition motif of Efm1 to be (Y>FW)-K-^P-G-G-Φ, where ^P represents any amino acid except proline. This recognition motif is unique among lysine methyltransferases as it does not require charged residues, apart from the target lysine (Figure 2).

11 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

While substrate recognition motifs of many lysine methyltransferases have been elucidated, very few arginine methyltransferases have had their full substrate recognition motifs determined. We therefore applied MT-MAMS to Hmt1 and PRMT1, the predominant arginine methyltransferases in S. cerevisiae and human, respectively21. In the case of PRMT1, we analysed two different isoforms thought to have different substrate specificities22: PRMT1v2, the main isoform, and PRMT1v1. These seven-beta-strand methyltransferases catalyse monoand asymmetric di-methylation in RGG sequences, however there is evidence that they have broader specificity23. For all three methyltransferases, we utilised a template sequence derived from a known Hmt1 substrate, Npl3 (GGFGGPRGGYGGYSR, target arginine underlined). Eight 20-peptide sets corresponding to positions -4 to +4 were synthesised and assayed with each methyltransferase. All three enzymes produced mono- and/or di-methylation on substrate peptides (Figures S-6, S-7 and S-8). On initial analysis, we observed that peptides with arginine substituted in the +1, +2 and +3 positions appeared to be strongly preferred by all three methyltransferases (Figure S-9). However, reanalysis of these peptides by parallel reaction monitoring (PRM) revealed that the introduced arginine was in fact being methylated, giving a false-positive signal (Figures S-10, S11 and S-12). This type of detailed analysis is not possible with array-based methods and explains why previous PRMT1 substrate recognition motifs have been strongly biased towards arginine12. We therefore corrected the methylation stoichiometries of arginine-substituted peptides in the +1, +2 and +3 positions to obtain the motifs of Hmt1 (Figure 3A), PRMT1v2 (Figure 3B) and PRMT1v1 (Figure 3C). The substrate recognition motif of Hmt1 was revealed to be ^(DE)-^(DE)-R-(G>>A)-(GN>RAW)-(FYW>ILKHM) (Figure 3A), while both isoforms of PRMT1 12 ACS Paragon Plus Environment

Page 12 of 24

Page 13 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

recognised the motif ^(DE)-^(DE)-R-(G>>N)-(GR>ANK)-(K>YHMFILW) (Figure 3B,C). This indicates that RG is the most important feature for substrate recognition by these enzymes, as a glycine in the +2 position is less strictly preferred than glycine in the +1 position. This also shows that the +3 position also plays a role in substrate recognition, with Hmt1 preferring aromatic residues and PRMT1 preferring positively charged or hydrophobic residues. The motif also shows a striking disallowance of acidic residues in the -1 and -2 positions for Hmt1, PRMT1v2 and PRMT1v1. Since MT-MAMS quantifies the unmethylated peptide alongside its methylated forms, we could use a reversed scoring method to develop these negative specificity profiles for all three enzymes. We note that array-based methods cannot detect negative specificity, as they do not quantify the unmethylated peptide. Overall, MT-MAMS confirmed and significantly expanded upon the canonical RGG motif recognised by Hmt1 and PRMT1. It will be of interest to elucidate the substrate recognition motifs of other PRMTs by MT-MAMS, particularly PRMT5, since it is a novel drug target24. We finally explored whether the methylation stoichiometry data generated by MT-MAMS could generate insights into methyltransferase processivity. For all peptides methylated by G9a, Hmt1, PRMT1v2 and PRMT1v1, we plotted the fraction of maximum methylation events (observed methylation score divided by the maximum possible methylation score) against the stoichiometries of un-, mono-, di- and/or tri-methylated peptide. We simultaneously plotted the expected methylation stoichiometries for a distributive (nonprocessive) enzyme, as has been demonstrated previously25. This revealed that G9a does not catalyse trimethylation of H3K9 peptides through a completely distributive or processive mode of action (Figure 4A). Instead, it catalyses dimethylation in a distributive manner, while subsequent trimethylation is 13 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

substantially less favourable (Figure 4B). This supports previous observations that G9a is a strong dimethyltransferase and a much weaker trimethyltransferase26. For Hmt1 and both PRMT1 isoforms, similar analyses revealed that dimethylation occurred at a higher rate than would be expected due to a distributive mode of action (Figure 4C-E). This supports previous observations that PRMT1 catalyses dimethylation in a partially processive manner27. MT-MAMS can therefore provide insight into enzyme processivity, due to its ability to detect and quantify all methylation states. Conclusions We have shown that MT-MAMS is an effective method for elucidating methyltransferase substrate recognition motifs. When compared to peptide array-based methods it has the advantages of quantifying different degrees of methylation, of measuring negative specificity and of being able to resolve methylation site localisation ambiguity. Since the majority of protein methyltransferases likely recognise linear motifs, MT-MAMS will be widely applicable in characterising this important class of enzymes and aid in the development of therapeutics to target them. Acknowledgements The authors thank A/Prof. Mark Raftery, Dr. Ling Zhong and Sydney Liu Lau for their maintenance of the Orbitrap mass spectrometers housed at the Bioanalytical Mass Spectrometry Facility within the Mark Wainwright Analytical Centre of the University of New South Wales. This work was supported by the Australian Research Council (DP170100108).

14 ACS Paragon Plus Environment

Page 14 of 24

Page 15 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Supporting Information The following supporting information is available free of charge at ACS website http://pubs.acs.org Figure S-1: G9a MT-MAMS methylation stoichiometries. Figure S-2: MT-MAMS is highly reproducible. Figure S-3: Efm1 monomethylates eEF1A at lysines 30 and 253 in vitro due to the presence of a common motif. Figure S-4: Efm1 MT-MAMS methylation stoichiometries with K30 sequence template. Figure S-5: Efm1 MT-MAMS methylation stoichiometries with K253 sequence template. Figure S-6: Hmt1 MT-MAMS methylation stoichiometries. Figure S-7: PRMT1v2 MT-MAMS methylation stoichiometries. Figure S-8: PRMT1v1 MT-MAMS methylation stoichiometries. Figure S-9: Apparent Hmt1 and PRMT1 motifs show preference for arginine in down-stream positions. Figure S-10: Arginines substituted at the +1, +2 and +3 positions are methylated by Hmt1. Figure S-11: Arginines substituted at the +1, +2 and +3 positions are methylated by PRMT1v2. Figure S-12: Arginines substituted at the +1, +2 and +3 positions are methylated by PRMT1v1.

15 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1: MT-MAMS workflow and application to lysine methyltransferases G9a and Efm1. (A) The MT-MAMS method showing the workflow for a single representative position in a motif. First, 20-peptide sets are synthesised wherein a single position near the target residue (lysine, arginine or any methylated residue) is substituted to every other amino acid. These are then 16 ACS Paragon Plus Environment

Page 16 of 24

Page 17 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

individually assayed with or without the methyltransferase. D3-AdoMet is used as the methyldonor in order to give a unique mass shift to methylated peptides (+17.0345 Da for monomethyl, +34.0690 Da for dimethyl and +51.1034 Da for trimethyl). After analysis by LCMS/MS, the relative abundance of un-, mono-, di- and/or tri-methylated peptide is quantified from the enzyme-treated sample, giving the methylation stoichiometries resulting from each substitution. Stoichiometry data for all positions in the motif are then used to generate a sequence logo representation of the full substrate recognition motif. (B) Sequence logo representation of G9a lysine methyltransferase specificity as determined by MT-MAMS. 20peptide sets corresponding to positions -4, -3, -2, -1, +1, +2 and +3 in a template sequence from histone H3 (ARTKQTARKSTGGKA, target lysine underlined) were assayed with G9a in triplicate and analysed by LC-MS/MS. The target lysine K9 is shown as the maximum possible height of a stack (log219 ≈ 4.25) for the sake of reference. Inset: Zoom-in of positions +1/S10 and +2/T11. (C) Sequence logo representation of lysine methyltransferase Efm1 specificity as determined by MT-MAMS. 20-peptide sets corresponding to positions -2, -1, +1, +2, +3 and +4 in template sequences from K30 (HLIYKCGGIDK, target lysine underlined) (top) or K253 (QDVYKIGGIGT, target lysine underlined) (bottom) were assayed with Efm1 in triplicate and analysed by LCMS/MS. The target lysines K30 and K253 are shown as the maximum possible height of a stack (log219 ≈ 4.25) for the sake of reference.

17 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2: Efm1 has a unique substrate recognition motif among lysine methyltransferases. The substrate recognition motifs of lysine methyltransferases characterised to-date (ATXR528, Cl429, Dim-530, Efm1 (this study), G9a2, NSD14, SET7/93, SET831, SUV39H17, SUV39H232, SUV4-20H16 and SUV4-20H26) are shown according to the preferred amino acids in positions around the target lysine. In the case of multiple amino acid preferences, the order indicates a decreasing preference. Blank squares are positions that do not constitute the motif. Dashes indicate positions within the motif where there is no preferred amino acid. Every lysine methyltransferase characterised so-far recognises at least one positively or negatively charged residue, excluding the target lysine, whereas Efm1 does not.

18 ACS Paragon Plus Environment

Page 18 of 24

Page 19 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 3: MT-MAMS uncovers the substrate recognition motifs of yeast and human arginine methyltransferases Hmt1 and PRMT1. MT-MAMS was employed to determine the positive (top) and negative (bottom) substrate recognition motifs of arginine methyltransferases Hmt1 (A), PRMT1v2 (B) and PRMT1v1 (C), as visualised by sequence logo representation. 20-peptide sets corresponding to positions -4, -3, 2, -1, +1, +2, +3 and +4 in a template sequence from Hmt1 substrate Npl3 (GGFGGPRGGYGGYSR, target arginine underlined) were assayed in triplicate with either Hmt1, PRM1v2 or PRMT1v1 and analysed by LC-MS/MS. Cysteine-substituted peptides were not detected in these analyses. The target arginine is shown as the maximum possible height of a stack (log218 ≈ 4.17) for the sake of reference.

19 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4: MT-MAMS provides insights into methyltransferase processivity. (A,B) The G9a-catalysed methylation stoichiometries obtained for all peptides analysed by MTMAMS (n = 133) were plotted (as points) relative to their fraction of maximum methylation events (methylation score divided by the maximum possible methylation score, i.e. 300) and coplotted (as lines) with the expected stoichiometries for an enzyme that catalyses trimethylation distributively (A) or dimethylation distributively followed by trimethylation (B). (C,D,E) The Hmt1-catalysed (C), PRMT1v2-catalysed (D) or PRMT1v1-catalysed (E) methylation stoichiometries obtained for all peptides analysed by MT-MAMS (n = 144) were plotted (as points) relative to their fraction of maximum methylation events (methylation score divided by

20 ACS Paragon Plus Environment

Page 20 of 24

Page 21 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

the maximum possible methylation score, i.e. 200) and co-plotted (as lines) with the expected stoichiometries for an enzyme that catalyses dimethylation distributively.

21 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References 1. 2. 3.

4. 5. 6.

7.

8. 9. 10. 11. 12. 13. 14.

15. 16. 17. 18. 19. 20. 21.

Copeland, R.A., Solomon, M.E. & Richon, V.M. Protein methyltransferases as a target class for drug discovery. Nat Rev Drug Discov 8, 724-732 (2009). Rathert, P. et al. Protein lysine methyltransferase G9a acts on non-histone targets. Nat Chem Biol 4, 344-346 (2008). Dhayalan, A., Kudithipudi, S., Rathert, P. & Jeltsch, A. Specificity analysis-based identification of new methylation targets of the SET7/9 protein lysine methyltransferase. Chem Biol 18, 111-120 (2011). Kudithipudi, S., Lungu, C., Rathert, P., Happel, N. & Jeltsch, A. Substrate specificity analysis and novel substrates of the protein lysine methyltransferase NSD1. Chem Biol 21, 226-237 (2014). Kusevic, D., Kudithipudi, S. & Jeltsch, A. Substrate Specificity of the HEMK2 Protein Glutamine Methyltransferase and Identification of Novel Substrates. J Biol Chem 291, 6124-6133 (2016). Weirich, S., Kudithipudi, S. & Jeltsch, A. Specificity of the SUV4-20H1 and SUV4-20H2 protein lysine methyltransferases and methylation of novel substrates. J Mol Biol 428, 2344-2358 (2016). Kudithipudi, S., Schuhmacher, M.K., Kebede, A.F. & Jeltsch, A. The SUV39H1 Protein Lysine Methyltransferase Methylates Chromatin Proteins Involved in Heterochromatin Formation and VDJ Recombination. ACS Chem Biol 12, 958-968 (2017). al-Obeidi, F.A., Wu, J.J. & Lam, K.S. Protein tyrosine kinases: structure, substrate specificity, and drug discovery. Biopolymers 47, 197-223 (1998). Miller, C.J. & Turk, B.E. Homing in: Mechanisms of Substrate Targeting by Protein Kinases. Trends Biochem Sci 43, 380-394 (2018). Biggar, K.K. & Li, S.S. Non-histone protein methylation as a regulator of cellular signalling and function. Nat Rev Mol Cell Biol 16, 5-17 (2015). Kudithipudi, S., Kusevic, D., Weirich, S. & Jeltsch, A. Specificity analysis of protein lysine methyltransferases using SPOT peptide arrays. J Vis Exp, e52203 (2014). Gayatri, S. et al. Using oriented peptide array libraries to evaluate methylarginine-specific antibodies and arginine methyltransferase substrate motifs. Sci Rep 6, 28718 (2016). Hamey, J.J. et al. Novel N-terminal and Lysine Methyltransferases That Target Translation Elongation Factor 1A in Yeast and Human. Mol Cell Proteomics 15, 164-176 (2016). Hart-Smith, G., Low, J.K., Erce, M.A. & Wilkins, M.R. Enhanced methylarginine characterization by post-translational modification-specific targeted data acquisition and electron-transfer dissociation mass spectrometry. J Am Soc Mass Spectrom 23, 1376-1389 (2012). Vizcaino, J.A. et al. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res 44, D447-456 (2016). Schneider, T.D. & Stephens, R.M. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18, 6097-6100 (1990). He, L., Diedrich, J., Chu, Y.Y. & Yates, J.R., 3rd Extracting Accurate Precursor Information for Tandem Mass Spectra by RawConverter. Anal Chem 87, 11361-11367 (2015). Huang, J. et al. G9a and Glp methylate lysine 373 in the tumor suppressor p53. J Biol Chem 285, 9636-9641 (2010). Zhang, X. et al. G9a-mediated methylation of ERalpha links the PHF20/MOF histone acetyltransferase complex to hormonal gene expression. Nat Commun 7, 10810 (2016). Lipson, R.S., Webb, K.J. & Clarke, S.G. Two novel methyltransferases acting upon eukaryotic elongation factor 1A in Saccharomyces cerevisiae. Arch Biochem Biophys 500, 137-143 (2010). Tang, J. et al. PRMT1 is the predominant type I protein arginine methyltransferase in mammalian cells. J Biol Chem 275, 7723-7730 (2000). 22 ACS Paragon Plus Environment

Page 22 of 24

Page 23 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

22.

23. 24. 25. 26. 27.

28. 29. 30. 31.

32.

Goulet, I., Gauvin, G., Boisvenue, S. & Cote, J. Alternative splicing yields protein arginine methyltransferase 1 isoforms with distinct activity, substrate specificity, and subcellular localization. J Biol Chem 282, 33009-33021 (2007). Wooderchak, W.L. et al. Substrate profiling of PRMT1 reveals amino acid sequences that extend beyond the "RGG" paradigm. Biochemistry 47, 9456-9466 (2008). Chan-Penebre, E. et al. A selective inhibitor of PRMT5 with in vivo and in vitro potency in MCL models. Nat Chem Biol 11, 432-437 (2015). Frederiks, F. et al. Nonprocessive methylation by Dot1 leads to functional redundancy of histone H3K79 methylation states. Nat Struct Mol Biol 15, 550-557 (2008). Zhang, X. et al. Structural basis for the product specificity of histone lysine methyltransferases. Mol Cell 12, 177-185 (2003). Osborne, T.C., Obianyo, O., Zhang, X., Cheng, X. & Thompson, P.R. Protein arginine methyltransferase 1: positively charged residues in substrate peptides distal to the site of methylation are important for substrate binding and catalysis. Biochemistry 46, 13370-13381 (2007). Bergamin, E. et al. Molecular basis for the methylation specificity of ATXR5 for histone H3. Nucleic Acids Res 45, 6375-6387 (2017). Kusevic, D., Kudithipudi, S., Iglesias, N., Moazed, D. & Jeltsch, A. Clr4 specificity and catalytic activity beyond H3K9 methylation. Biochimie 135, 83-88 (2017). Rathert, P., Zhang, X., Freund, C., Cheng, X. & Jeltsch, A. Analysis of the substrate specificity of the Dim-5 histone lysine methyltransferase using peptide arrays. Chem Biol 15, 5-11 (2008). Kudithipudi, S., Dhayalan, A., Kebede, A.F. & Jeltsch, A. The SET8 H4K20 protein lysine methyltransferase has a long recognition sequence covering seven amino acid residues. Biochimie 94, 2212-2218 (2012). Schuhmacher, M.K., Kudithipudi, S., Kusevic, D., Weirich, S. & Jeltsch, A. Activity and specificity of the human SUV39H2 protein lysine methyltransferase. Biochim Biophys Acta 1849, 55-63 (2015).

23 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

For TOC only

24 ACS Paragon Plus Environment

Page 24 of 24