MM Investigation of Substrate and Product Specificities of Suv4

May 10, 2017 - Quantum mechanical/molecular mechanical (QM/MM) free energy and MD simulations were performed to study the active site dynamics of ...
0 downloads 9 Views 2MB Size
Subscriber access provided by CORNELL UNIVERSITY LIBRARY

Article

QM/MM Investigation of Substrate and Product Specificities of Suv4-20h2: How Does This Enzyme Generate Dimethylated H4K20 from Mono-methylated Substrate? Ping Qian, Haobo Guo, Liang Wang, and Hong Guo J. Chem. Theory Comput., Just Accepted Manuscript • Publication Date (Web): 10 May 2017 Downloaded from http://pubs.acs.org on May 15, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Chemical Theory and Computation is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

QM/MM Investigation of Substrate and Product Specificities of Suv420h2: How Does This Enzyme Generate Di-methylated H4K20 from Mono-methylated Substrate? Ping Qian1*, Haobo Guo2, Liang Wang1, and Hong Guo2,3* 1

Shandong Agricultural University, Chemistry and Material Science Faculty, Tai’an 271018, Shandong,

Peoples Republic of China 2

Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville,

TN 37996, USA 3

UT/ORNL Center for Molecular Biophysics, Oak Ridge National Laboratory, Oak Ridge, TN 37830,

USA

* Corresponding authors. E-mail: [email protected]; Fax: +1(865)974-6306; E-mail: [email protected]

RECEIVED DATE (to be automatically inserted after your manuscript is accepted if required according to the journal that you are submitting your paper to) TITLE RUNNING HEAD (Word Style “AF_Title_Running_Head”).

ACS Paragon Plus Environment

1

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 27

ABSTRACT. Protein lysine methyltransferases (PKMTs) catalyze the methylation of lysine residues on histone proteins in the regulation of chromatin structure and gene expression. In contrast to many other PKMTs for which unmodified lysine is the methylation target, the enzymes in the Suv4-20 family are able to generate di-methylated product (H4K20me2) based exclusively on the mono-methylated H4K20 substrate (H4K20me1). The origin of such substrate/product specificity is still not clear. Here molecular dynamics (MD) and free energy (potential of mean force) simulations are undertaken using quantum mechanical/molecular mechanical (QM/MM) potentials to understand the substrate/product specificities of Suv4-20h2, a member of the Suv4-20 family. The free energy barriers for mono-, di- and trimethylation in Suv4-20h2 obtained from the simulations are found to be well correlated with the specificities observed experimentally with the allowed di-methylation based on the H4K20me1 substrate and prohibited mono-methylation and tri-methylation based on H4K20 and H4K20me2, respectively. It is demonstrated that the reason for the relatively efficient di-methylation is an effective transition state (TS) stabilization through strengthening the CH···O interactions as well as the presence of cation-π interaction at the transition state. The simulations also show that the failures of Suv4-20h2 to catalyze the mono-methylation and tri-methylation are due, respectively, to a less effective TS stabilization and inability of the reactant complex containing H4K20me2 to adopt a reactive (near attack) configuration for methyl transfer. The results suggest that care must be exercised in the prediction of the substrate specificity based only on the existence of the near attack configurations in the substrate complexes.

ACS Paragon Plus Environment

2

Page 3 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Introduction Protein lysine methyltransferases (PKMTs) methylate lysine residues on the histone amino termini, and such methylation is one of the important posttranslational modifications on histone proteins.1-3 Considerable efforts have been made to understand structural and energetic origins of the PKMT specificity.4, 5 Nevertheless, questions remain concerning the features of the enzyme’s active sites and the interactions that control their specificity. This is especially true for the Suv4-20 enzymes (e.g., Suv4-20h1 and Suv4-20h2), which are among the most important PKMTs for Histone 4 Lysine 20 (H4K20) methylation.6,

7

Indeed, Suv4-20h1, Suv4-20h2 and other enzymes in this family exhibit

appreciable activity of methylation on the H4K20me1 substrate, but not on H4K20 (i.e., H4K20me0).810

This is in contrast to many other SET domain methyltransferases that target unmodified lysine on

their substrates in the generation of mono-, di- or tri-methylated lysine in the products. Although a number of biochemical and structural studies have been performed, it is still not clear as to why the Suv4-20 enzymes can only add the methyl group to H4K20me1, but not to H4K20me0 and H4K20me2. The SET domain PKMT catalysis exhibits product specificity, i.e., a given PKMT produces a specific methylation product (i.e., mono-, di- or tri-methylated lysine) based on an unmodified lysine residue in the substrate (with the exception of the enzymes in the Suv4-20 family. See above).11, 12 Some important features of the active site sites controlling the product specificity have been identified. One of such features is the so-called Phe/Tyr switch13 in which the nature of the residue (Phe or Tyr) at a unique position of the active site controls the methylation state in the product, with Phe associated with a higher methylation level and Tyr with a lower methylation level.11-21 For instance, the wild-type DIM5 produces the tri-methylated product from an unmodified lysine, whereas its F281Y mutant (in which the Phe/Tyr switch residue is mutated to Tyr) generates mono- and di-methylated lysine in the products.13 Similarly, PR-Set7 (also known as SET8) can be converted from the H4K20 monomethylase to a di-methylase when Y334 (the Phe/Tyr switch residue) is replaced by Phe.17 The Phe/Tyr switch phenomenon may be explained based on the change of the size of the active sites.22-27 For

ACS Paragon Plus Environment

3

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 27

instance, the active site of PR-Set7 may be rather crowded for addition of two methyl groups,24 and the deletion of the hydroxyl group on Y334 and the water dissociation from the active site due to the Y334→F mutation can create additional space for the second methylation reaction to occur.17, 24 Although PR-Set7 is H4K20 mono-methylase, the knockout of PR-Set7 (SET8) in mice was found to lead to the loss of all the three methylated forms of H4K20, not just H4K20me1.28 A sequential model was then proposed in which H4K20me1 produced by PR-Set7 serves as the substrate for the Suv4-20 enzymes that catalyze the next level of methylation.29 This model is consistent with the observations that there was a reduction of H4K20me2/me3 along with an increase in H4K20me1 after the double knock-out of the Suv4-20h1 and Suv4-20h2 enzymes.30, 31 Additional biochemical studies8-10 demonstrated that the Suv4-20h1 and Suv4-20h2 enzymes indeed prefer mono-methylated H4K20 peptides as the substrates, as the activity of these enzymes on the H4K20me1 peptides was found to be 3-28 folds higher than that on H4K20me0. Moreover, the Suv4-20 enzymes were unable to generate H4K20me3 using the H4K20me2 substrate. The X-ray crystallographic structure of the Suv4-20h2 complex containing AdoHcy and H4K20me2 peptide (i.e., the product complex from the methylation of H4K20me1) has been determined.8 Although this experimental structure is of extremely useful, its existence is not sufficient to answer some of the important questions concerning the specificities of the Suv4-20 enzymes. For instance, the crystal structure showed that Ser-161 at the active site forms a hydrogen bond to the H4K20me2 Nζ atom, but question remains as to whether the same hydrogen bond would also be present in the reactant complex containing H4K20me2, thus playing a role for inhibiting the tri-methylation as suggested. Phe191 in Suv4-20h2 presumably interacts with the methyl group of H4K20me1 based on the X-ray structure. This so-called CH3-π hydrogen bond was proposed to effectively lock the K20me1Nζ group in a position suitable for the methyl transfer to H4K20me1.8 However, it is still not clear as to whether the lack of this interaction in Suv4-20h2 complexed with H4K20me0 would be responsible for the failure of the enzyme to catalyze mono-methylation.8 Indeed, previous studies have shown that dimethyltransferases were capable of catalyzing the mono-methylation (i.e., the first methyl transfer from ACS Paragon Plus Environment

4

Page 5 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

S-adenosyl-L-methionine or AdoMet to unmodified lysine), in addition to the di-methylation (i.e., the second methyl transfer to mono-methylated lysine).22-27 The similar observations have been made for tri-methyltransferases. For these enzymes, the lack of the methyl group on unmodified lysine residue seems to have no major effects on the orientation of lysine-Nζ as well as on the energetics of the first methyl transfer.23, 25 Thus, question is why this could become an issue for the Suv4-20h enzymes. Previous studies for the PKMT methylation reactions have mainly focused on the product specificity of the enzymes, as the target lysine is generally unmodified in those cases. The observations that the enzymes in the Suv4-20 family prefer the mono-methylated H4K20 peptide as the substrate generate new questions concerning the factors that determine the substrate specificity for lysine and methyl lysine. Previous experimental investigations have failed to provide detailed energetic explanations for this question, and computer simulations, which were used earlier for other PKMTs,22-27, 32-44

may be applied to address the specificity questions for the Suv4-20 enzymes. In the present study,

MD and potential of mean force (PMF) simulations were undertaken with use of QM/MM potentials to study the Suv4-20h2-catalyzed reactions and understand its substrate and product specificities. The free energy barriers for mono-, di- and tri-methylation obtained from these simulations are in line with the experimental reports that the enzyme can only catalyze di-methylation with the mono-methylated substrate, whereas it is inactive on the unmodified or di-methylated H4K20 substrate. The origin of such specificity is discussed, and the differential free-energy barriers of the methyl transfers were found to be able to determine the specificities of the enzyme as observed previously. Methods Quantum mechanical/molecular mechanical (QM/MM) free energy and MD simulations were performed to study the active-site dynamics of Suv4-20h2 and to determine the free energy profiles of mono-, di- and tri-methylation as a result of the methyl transfers from S-adenosyl-L-methionine (AdoMet) to the ε-amino groups of H4K20me0, H4K20me1 and H4K20me2 in Suv4-20h2, respectively, using the CHARMM program.45 The –CH2–CH2–S+(Me) –CH2– portion of S-adenosyl-Lmethionine and the sidechains of lysine/methyl-lysine were treated quantum mechanically and the rest ACS Paragon Plus Environment

5

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 27

of the system molecular mechanically in the simulations; a larger QM region was also used to determine the QM effects on certain key interactions and free energy barriers (see below). The QM and MM regions were separated using the link-atom approach.46 For solvent, a modified TIP3P water model47 was used. The simulations were performed based on the stochastic boundary molecular dynamics method.48 The system was separated into a reaction zone and a reservoir region, with the reaction zone containing a reaction region (a sphere with radius r of 20 Å) and a buffer region (20 Å ≤ r ≤ 22 Å). The reference center is the Nζ atom of the lysine residue/methyl lysine. The DFTB3 method49-52 was applied to the QM atoms and the CHARMM potential function (PARAM27)53 was adopted for the MM atoms. The high-level quantum mechanical calculations (e.g., B3LYP/6-31G** or MP2/6-31G**) are still too time-consuming for free energy simulations of enzymes. The semi-empirical approach adopted here has generated the results on a number of systems that seem to be quite reasonable.51, 52 Indeed, we have used a small model system to compare the energetics for the methyl transfer based on B3LYP/6-31G** and DFTB3 methods. The potential energy curves obtained are rather similar, and the difference in the energy barriers is only about 1.6 kcal/mol (see Supporting Information). Moreover, the bond breaking and making events studied in this work are similar SN2 reaction processes, and much of the errors for the three methyl transfers were presumably cancelled out when the relative free energy barriers were compared. The relative values in the free-energy barriers, rather than the absolute values of the barriers, are likely to be more important for the substrate/product specificity and are less sensitive to the quantum mechanical methods used in the study due to the cancellation of errors. The systems studied in this work contained around 5500 atoms, including about 700-800 water molecules. The structures of the reactant complexes for mono-, di- and tri-methylation were generated from the structure of the protein complex (PDB code: 4AU7) containing Suv4-20h2, AdoHcy and H4K20me2 peptide (i.e., the product complex resulting from the di-methylation of H4K20me1).8 To obtain the reactant complexes, AdoMet was generated by adding a methyl group manually to AdoHcy. For the H4K20me0 methylation, the two methyl groups on the H4K20me2 peptide in the crystallographic complex were manually deleted for generating the target lysine. For the H4K20me1 methylation, one of ACS Paragon Plus Environment

6

Page 7 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

the two methyl groups (i.e., CMH3 that points to AdoHcy in the crystallographic complex) was manually deleted to generate the mono-methylated target lysine. The steepest descent and adopted-basis NewtonRaphson methods were used to optimize the stochastic boundary systems which were heated from 50.0 to 298.15 K in 50 ps. The time step for integration of the equation of motion is1-fs, and for every 50 fs the coordinates were saved. The QM/MM MD simulations were performed for 3 ns for each of the reactant complexes for mono-, di- and tri-methylation, and the distribution maps of r(CM-Nζ) and θ were generated based on the data from the 1~1.5 ns. The previous study showed that the SN2 methyl transfers were normally more efficient when the AdoMet S-CH3 group could be aligned well with the Nζ lone pair of electrons in the reactant complex; i.e., with smaller θ angles and shorter CM-Nζ distances.24 Here θ is defined as the angle between the CM-Sδ direction (r2) and the electron lone-pair direction (r1) (see Figure 1a). Therefore, we determined the distribution maps of r(CM-Nζ) and θ based on the simulations to determine the connection between the distributions and substrate/product specificities. The umbrella sampling method54 along with the Weighted Histogram Analysis Method (WHAM)55 was used to calculate the free energy changes for the methyl transfers from AdoMet to H4K20me0, H4-K20me1 or H4-K20me2 in the enzyme. Two different sets of the free energy simulations with different QM regions were undertaken. For the first set of the simulations involving a relatively small QM region, the –CH2–CH2–S+(Me) –CH2–

portion of AdoMet and the sidechains of

lysine/methyl-lysine were described by the quantum mechanical approach; the rest of the system were described by the molecular mechanical method. For this first set of the simulations, the free energy profiles were determined for all the three methylation reactions. For the second set of the simulations containing a relatively large quantum mechanical region, the sidechains of Phe191 and Tyr217 were treated by QM as well as, in addition to the QM atoms used in the first set of the simulations. Phe191 and Tyr217 are participated in the key interactions during the substrate binding and the methyl transfers (see below), and determining the effects of the QM treatment for these two residues on the free energy barriers would be of interest. In the case of the larger QM region, the free energy simulations for the mono- and di-methylation were undertaken. The linear combination of r(CM-Nζ) and r(CM-Sδ) [i.e., R = ACS Paragon Plus Environment

7

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 27

r(CM-Sδ)– r(CM-Nζ)] was used as the reaction coordinate (see Figure 1a). For the first/second/third methyl transfer process, 25/23/34 windows were respectively used, and the production runs of 50 ps were undertaken after 50 ps equilibration for each window. The harmonic biasing potentials have force constants of 50—400 kcal mol–1 Å–2. For each of the five systems (three with the relatively small QM region and two with the relatively large QM region), five independent PMF simulations were performed (i.e., total 25 PMF simulations). The free energy barriers and statistical errors were taken as the average values and standard deviations from the five runs, respectively, for each system. In order to determine whether the results obtained from the simulations are meaningful, we compared the structure of the active site for the di-methylation product generated from the PMF simulations (with the relatively large QM region) with the corresponding experimental X-ray structure (PDB ID: 4AU7) in Figure 1b. Figure 1b shows that the experimental active site was reproduced quite well from the simulations; the average error is less than 0.2 Å for the key interaction distances (or only ~0.1 Å without consideration of the distance between the C=O group of Ile181 and CM1). A good agreement was also obtained for the di-methylation product based on the relatively small QM region. The ability of our simulations to reproduce the experimental structure of the product complex suggests that these simulations are meaningful. Results and Discussion The representative structures of the active site for the reactant complexes obtained from the simulations for the mono-, di- and tri-methylation with the H4-K20me0, H4-K20me1 and H4-K20me2 substrates, respectively, are given in Figure 2a-c. The distribution map of r(CM-Nζ) and θ is also given in each case. Phe191 interacts with CM1H3 in the X-ray structure (see Figure 1b).8 It was hypothesized that the lack of this interaction in Suv4-20h2 complexed with H4-K20 (H4-K20me0) might lead to a poor orientation of lysine-Nζ, making the methyl transfer from AdoMet to H4-K20me0 (monomethylation) inefficient.4a The reactant structures for mono-methylation obtained from the simulations do not support this suggestion. Indeed, Figure 2a shows that the target lysine has its lone pair of electrons on Nζ well aligned with the methyl group of AdoMet, and there is a large population of the ACS Paragon Plus Environment

8

Page 9 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

structures with relatively short r(CM…Nζ) distances (e.g., ≤ 3.5 Å) and small values of the θ angle (e.g., ≤ 30°). The average distance between Nζ and CM of the methyl group is about 3.7 Å with the θ angle in the range of 0-45° for most of the structures produced through the simulations. As is shown in Figure 2a, Ser161 accepts a hydrogen bond from the sidechain of the target lysine. This interaction is likely to be important for generating the good alignment between the methyl group of AdoMet and lone pair of electrons of the ε-amino group observed from the simulations. It is interesting to note that the transferable methyl group from AdoMet forms or has the tendency to form carbonoxygen hydrogen bonds with the C=O groups of Phe160, Ala179 and Ile181 as well as the sidechain of Tyr217. The CH···O hydrogen bonding interactions widely exist in protein structures and have been a subject

of

extensive

quantum

mechanical

calculations.56,

57

For

the

AdoMet-dependent

methyltransferases, it has been suggested58 that the CH···O hydrogen bonds involving the methyl groups may represent an important feature from convergent evolution. The existence of extensive CH···O hydrogen bonds in Suv4-20h2 is consistent with such structural feature of AdoMet-dependent methyltransferases. Nevertheless, Suv4-20h2 failed to catalyze the methylation reaction for the H4K20me0 peptide,8 even though the active site structure seems to have a significant population of the reactive (near attack) conformations as well as extensive CH···O hydrogen bonds. Figure 2b shows that for di-methylation the lone pair of electrons on Nζ in the reactant complex can also be aligned well with the transferable methyl group. This good alignment is likely to be achieved through the CM1H3, Phe191 interaction as well as the hydrogen bond interaction involving Ser161. These two interactions seem to effectively lock the K20me1 group in a position suitable for the methylation reaction, supporting the previous suggestion based on the experimental structure of the product complex and consistent with the activity of the enzyme on H4-K20me1.8 The average distance between Nζ and CM is around 3.5 Å with the θ angle mainly between 0° and 40°. The carbon-oxygen hydrogen bonds mentioned above for the reactant complex of mono-methylation also exist here for dimethylation.

ACS Paragon Plus Environment

9

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 27

The structure for the reactant complex of tri-methylation is given in Figure 2c. This structure is significant different from the experimental structure and the average structure of the product complex from the di-methylation (Figure 1b). For instance, the Nζ(CH3)2 group was undergone a rotation such that one of the methyl groups is now involved in the CM2H3-π hydrogen bond/hydrophobic interactions with both Trp174 and Phe191 in the reactant complex. The X-ray structure (Figure 1b) and the reactant structures for mono- and di-methylation (Figure 2a-b) showed that Ser-161 forms a hydrogen bond to the Nζ atom of lysine or methyl lysine. It was suggested8 that accommodating a third methyl group would require to break this hydrogen bond, leading to a low activity of Suv4-20h2 toward the H4K20me2 substrate. It should be mentioned that the crystal structure corresponds to the product complex generated from methylation of H4-K20me1, and it is likely that H4-K20me2 is protonated in this product complex and could only donate a hydrogen bond to Ser-161. The reactant complex for trimethylation, on the other hand, should have Nζ of H4-K20me2 deprotonated in order to accept the third methyl group. Figure 2c shows that the hydrogen bond interaction between Ser-161 and deprotonated H4-K20me2 seems to be quite weak in the reactant complex of tri-methylation (with a distance of 4.6 Å). Thus, the energetic costs for breaking this hydrogen bond involving Ser161 may not be the main reason for the failure of the enzyme to catalyze the methyl transfer to H4-K20me2. As is evident from the structure shown in Figure 2c, the methyl group from AdoMet and the lone pair of electrons of H4K20me2 cannot be well aligned. The average distance between Nζ and CM is about 5 Å and the angle is in the range of 80-140°. Thus, the failure to generate the reactive (near attack) conformations for the reactant complex is likely to be one of the main reasons that Suv4-20h2 is not a tri-methylase (see below for more discussions). The free-energy profiles for mono-, di- and tri-methylation in Suv4-20h2 are plotted in Figure 3a based on the models with the relatively small QM region. The free energy barrier for mono-methylation was found to be quite high (23.9 kcal/mol). Therefore, the enzyme is likely to be inefficient in catalyzing the corresponding methylation reaction, consistent with previous experimental observation.8 It is of interest to note from Figure 3a that the free energy barrier for di-methylation (17.9 kcal/mol) is ACS Paragon Plus Environment

10

Page 11 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

much lower than those of mono-methylation (by as much as 6 kcal/mol) and tri-methylation (by 5.1 kcal/mol). Thus, the di-methylation reaction is likely to be much more efficient compared to the monomethylation and tri-methylation processes. This agrees with the experimental findings that Suv4-20h2 generates di-methylated lysine product based exclusively on the mono-methylated H4K20 substrate.8 The results are in contrast with our previous studies on other PKMTs, where it has been shown that dimethylases (e.g., G9a-like-protein) or tri-methylases (e.g., DIM-5) are all capable of catalyzing the first methyl transfer from AdoMet to the unmodified target lysine, in addition to the second methyl transfer (for di-methylases) or second and third methyl transfers (for tri-methylases) to the corresponding methylated lysine.22-27 It is of interest to note that the transition state for mono-methylation (R ~ 0.65) is more advanced compared to di-methylation and tri-methylation (R~ 0.3−0.4). The free-energy profiles for mono- and di-methylation based on the models with the relatively large QM region are plotted in Figure 3b. These two profiles are quite similar to the corresponding profiles in Figure 3a, suggesting that including Phe191 and Tyr217 in the QM region do not have major energetic effects on the free energy profiles and activation barriers. We have defined an energy triplet (0, Δ2-1W, Δ3-1W) for wild-type PKMTs earlier to describe the relative free-energy barriers for methyl transfers determined from the QM/MM simulations;25 a similar energy triplet has also been defined for mutated PMKTs. Here the second parameter (Δ2-1W) is the difference in the free energy barriers for the second and first methyl transfers to the mono-methylated and unmodified lysine, respectively. The third parameter (Δ3-1W) is the difference in the free energy barriers for the third and first methyl transfers to the di-methylated and unmodified lysine, respective. Such energy triplets have been successfully used to understand the product specificity for several PKMTs and their mutants.23-25, 27 For instance, the energy triplet for SET7/9 is (0, 5, 8) so that the free energy barrier for the second or third methyl transfer is much higher than the first barrier, consistent with the fact that the enzyme is a mono-methylase. For DIM-5, the energy triplet is (0, 0, -1). Therefore, all the three methyl transfers from AdoMet to the target lysine/methyl lysine can be catalyzed by DIM-5, consistent with the fact that DIM-5 is a tri-methylase. In this earlier definition of the energy triplet, the ACS Paragon Plus Environment

11

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 27

free energy barrier for the mono-methylation is used as the reference (i.e., 0 for the first parameter). This is because the PKMTs studied previously all have the substrates containing unmodified target lysine. For instance, DIM-5 is capable of catalyzing the first methyl transfer, in addition to the second and third methyl transfers, in order to generate the tri-methylated lysine in the product. As was discussed above, Suv4-20h2 is different and produces the product containing the di-methylated lysine based exclusively on the mono-methylated H4K20 substrate (H4-K20me1), and the enzyme is not active on the peptide with unmodified lysine (H4-K20me0). It appears that a modified version of the energy triplet in which the methyl transfer process with the lowest barrier is used as the reference may probably be more informative. If we use this new definition, the energy triplet for Suv4-20h2 is (6, 0, 5.1) for the models with the relatively small QM region, showing that only the di-methylation with the H4-K20me1 substrate can be catalyzed by this enzyme (i.e., mono- and tri-methylation have free energy barriers that are more than 5 kcal/mol higher). The results of our simulations given above are well correlated with experimental observations that Suv4-20h2 generates H4K20me2 based on H4K20me1. However, question remains concerning the origin of such substrate/product specificity. One possibility is that for the di-methylation process there would be a more effective transition state stabilization. In Figure 4a-c, the representative structures of the active site near transition state for mono-, di- and tri-methylation are compared. As is shown from Figure 4a and b, one of the main differences between these two structures is the existence of the methyl group (CM1H3) on H4K20me1 (Figure 4b) that is involved in the interactions with Phe-191 as well as the CH···O hydrogen bonds with the C=O groups of Ala-178 and Ile-181. Such interactions do not exist in the structure for mono-methylation in Figure 4a. Comparison of Figure 4b with Figure 2b (the structure of the reactant complex for di-methylation) shows that these interactions seem to be strengthening as the substrate changes to the transition state. For instance, the distances between C M1 and O for the two CH···O hydrogen bonds involving Ala-178 and Ile-181 decrease from 4.2 and 4.8 Å to 3.7 and 3.9 Å (for the models with the small QM region), respectively. Such strengthening of the electrostatic interactions is probably due to the positive charge formation on CM1H3 during the CMH3+ ACS Paragon Plus Environment

12

Page 13 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

transfer. In addition to the enhancement of the CH···O hydrogen bonds, the formation of the positive partial charge on CM1H3 may also allow this group to be involved in the cation-π interaction with Phe191 in the transition state. The cation-π interactions are the attractive interactions between cations and negative electrostatic potentials of π systems and believed to make an important contribution to biological recognition and catalysis,59, 60 including the recognition of methylated lysine and arginine on histone proteins.61, 62 Nevertheless, to the best of our knowledge, the possible contribution of the cationπ interaction on the substrate/product specificity of PKMTs suggested here has not been discussed previously. The existence of the cation-π interaction seems to be supported by the decrease of the average distance between CM1 and Phe-191 from 3.9 Å to 3.6 Å in going from the reactant complex (Figure 2b) to the transition state (Figure 4b). In addition to the interactions involving the methyl group (CM1H3) on H4K20me1, the general strengthening of the interactions involving the transferable methyl group (CMH3) in going from the reactant state to the transition state may also help to lower the free energy barrier for di-methylation. Table 1 lists the average O…CM distances obtained from the free energy simulations near the reactant, transition and product states for the three methyl transfers. It is of interest to note that for di-methylation, four of the five average O…CM distances become shorter in going from the reactant to transition state, indicating a potential increase in the strength of the interactions at the transition state; the only exception is the interaction involving Ala-179. The strengthening of the interactions for mono-methylation is not profound compared to that for dimethylation. Thus, the examination of the active site structures suggests that there is a stronger transition state stabilization for di-methylation than for mono-methylation. Comparison of the free energy curves for di-methylation and tri-methylation in Figure 3 demonstrates that although the free energy barrier for tri-methylation is about 5.1 kcal/mol higher than that for di-methylation, much of such difference seems to exist already before the bond breaking/making event. Indeed, the free energy difference at R ~ -1.2 Å is as much as 5 kcal/mol already. Comparison of Figure 2b and 4b demonstrates that the structure for the reactant complex of di-methylation (Figure 2b) is rather similar to that for the transition state (Figure 4b). Thus, the generation of such a TS-like ACS Paragon Plus Environment

13

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 27

reactant conformation through binding already contributes to the TS stabilization for di-methylation compared to the case for tri-methylation. Indeed, comparison of the structures in Figure 2c and 4c demonstrates that the reactant structure for tri-methylation (Figure 2c) is considerably different from the corresponding transition state structure (Figures 4c). Thus, the energetic costs to produce the TS-like structure from this reactant structure would be higher, and this is likely to lead to a higher activation barrier for tri-methylation. The similar arguments were made for other PKMTs.23-25 Conclusions The quantum mechanical/molecular mechanical MD and free energy simulations were undertaken for the histone H4K20 methyltransferase Suv4-20h2. Although the Suv4-20 enzymes have been a subject of previous experimental investigations, question remains as to why the enzymes in this family generate the di-methylated lysine product (H4K20me2) based exclusively on the monomethylated H4K20 substrate (H4K20me1). The free-energy barriers of methyl transfers for mono-, diand tri-methylation in Suv4-20h2 obtained from the simulations provided explanations for the experimental observations concerning the origin of the unique substrate/product specificity for the enzymes in this family. It was shown that the relatively low free energy barrier for di-methylation compared to that for mono-methylation is likely due to a more effective TS stabilization achieved through the strengthening of the CH···O interactions involving the CM1H3 and CMH3 groups as well as the presence of cation-π interaction between Phe-191 and CM1H3 at the transition state. The relatively high free energy barrier for tri-methylation compared to that for di-methylation seems due to the failure of the reactant complex to adopt a reactive (near attack) configuration for the methyl transfer. The simulations for the reactant complexes showed that although the reactive (near attack) conformations for mono-methylation could be formed, the free energy barrier is still quite high. Thus, care must be exercised to predict the substrate/product specificity based only on the existence of the near attack configurations in the substrate complexes.

Acknowledgment ACS Paragon Plus Environment

14

Page 15 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

This work was supported in part by grants 0817940 from the National Science Foundation (H.G.). Oak Ridge National Laboratory is managed by UT-Battelle, LLC for the US Department of Energy under contract number DE-AC05-00OR22725. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1053575. The research has been aided by the National Nature Science Foundation of China (No. 20903063 to PQ), the grant from the Postdoctoral Foundation of Shandong Agricultural University in China (No. 76335 to PQ) and China Scholarship Council (No. 201408370020 to PQ). We thank Dr. Martin Karplus for a gift of the CHARMM program.

Supporting Information Supporting Information Available: A comparison of the energetics for the methyl transfer based on a small model system using B3LYP/6-31G** and DFTB3 methods. This material is available free of charge via the Internet at http://pubs.acs.org.

ACS Paragon Plus Environment

15

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 27

REFERENCES (1) Strahl, B. D.; Allis, C. D. The Language of Covalent Histone Modifications. Nature 2000, 403, 4145. (2) Jenuwein, T. The Epigenetic Magic of Histone Lysine Methylation. Febs. J. 2006, 273, 3121-3135. (3) Martin, C.; Zhang, Y. The Diverse Functions of Histone Lysine Methylation. Nat. Rev. Mol. Cell Biol. 2005, 6, 838-849. (4) Del Rizzo, P. A.; Trievel, R. C. Molecular Basis for Substrate Recognition by Lysine Methyltransferases and Demethylases. BBA-Gene Regul. Mech. 2014, 1839, 1404-1415. (5) Cortopassi, W. A.; Kumar, K.; Duarte, F.; Pimentel, A. S.; Paton, R. S. Mechanisms of Histone Lysine-Modifying Enzymes: A Computational Perspective on the Role of the Protein Environment. J. Mol. Graph. Model. 2016, 67, 69-84. (6) Mozzetta, C.; Boyarchuk, E.; Pontis, J.; Ait-Si-Ali, S. Sound of Silence: The Properties and Functions of Repressive Lys Methyltransferases. Nat. Rev. Mol. Cell Bio. 2015, 16, 499-513. (7) Balakrishnan, L.; Milavetz, B. Decoding the Histone H4 Lysine 20 Methylation Mark. Crit. Rev. Biochem. Mol. Biol. 2010, 45, 440-452. (8) Southall, S. M.; Cronin, N. B.; Wilson, J. R. A Novel Route to Product Specificity in the Suv4-20 Family of Histone H4k20 Methyltransferases. Nucleic Acids Res. 2014, 42, 661-671. (9) Weirich, S.; Kudithipudi, S.; Jeltsch, A. Specificity of the Suv4–20h1 and Suv4–20h2 Protein Lysine Methyltransferases and Methylation of Novel Substrates. J. Mol. Biol. 2016, 428, 2344-2358. (10) Wu, H.; Siarheyeva, A.; Zeng, H.; Lam, R.; Dong, A. P.; Wu, X. H.; Li, Y. J.; Schapira, M.; Vedadi, M.; Min, J. Crystal Structures of the Human Histone H4k20 Methyltransferases Suv420h1 and Suv420h2. Febs Lett. 2013, 587, 3859-3868. (11) Xiao, B.; Wilson, J. R.; Gamblin, S. J. Set Domains and Histone Methylation. Curr. Opin. Struc. Bio. 2003, 13, 699-705. (12) Cheng, X.; Collins, R. E.; Zhang, X. Structural and Sequence Motifs of Protein (Histone) Methylation Enzymes. Annu. Rev. Bioph. Biom. 2005, 34, 267-294. (13) Zhang, X.; Yang, Z.; Khan, S. I.; Horton, J. R.; Tamaru, H.; Selker, E. U.; Cheng, X. Structural Basis for the Product Specificity of Histone Lysine Methyltransferases. Mol. Cell 2003, 12, 177-185. (14) Collins, R. E.; Tachibana, M.; Tamaru, H.; Smith, K. M.; Jia, D.; Zhang, X.; Selker, E. U.; Shinkai, Y.; Cheng, X. In Vitro and in Vivo Analyses of a Phe/Tyr Switch Controlling Product Specificity of Histone Lysine Methyltransferases. J. Biol. Chem. 2005, 280, 5563-5570. (15) Couture, J. F.; Collazo, E.; Brunzelle, J. S.; Trievel, R. C. Structural and Functional Analysis of Set8, a Histone H4 Lys-20 Methyltransferase. Gene. Dev. 2005, 19, 1455-1465. (16) Couture, J. F.; Collazo, E.; Hauk, G.; Trievel, R. C. Structural Basis for the Methylation Site Specificity of Set7/9. Nat. Struct. Mol. Biol. 2006, 13, 140-146. (17) Couture, J. F.; Dirk, L. M. A.; Brunzelle, J. S.; Houtz, R. L.; Trievel, R. C. Structural Origins for the Product Specificity of Set Domain Protein Methyltransferases. Proc. Natl. Acad. Sci. USA 2008, 105, 20659-20664. (18) Dillon, S. C.; Zhang, X.; Trievel, R. C.; Cheng, X. The Set-Domain Protein Superfamily: Protein Lysine Methyltransferases. Genome Biol. 2005, 6, 227-236. (19) Qian, C.; Zhou, M. M. Set Domain Protein Lysine Methyltransferases: Structure, Specificity and Catalysis. Cell. Mol. Life Sci. 2006, 63, 2755-2763. (20) Xiao, B.; Jing, C.; Wilson, J. R.; Walker, P. A.; Vasisht, N.; Kelly, G.; Howell, S.; Taylor, I. A.; Blackburn, G. M.; Gamblin, S. J. Structure and Catalytic Mechanism of the Human Histone Methyltransferase Set7/9. Nature 2003, 421, 652-656. (21) Zhang, X.; Tamaru, H.; Khan, S. I.; Horton, J. R.; Keefe, L. J.; Selker, E. U.; Cheng, X. Structure of the Neurospora Set Domain Protein DIM-5, a Histone H3 Lysine Methyltransferase. Cell 2002, 111, 117-127. (22) Chu, Y. Z.; Guo, H. QM/MM MD and Free Energy Simulation Study of Methyl Transfer Processes Catalyzed by PKMTs and PRMTs. Interdiscip. Sci. 2015, 7, 309-318. ACS Paragon Plus Environment

16

Page 17 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(23) Chu, Y. Z.; Yao, J. Z.; Guo, H. QM/MM MD and Free Energy Simulations of G9a-Like Protein (GLP) and Its Mutants: Understanding the Factors That Determine the Product Specificity. Plos One 2012, 7. (24) Chu, Y. Z.; Xu, Q.; Guo, H. Understanding Energetic Origins of Product Specificity of Set8 from QM/MM Free Energy Simulations: What Causes the Stop of Methyl Addition During Histone Lysine Methylation? J. Chem. Theory. Comput. 2010, 6, 1380-1389. (25) Xu, Q.; Chu, Y. Z.; Guo, H. B.; Smith, J. C.; Guo, H. Energy Triplets for Writing Epigenetic Marks: Insights from QM/MM Free-Energy Simulations of Protein Lysine Methyltransferases. ChemEur. J. 2009, 15, 12596-12599. (26) Guo, H. B.; Guo, H. Mechanism of Histone Methylation Catalyzed by Protein Lysine Methyltransferase Set7/9 and Origin of Product Specificity. Proc. Natl. Acad. Sci. USA 2007, 104, 8797-8802. (27) Yao, J. Z.; Chu, Y. Z.; An, R.; Guo, H. Understanding Product Specificity of Protein Lysine Methyltransferases from QM/MM Molecular Dynamics and Free Energy Simulations: The Effects of Mutation on Set7/9 Beyond the Tyr/Phe Switch. J. Chem. Inf. Model. 2012, 52, 449-456. (28) Oda, H.; Okamoto, I.; Murphy, N.; Chu, J.; Price, S. M.; Shen, M. M.; Torres-Padilla, M. E.; Heard, E.; Reinberg, D. Monomethylation of Histone H4-Lysine 20 Is Involved in Chromosome Structure and Stability and Is Essential for Mouse Development. Mol. Cell. Biol. 2009, 29, 2278-2295. (29) Yang, H. B.; Pesavento, J. J.; Starnes, T. W.; Cryderman, D. E.; Wallrath, L. L.; Kelleher, N. L.; Mizzen, C. A. Preferential Dimethylation of Histone H4 Lysine 20 by Suv4-20. J. Biol. Chem. 2008, 283, 12085-12092. (30) Kuo, A. J.; Song, J.; Cheung, P.; Ishibe-Murakami, S.; Yamazoe, S.; Chen, J. K.; Patel, D. J.; Gozani, O. The Bah Domain of Orc1 Links H4k20me2 to DNA Replication Licensing and Meier– Gorlin Syndrome. Nature 2012, 484, 115-119. (31) Schotta, G.; Sengupta, R.; Kubicek, S.; Malin, S.; Kauer, M.; Callen, E.; Celeste, A.; Pagani, M.; Opravil, S.; De La Rosa-Velazquez, I. A.; Espejo, A.; Bedford, M. T.; Nussenzweig, A.; Busslinger, M.; Jenuwein, T. A Chromatin-Wide Transition to H4k20 Monomethylation Impairs Genome Integrity and Programmed DNA Rearrangements in the Mouse. Gene. Dev. 2008, 22, 2048-2061. (32) Zhang, X.; Bruice, T. C. Enzymatic Mechanism and Product Specificity of Set-Domain Protein Lysine Methyltransferases. Proc. Natl. Acad. Sci. USA 2008, 105, 5728-5732. (33) Zhang, X.; Bruice, T. C. Catalytic Mechanism and Product Specificity of Rubisco Large Subunit Methyltransferase: QM/MM and MD Investigations. Biochemistry 2007, 46, 5505-5514. (34) Hu, P.; Wang, S.; Zhang, Y. How Do Set-Domain Protein Lysine Methyltransferases Achieve the Methylation State Specificity? Revisited by Ab Initio QM/MM Molecular Dynamics Simulations. J. Am. Chem. Soc. 2008, 130, 3806-3813. (35) Hu, P.; Zhang, Y. K. Catalytic Mechanism and Product Specificity of the Histone Lysine Methyltransferase Set7/9: An Ab Initio QM/MM-Fe Study with Multiple Initial Structures. J. Am. Chem. Soc. 2006, 128, 1272-1278. (36) Yue, Y. F.; Chu, Y. Z.; Guo, H. Computational Study of Symmetric Methylation on Histone Arginine Catalyzed by Protein Arginine Methyltransferase PRMT5 through QM/MM MD and Free Energy Simulations. Molecules 2015, 20, 10032-10046. (37) Yao, J. Z.; Guo, H. B.; Chaiprasongsuk, M.; Zhao, N.; Chen, F.; Yang, X. H.; Guo, H. SubstrateAssisted Catalysis in the Reaction Catalyzed by Salicylic Acid Binding Protein 2 (SABP2), a Potential Mechanism of Substrate Discrimination for Some Promiscuous Enzymes. Biochemistry 2015, 54, 53665375. (38) Qian, P.; Zhao, N.; Chen, F.; Guo, H. Understanding Substrate Specificity of Related Plant Methylesterases (MESs) from Computational Investigations. Chem. J. Chinese U. 2015, 36, 2283-2291. (39) Yue, Y. F.; Guo, H. Quantum Mechanical/Molecular Mechanical Study of Catalytic Mechanism and Role of Key Residues in Methylation Reactions Catalyzed by Dimethylxanthine Methyltransferase in Caffeine Biosynthesis. J. Chem. Inf. Model. 2014, 54, 593-600. ACS Paragon Plus Environment

17

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 27

(40) Chu, Y. Z.; Li, G. H.; Guo, H. QM/MM MD and Free Energy Simulations of the Methylation Reactions Catalyzed by Protein Arginine Methyltransferase PRMT3. Can. J. Chem. 2013, 91, 605-612. (41) Yao, J. Z.; Xu, Q.; Chen, F.; Guo, H. QM/MM Free Energy Simulations of Salicylic Acid Methyltransferase: Effects of Stabilization of Ts-Like Structures on Substrate Specificity. J. Phys. Chem. B 2011, 115, 389-396. (42) Xu, Q.; Yao, J. Z.; Wlodawer, A.; Guo, H. Clarification of the Mechanism of Acylation Reaction and Origin of Substrate Specificity of the Serine-Carboxyl Peptidase Sedolisin through QM/MM Free Energy Simulations. J. Phys. Chem. B 2011, 115, 2470-2476. (43) Xu, Q.; Guo, H. B.; Wlodawer, A.; Nakayama, T.; Guo, H. The QM/MM Molecular Dynamics and Free Energy Simulations of the Acylation Reaction Catalyzed by the Serine-Carboxyl Peptidase Kumamolisin-As. Biochemistry 2007, 46, 3784-3792. (44) Xu, Q.; Guo, H.; Wlodawer, A. The Importance of Dynamics in Substrate-Assisted Catalysis and Specificity. J. Am. Chem. Soc. 2006, 128, 5994-5995. (45) Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. CHARMM - a Program for Macromolecular Energy, Minimization, and Dynamics Calculations. J. Comput. Chem. 1983, 4, 187-217. (46) Field, M. J.; Bash, P. A.; Karplus, M. A Combined Quantum-Mechanical and Molecular Mechanical Potential for Molecular-Dynamics Simulations. J. Comput. Chem. 1990, 11, 700-733. (47) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79, 926-935. (48) Brooks, C. L.; Brunger, A.; Karplus, M. Active-Site Dynamics in Protein Molecules - a Stochastic Boundary Molecular-Dynamics Approach. Biopolymers 1985, 24, 843-865. (49) Elstner, M.; Porezag, D.; Jungnickel, G.; Elsner, J.; Haugk, M.; Frauenheim, T.; Suhai, S.; Seifert, G. Self-Consistent-Charge Density-Functional Tight-Binding Method for Simulations of Complex Materials Properties. Phys. Rev. B 1998, 58, 7260-7268. (50) Cui, Q.; Elstner, M.; Kaxiras, E.; Frauenheim, T.; Karplus, M. A QM/MM Implementation of the Self-Consistent Charge Density Functional Tight Binding (SCC-DFTB) Method. J. Phys. Chem. B 2001, 105, 569-585. (51) Christensen, A. S.; Kubar, T.; Cui, Q.; Elstner, M. Semiempirical Quantum Mechanical Methods for Noncovalent Interactions for Chemical and Biochemical Applications. Chem. Rev. 2016, 116, 53015337. (52) Lu, X.; Gaus, M.; Elstner, M.; Cui, Q. Parametrization of DFTB3/3ob for Magnesium and Zinc for Chemical and Biological Applications. J. Phys. Chem. B 2015, 119, 1062-1082. (53) MacKerell, A. D.; Bashford, D.; Bellott, M.; Dunbrack, R. L.; Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.; Joseph-McCarthy, D.; Kuchnir, L.; Kuczera, K.; Lau, F. T. K.; Mattos, C.; Michnick, S.; Ngo, T.; Nguyen, D. T.; Prodhom, B.; Reiher, W. E.; Roux, B.; Schlenkrich, M.; Smith, J. C.; Stote, R.; Straub, J.; Watanabe, M.; Wiorkiewicz-Kuczera, J.; Yin, D.; Karplus, M. All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J. Phys. Chem. B 1998, 102, 3586-3616. (54) Torrie, G. M.; Valleau, J. P. Monte-Carlo Free-Energy Estimates Using Non-Boltzmann Sampling - Application to Subcritical Lennard-Jones Fluid. Chem. Phys. Lett. 1974, 28, 578-581. (55) Kumar, S.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A.; Rosenberg, J. M. The Weighted Histogram Analysis Method for Free-Energy Calculations on Biomolecules .1. The Method. J. Comput. Chem. 1992, 13, 1011-1021. (56) Guo, H. B.; Gorin, A.; Guo, H. A Peptide-Linkage Deletion Procedure for Estimate of Energetic Contributions of Individual Peptide Groups in a Complex Environment: Application to Parallel ΒSheets Interdiscip. Sci.: Comput. Life Sci. 2009, 1, 12-20. (57) Guo, H. B.; Beahm, R. F.; Guo, H. Stabilization and Destabilization of the C-Delta-H Center Dot Center Dot Center Dot O=C Hydrogen Bonds Involving Proline Residues in Helices. J. Phys. Chem. B 2004, 108, 18065-18072. ACS Paragon Plus Environment

18

Page 19 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(58) Horowitz, S.; Dirk, L. M. A.; Yesselman, J. D.; Nimtz, J. S.; Adhikari, U.; Mehl, R. A.; Scheiner, S.; Houtz, R. L.; Al-Hashimi, H. M.; Trievel, R. C. Conservation and Functional Importance of CarbonOxygen Hydrogen Bonding in Ado Met-Dependent Methyltransferases. J. Am. Chem. Soc. 2013, 135, 15536-15548. (59) Ma, J. C.; Dougherty, D. A. The Cation-Pi Interaction. Chem. Rev. 1997, 97, 1303-1324. (60) Dougherty, D. A. The Cation-Pi Interaction. Accounts Chem. Res. 2013, 46, 885-893. (61) Taverna, S. D.; Li, H.; Ruthenburg, A. J.; Allis, C. D.; Patel, D. J. How Chromatin-Binding Modules Interpret Histone Modifications: Lessons from Professional Pocket Pickers. Nat. Struct. Mol. Biol. 2007, 14, 1025-1040. (62) Beaver, J. E.; Waters, M. L. Molecular Recognition of Lys and Arg Methylation. ACS Chem. Biol. 2016, 11, 643-653.

ACS Paragon Plus Environment

19

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 27

Figure Captions Fig.1 (a) The relative orientation of AdoMet and K20me0 in the reactant complex. θ is defined as the angle between the two vectors r1 and r2. Here, r1 is the direction of the lone pair of electrons on Nζ, and r2 is the vector pointing from CM to Sδ. (b) Comparison of the X-ray structure (Left) with the representative active-site structure (Right) of the di-methylation product from the QM/MM free energy simulations with the relatively large QM region. Suv4-20h2 is shown in sticks, and AdoHcy and lysine are in balls and sticks. Only the three atoms, C5’, CG, and Sδ from AdoHcy and the residues that are close to H4-K20 are shown for clarity. The average distances from the simulations are given (in angstroms). Fig.2 (a) Representative active-site structure of the reactant complex for mono-methylation containing AdoMet (methyl donor) and H4-K20me0 along with r(CM···Nζ) and θ distributions obtained from the QM/MM MD simulations. Suv4-20h2 is shown in sticks, and AdoMet and lysine are in balls and sticks. Only the four atoms, C5’, CG, Sδ, and CM from AdoMet and the residues that are close to H4-K20 are shown for clarity. The average distances for some important interactions obtained from the simulations are also given (in angstroms). (b) The active-site structure of the reactant complex for di-methylation containing AdoMet and H4-K20me1 along with r(CM···Nζ) and θ distributions. (c) The active-site structure of the reactant complex for tri-methylation containing AdoMet and H4-K20me2 along with r(CM···Nζ) and θ distributions. Fig. 3 Free energy (potential of mean force) profiles for the methylation reactions in Suv4-20h2 as a function of the reaction coordinate [R = r(CM···Sδ) – r(CM···Nζ)] based on the H4-K20me0, H4K20me1 and H4-K20me2 substrates, respectively. (a) The free energy profiles for the models with the relatively small QM region (see Methods). Mono-methylation of H4-K20me0: blue line with a free energy barrier of 23.9 kcal/mol. Di-methylation of H4-K20me1: red line with a free energy barrier of 17.9 kcal/mol. Tri-methylation of H4-K20me2: orange line with a free energy barrier of 23.0 kcal/mol. (b) The free energy profiles for the models with the relatively large QM region. Mono-methylation of H4-K20me0: blue line with a free energy barrier of 23.9 kcal/mol. Di-methylation of H4-K20me1: orange line with a free energy barrier of 17.0 kcal/mol. Fig.4 (a) Representative active-site structures of near transition state for mono-methylation obtained from the free energy simulations. Left: the structure based on the model with the relatively small QM region. Right: the structure based on the model with the relatively large QM region. (b) Representative active-site structures of near transition state for di-methylation. Left: the structure based on model with the relatively small QM region. Right: the structure based on model with the relatively large QM region. ACS Paragon Plus Environment

20

Page 21 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

(c) Representative active-site structure of near transition state for tri-methylation based on the model with the relatively small QM region.

ACS Paragon Plus Environment

21

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Fig. 1

ACS Paragon Plus Environment

22

Page 22 of 27

Page 23 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Fig. 2

ACS Paragon Plus Environment

23

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Fig. 3

ACS Paragon Plus Environment

24

Page 24 of 27

Page 25 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

Fig. 4

ACS Paragon Plus Environment

25

Journal of Chemical Theory and Computation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 27

Table 1 Average hydrogen bonding distances between the carbon atom of transferrable methyl group and oxygen atoms of different residues r (O···CM) (Å) hydrogen bond type Ile181 C=O···CM Tyr217 O···CM Phe160 C=O···CM Ser161 O···CM Ala179 C=O···CM

Mono-methylation

Di-methylation

tri-methylation

near reactant

near TS

near product

near reactant

near TS

near product

4.11

4.17

4.24

4.38

4.15

4.42

4.20

3.95

4.66

3.36

3.50

3.69

3.45

3.34

3.69

3.45

3.44

4.76

3.46

3.42

3.54

3.79

3.49

3.49

3.45

3.32

3.16

4.24

3.86

3.67

4.54

4.34

3.66

4.22

5.31

3.79

3.22

3.44

3.71

3.35

3.56

3.53

3.22

3.38

3.27

ACS Paragon Plus Environment

26

near near near reactant TS product

Page 27 of 27

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Chemical Theory and Computation

TOC

ACS Paragon Plus Environment

27