Binding Analysis of Methyl-CpG Binding Domain of ... - ACS Publications

Jun 29, 2016 - way.6,7 It is recognized that not only mC and hmC but also fC and caC may serve ... binding analysis on mC and hmC is controversial, an...
0 downloads 0 Views 469KB Size
Subscriber access provided by Weizmann Institute of Science

Article

Binding Analysis of Methyl-CpG Binding Domain of MeCP2 and Rett Syndrome Mutations Ye Yang, Tugba G Kucukkal, Jing Li, Emil Alexov, and Weiguo Cao ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.6b00450 • Publication Date (Web): 29 Jun 2016 Downloaded from http://pubs.acs.org on June 29, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Chemical Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Binding Analysis of Methyl-CpG Binding Domain of MeCP2 and Rett Syndrome Mutations

Ye Yang1,#, Tugba G. Kucukkal2,#, Jing Li1, Emil Alexov2,* and Weiguo Cao1,*

1

Department of Genetics and Biochemistry, Clemson University Room 049 Life Sciences Facility 190 Collings Street Clemson, SC 29634 USA

2

Department of Physics, Clemson University 118 Kinard Laboratory Clemson, SC 29634 USA

#

Contributed equally

* Corresponding Author: email: [email protected]; Tel.: (864) 656-4176; email: [email protected]; Tel.: 864-908-4796

Running Title: Binding analysis of MeCP2-MBD

1 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 28

Abstract Methyl-CpG binding protein 2 (MeCP2) binds to methylated cytosine in CpG island through its methyl-CpG binding domain (MBD). Here, the effects of the Rett syndrome-causing missense mutations

on

binding

affinity

of

MBD

to

cytosine

(C),

methylcytosine

(mC),

hydroxymethylcytosine (hmC), formylcytosine (fC) and carboxylcytosine (caC) in CpG dinucleotide are investigated. MeCP2-MBD binds to mC-containing variants of double stranded CpG stronger than any other cytosine modified CpG with strongest affinity to mC/mC. Thirteen MBD missense mutations show reduced binding affinity for mC/mC ranging with 2-fold decrease for T158M to 88-fold for R111G. The binding affinities of these mutants to C/C are also reduced to various degrees except for T158M. Consistent with free energy perturbation analysis, correlation of binding affinity with protein unfolding allows for grouping mutations into three clusters. Correlation of the first cluster includes mutations that have a higher tendency to unfold and have lesser affinity to mC/mC and C/C. Mutations in the second cluster have similar structural stability but various affinity to mC/mC and C/C. R111G and A140V belong to the third cluster in which the loss of protein flexibility may underlie their reduction in binding affinity to mC/mC and C/C. Most notably, R111 emerges as the key structural element that modulates the specific contacts with mCpG. Implications of the results for the mCpG binding mechanism of MeCP2-MBD are discussed. These analyses provide new insights on the structure and function relationships in MeCP2-MBD and offer new clues to their roles in the pathology of Rett syndrome.

2 ACS Paragon Plus Environment

Page 3 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Introduction Methylation at the C5 position of cytosine (C) in the CpG islands located in promoter regions is a major cellular mechanism to regulate gene transcription 1 either by directly affecting the binding of transcription factors 2, 3 or mediated by methyl-CpG-binding protein (MBP) family 4, 5

. Methyl-C (mC) can be oxidized by ten-eleven translocation (Tet) proteins to hydroxymethyl-

C (hmC), formyl-C (fC) and carboxyl-C (caC) 6. Among them, the fC and caC can be excised by thymine DNA glycosylase (TDG) and replaced by cytosine through the conventional base excision repair (BER) pathway 6, 7. It is recognized that not only mC and hmC, but also fC and caC may serve as epigenetic marks 8. In humans, five members are identified in the MBP family, which contain a methyl-CpG binding domain (MBD) that specifically binds to a symmetric methyl-CpG (mCpG) dinucleotide pair 9 (Figure 1a). The founding member of MBP is the methyl-CpG binding protein 2 (MeCP2), which is initially reported to selectively interact with methyl-CpG 10. MeCP2 has four distinctive domains, methyl-CpG binding domain, transcriptional repression domain (TRD) 11, and NCOR-SMRT interaction domain (NID) 12, and three AT-hook motifs

10

(Figure 1a). The TRD and NID are involved in the recruitment of gene repressor, for

instance the deacetylase HDAC3 12. The three AT-hook motifs can compete with histone-H1 for compacting chromatin after being redirected by MBD to specific DNA region

13

. Thus, MeCP2

may serve as a mediator to chromatin remodeling and the two epigenetic gene repression mechanisms, DNA methylation and histone deacetylation (reviewed by 14). MeCP2 was found to be highly expressed in neural cells during development and also in adult neuron 15, 16. Most cases of the Rett syndrome (RTT) are caused by mutations in MeCP2, which are located within and outside the MBD region

17, 18

. This X-linked discrete neurologic disorder occurs almost

3 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 28

exclusively in girls with a frequency of 1 in 10,000 live female births, with clinical symptoms including growth arrest, speech lost and other autistic features

19, 20

. MeCP2 mutants in males

often cause death in infancy, or severe neonatal encephalopathy with RTT-like symptoms. Furthermore, more than 95% of RTT patients result from “de novo” mutations in MECP2 gene, which are not present in their families 18. The close relevance of MeCP2 to neurologic disorder and the nonhereditary nature provides a good model to understand the mechanism by which a single MeCP2 mutation survives from developmental selection and impacts epigenetic gene regulation. About 42% of RTT cases are identified with frame-shift or truncation mutations, most of which were located at downstream of the MBD region. Another 46% of RTT cases are caused by missense mutations, and about half of them were found in the MBD region (77-167 aa) 18. Given the critical role of initial binding of the MeCP2 to 5mC-containing DNA, the binding behavior of MBD region of MeCP2 to 5mC were widely investigated 21, 22. Five of RTT mutants in MBD are found dysfunctional in mCpG binding 23. Later on, the structural studies of MBD-DNA complex reveal that most of the high-frequency RTT mutants (R106W, F155S and T158M) are not involved in direct interaction with mCpG bases 24, but caused changes in the secondary structure and thermal stability 25. The unfolding and structural dynamics of some frequent RTT mutations have been investigated previously 26, 27. With the recognition of the complexity of mC oxidation and demethylation, the ability of MeCP2 and its clinical mutations to bind to various oxidative states of mC becomes an important subject of investigation. However, the outcome of previous binding analysis on mC and hmC is controversial and the information on fC and caC is limited. While three studies show that MeCP2 has a stronger binding affinity to mC, a recent study indicates that MeCP2 has similar

4 ACS Paragon Plus Environment

Page 5 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

affinity to mC and hmC

21, 22, 28, 29

. The abovementioned proteomics analysis also shows that

MeCP2 binds to fC and caC with similar affinity, which is less than 50% of the affinity to hmC, suggesting that MBD family proteins may be involved in gene regulating by interacting with different oxidative derivatives of mC

29

. However, quantitative data on binding affinity of

MeCP2 to fC and caC are not available. In this study, we measured the binding affinity to C, mC, hmC, fC and caC and determined their Kd values. The MBD region of wild-type (wt) MeCP2 showed strongest affinity to mC than its oxidative derivatives. Circular dichroism analysis of the wt and 13 clinical mutants indicated that small changes in the contents of secondary structures. In comparison with the wt MeCP2-MBD, the clinical mutants showed variable but profound weakening in binding affinity to mC. Correlation analysis between binding affinity and protein stability categorized the clinical mutations into three classes. The correlation between protein function, stability, and pathogenicity are also discussed.

5 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Results and Discussion Binding of wt MeCP2-MBD to CpG, mCpG and oxidized mCpG derivatives. To study the binding affinity of MeCP2-MBD to CpG, mCpG and oxidized mCpG derivatives, we designed an oligodeoxynucleotide substrate based on the human BDNF promoter sequence, which is a target of MeCP2, as described in Materials and Methods. The target sequence contained a CpG dinucleotide, in which the cytosine was substituted by mC, hmC, fC and caC. We initially measured the binding affinity of the wt MeCP2-MBD using gel mobility shift analysis. Representative gel images and quantitative data are shown in Figure 2 and complete Kd values are represented by column chart in Figure 3. The binding affinities for the MeCP2-MBD followed the order of mC > fC, caC, hmC > C. This work for the first time determines the binding affinity of MeCP2-MBD to all five types of different cytosine. It is clear that MeCP2-MBD has the strongest affinity to mC, which is several-fold greater than fC, caC and hmC and 10-fold greater than unmodified C (Fig. 3) (Figure 3). Regardless of whether the strand opposite the mC strand contains hmC, fC or caC, the Kd values are comparable to C/mC, indicating that binding to mC dominates the Kd values in the hemi-methylated CpG island (Fig. 3) (Figure 3). Furthermore, mC/mC DNA has the lowest Kd value that is 63-fold smaller than the C/C DNA, suggesting that MeCP2-MBD has adapted to binding to fully methylated CpG island. Among the narrow range of Kd values, fC has a slightly stronger affinity than caC and hmC DNA but all Kd values are 2- to 3-fold lower than the Kd for C/C DNA. These results suggest that MeCP2-MBD has evolved some binding affinity to distinguish oxidized mCpG from unmodified C/C. The data presented are consistent with the previous studies indicating that MeCP2-MBD preferentially binds to mC than hmC 21, 22, 29, but is not in keeping with the previous study showing that MeCP2 had similar affinity to mC and hmC

6 ACS Paragon Plus Environment

Page 6 of 28

Page 7 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

28

. Another quantitative finding from this study is that MeCP2-MBD has similar affinity to three

oxidized mC DNAs (Figure 3), which is not consistent with the previous study showing fC and caC had less than 50% affinity than hmC 29. Binding of mutant MeCP2-MBD to CpG and mCpG. Rett syndrome mutations are found across the MBD region (Figure 1b). To determine how clinical mutations affect the binding affinity to C and mC, we measured the Kd values of 13 mutant MeCP2-MBD proteins to C/C and mC/mC DNAs. The gel mobility shift images are shown in Figure 4, Figure S2, Figure S3 and complete Kd values are presented in Table 1. The conserved L100V mutation is located in a loop region that is distal to the protein-DNA interface. The Kd value for mC/mC was modestly increased to 21.6 nM, and that for C/C was slightly increased to 568 nM. R106 located in the first β strand forms hydrogen bonds with the mainchain carboxyl oxygens of T158 and V159 (Fig. 1B and Fig. 5A). T158 is part of the Asx-ST motif that interacts with the phosphate backbone. R106W increased the Kd for mC/mC over 12-fold to 76.5 nM but only increased the Kd for C/C slightly (Table 1). On the other hand, R106Q increased the Kd for mC/mC over 24fold to 153 nM while increased the Kd for C/C over 3-fold to 1389 nM (Table 1). R111 interacts with guanine of mCpG island through hydrogen bond with O6 and N7 (Fig. 5B). The R111G mutation increased the Kd for mC/mC 88-fold to 554.6 nM and the Kd for C/C 3-fold to 1210 nM (Table 1). R133 is another residue that interacts with guanine of mCpG island in the opposite strand from which R111 interacts (Fig. 1D and Fig. 5B). As the second most frequent clinical mutation in the MBD region, R133C increased the Kd for mC/mC over 15-fold to 99.8 nM and increased the Kd for C/C 3-fold. R133H mutation increased the Kd for mC over 12-fold to 77.9 nM and the Kd for C/C less than 2-fold (Table 1). The adjacent mutation S134C increased the Kd for mC/mC over 4-fold and the Kd for C/C less than 2-fold. P152R and F155S mutations located

7 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

in a loop region increased the Kd for mC/mC over 4- and 3-fold, respectively, with an over 2-fold increase in Kd values for C/C (Table 1). The Asx-ST motif, which interacts with the phosphate backbone, constitutes residues from 156 to 161 (DFTVTG) (Fig. 5A). D156E increased the Kd for mC/mC over 7-fold to 47.9 nM and only slightly increased the Kd for C/C (Table 1). As the most frequent clinical mutation in the MBD region, T158M increased the Kd for mC/mC only over 2-fold while essentially maintained the same affinity to C/C (Table 1). On the other hand, T158A mutation increased the Kd for mC/mC over 4-fold and the Kd for C/C over 3-fold (Table 1). To assess whether these clinical mutations caused any changes in the secondary structures, we measured the content of the secondary structures using far-UV circular dichroism. Small changes in the contents of α helix, β strands or turns were observed but no significant changes occurred in these mutations (Figure S1 and Table S1). Stability, binding affinity and clustering of RTT mutants. In a comprehensive study, we determined the stabilities of the wt MeCP2-MBD and 13 MBD mutants by urea-induced unfolding through circular dichoism analysis

26

. The 13 mutants can be categorized as three

clusters based on their folding stabilities. To correlate the protein stability with DNA binding affinity to mC and C, the free energy differences between the mutant and the wt MeCP2-MBD are expressed as ∆∆GmC and ∆∆GC, respectively (Table 1 and Figure 6). The free energy differences between binding to mC and C are expressed as ∆∆GmC/C (Figure 6a, b). Evidently, all mutants have more profound effect on reducing the binding affinity to the mC than to the C (Figure 6c, d). The correlations between the binding affinity and stability along with computational results are discussed for each cluster below.

8 ACS Paragon Plus Environment

Page 8 of 28

Page 9 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

L100V, S134C, P152R and D156E. Mutants in this cluster are the least stable and tend to unfold (Figure 6)

26

. L100V, S134C and P152R are not involved directly or indirectly in

binding to DNA. Root mean square fluctuation (RMSF) analysis of Ca atoms on S134C indicates an increased flexibility in the upstream of position 134 which may cause decreased stability of the mutant (Figure 7a). In addition, the S134 makes two intra-protein hydrogen bonds although relatively less frequently; upon mutation, both hydrogen bonds are lost (Table S5). Simulations also show that the decreased stability for P152R partly results from the increased flexibility in the helix, in the loop between the second strand and the helix and also in the vicinity of position 152 (Figures 1c, d and 7b). D156, as the Asx motif

30

, forms a hydrogen bond with the main

chain amino group of T158 through its carboxyl sidechain 24. The mutational effects on binding to mC and C are modest with D156E being the greatest among them (Figures 1c, d and 6b). The modest binding effects are likely due to the instability of the mutants rather than direct effects on protein-DNA interactions. The greater binding effect of D156E on mC may be caused by the specific requirement of Asp or Asn in the Asx motif, which provides a structural support for the phosphate backbone interaction mediated by the adjacent ST turn 24. A more detailed analysis of hydrogen bonds in our simulations shows that although D156E maintains the same number of hydrogen bonds within the protein (Table S5), the occupancies are significantly less, indicating a destabilizing effect to the Asx motif. In agreement with the experimental data, all four mutants cause a decreased number of intra-protein hydrogen bonds and a significant increase in the solvent accessibility of the protein (Tables S2 and S5), which together suggests a significant loss of stability for these mutants. Note that none of the mutations in this cluster cause a significant change in salt bridges (Table S6). The simulations also show a measurable evidence for the modest decrease in binding. First of all,

9 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 28

these four mutants cause an increase in mC8-R111 and mC33-R133 center of mass distances with the largest effect for the S134C (Table S5). Interaction energy of these four mutants with the DNA is also reduced, again with the largest effect being indicated for the S134C (Table S4). The binding free energy results from free energy perturbation (FEP) simulations for mC agrees well with the experimental results (Figure S5). R106Q, R106W, R133H, R133C, F155S, T158A and T158M. The second cluster of mutants has similar folding stabilities as the wt MeCP2-MBD but wide range of binding affinity (Figure 6). Simulations indicate that these mutations also cause decreased number of intraprotein hydrogen bonds, however, the effect is not as pronounced as it is for the mutations in the first cluster (Table S6). Also, all mutants in this cluster cause increased solvent accessible area (Table S2) but as shown in Table S5, salt bridges are not affected significantly for most of the mutations in this cluster, which is consistent with modest change in stability. Except for F155S, all other mutants in this cluster either direct or indirectly involve in binding to DNA. R133 located in a loop region inserts itself between the C and the G in the CpG island through the guanidyl moiety to form two hydrogen bonds with the O6 and N7 groups in the guanine 24. Unsurprisingly, R133C is the second most frequent RTT syndrome mutation. T158 is part of the ST motif, which forms one sidechain to mainchain and one mainchain to mainchain hydrogen bonds with residues at i+3 and i+4 positions 30. In MeCP2-MBD, the ST motif is further stabilized by two hydrogen bonds mediated by the sidechain of R106 to the mainchains of T158 and V159 backbone

24

24

. V159 forms a mainchain hydrogen bond with the phosphate

. The structural arrangement is unusual as it has the Asx motif and ST motif

organized in tandem, in which R106 located in a neighboring β strand provides additional support to T158 and V159. In such a well-organized structure, R106W, R106Q, T158A and

10 ACS Paragon Plus Environment

Page 11 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

T158M mutations are all RTT syndrome causing. Mutation in R106 results in the loss of two sidechain hydrogen bonds. We speculate that the amide sidechain in R106Q may be repulsive with either T158 or V159. The effects of these mutations are likely limited to local environment based to the CD and urea-induced unfolding analysis (Figure S1 and Table S1)

26

. However,

T158A is slightly more unstable than the rest of mutants in this cluster. Due to the loss of sidechain hydrogen bonding, T158A may negatively affect the V159-phosphate backbone interaction, which results in the largest reduction in binding affinity to C/C DNA (Table 1). T158M, the most frequent RTT mutation, is unusual in that it has the least reduction in binding to mC/mC and its binding affinity C/C is essentially the same as the wt MeCP2-MBD (Table 1). Our results are consistent with a previous study that shows a 2-fold decrease in binding to mC/mC in Xenopus MeCP2

31

. Certainly, T158M may exert its effects through other yet-to-be-

discovered mechanisms, which contributes to its highest mutation frequency in RTT syndrome. R111G and A140V. R111G and A140V represent mutations that enhance the folding stability while causing substantial reduction in DNA binding. A140 is located in the middle of the sole a helix (Figure 1d). Although this helix is not directly involved in interaction with DNA, A140V mutation may affect interactions with neighboring secondary structural elements or transmitting its effects distally through dynamic motions. R111G has a profound effect on binding to mC/mC and C/C (Table 1). Quantitatively, the glycine substitution reduces the binding affinity to mC/mC 88-fold, the highest one among all RTT mutations investigated. The decrease in binding affinity to C/C is also among the highest (Table 1). These properties allow R111G stand out in various plots presented in Figure 6b-e. Consistent with this, simulations indicate the most drastic change in hydrogen bonding with DNA and interaction energy for this mutation (Tables S4 and S5). R111G also causes the most

11 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 28

pronounced effect in the increase of distance between center of mass of both mC8-R111 and mC33-R133 (Table S5), which also supports the observed reduction in binding affinity (Table 1). Besides the profound effect of R111G on binding, another salient binding property is revealed in our gel mobility analysis. Although both R111G and R133C reduce binding affinity to mC/mC, the patterns appear to be different (Figure 4). R133C maintains a weaker but still distinct bound band with mC/mC (Figure 4). Other mutants show a similarly distinct binding pattern with mC/mC (Figure S2). On the other hand, R111G not only substantially increases its Kd, it also shows a loss of any distinct binding to mC/mC at all concentrations (Figure 4). Based on the MBD-mC/mC crystal structure, R111 and R133 are responsible for interacting with the two guanine bases in the CpG island in two strands

24

. The lack of distinct binding pattern in

R111G is an indication that it is essential not only for its own interaction with one guanine, but also vital for R133 to interact with the guanine on the opposite strand. Because it is more stable, i.e., less flexible, the more rigid structure in R111G may severely interfere with its interactions with DNA both locally and globally. The peculiarity of R111G is also seen in simulations. R111G causes a substantial increase in both mC8-R111 and mC33-R133 distances in a way that the increase in the latter caused by R111G is nearly the same as the increase caused by R133C (Table S5). Therefore, the R111G substitution not only affects the interaction in the vicinity of position 111 but also in the vicinity of position 133 to the same degree as the R133C indicates. Also, the hydrogen bonding with DNA is affected by R111G more profoundly than R133C although both cause a significant reduction in hydrogen bonding (Table S5). Thus, R111 plays a key role in mediating MeCP2-MBD’s interaction with CpG and mCpG. So far, the R111G mutation is only reported in one patient found in Germany

32

. The

rarity of R111G mutation may correlate with the severe defects in DNA binding. The patient

12 ACS Paragon Plus Environment

Page 13 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

with the R111G mutation may have other compensatory mutations/genes that partially rescue the functional defects of this mutation. This phenomenon may hint at the developmental selective force that removes severe and possibly lethal mutations in human populations. This study demonstrates that the MeCP2-MBD has highest binding affinity to mCpG pairs. Thirteen Rett syndrome mutations cause reduction in binding affinity to mCpG, but various changes in stability. The correlation between binding affinity and stability implied various molecular mechanisms underlying clinical symptoms. The inferred developmental selection due to the mutational defect raises a new possibility that MeCP2 may not only play an important role in individual neural development but also in early embryo development.

13 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 28

Methods MeCP2-MBD and substrates preparation. The human MeCP2 and 13 mutant MBD proteins were prepared as reported in our previous work (Figure 1b, c and d) 26. The sequence of the oligodeoxynucleotide substrate was derived from the promoter region of brain-derived neurotrophic factor (BDNF), a MBD specific binding target (top strand: 5’-FAM-GCCCTGGAANGGAACTCTTCTGGCC-3’

and

GGCCAGAAGAGTTCMGTTCCAGGGC-3’,

M

N

and

bottom indicated

strand: replacement

5’with

epigenetically important DNA bases, cytosine, 5-methylcytosine, 5-hydromethylcytosine, 5formylcytosin and 5-carboxylcytosine, respectively). To yield all possible combination of modified CpG pairs, the two complementary strands with the unlabeled strand in 1.5-fold excess (15 µM) were mixed, incubated at 90°C for 3 min, and allowed to form duplex DNA substrates at room termperature for more than 30 min. The no-CpG non-specific binding trap DNA was prepared

by

annealing

two

complementary

strands

(top

strand:

5’-

GATCCTGGTATGAACTCTTCTGACC-3’ and bottom strand: 5’-GGTCAGAAGAGTT CATACCAGGATC -3’) at 1:1 ratio. Gel mobility shift assay. DNA binding affinity of wild-type and mutant MeCP2-MBD proteins was measured and modified according to a previously reported method21, 22. The concentration of free DNA is given as follow:  =

     (      )  



(1)

where MBDtotal is the initial total MBD concentration, DNAtotal is the initial concentration of DNA. Dissociation constant (Kd) of MeCP2 to DNA was determined by nonlinear curve fitting with Equation 1. 14 ACS Paragon Plus Environment

Page 15 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

The differences in free energy change for mutant binding to mC (

!" )

and C (

")

were calculated with Equations 2 and 3, respectively: 

!"



"

= −$%&'(

,)*,+

,)*,), -

= −$%&'(

,*,+ ,*,), -

)

(2)

)

(3)

, where Kd,mC,wt is the dissociation constant of MBD-WT binding to symmetric mCpG, Kd,mC,mutant is the dissociation constant of MBD mutant binding to symmetric mCpG, respectively. Positive 

!"

and 

"

indicate decreased binding affinity of mutants in comparison with wild-type

MBD, and vice versa. The differences in apparent free energy change between mC and C (

!"/" )

for MBD-

WT and MBD mutants were calculated with Equation 4: 

!"/"



= −$%&'( ,)* )

, where higher 

(4)

,*

!"/"

value indicated higher binding specificity of MBD variants to mC/mC

in comparison with C/C in the presence of 20 ng µl-1 non-specific binding trap. Circular dichroism spectrum and secondary structure estimation. Circular dichroism (CD) analysis of MeCP2-MBD proteins was performed as reported previously 26. The secondary structure for each MBD protein was estimated by applying calculated mean residue ellipticity to CONTINLL supported by DICHROWEB

33-35

. Reference set 7 on DICHROWEB, including 43

native and 5 denatured structures, was used to obtain more reliable results 36. Conformational sampling. The X-ray crystal structure of MeCP2 MBD bound to mC/mC DNA 24 was used as an initial structure. First, the MBD was extracted and the 13 mutant

15 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

structures were modeled using the VMD software

37

Page 16 of 28

. Then, the wild-type and mutant MBD

structures (in the absence of DNA) were subjected to fast backbone sampling to generate 20 conformations of each utilizing CONCOORD program

38

. It first identified the interatomic

interactions (such as covalent or nonbonded) in the starting structure and classified them depending on their strength. Distance restraints were generated according to the strength of each interaction. Then, the coordinates are perturbed and all distances were corrected iteratively so that the distance restraints were satisfied. After that, each structure was joined with the DNA structure in the original PDB (ID: 3C2I) which was followed by a 10 ps relaxation through Generalized Born Implicit Solvent (GBIS) simulations as implemented in NAMD where the top four and bottom 5 nucleotide-pairs of DNA were kept fixed 39, 40. The short GBIS relaxation also ensured that there was no overlap between MDB and DNA atoms or any other immediate structural defects in the systems. The resulting structures (21 including the original initial structure) were analyzed for hydrogen bonding network within the protein and with DNA, salt bridges, solvent accessible surface area, radii of gyration in the vicinity of mutation positions, interaction energy between DNA and MBD, root mean square deviation (RMSD) of MBD and root mean square fluctuation (RMSF) of CA atoms of each residue in MBD. Analyses of conformational sampling. All analyses were performed using the VMD program. All protein RMSD’s for the backbone atoms were found to be within about 2.5 Å when the structured parts (strands and helix) are aligned prior to the RMSD calculation. RMSF calculations were performed for CA atoms of MBD. The distance cutoff for the salt bridges was set to 4 Å. A hydrogen bond was considered to be formed when the distance between the donor and acceptor is less than 3.0 Å and also the angle between the donor hydrogen and acceptor is less than 20 degrees.

16 ACS Paragon Plus Environment

Page 17 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Binding free energy calculations through free energy perturbation (FEP). Binding free energy between the MBD (wild-type and 13 mutants) and mC/mC DNA was calculated through 60 ns (30 for forward and 30 for backward) FEP simulations utilizing NAMD

41

.

Langevin dynamics with periodic boundary conditions was performed in an NPT (constant pressure, constant temperature) ensemble using the CHARMM22 corrections

43, 44

42

force field with CMAP

. VDW interactions were truncated with a switching function in 10 Å distance

with an 8–10 Å cutoff. Electrostatic interactions were truncated with particle mesh Ewald (PME) 45

. Each dual topology structure was subjected to 700 steps of minimization and 100 ps of

equilibration. After that the alchemical transformation was done in total of 14 intermediate steps with 30 ns of sampling for each in forward and backward directions. The overall free energy change was obtained with combined forward and backward calculations using the Bennett’s acceptance ratio (BAR) method (as implemented in parseFEP plugin of VMD) of our FEP protocol are provided previously 26.

17 ACS Paragon Plus Environment

46

. More details

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Acknowledgements We thank Latour laboratory at Clemson University for the use of CD spectrophotometer and assistance, supported by grant from NIGMS of the National Institutes of Health under award number 5P20GM103444. This project was supported in part by Clemson University, NIH GM090141 to W.C., and NIH R01GM093937 to E.A. We also thank members of Cao and Alexov labs for help and discussions.

18 ACS Paragon Plus Environment

Page 18 of 28

Page 19 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

References 1. Buck-Koehntop, B. A., and Defossez, P. A. (2013) On how mammalian transcription factors recognize methylated DNA, Epigenetics 8, 131-137. 2. Bell, A. C., and Felsenfeld, G. (2000) Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene, Nature 405, 482-485. 3. Hark, A. T., Schoenherr, C. J., Katz, D. J., Ingram, R. S., Levorse, J. M., and Tilghman, S. M. (2000) CTCF mediates methylation-sensitive enhancer-blocking activity at the H19/Igf2 locus, Nature 405, 486-489. 4. Georgel, P. T., Horowitz-Scherer, R. A., Adkins, N., Woodcock, C. L., Wade, P. A., and Hansen, J. C. (2003) Chromatin compaction by human MeCP2. Assembly of novel secondary chromatin structures in the absence of DNA methylation, J. Biol. Chem. 278, 32181-32188. 5. Chen, W. G., Chang, Q., Lin, Y., Meissner, A., West, A. E., Griffith, E. C., Jaenisch, R., and Greenberg, M. E. (2003) Derepression of BDNF transcription involves calcium-dependent phosphorylation of MeCP2, Science 302, 885-889. 6. He, Y. F., Li, B. Z., Li, Z., Liu, P., Wang, Y., Tang, Q., Ding, J., Jia, Y., Chen, Z., Li, L., Sun, Y., Li, X., Dai, Q., Song, C. X., Zhang, K., He, C., and Xu, G. L. (2011) Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA, Science 333, 1303-1307. 7. Maiti, A., and Drohat, A. C. (2011) Thymine DNA glycosylase can rapidly excise 5formylcytosine and 5-carboxylcytosine: potential implications for active demethylation of CpG sites, J. Biol. Chem. 286, 35334-35338. 8. Wu, H., and Zhang, Y. (2015) Charting oxidized methylcytosines at base resolution, Nat. Struct. Mol. Biol. 22, 656-661.

19 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 28

9. Adkins, N. L., and Georgel, P. T. (2011) MeCP2: structure and function, Biochem. Cell Biol. 89, 1-11. 10. Lewis, J. D., Meehan, R. R., Henzel, W. J., Maurer-Fogy, I., Jeppesen, P., Klein, F., and Bird, A. (1992) Purification, sequence, and cellular localization of a novel chromosomal protein that binds to methylated DNA, Cell 69, 905-914. 11. Meehan, R. R., Lewis, J. D., and Bird, A. P. (1992) Characterization of MeCP2, a vertebrate DNA binding protein with affinity for methylated DNA, Nucleic Acids Res. 20, 5085-5092. 12. Lyst, M. J., Ekiert, R., Ebert, D. H., Merusi, C., Nowak, J., Selfridge, J., Guy, J., Kastan, N. R., Robinson, N. D., de Lima Alves, F., Rappsilber, J., Greenberg, M. E., and Bird, A. (2013) Rett syndrome mutations abolish the interaction of MeCP2 with the NCoR/SMRT corepressor, Nat. Neurosci. 16, 898-902. 13. Baker, S. A., Chen, L., Wilkins, A. D., Yu, P., Lichtarge, O., and Zoghbi, H. Y. (2013) An AT-hook domain in MeCP2 determines the clinical course of Rett syndrome and related disorders, Cell 152, 984-996. 14. Bogdanovic, O., and Veenstra, G. J. (2009) DNA methylation and methyl-CpG binding proteins: developmental requirements and function, Chromosoma 118, 549-565. 15. Banerjee, A., Castro, J., and Sur, M. (2012) Rett syndrome: genes, synapses, circuits, and therapeutics, Front Psychiatry 3, 34. 16. Shahbazian, M. D., Antalffy, B., Armstrong, D. L., and Zoghbi, H. Y. (2002) Insight into Rett syndrome: MeCP2 levels display tissue- and cell-specific differences and correlate with neuronal maturation, Hum. Mol. Genet. 11, 115-124.

20 ACS Paragon Plus Environment

Page 21 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

17. Amir, R. E., Van den Veyver, I. B., Wan, M., Tran, C. Q., Francke, U., and Zoghbi, H. Y. (1999) Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpGbinding protein 2, Nat. Genet. 23, 185-188. 18. Christodoulou, J., Grimm, A., Maher, T., and Bennetts, B. (2003) RettBASE: The IRSA MECP2 variation database-a new mutation database in evolution, Hum. Mutat. 21, 466-472. 19. Rett, A. (1966) [On a unusual brain atrophy syndrome in hyperammonemia in childhood], Wien. Med. Wochenschr. 116, 723-726. 20. Hagberg, B., Aicardi, J., Dias, K., and Ramos, O. (1983) A progressive syndrome of autism, dementia, ataxia, and loss of purposeful hand use in girls: Rett's syndrome: report of 35 cases, Ann. Neurol. 14, 471-479. 21. Valinluck, V., Tsai, H. H., Rogstad, D. K., Burdzy, A., Bird, A., and Sowers, L. C. (2004) Oxidative damage to methyl-CpG sequences inhibits the binding of the methyl-CpG binding domain (MBD) of methyl-CpG binding protein 2 (MeCP2), Nucleic Acids Res. 32, 4100-4108. 22. Khrapunov, S., Warren, C., Cheng, H., Berko, E. R., Greally, J. M., and Brenowitz, M. (2014) Unusual characteristics of the DNA binding domain of epigenetic regulatory protein MeCP2 determine its binding specificity, Biochemistry 53, 3379-3391. 23. Yusufzai, T. M., and Wolffe, A. P. (2000) Functional consequences of Rett syndrome mutations on human MeCP2, Nucleic Acids Res. 28, 4172-4179. 24. Ho, K. L., McNae, I. W., Schmiedeberg, L., Klose, R. J., Bird, A. P., and Walkinshaw, M. D. (2008) MeCP2 binding to DNA depends upon hydration at methyl-CpG, Mol. Cell 29, 525531.

21 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 28

25. Ghosh, R. P., Horowitz-Scherer, R. A., Nikitina, T., Gierasch, L. M., and Woodcock, C. L. (2008) Rett syndrome-causing mutations in human MeCP2 result in diverse structural changes that impact folding and DNA interactions, J. Biol. Chem. 283, 20523-20534. 26. Kucukkal, T. G., Yang, Y., Uvarov, O., Cao, W., and Alexov, E. (2015) Impact of Rett Syndrome Mutations on MeCP2 MBD Stability, Biochemistry 54, 6357-6368. 27. Kucukkal, T. G., and Alexov, E. (2015) Structural, Dynamical, and Energetical Consequences of Rett Syndrome Mutation R133C in MeCP2, Comput. Math. Methods Med. 2015, 746157. 28. Mellen, M., Ayata, P., Dewell, S., Kriaucionis, S., and Heintz, N. (2012) MeCP2 binds to 5hmC enriched within active genes and accessible chromatin in the nervous system, Cell 151, 1417-1430. 29. Spruijt, C. G., Gnerlich, F., Smits, A. H., Pfaffeneder, T., Jansen, P. W., Bauer, C., Munzel, M., Wagner, M., Muller, M., Khan, F., Eberl, H. C., Mensinga, A., Brinkman, A. B., Lephikov, K., Muller, U., Walter, J., Boelens, R., van Ingen, H., Leonhardt, H., Carell, T., and Vermeulen, M. (2013) Dynamic readers for 5-(hydroxy)methylcytosine and its oxidized derivatives, Cell 152, 1146-1159. 30. Wan, M., Lee, S. S., Zhang, X., Houwink-Manville, I., Song, H. R., Amir, R. E., Budden, S., Naidu, S., Pereira, J. L., Lo, I. F., Zoghbi, H. Y., Schanen, N. C., and Francke, U. (1999) Rett syndrome and beyond: recurrent spontaneous and familial MECP2 mutations at CpG hotspots, Am. J. Hum. Genet. 65, 1520-1529. 31. Ballestar, E., Yusufzai, T. M., and Wolffe, A. P. (2000) Effects of Rett syndrome mutations of the methyl-CpG binding domain of the transcriptional repressor MeCP2 on selectivity for association with methylated DNA, Biochemistry 39, 7100-7106.

22 ACS Paragon Plus Environment

Page 23 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

32. Laccone, F., Huppke, P., Hanefeld, F., and Meins, M. (2001) Mutation spectrum in patients with Rett syndrome in the German population: Evidence of hot spot regions, Hum. Mutat. 17, 183-190. 33. Provencher, S. W., and Glockner, J. (1981) Estimation of globular protein secondary structure from circular dichroism, Biochemistry 20, 33-37. 34. Lobley, A., Whitmore, L., and Wallace, B. A. (2002) DICHROWEB: an interactive website for the analysis of protein secondary structure from circular dichroism spectra, Bioinformatics 18, 211-212. 35. Whitmore, L., and Wallace, B. A. (2008) Protein secondary structure analyses from circular dichroism spectroscopy: methods and reference databases, Biopolymers 89, 392-400. 36. Abdul-Gader, A., Miles, A. J., and Wallace, B. A. (2011) A reference dataset for the analyses of membrane protein secondary structures and transmembrane residues using circular dichroism spectroscopy, Bioinformatics 27, 1630-1636. 37. Humphrey, W., Dalke, A., and Schulten, K. (1996) VMD: visual molecular dynamics, J. Mol. Graphics 14, 33-38. 38. deGroot, B. L., vanAalten, D. M. F., Scheek, R. M., Amadei, A., Vriend, G., and Berendsen, H. J. C. (1997) Prediction of protein conformational freedom from distance constraints, Proteins: Struct., Funct., Bioinf. 29, 240-251. 39. Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., Chipot, C., Skeel, R. D., Kalé, L., and Schulten, K. (2005) Scalable molecular dynamics with NAMD, J. Comput. Chem. 26, 1781-1802. 40. Tanner, D. E., Chan, K. Y., Phillips, J. C., and Schulten, K. (2011) Parallel Generalized Born Implicit Solvent Calculations with NAMD, J. Chem. Theory Comput. 7, 3635-3642.

23 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 28

41. Chipot, C. (2007) Free Energy Calculations: Theory and Applications in Chemistry and Biology, Springer-Verlag, Berlin. 42. MacKerell, A. D., Bashford, D., Bellott, M., Dunbrack, R. L., Evanseck, J. D., Field, M. J., Fischer, S., Gao, J., Guo, H., Ha, S., Joseph-McCarthy, D., Kuchnir, L., Kuczera, K., Lau, F. T., Mattos, C., Michnick, S., Ngo, T., Nguyen, D. T., Prodhom, B., Reiher, W. E., Roux, B., Schlenkrich, M., Smith, J. C., Stote, R., Straub, J., Watanabe, M., Wiorkiewicz-Kuczera, J., Yin, D., and Karplus, M. (1998) All-atom empirical potential for molecular modeling and dynamics studies of proteins, J. Phys. Chem. B 102, 3586-3616. 43. MacKerell, A. D., Feig, M., and Brooks, C. L. (2004) Improved treatment of the protein backbone in empirical force fields, J. Am. Chem. Soc. 126, 698-699. 44. Mackerell, A. D., Feig, M., and Brooks, C. L. (2004) Extending the treatment of backbone energetics in protein force fields: Limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations, J. Comput. Chem. 25, 1400-1415. 45. Darden, T., York, D., and Pedersen, L. (1993) Particle Mesh Ewald - an N.Log(N) Method for Ewald Sums in Large Systems, J. Chem. Phys. 98, 10089-10092. 46. Liu, P., Dehez, F. o., Cai, W., and Chipot, C. (2012) A toolkit for the analysis of free-energy perturbation calculations, J. Chem. Theory Comput. 8, 2606-2616.

24 ACS Paragon Plus Environment

Page 25 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Figure Legends Figure 1. Domain architecture in methyl binding proteins and MeCP2. a) Schematic illustration of methyl-CpG binding domain conserved among MBP family proteins. b) MeCP2MBD Rett syndrome mutation frequencies overlaid on the primary sequence and the secondary structure of MeCP2-MBD. Bars: mutation frequency; for sites with multiple mutations, the bars represent combined number of mutations. The residues highlighted in green are present in MBDDNA interface. The α-helix (blue tube), β-strands (orange arrows) and turns (gray arcs) are drawn according the crystal structure (PDB: 3C2I). c) MBD-mC/mC structure based on 3C2I in PDB. d) Rett mutations highlighted on MBD-mC/mC structure. Orange: residues directly interact with CpG bases; purple: residues involved in Asx-ST motif; green: residues present in MBD-DNA interface; blue: residues not involved to interact with DNA directly.

Figure 2. Representative images of gel mobility shift analysis of MeCP2-MBD on duplex DNA with modified CpG pair. Representative images of MeCP2-MBD binding to a) C/C, b) mC/mC, c) hmC/hmC, d) fC, e) caC. f) Curve fitting of experimental data.

Figure 3. Dissociation constants of MeCP2-MBD binding to various oxidative states of CpG-containing DNA. Dissociation constants of MeCP2-MBD binding to a) C/C, b) mC/mC, c) hmC/hmC, d) fC, e) caC. Binding analysis and calculations were performed as described in Materials and Methods. Data are averages of three independent experiments.

25 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 28

Figure 4. Representative images of gel mobility shift analysis of MeCP2-MBD mutants on duplex DNA with C/C or mC/mC DNA. a)-d) Representative images for MeCP2-MBD mutants. e) Curve fitting of MeCP2-MBD mutants binding to C/C experimental data. f) Curve fitting of MeCP2-MBD mutants binding to mC/mC experimental data.

Figure 5. Interactions between MeCP2-MBD and modified CpG. a) Closeup view of Asx-ST motif. b) Direct base contacts in MBD-mC/mC structure. Green circles: water molecules. C5 methyl group is shown in yellow.

Figure 6. Correlation of differences of free energy changes in DNA binding and protein unfolding. a) Schematic illustration of free energy changes in DNA binding and unfolding.b) Differences in free energy changes in binding to C/C (∆∆GC) versus differences in free energy changes in binding to mC/mC (∆∆GmC). c) Differences in free energy changes in binding to mC/mC (∆∆GmC) versus changes in unfolding free energy (∆∆Gunfolding). d) Differences in free energy changes in binding to C/C (∆∆GC) versus changes in unfolding free energy (∆∆Gunfolding). e) Differences in binding free energy changes of mC/mC to C/C (

!"/",!/0120 )

versus

changes in unfolding free energy (∆∆Gunfolding).

Figure 7. RMSF Profiles for P152R, S134C and A140V for CA atoms of all residues in MBD. Residues 105-111 and 119-125 are the beta strands (orange arrows) and residues 135-145 shows the helix region (blue tube).

26 ACS Paragon Plus Environment

Page 27 of 28

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

ACS Chemical Biology

Table 1. Summary of Kd values and ∆∆G of wild-type and mutant MeCP2-MBD proteinsa Summary of Kd values and ∆∆G of wild-type and mutant MeCP2-MBD proteins MBD Variants

Mutation Frequencyb

Kd,mC (nM)

∆∆GmC (kcal mol-1)

Kd,C (nM)

∆∆GC (kcal mol-1)

∆∆GmC/C (kcal mol-1)

WT 6 ± 0.5 N/A 398 ± 49 N/A 2.48 ± 0.07 L100V 7 21.6 ± 3.6 0.76 ± 0.10 568 ± 102 0.21 ± 0.11 1.94 ± 0.11 1.51 ± 0.16 596 ± 97 0.24 ± 0.09 1.22 ± 0.09 R106W 121 76.5 ± 20.2 R106Q 19 153 ± 22.1 1.92 ± 0.09 1389 ± 221 0.74 ± 0.1 1.31 ± 0.10 2.68 ± 0.16 1210 ± 209 0.66 ± 0.1 0.46 ± 0.10 R111G 1 554.6 ± 145.7 R133C 186 99.8 ± 11.9 1.66 ± 0.15 1203 ± 392 0.65 ± 0.05 1.47 ± 0.05 1.52 ± 0.07 629 ± 54 0.27 ± 0.19 1.24 ± 0.19 R133H 8 77.9 ± 18.8 S134C 19 28.2 ± 1.0 0.92 ± 0.02 759 ± 184 0.38 ± 0.14 1.95 ± 0.14 A140V 28 48.5 ± 8.6 1.24 ± 0.11 1205 ± 197 0.66 ± 0.1 1.9 ± 0.10 0.97 ± 0.31 996 ± 205 0.54 ± 0.12 2.05 ± 0.12 P152R 62 31.1 ± 14.3 F155S 2 22.6 ± 2.7 0.78 ± 0.07 884 ± 179 0.47 ± 0.13 2.17 ± 0.13 D156E 14 47.9 ± 8.6 1.23 ± 0.10 509 ± 118 0.15 ± 0.14 1.4 ± 0.14 T158M 393 14.2 ± 1.5 0.51 ± 0.07 351 ± 54 -0.07 ± 0.09 1.9 ± 0.09 0.9 ± 0.08 1437 ± 73 0.76 ± 0.03 2.34 ± 0.03 T158A 2 27.6 ± 7.5 a Dissociation constants were measured and averaged from three independent experiment. b Patients with indicated mutants out of 4841 Rett syndrome patients. Date retrieved from RettBASE: RettSyndrome.org Variation Database (mecp2.chw.edu.au).

27 ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

80x39mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 28 of 28