The N6-Position of Adenine Is a Blind Spot for TAL ... - ACS Publications

May 11, 2017 - Transcription-activator-like effectors (TALEs) are programmable DNA binding proteins widely used for genome targeting. TALEs consist of...
0 downloads 0 Views 1MB Size
Subscriber access provided by CORNELL UNIVERSITY LIBRARY

Letter

The N6-Position of Adenine is a Blind Spot for TAL-Effectors that Enables Effective Binding of Methylated and Fluorophore-Labeled DNA Sarah Flade, Julia Jasper, Mario Giess, Matyas Juhasz, Andreas Dankers, Grzegorz Kubik, Oliver Koch, Elmar Weinhold, and Daniel Summerer ACS Chem. Biol., Just Accepted Manuscript • Publication Date (Web): 11 May 2017 Downloaded from http://pubs.acs.org on May 12, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Chemical Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology



ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 8

The N6-Position of Adenine is a Blind Spot for TAL-Effectors that Enables Effective Binding of Methylated and Fluorophore-Labeled DNA Sarah Flade1, Julia Jasper1, Mario Gieß1, Matyas Juhasz2, Andreas Dankers2, Grzegorz Kubik1, Oliver Koch*1, Elmar Weinhold*2 and Daniel Summerer*1 1 Department of Chemistry and Chemical Biology, TU Dortmund University, Otto-Hahn-Str. 6, 44227 Dortmund (Germany) 2 Institute of Organic Chemistry, RWTH Aachen University, Landoltweg 1, 52056 Aachen (Germany)

Supporting Information Placeholder ABSTRACT: Transcription-activator-like effectors (TALEs) are programmable DNA binding proteins widely used for genome targeting. TALEs consist of multiple concatenated repeats, each selectively recognizing one nucleobase via a defined repeat variable diresidue (RVD). Effective use of TALEs requires knowledge about their binding ability to epigenetic and other modified nucleobases occurring in target DNA. However, aside from epigenetic cytosine-5 modifications, the binding ability of TALEs to modified DNA is unknown. We here study the binding of TALEs to the epigenetic nucleobase N6-methyladenine (6mA) found in prokaryotic and recently also eukaryotic genomes. We find that the natural, adenine (A)-binding RVD NI is insensitive to 6mA. Modelassisted structure-function studies reveal accommodation of 6mA by RVDs with altered hydrophobic surfaces and abilities of hydrogen bonding to the N6-amino group or N7-atom of A. Surprisingly, this tolerance of N6-substitution was transferrable to bulky N6-alkynyl-substituents usable for click chemistry and even to a large rhodamine dye, establishing the N6 position of A as the first site of DNA that offers label introduction within TALE target sites without interference. These findings will guide future in vivo studies with TALEs and expand their applicability as DNA capture probes for analytical applications in vitro. Transcription-activator like effectors (TALEs) are DNA binding proteins used for genome engineering, transcription regulation as well as epigenome editing and analysis in a wide range of organisms.1-3 TALEs contain a central DNA binding domain consisting of multiple concatenated repeats that each selectively recognizes one nucleobase via one of two variable amino acids (the repeat variable diresidue, RVD).32,33 This recognition occurs via a predictable code, offering programmable sequence selectivity (RVDs with amino acids NI, NN, NG and HD at positions 12 and 13 within each TALE repeat preferentially bind to A, G, T, and C, respectively;4-5 see Fig. 1a). The central, critical property of TALEs for any genome targeting approach is their sequence selectivity and propensity

to bind genomic off-target sites. Consequently, RVDs with maximal selectivity for each canonical nucleobase are required, and mutational analyses have already afforded improved RVDs.6-7 A second critical property of TALEs is their sensitivity to epigenetic nucleobases occurring in target genomes, which can abolish their desired function. For example, RVD HD has been described to be sensitive to 5methylcytosine (5mC),8-10 which has motivated the establishment of tailored RVDs with similar binding to C and 5mC.9 Moreover, mutational analyses have afforded TALE repeats with selectivity or promiscuity in the full range of epigenetic 5-modified C nucleobases, i.e. including 5hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC)11-13 that are generated from 5mC by ten-eleven-translocation (TET) dioxygenases.14-18 However, the binding ability of TALEs to other modified nucleobases is not known. Methylation of the 6-amino group of A to N6-methyladenine (6mA, Fig. 1b) is long known to be a major epigenetic mark in bacteria, archaea, and fungi.19 Here, 6mA can serve as control element of protein-DNA interactions involved in genome defense, DNA replication/repair, and transcription regulation.20 Notably, transcription regulation by 6mA often is involved in important bacterial pathogenic processes, 21 such as the regulation of secretion systems,22 adhesion,23 and the control of glucan and potentially bacteriocin production.24 Recently, 6mA has also been discovered in diverse eukaryotic organisms25-27 including vertebrates and even mammals28-31 (for mouse cells, contradictary data have been reported very recently).32 Though generally occurring at low levels, multiple lines of evidence for an involvement of 6mA in transcription regulation were found. For example, whereas 6mA generally correlated with active transcription in Chlamydia25 and Drosophila melanogaster,26 it correlated with silencing of LINE-1 transposons in the mouse.28 Moreover, DNA methyltransferases and oxidative demethylases were identified for 6mA in several organisms, and 6mA levels were found to be dynamic during development.33 In the light of this widespread occurrence of 6mA in organisms from all domains of life, it is im-

ACS Paragon Plus Environment

Page 3 of 8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

portant to understand the binding ability of TALEs to this epigenetic nucleobase.

ceivable, since an intact hydrogen bond between the N6-amino hydrogen of A and the O4-atom of the paired T nucleobase would orient the A N6-methyl group towards the side chain of I13 of the RVD. EMSA revealed the formation of single, defined TALE-DNA complexes with increased electromobility compared to the free TALE protein, but without differences in binding to A or 6mA-containing DNA (Fig. 2B and 2C).

Figure 1. General features of TALEs, recognition of nucleobase A and methylation of A to 6mA. A: Cartoon showing features of employed TALEs. The amino acid sequence of one representative TALE repeat is shown on top, with RVD amino acids 12 and 13 marked with a grey box. TALE repeats with selectivity for canonical nucleobases are shown on the right with RVDs specified. B: Structures of nucleobases A and 6mA. C: Structure of the interaction of TALE RVD NI with A as found in a crystal structure (pdb entry 3UGM).34 Distances are indicated in Ångström and shown as dashed black lines.

To study the interaction of the natural TALE RVD NI with the nucleobases A and 6mA (Fig. 1C shows a crystal structure of this RVD bound to A34), we constructed TALE_Pap2_(NI) targeting a 17 nt sequence around a regulatory GATC site within the E. coli pap operon (Pap2, Fig. 2A). Methylation of this and a second GATC site by Dam DNA-methyltransferase controls binding of the leucine-responsive regulatory protein and ultimately expression of pyelonephritis-associated pili (Pap) that mediate adhesion of pathogenic E. coli.35-37 We generally constructed TALEs based on a Xanthomonas axonopodis scaffold38 by golden gate assembly39. TALEs contained an N-terminal thioredoxin or green fluorescent protein (GFP)domain, a shortened TALE NTR (+136 amino acids starting from canonical repeat 1) and a C-terminal His6 tag (for protein sequences, see SI). We expressed TALEs in E. coli and purified them by Ni-NTA chromatography (SI Fig. 1). We employed TALE_Pap2_(NI) in electromobility shift assays (EMSA) using the GFP fluorophore as read-out, together with unlabeled DNA oligonucleotide duplexes containing the Pap2 sequence with either an A or a 6mA at position 7 opposite RVD NI (Pap2 and Pap2_6mA, Fig. 2A). NI is the most frequently found RVD for binding of A in natural TALEs,1 and has been shown to be selective in the context of the four canonical nucleobases,4-5, 7, 40 but how NI interacts with 6mA is unknown. I13 forms a hydrophobic surface at the Hoogsteen face of A with atomic distances between 3.8 - 4.1 Å (Fig. 1C) and is found in different conformations in crystal structures.34, 41-42 It can be postulated that this involves CH-N hydrogen bonding, a rarely occurring and weak interaction.43 However, an additional hydrophobic interaction to 6mA should be con-

Figure 2. Interaction of nonpolar TALE RVDs with A and 6mA in DNA. A: Target sequences of the Pap operon used in this study. GATC sites are underlined with A/6mA of interest in bold red. B: EMSA assay with TALE_Pap2_(NI) and varying concentrations of DNA duplexes with sequence Pap2 shown in Fig. 1A and a single A or 6mA at position 7. C.) Quantification of EMSA assays as in Fig. 2B using a DNA/TALE ratio of 6.5 : 1. D: Principle of TALE-controlled primer extension. Competitive binding of TALE to DNA inhibits primer extension by KF(exo-). TALE is shown as cartoon with RVD varied in this study in red. X and Y denote two different nucleobases. E: PAGE analysis of primer extension reactions containing 25 mU KF(exo-), 41.5 µM dNTP, 8.325 nM primer-template complex, and different concentrations

2

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

of TALE_Pap2_(NI) resulting in the indicated TALE/DNA ratios. Primer and extension product are marked with a black and grey arrow, respectively. F: Binding analysis of TALE_Pap2_(NI) with varying repeat numbers to Pap2 DNA containing single A or 6mA by primer extension as shown in Fig. 2D. TALE/DNA ratio was 30 : 1. Indicated repeat number includes T-binding repeat 0. AU = arbitrary units. G: Binding analysis of TALE_Pap2_(NI), _(NL) and _(NV) to Pap2 DNA containing single A or 6mA by primer extension as shown in Fig. 2D. TALE/DNA ratio was 20 : 1. H J: Superimposed models of interactions between TALE_Pap2_(NI), -(NL), and _(NV) to DNA containing A or 6mA opposite the varied RVD. TALE is shown as cartoon, with relevant nucleobases and residue 13 (numbering within each repeat) of varied RVD (bold red) and RVDs of adjacent repeats shown as sticks. Grey = model containing A, light brown = model containing 6mA.

However, TALE_Pap2_(NI) bound the nucleobases T, G and C with strongly reduced affinity, confirming that RVD NI in the chosen sequence context exhibited its expected Aselectivity (SI Fig. 12). Because of the low nucleobase selectivities typically observed for TALEs in EMSA,44 we next performed primer extension assays that exhibit far higher selectivities and should reveal even subtle binding differences.10, 12 This assay relies on the ability of TALEs to competitively inhibit DNA polymerase-binding to primer template complexes (Fig. 2D). We hybridized a 5´-32P-labeled primer with an oligonucleotide template containing sequence Pap2 (pPap2 and tPap2, see, the SI) as before and incubated it with TALE_Pap2_(NI). Subsequently, we added dNTP and the Klenow fragment of E. coli DNA polymerase I (3´-5´-exo-, KF(exo-) incubated the mixture for 15 min at room temperature and resolved it by denaturing polyacrylamide gel electrophoresis (PAGE). We quantified the extension product and used it as a measure of TALE binding. However, we again did not observe differences in binding of TALE_Pap2_(NI) to DNA containing an A or 6mA opposite the single RVD NI. This observation was consistent over various TALE concentrations (SI Fig. 2) and was also observed in Pap 5, a second sequence of the pap operon (Fig. 2a and SI Fig. 3). To exclude that a potential selectivity was masked by excessive binding energy contributed by the 15 remaining canonical TALE repeat-nucleobase interactions, we repeated the experiments with truncated versions of TALE_Pap2_(NI) targeting only 16 or 15 nt sequences. However, no difference was observed (Fig. 2F). This indicates that the N6-methyl group of 6mA is well accommodated by RVD NI and does not undergo measurable hydrophobic interactions. Previous structural studies have shown that an A nucleobase can be accommodated by RVDs with alternative amino acids at position 13, providing the possibility to further probe the interactions between TALEs and 6mA.41 For this, we first aimed to modulate the hydrophobic surface of RVD NI by presenting the shorter V or a more flexible L residue at position 13 (TALE_Pap2_(NV) and TALE_Pap2_(NL)). In both cases, primer extension assays revealed that TALE-binding to A and 6mA was comparable to that of TALE_Pap2_(NI) (Fig. 2G). This observation was also made in the second sequence context Pap5 (SI Fig. 3). In agreement with crystal structures,41 these data indicate a pronounced flexibility of the interaction of nonpolar RVDs with A, supporting the model that no specific interaction takes place with this nucleobase.41 However, the data also indicate that the presence of the N6-

Page 4 of 8

methyl group in 6mA is tolerated and does not lead to increased hydrophobic interactions. To rationalize these findings, we performed a modelling of Pap2_(NI), Pap2_(NV) and Pap2_(NL) based on available crystal structures for both A and 6mA. Briefly, based on available crystal structures, the base pairs of the DNA strand were mutated to the core of the Pap2 DNA sequence (AAAAGATCGGT) using Chimera. A homology model was created for each RVD variant using MOE2016.08 and the Amber10_EHT force field with including selected atoms as environment for induced fit (for details, see SI). The Pap2_(NI) model reveals that the terminal methyl group of I13 can simply rotate to accommodate the 6mA (see Figure 2H). A similar change in conformation is observed for Pap2_(NL) (see Figure 2I). This illustrates that in both RVDs the sidechains can easily avoid unfavorable interactions with the 6mA methyl group by relatively small conformational changes. In case of Pap2_(NV), the shorter side chain of V13 has a larger distance to the 6mA methyl group so that no rotation is required to accommodate 6mA (see Figure 2J). These findings imply that the methyl group of 6mA can be well accommodated by all three RVDs. The larger sidechains of I13 and L13 thereby adopt conformations that avoid steric clash without undergoing hydrophobic interactions strong enough to result in a measurable change in selectivity.

Figure 3. Interaction of polar TALE RVDs with A and 6mA in DNA. A: Binding analysis of TALE_Pap2_(NN) to Pap2 DNA containing single A or 6mA by primer extension as shown in Fig. 2D. TALE/DNA ratio was 60 : 1. B: Superimposed models of interactions between TALE_Pap2_(NN), to DNA containing A or 6mA opposite the varied RVD. C: Binding analysis of TALE_Pap2_(NE) to Pap2 DNA containing single A or 6mA by primer extension as shown in Fig. 2D. TALE/DNA ratio was 60 : 1. B: Superimposed models of interactions between TALE_Pap2_(NE), to DNA containing A or 6mA opposite the varied RVD.

We next aimed to investigate the interactions of polar RVDs with A and 6mA. RVD NN is known to bind both G and A,4-5, 7, 40 and a hydrogen bond between the amide amino group of N13 and the N7-atom of A has been observed in crystal structures.41 However, in primer extensions, TALE_Pap2_(NN)

3

ACS Paragon Plus Environment

Page 5 of 8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

exhibited comparable binding to 6mA and A, indicating little influence of the N6 methyl group on overall affinity (Fig. 3A). The model of TALE_Pap2_(NN) shows that N13 of the interacting RVD can slightly rotate to accommodate the 6mA methyl group, resulting in a loss of the hydrogen bond to the amino group of N7. The observation that this does not weaken the affinity of the TALE indicates that the corresponding hydrogen bond might be relatively weak. This is in accordance with the available x-ray structure, where N13 adopts different conformations in chain A and B of pdb entry 4OSJ39, resulting in a distance of 4.1 Å between the N-atom of the N13 amino group to N7 in chain B and a loss of the hydrogen bond (see SI, Fig. 4). The electron densities of the N13 residues is not well resolved in both chains (SI Fig. 5), implying that the hydrogen bond to N7 of A is relatively weak, so that the rotation of N13 for accommodation of 6mA does not result in reduced binding.

Figure 4. Modification of DNA at the N6-position of A and insensitivity of TALEs for such modifications. A: Modification of A at the N6 position by the use of E. coli Dam DNA methyltransferase (EcoDam) and cofactor analogs of S-adenosylmethionine (SAM).45-46 B - D: ESI-TOF analysis of DNA oligonucleotides containing the Pap2 sequence without labeling (B), labeled as in Fig. 4A using cofactor 1b (C) and cofactor 1c (D). s = sense strand. a = antisense strand. Masses: s calc: 14977.8, s found: 14977.9; a calc: 15162.8, a found: 15163.9; s (2b) calc: 15082.8, s (2b) found: 15083; a (2b) calc: 15267.8, a (2b) found: 15268; s (2c) calc: 15713.1, s (2c) found: 15714; a (2c) calc: 15898.1, a (2c) found: 15898.7). E: Analysis of TALE_Pap2_(NI) interaction with DNA containing sequence Pap2 reacted with or without cofactors 1b or 1c by primer extension reactions as in Fig. 2D. TALE-DNA ratio was 60 : 1. F: Superimposed models of interactions between TALE_Pap2_(NI) and DNA containing two A or 2a nucleobases, one opposite RVD NI in the TALE-bound DNA strand (green sticks) and one in the antisense DNA strand in the palyndromic GATC in sequence Pap2 (yellow sticks). Structures

are labeled as in Fig. 2 H - J. Grey = model containing A, light brown = model containing 2a. G: same as Fig. 4F, but as differently orientated surface representation.

Next, we aimed to assess the particularly interesting polar RVD NE. In vivo activity assays with TALE nucleases and transcription activators bearing this RVD suggested no binding to A,7, 47 but compared to N13, the E13 side chain can reach further into the DNA major groove towards the N6 amino group of A, and indeed, a direct hydrogen bond has been observed in a crystal structure (Fig. 3C).41 Hence, this interaction may be more directly influenced by N6methylation. Strikingly, even TALE_Pap2_(NE) did not exhibit differential binding to A and 6mA in primer extensions (Fig. 3C). In the model, the large side chain of E13 is completely re-oriented, resulting in loss of the hydrogen bond to N6 and an establishment of a hydrogen bond to its own backbone amide. As a result of this conformational change, also N13 of the preceding RVD is re-oriented to avoid unfavorable interactions (Fig. 3D). As for Pap2_(NN), the promiscuity of RVD NE implies that the hydrogen bond to N6 is weak and can be replaced by a backbone hydrogen bond. In the crystal structure, E13 adopts only slightly different conformations in chain A and B, resulting in an O – N6 distance of 3.1 - 3.2 Å (SI Fig. 6). Again, the electron density is not well resolved for both chains (SI Fig. 7), suggesting that the respective hydrogen bond is weak and that the E13 side chain can rotate to accommodate 6mA without a reduction of binding. Taken together, these data indicate that the TALE repeat scaffold - though giving rise to a wider range of selective RVDs for T, G, C, and epigenetic modifications of the latter6-7, 11-13 is highly promiscuous towards A and 6mA, even when significant structural changes are introduced. This methylation insensitivity provides robust binding of TALEs to user-defined DNA sequences containing potentially methylated As. The major-groove-directed DNA recognition mode makes TALEs attractive capture probes for analytic applications that can deliver information Watson-Crick-hybridization probes cannot. For example, TALE repeats selective for epigenetic nucleobases presenting unique groups in the major groove have been identified, such as 5mC and 5hmC.11-13 This enabled the design of TALE probes for the direct in vitro detection of 5mC and 5hmC at user-defined genomic sites by solid phase affinity enrichment.44 Such applications could greatly benefit from the applicability of fluorescently labeled DNA for detection. However, the preferred attachment site of labels to DNA is exactly the major groove-directed 5-position of pyrimidines that is tightly recognized by TALEs.32,33 Though the N6position of A is also directed towards the major groove, the tolerance of methylation raised the question, if this position may be a permissive site also for larger substituents. An attractive strategy for label introduction at A is the use of S-adenosylmethionine (SAM; 1a in Fig. 4A) analogs with substituents replacing the electrophilic methyl group at the sulfonium center. These can act as co-factors for DNA methyltransferases, allowing the substituents to be transferred to the nucleophilic N6-position of A (Fig. 4A).48-50 We were interested, if the dialkynyl substituent transferred by cofactor 1b (Fig. 4A) in sequence Pap2 would interfere with binding of TALE_Pap2_(NI), since this substituent does not contain branch points and provides a certain flexibility, but can be used for subsequent labeling with bulky fluorophores via

4

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Cu(I)-catalyzed [3+2] azide-alkyne cycloadditions.51 We found that E. coli Dam DNA methyltransferase (EcoDam) was able to accept 1b as a cofactor (SI Fig. 5). Taking advantage of this, we subjected oligonucleotide duplexes containing the Pap2 target sequence to a labeling reaction containing EcoDam and 80 µM of the cofactor 1b (see SI). Analysis by electrospray-ionization time-of-flight (ESI-TOF) mass spectrometry suggested quantitative labeling of both DNA strands with 1b (Fig. 4B, C and SI Fig. 10). However, no difference in binding was observed for TALE_Pap2_(NI) between DNA containing an A or a 2b nucleobase, indicating that this larger substituent is well tolerated by TALEs, even when present at both DNA strands (Fig. 4E). Modelling studies suggested that the substituents at the two 2b nucleobases, even though presented in the major groove, can align and thread through the open half of the groove towards the surface of the complex (Fig. 4F, G). This model also suggested that TALEs could tolerate even larger substituents when connected via the terminal alkyne moiety of 1b. We repeated the labeling reaction with cofactor 1c bearing a substituent derived from 1b by Cu(I)-catalyzed [3+2] azide-alkyne cycloaddition with a bulky tetramethylrhodamin (TAMRA) fluorophore connected via a tetraethylene glycole linker (Fig. 4A). ESI-TOF analysis again indicated labeling with 1c, albeit with reduced yield compared to 1b (Fig. 4A, C, SI Fig. 11). Nevertheless, primer extensions still did not reveal any difference in binding of TALE_Pap2_(NI) to DNA containing a single A or 2c nucleobase (Fig. 4E). This suggests that the large TAMRA substitutent of 2c indeed is surface exposed and that other larger fluorophores ath the N6 position of A may be tolerated by TALEs as well. In conclusion, we report the first insights into the binding ability of TALEs to DNA modified at a position other than the 5-position of C. We study interactions with A and its methylated counterpart 6mA, an epigenetically modified nucleobase widely found in organisms from all domains of life. Unlike for C/5mC, we find that 6mA is well tolerated by the natural, Abinding TALE RVD NI and does not require the engineering of universal repeats to avoid methylation sensitivity. 6mA can thus likely be ignored in TALE design for in vivo genome targeting applications, including genome engineering and transcriptional control. This currently includes experiments in a large number of prokaryotes, but also a growing number of eukaryotic organisms revealed by ultrasensitive mass spectrometry methods to contain 6mA and likely use it as a transcriptional regulatory element. Structure-function studies show that the found 6mA tolerance of TALEs also holds true for RVDs with varying hydrophobic surfaces, and with differential potential to undergo hydrogen bonds with A and 6mA. This suggests a limited selectivity potential of the TALE repeat scaffold for these two nucleobases, which is in marked contrast to the high selectivities observed for RVDs binding to G, T, C and epigenetic modifications of the latter.6-7, 11-13 Finally, this tolerance was transferrable to bulky substituents bearing alkyne moieties and even a fluorophore, offering a route to the use of labeled DNA in in vitro analytical applications relying on TALEs as capture probes. Taken together, our study establishes the N6-position of A as a ”blind spot” of TALEs, a finding that will be useful for TALE design in future studies of genome targeting in vivo and in vitro.

Page 6 of 8

ASSOCIATED CONTENT Supporting Information. Experimental procedures, oligonucleotide and protein sequences, data of protein expressions/purifications, biochemical assays and modelling studies. This material is available free of charge via the Internet at http://pubs.acs.org.”

AUTHOR INFORMATION Corresponding Authors [email protected]; [email protected]; [email protected].

Funding Sources No competing financial interests have been declared. This work was supported by grants from the Deutsche Forschungsgemeinschaft (Su 726/5-1 in SPP1784 and Su 726/6-1 in SPP1623). E.W. acknowledges financial support from the German-Israeli Foundation for Scientific Research and Development (I-1196195.9/2012). O.K. acknowledges a grant from the German Federal Ministry for Education and Research (Grant No. BMBF 1316053).

ACKNOWLEDGMENT We acknowledge support by the TU Dortmund, RWTH Aachen University, the Zukunftskolleg of the University of Konstanz and the Konstanz Research School Chemical Biology. We thank A. J. Bogdanove and D. F. Voytas for TALE assembly plasmids obtained via Addgene.

REFERENCES 1. Boch, J.; Bonas, U., Xanthomonas AvrBs3 family-type III effectors: discovery and function. Ann. Rev. Phytopath. 2010, 48, 419-36. 2. Bogdanove, A. J.; Voytas, D. F., TAL effectors: customizable proteins for DNA targeting. Science 2011, 333, 1843-6. 3. Thakore, P. I.; Black, J. B.; Hilton, I. B.; Gersbach, C. A., Editing the epigenome: technologies for programmable transcription and epigenetic modulation. Nat. Methods 2016, 13, 127-37. 4. Moscou, M. J.; Bogdanove, A. J., A simple cipher governs DNA recognition by TAL effectors. Science 2009, 326, 1501. 5. Boch, J.; Scholze, H.; Schornack, S.; Landgraf, A.; Hahn, S.; Kay, S.; Lahaye, T.; Nickstadt, A.; Bonas, U., Breaking the code of DNA binding specificity of TAL-type III effectors. Science 2009, 326, 1509-12. 6. Miller, J. C.; Zhang, L.; Xia, D. F.; Campo, J. J.; Ankoudinova, I. V.; Guschin, D. Y.; Babiarz, J. E.; Meng, X.; Hinkley, S. J.; Lam, S. C.; Paschon, D. E.; Vincent, A. I.; Dulay, G. P.; Barlow, K. A.; Shivak, D. A.; Leung, E.; Kim, J. D.; Amora, R.; Urnov, F. D.; Gregory, P. D.; Rebar, E. J., Improved specificity of TALEbased genome editing using an expanded RVD repertoire. Nat. Methods 2015, 12, 465-71. 7. Yang, J.; Zhang, Y.; Yuan, P.; Zhou, Y.; Cai, C.; Ren, Q.; Wen, D.; Chu, C.; Qi, H.; Wei, W., Complete decoding of TAL effectors for DNA recognition. Cell Res. 2014, 24, 628-631. 8. Bultmann, S.; Morbitzer, R.; Schmidt, C. S.; Thanisch, K.; Spada, F.; Elsaesser, J.; Lahaye, T.; Leonhardt, H., Targeted transcriptional activation of silent oct4 pluripotency gene by combining designer TALEs and inhibition of epigenetic modifiers. Nucleic Acids Res. 2012, 40, 5368-5377. 9. Valton, J.; Dupuy, A.; Daboussi, F.; Thomas, S.; Marechal, A.; Macmaster, R.; Melliand, K.; Juillerat, A.; Duchateau, P., Overcoming Transcription Activator-like Effector (TALE) DNA

5

ACS Paragon Plus Environment

Page 7 of 8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Binding Domain Sensitivity to Cytosine Methylation. J. Biol. Chem. 2012, 287, 38427-38432. 10. Kubik, G.; Schmidt, M. J.; Penner, J. E.; Summerer, D., Programmable and highly resolved in vitro detection of 5methylcytosine by TALEs. Angew. Chem. Int. Ed. Engl. 2014, 53, 6002-6. 11. Maurer, S.; Giess, M.; Koch, O.; Summerer, D., Interrogating Key Positions of Size-Reduced TALE Repeats Reveals a Programmable Sensor of 5-Carboxylcytosine. ACS Chem. Biol. 2016, 11, 3294-3299. 12. Kubik, G.; Summerer, D., Achieving single-nucleotide resolution of 5-methylcytosine detection with TALEs. Chembiochem 2015, 16, 228-31. 13. Kubik, G.; Batke, S.; Summerer, D., Programmable sensors of 5-hydroxymethylcytosine. J. Am. Chem. Soc. 2015, 137, 2-5. 14. Tahiliani, M.; Koh, K. P.; Shen, Y.; Pastor, W. A.; Bandukwala, H.; Brudno, Y.; Agarwal, S.; Iyer, L. M.; Liu, D. R.; Aravind, L.; Rao, A., Conversion of 5-methylcytosine to 5hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 2009, 324, 930-5. 15. Kriaucionis, S.; Heintz, N., The nuclear DNA base 5hydroxymethylcytosine is present in Purkinje neurons and the brain. Science 2009, 324, 929-30. 16. Ito, S.; Shen, L.; Dai, Q.; Wu, S. C.; Collins, L. B.; Swenberg, J. A.; He, C.; Zhang, Y., Tet proteins can convert 5methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science 2011, 333, 1300-3. 17. He, Y. F.; Li, B. Z.; Li, Z.; Liu, P.; Wang, Y.; Tang, Q.; Ding, J.; Jia, Y.; Chen, Z.; Li, L.; Sun, Y.; Li, X.; Dai, Q.; Song, C. X.; Zhang, K.; He, C.; Xu, G. L., Tet-mediated formation of 5carboxylcytosine and its excision by TDG in mammalian DNA. Science 2011, 333, 1303-7. 18. Pfaffeneder, T.; Hackner, B.; Truss, M.; Munzel, M.; Muller, M.; Deiml, C. A.; Hagemeier, C.; Carell, T., The Discovery of 5Formylcytosine in Embryonic Stem Cell DNA. Angew. Chem. Int. Ed. Engl. 2011, 50, 7008-7012. 19. Wion, D.; Casadesus, J., N6-methyl-adenine: an epigenetic signal for DNA-protein interactions. Nat. Rev. Microbiol. 2006, 4, 183-92. 20. Lobner-Olesen, A.; Skovgaard, O.; Marinus, M. G., Dam methylation: coordinating cellular processes. Curr. Opin. Microbiol. 2005, 8, 154-160. 21. Heithoff, D. M.; Sinsheimer, R. L.; Low, D. A.; Mahan, M. J., An essential role for DNA adenine methylation in bacterial virulence. Science 1999, 284, 967-70. 22. Garcia-Del Portillo, F.; Pucciarelli, M. G.; Casadesus, J., DNA adenine methylase mutants of Salmonella typhimurium show defects in protein secretion, cell invasion, and M cell cytotoxicity. Proc. Natl. Acad. Sci. USA 1999, 96, 11578-11583. 23. Hernday, A.; Braaten, B.; Low, D., The intricate workings of a bacterial epigenetic switch. Adv. Exp. Med. Biol. 2004, 547, 83-9. 24. Banas, J. A.; Biswas, S.; Zhu, M., Effects of DNA Methylation on Expression of Virulence Genes in Streptococcus mutans. Appl. Environ. Microb. 2011, 77, 7236-7242. 25. Fu, Y.; Luo, G. Z.; Chen, K.; Deng, X.; Yu, M.; Han, D.; Hao, Z.; Liu, J.; Lu, X.; Dore, L. C.; Weng, X.; Ji, Q.; Mets, L.; He, C., N(6)-methyldeoxyadenosine marks active transcription start sites in chlamydomonas. Cell 2015, 16, 879-92. 26. Zhang, G.; Huang, H.; Liu, D.; Cheng, Y.; Liu, X.; Zhang, W.; Yin, R.; Zhang, D.; Zhang, P.; Liu, J.; Li, C.; Liu, B.; Luo, Y.; Zhu, Y.; Zhang, N.; He, S.; He, C.; Wang, H.; Chen, D., N(6)methyladenine DNA modification in Drosophila. Cell 2015, 161, 893-906. 27. Greer, E. L.; Blanco, M. A.; Gu, L.; Sendinc, E.; Liu, J.; Aristizabal-Corrales, D.; Hsu, C. H.; Aravind, L.; He, C.; Shi, Y., DNA Methylation on N(6)-Adenine in C. elegans. Cell 2015, 161, 868-78.

28. Wu, T. P.; Wang, T.; Seetin, M. G.; Lai, Y.; Zhu, S.; Lin, K.; Liu, Y.; Byrum, S. D.; Mackintosh, S. G.; Zhong, M.; Tackett, A.; Wang, G.; Hon, L. S.; Fang, G.; Swenberg, J. A.; Xiao, A. Z., DNA methylation on N(6)-adenine in mammalian embryonic stem cells. Nature 2016, 532, 329-33. 29. Liang, D.; Wang, H.; Song, W.; Xiong, X.; Zhang, X.; Hu, Z.; Guo, H.; Yang, Z.; Zhai, S.; Zhang, L. H.; Ye, M.; Du, Q., The decreased N6-methyladenine DNA modification in cancer cells. Biochem. Biophys. Res. Commun. 2016, 480, 120-125. 30. Zhou, C.; Liu, Y.; Li, X.; Zou, J.; Zou, S., DNA N6methyladenine demethylase ALKBH1 enhances osteogenic differentiation of human MSCs. Bone Res. 2016, 4, 16033. 31. Liu, J.; Zhu, Y.; Luo, G. Z.; Wang, X.; Yue, Y.; Wang, X.; Zong, X.; Chen, K.; Yin, H.; Fu, Y.; Han, D.; Wang, Y.; Chen, D.; He, C., Abundant DNA 6mA methylation during early embryogenesis of zebrafish and pig. Nat. Commun. 2016, 7, 13052. 32. Schiffers, S.; Ebert, C.; Rahimoff, R.; Kosmatchev, O.; Steinbacher, J.; Bohne, A. V.; Spada, F.; Michalakis, S.; Nickelsen, J.; Muller, M.; Carell, T., Quantitative LC-MS Provides No Evidence for m6 dA or m4 dC in the Genome of Mouse Embryonic Stem Cells and Tissues. Angew. Chem. Int. Ed. Engl. 2017, doi: 10.1002/anie.201700424. 33. Luo, G. Z.; Blanco, M. A.; Greer, E. L.; He, C.; Shi, Y., DNA N(6)-methyladenine: a new epigenetic mark in eukaryotes? Nat. Rev. Mol. Cell Biol. 2015, 16, 705-10. 34. Mak, A. N. S.; Bradley, P.; Cernadas, R. A.; Bogdanove, A. J.; Stoddard, B. L., The Crystal Structure of TAL Effector PthXo1 Bound to Its DNA Target. Science 2012, 335, 716-719. 35. Hernday, A.; Krabbe, M.; Braaten, B.; Low, D., Selfperpetuating epigenetic pili switches in bacteria. Proc. Natl. Acad. Sci. U S A 2002, 99 Suppl 4, 16470-6. 36. Hernday, A. D.; Braaten, B. A.; Low, D. A., The mechanism by which DNA adenine methylase and PapI activate the pap epigenetic switch. Mol. Cell. 2003, 12, 947-57. 37. van der Woude, M.; Braaten, B.; Low, D., Epigenetic phase variation of the pap operon in Escherichia coli. Trends Microbiol. 1996, 4, 5-9. 38. Miller, J. C.; Tan, S. Y.; Qiao, G. J.; Barlow, K. A.; Wang, J. B.; Xia, D. F.; Meng, X. D.; Paschon, D. E.; Leung, E.; Hinkley, S. J.; Dulay, G. P.; Hua, K. L.; Ankoudinova, I.; Cost, G. J.; Urnov, F. D.; Zhang, H. S.; Holmes, M. C.; Zhang, L.; Gregory, P. D.; Rebar, E. J., A TALE nuclease architecture for efficient genome editing. Nat. Biotechnol. 2011, 29, 143-149. 39. Cermak, T.; Doyle, E. L.; Christian, M.; Wang, L.; Zhang, Y.; Schmidt, C.; Baller, J. A.; Somia, N. V.; Bogdanove, A. J.; Voytas, D. F., Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting. Nucleic Acids Res. 2011, 39, e82. 40. Cong, L.; Zhou, R.; Kuo, Y. C.; Cunniff, M.; Zhang, F., Comprehensive interrogation of natural TALE DNA-binding modules and transcriptional repressor domains. Nat. Commun. 2012, 3, 968. 41. Deng, D.; Yan, C.; Wu, J.; Pan, X.; Yan, N., Revisiting the TALE repeat. Protein Cell 2014, 5, 297-306. 42. Deng, D.; Yan, C.; Pan, X.; Mahfouz, M.; Wang, J.; Zhu, J. K.; Shi, Y.; Yan, N., Structural basis for sequence-specific recognition of DNA by TAL effectors. Science 2012, 335, 720-3. 43. Taylor, R.; Kennard, O., Crystallographic Evidence for the Existence of C-H...O, C-H...N, and C-H...C1 Hydrogen-Bonds. J. Am. Chem. Soc. 1982, 104, 5063-5070. 44. Rathi, P.; Maurer, S.; Kubik, G.; Summerer, D., Isolation of Human Genomic DNA Sequences with Expanded Nucleobase Selectivity. J. Am. Chem. Soc. 2016, 138, 9910-8. 45. Grunwald, A.; Dahan, M.; Giesbertz, A.; Nilsson, A.; Nyberg, L. K.; Weinhold, E.; Ambjornsson, T.; Westerlund, F.; Ebenstein,

6

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Y., Bacteriophage strain typing by rapid single molecule analysis. Nucleic Acids Res. 2015, 43, e117. 46. Gilboa, T.; Torfstein, C.; Juhasz, M.; Grunwald, A.; Ebenstein, Y.; Weinhold, E.; Meller, A., Single-Molecule DNA Methylation Quantification Using Electro-optical Sensing in Solid-State Nanopores. ACS Nano 2016, 10, 8861-8870. 47. Juillerat, A.; Pessereau, C.; Dubois, G.; Guyot, V.; Marechal, A.; Valton, J.; Daboussi, F.; Poirot, L.; Duclert, A.; Duchateau, P., Optimized tuning of TALEN specificity using non-conventional RVDs. Sci. Rep. 2015, 5, 8150. 48. Dalhoff, C.; Lukinavicius, G.; Klimasauskas, S.; Weinhold, E., Direct transfer of extended groups from synthetic cofactors by DNA methyltransferases. Nat. Chem. Biol. 2006, 2, 31-32. 49. Lukinavicius, G.; Lapiene, V.; Stasevskij, Z.; Dalhoff, C.; Weinhold, E.; Klimasauskas, S., Targeted labeling of DNA by methyltransferase-directed transfer of activated groups (mTAG). J. Am. Chem. Soc. 2007, 129, 2758-2759.

Page 8 of 8

50. Hanz, G. M.; Jung, B.; Giesbertz, A.; Juhasz, M.; Weinhold, E., Sequence-specific Labeling of Nucleic Acids and Proteins with Methyltransferases and Cofactor Analogues. J. Vis. Exp. 2014, 22, e52014. 51. Lukinavicius, G.; Tomkuviene, M.; Masevicius, V.; Klimasauskas, S., Enhanced Chemical Stability of AdoMet Analogues for Improved Methyltransferase-Directed Labeling of DNA. ACS Chem. Biol. 2013, 8, 1134-1139.

Table of Contents Artwork

7

ACS Paragon Plus Environment