Insights into the Biochemistry, Evolution, and Biotechnological

Dec 20, 2018 - (15−20) At these two developmental junctures, the entire genome ...... Sequence names consisting of three alphanumeric strings unique...
0 downloads 0 Views 6MB Size
This is an open access article published under an ACS AuthorChoice License, which permits copying and redistribution of the article or any adaptations for non-commercial purposes.

Perspective Cite This: Biochemistry 2019, 58, 450−467

pubs.acs.org/biochemistry

Insights into the Biochemistry, Evolution, and Biotechnological Applications of the Ten-Eleven Translocation (TET) Enzymes Mackenzie J. Parker, Peter R. Weigele, and Lana Saleh*

Biochemistry 2019.58:450-467. Downloaded from pubs.acs.org by 146.185.200.217 on 04/25/19. For personal use only.

Research Department, New England Biolabs, Inc., 240 County Road, Ipswich, Massachusetts 01938, United States ABSTRACT: A tight link exists between patterns of DNA methylation at carbon 5 of cytosine and differential gene expression in mammalian tissues. Indeed, aberrant DNA methylation results in various human diseases, including neurologic and immune disorders, and contributes to the initiation and progression of various cancers. Proper DNA methylation depends on the fidelity and control of the underlying mechanisms that write, maintain, and erase these epigenetic marks. In this Perspective, we address one of the key players in active demethylation: the ten-eleven translocation enzymes or TETs. These enzymes belong to the Fe2+/α-ketoglutarate-dependent dioxygenase superfamily and iteratively oxidize 5-methylcytosine (5mC) in DNA to produce 5-hydroxymethylcytosine, 5-formylcytosine, and 5carboxycytosine. The latter three bases may convey additional layers of epigenetic information in addition to being intermediates in active demethylation. Despite the intense interest in understanding the physiological roles TETs play in active demethylation and cell regulation, less has been done, in comparison, to illuminate details of the chemistry and factors involved in regulating the three-step oxidation mechanism. Herein, we focus on what is known about the biochemical features of TETs and explore questions whose answers will lead to a more detailed understanding of the in vivo modus operandi of these enzymes. We also summarize the membership and evolutionary history of the TET/JBP family and highlight the prokaryotic homologues as a reservoir of potentially diverse functionalities awaiting discovery. Finally, we spotlight sequencing methods that utilize TETs for mapping 5mC and its oxidation products in genomic DNA and comment on possible improvements in these approaches.

that is essential for cellular-lineage determination and resetting of the life cycle. Downregulation of UHRF1 during PGC arrest in G2 prior to epigenetic remodeling21 is suggested to result in inefficient recruitment of DNMT1 to DNA during repeated rounds of replication, thus leading to passive demethylation (Figure 1).17,19,22,23 In contrast, active demethylation is replication-independent and relies on enzymatic activity. A major player in active demethylation pathways is the ten-eleven translocation (TET) dioxygenase, which catalyzes three iterative Fe2+- and α-ketoglutarate (aKG)-dependent oxidations of 5mC to yield 5-hydroxymethylcytosine (5hmC), 5formylcytosine (5fC), and 5-carboxycytosine (5caC) (Figure 2A).24−27 This enzyme and thymine-DNA glycosylase (TDG) are currently thought to compose the main pathway for active demethylation in mammals (Figure 1). The latter excises 5fC and 5caC from DNA far more efficiently than its long-known substrate, thymine (T) in T:G mismatches,28 producing an abasic site that is then replaced with cytosine (C) via the base excision repair (BER) pathway.24,28 Consistent with this model, disruption of mouse tdg results in increased levels of DNA methylation at certain genomic loci,29,30 while its overexpression in human embryonic kidney (HEK) 293 cells

In mammals, 5-methylcytosine (5mC) is found in approximately 1.5% of genomic DNA (gDNA).1 5mC plays a role in several epigenetic processes, such as silencing of repetitive elements, genomic imprinting, X-chromosome inactivation, and regulation of gene expression during development and cellular specialization.2,3 Tissue- and cell-specific DNA methylation patterns are established early during embryogenesis and during primordial germ cell (PGC) maturation by two related DNA-(C5-cytosine)-methyltransferases (C5-cytosine-MT), DNMT3A and DNMT3B (Figure 1),4−6 with the help of the stimulatory factor DNMT3L. 7−9 During replication, another C5-cytosine-MT, DNMT1, is guided to hemimethylated sites on nascent coding strands by E3 ubiquitin-protein ligase (UHRF1) to restore symmetrical methylation (Figure 1).10,11 Guidance of DNMT1 by UHRF1 is dependent on the ability of the latter to cooperatively bind both hemimethylated DNA and methylated histone H3K9.12−14 The reverse process, erasure of methylation, is known to occur by both passive and active mechanisms. Both mechanisms contribute to the demethylation events in preimplanted embryos preceding re-establishment of methylation patterns by DNMT3A and DNMT3B, whereas only active demethylation is thought to occur during PGC maturation.15−20 At these two developmental junctures, the entire genome undergoes global demethylation in a process © 2018 American Chemical Society

Received: November 13, 2018 Revised: December 18, 2018 Published: December 20, 2018 450

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry

demethylation that occur in the cell. Other enzymes involved in active demethylation include the activation-induced cytidine deaminase (AID), which has a critical function in epigenetic reprogramming in mouse PGCs, and its deficiency interferes with genome-wide erasure of DNA methylation patterns.32 The growth arrest DNA-damage-inducible protein 45a (Gadd45a) has also been implicated in DNA demethylation by stimulating nucleotide excision repair in Xenopus laevis embryos and mammalian cells,33−36 but these findings are challenged by results reporting neither global- nor locusspecific methylation increases in Gadd45a−/− mice.37 TET enzymes (TETs) have also been suggested to affect passive demethylation at certain sites by interfering with 5mC maintenance via oxidation of hemimethylated intermediates to hemihydroxymethylated forms, thus interfering with the activity of DNMT1 (DNMT1 was suggested to have low activity on these sites).38−40 Recent data, however, conflicted with these observations and showed that DNMT3A and DNMT3B exhibit activity toward hemihydroxymethylated sites.38,40 In addition, UHRF1 has been shown to bind 5hmC and target DNMT1 to these sites.41−43 It was further suggested that UHRF1, DNMT1, and TET may function together as a complex to maintain 5hmC during DNA replication.44 Several excellent reviews discussing the seminal advancements over the past decade in the biology of decoding active demethylation and the role of TET in this process are available.45,46 However, key questions regarding the chemical mechanism of TET, factors that control its iterative oxidation, how it targets specific genomic sites with the goal of active demethylation versus depositing epigenetic marks for gene regulation, and how it differentiates 5mC from T, have not been very well addressed. Furthermore, experimental efforts to understand the functional diversity of the TET family and to elucidate possible sequence−structure−function relationships by biochemically characterizing various TETs from different

Figure 1. Main pathways for DNA methylation and demethylation. Red lines indicate modified DNA strands and black lines nascently synthesized strands. DNMT3A/3B catalyze de novo methylation, while the UHRF1/DNMT1 complex maintains methylation after replication. Blue arrows indicate passive demethylation, which results in dilution of 5mC or its oxidized forms during replication. Gold arrows indicate TET-TDG-BER-mediated active demethylation.

results in lower levels of 5fC and 5caC with little or no effect on 5mC and 5hmC.31 The TET-TDG-BER-mediated demethylation pathway (Figure 1) does not account for all events of active

Figure 2. Reactions catalyzed by members of the TET/JPB family. (A) Iterative 5mC oxidation catalyzed by mammalian TET1/2/3, Naegleria gruberi TET1 (NgTET1) (major activity), and Coprinopsis cinerea TET (CcTET). (B) JBP1/2 catalyze the oxidation of T to 5hmU in the first step of base J biosynthesis. (C) Iterative T oxidation catalyzed by NgTET1 (minor activity). aKG = α-ketoglutarate; O2 = molecular oxygen; Suc = succinate; CO2 = carbon dioxide; UDP-Glu = uridine diphosphoglucose. 451

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry

Figure 3. (A) Domain architectures of the hTET paralogues and NgTET1. (B) Crystal structure of hTET2 truncated CD in complex with 5mCcontaining DNA (PDB entry 4NM6).67 The layers of the DSBH are colored as follows: α-helical layer = green, major β-sheet = dark blue, minor βsheet = magenta. Random coil regions between elements of the DSBH are colored light blue, and the low complexity insert is colored orange. All metals are indicated. The left inset is a blow-up of the Zn3 binding site, and the right inset is a blow-up of the active site showing ligands to Fe2+ and NOG (w = water molecule). (C) Crystal structure of NgTET1 in complex with 5mC-containing DNA (PDB entry 4LT5).59 The layers of the DSBH and random coil regions outside of these layers are colored as in (B). The inset is a blow-up of the active site showing ligands to Mn2+ and NOG (w = water molecule). In both structures, only the base of the targeted 5mC is shown for clarity. Metal−ligand interactions and hydrogen bonds are shown as dashed lines. Atoms of residue side chains and DNA are shown in stick representation and are colored according to heteroatom: red = O, blue = N, orange = P, yellow = S, rust = Fe, purple = Mn, gray = Zn.

organisms would almost certainly lead to improved functionprediction methods and unexplored novel functionalities. In this Perspective, we highlight literature that addresses aspects of these outstanding issues and offer our outlook on questions that need attention from biochemists in the community.

initially with the predicted oxygenase domains of the Trypanosoma brucei base J-binding proteins JBP1 and JBP2 as queries revealed homologous regions within three human proteins, TET1, TET2, and TET3, and their orthologues in the genomes of other metazoans (i.e., animals that undergo development from an embryo).27,49 JBP1 and JBP2 were previously shown to oxidize T to form 5-hydroxymethyluracil (5hmU), an intermediate in the biosynthesis of base J (5-(β-Dglucosyl)methyluracil) (Figure 2B),50,51 which acts as an RNA polymerase II termination factor in Leishmania and perhaps T. brucei.52 At the time, oxidized T derivatives had not been observed in mammalian genomes outside of the context of DNA damage, and therefore, TETs seemed likely candidates as catalysts of 5mC oxidation instead. The first evidence for the 5mC dioxygenase activity of mammalian TETs was presented by Tahiliani et al. in 2009.27



DISCOVERY OF THE BIOCHEMICAL ACTIVITY OF TETS TETs were originally named after a chromosomal translocation identified in patients with acute myeloid or lymphocytic leukemia in the early 2000s that fuses the mixed-lineage leukemia 1 gene located on chromosome 10 with the tet1 gene on chromosome 11.47,48 However, the function of TETs remained unknown until 2009, when emerging bioinformatic clues implicated these enzymes as potential 5-methylpyrimidine dioxygenases. Iterative PSI-BLAST searches seeded 452

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Biochemistry



Perspective

DOMAIN ARCHITECTURE AND FUNCTIONAL DIVERGENCE OF METAZOAN TETS Metazoan TETs are large, multidomain proteins with architectures resembling that of DNMT1 and other chromatin-binding proteins (Figure 3A). The N-terminal end typically contains a CXXC DNA-binding domain and nuclear localization signal sequences, whereas the C-terminal end houses the CD,27 which adopts a double-stranded β-helix (DSBH) fold characteristic of members of the Fe2+/aKGdependent dioxygenase superfamily (Figure 3B).67 The CD of metazoan TETs is notably interrupted by a large lowcomplexity insert of unknown function that splits the core DSBH in two (Figure 3A).27 Moreover, relative to lower eukaryotic TETs like NgTET1, the CD of metazoan enzymes is elongated at the N-terminal end to include a cysteine-rich region27 (Figure 3A) that binds three Zn2+ ions to help stabilize the enzyme fold and allow it to properly interact with the DNA substrate (Figure 3B).67 Within the gnathostome (jawed) vertebrate clade of metazoans, either a gene triplication event or two independent duplications occurred during the course of their evolution that resulted in the existence of three TET paralogues (TET1, TET2, and TET3) in these organisms.68 Different expression patterns have arisen among the paralogues, signaling distinct biological functions for each in developmental processes and the routine maintenance of methylation patterns in various tissues. For example, TET1 and TET2 are highly expressed in mESCs during the blastocyst stage, while TET3 is highly enriched in mouse oocytes and early zygotes.69 Remarkably, TET3 seems to have experienced different evolutionary pressures than TET1 and TET2 since it has a lower number of divergent amino acid substitutions in comparison.70 Furthermore, tet1 and tet2 exhibit more frequent codon diversification in coding regions outside of the CXXC domain and the DSBH and Cys-rich components of the CD. However, strong selective constraints are observed within the CXXC and CDs of all three paralogues, emphasizing their functional importance to the enzyme.70 The functional divergence of gnathostome TET paralogues can also be inferred from their unique motifs and/or domains. TET1s have recently been shown to have a “before CXXC” or BC domain that is involved in chromatin binding (Figure 3A).71 TET2s have three unique motifs: an approximately 380amino acid long glutamine-rich region (20% Gln content) upstream of the CD; a moderately conserved short Gln-rich region within the low-complexity insert; and an N-terminal proline-rich region containing short poly-Pro repeats that are about 20 amino acids in length (Figure 3A).72 The roles of these three motifs in TET2’s function are unknown and require further study. A chromosomal inversion that occurred during evolution also resulted in severing of the CXXC domain of TET2s to form an adjacent gene called Idax (Figure 3A).73 TET3s contain two previously unidentified sequence motifs of unknown function termed Element 1 and Element 2 (Figure 3A). Element 1 is situated downstream of the CXXC domain and is highly conserved in all mammalian TET3 proteins, whereas Element 2 is located within the DSBH region of the CD.72 The diverse combination of unique motifs in the various TET paralogues likely dictates their function in the cell. Further functional diversity of gnathostome TET paralogues can be generated through the expression of alternative splicing isoforms. Full-length TET1 is detected only during early

In vivo immunostaining experiments in HEK 293 cells overexpressing human TET1 (hTET1) demonstrated a positive correlation between the enzyme’s presence and reduced 5mC levels in gDNA. The disappearance of 5mC also coincided with the formation of a new base, which was identified as 5hmC by thin-layer chromatography (TLC) and mass spectrometry. TET1’s involvement in these correlations was confirmed by purifying recombinant hTET1 catalytic domain (CD) from Sf9 insect cells and showing by TLC that it converted 5mC in synthetic oligonucleotides to 5hmC with an absolute dependence on Fe2+ and aKG. The discovery of TET’s activity coincided with emerging reports that 5hmC is actually a stable component of gDNA in a variety of vertebrate cell types,26,27 leading to speculations about the roles these enzymes play in active demethylation and/or generating new layers of epigenetic control. A more definitive role for TETs in active demethylation was established two years later, when two groups independently demonstrated, using a variety of analytical techniques, that the three mouse TET (mTET) paralogues could catalyze further oxidation of 5hmC in oligonucleotides and gDNA to produce 5fC and 5caC in an Fe2+- and aKG-dependent manner.24,25 The results of these studies rationalized the observation of 5fC in mouse embryonic stem cell (mESC) gDNA in an earlier report by Pfaffeneder et al.53 The discovery that TETs could produce 5caC triggered relatively fruitless searches for a decarboxylase similar to the thymine salvage enzyme isoorotate decarboxylase54−56 that would catalyze decarboxylation of 5caC and re-establish C on genomic sites without resorting to the resource intensive TDG-BER pathway. One study found that treatment of [1,3-15N]5caC-labeled DNA with mESC extract resulted in product containing isotopically labeled C, suggesting that such a decarboxylase may exist.57 However, the nature of the enzyme responsible for this activity awaits identification. Another study reported that mammalian DNMTs catalyze decarboxylation of 5caC to yield C in vitro.58 Whether this occurs in vivo has yet to be demonstrated. Subsequent to the discovery of mammalian TETs, the activities of homologues from other eukaryotes such as Naegleria gruberi (NgTET1), 59,60 Coprinopsis cinerea (CcTET),61 Apis mellifera (AmTET),62 and Drosophila melanogaster (droTET)63,64 were investigated. Like their mammalian counterparts, NgTET1 and CcTET iteratively oxidize 5mC to produce 5hmC, 5fC, and 5caC (Figure 2A).59−61 Interestingly, NgTET1 also harbors minor iterative T-oxygenase activity in vitro, producing 5hmU, 5-formyluracil (5fU), and 5-carboxyuracil (5caU) (Figure 2C).60 Evidence suggesting that mTET can also oxidize T to form 5hmU in ESCs has been reported,65 but this activity could not be reproduced in vitro.60 In contrast, AmTET has been reported to produce only 5hmC,62 while conflicting reports have suggested that droTET is either a DNA-specific N6-methyl2′-deoxyadenosine (6 mA)demethylase64 or an RNA-specific 5mC dioxygenase.63 These results are surprising considering that both AmTET and droTET exhibit absolute conservation of active-site residues27,66 that are involved in hydrogenbonding or stacking interactions with the pyrimidine ring of 5mC in mammalian TETs (discussed in more detail in later sections). Structural studies of AmTET and droTET will be invaluable at respectively illuminating the main elements that dictate a single-step oxidation outcome or an alternative substrate specificity. 453

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry

results showed that intermediate products, specifically 5hmC, accumulate to concentrations greater than the input enzyme concentration (see Figure 2C in ref 88), confirming that TETs indeed release the oxidized products after each turnover. Our proof for the distributive physical and chemical behaviors of TETs explains in part how these enzymes allow different oxidized forms of 5mC to stably accumulate at different locales in the genome. This process could also be influenced by the enzyme’s localized concentration in accessible regions of chromatin. At sites where active demethylation is expected to occur, TETs are speculated to be present in large amounts to drive the removal of errant methylation. Conversely, sites with relatively stable 5hmC marks are speculated to have low local enzyme concentrations.86 If this hypothesis is accurate, then TET’s distributive mode of action coupled with the local enzyme concentration dictates the occurrence of iterative versus single oxidation events. TETs may be recruited to or excluded from specific genomic sites by interacting with other chromatin-associated proteins. For example, TET1 initially targets 5mC sites in gene enhancer regions of mESCs to produce 5hmC. TET1 then interacts with the protein SALL4A and subsequently recruits TET2 to complete the iterative oxidation of 5hmC to 5fC and 5caC.89 Furthermore, the N-terminal CXXC domain could play a role in localizing TETs to specific genomic sites. Typically, CXXC domains are thought to anchor chromatin-modifying enzymes to DNA by binding to non-methylated CpG islands. However, a study of the CXXC domain of mTET3 revealed that it had a higher affinity for 5caCpG sites relative to non-methylated sites, and it was postulated that anchoring TET3 to 5caCpG keeps the enzyme localized in areas of the genome where C methylation is undesirable, such as at transcriptional start sites.75 Although there is no current evidence to support this intriguing proposal, it raises questions of whether there could be other variants of the CXXC domain that preferentially interact with other 5xC (x = m, hm, f, ca) derivatives and thus help regulate TET activity through chromatin localization. An intriguing observation regarding the ability of mammalian TETs, as well as CcTET, to generate 5fC and 5caC is the dependence of these oxidation reactions on the presence of ascorbate and ATP.24,61,90−92 The mechanism by which ascorbate enhances the three-step oxidation reaction is not quite understood. One study suggested that activation of TET by ascorbate results from a direct interaction of this small molecule with its C-terminus, as evidenced by intrinsic fluorescence changes of TET CD with increasing ascorbate concentration.92 However, the experiments in that study were performed under equilibrium binding conditions for iron, introducing the possibility that changes in the protein’s fluorescence were merely due to conformational events occurring from iron in the bound and unbound states. Therefore, this conclusion needs further experimental support. We believe that the most probable explanations for the enhancing effect of ascorbate on TET activity are (1) it keeps free Fe in a reduced and therefore kinetically labile state for binding to the enzyme and (2) it reactivates the Fe3+−OH form of the enzyme in cases where the main substrate (R−CH3 in Figure 4) is absent or not properly positioned at the active site, similar to what is seen with prolyl-4-hydroxylase.93,94 In contrast to ascorbate, there have been no reported studies on how ATP stimulates 5fC and 5caC production with mammalian TETs and CcTET. It is tempting to speculate

embryonic development, whereas adult somatic tissues weakly express an isoform, called TET1s, that is missing the CXXC domain. The presence versus absence of the CXXC domain on TET1 is predicted to control epigenetic memory erasure.71 A non-enzymatic isoform of TET2 that lacks the CD is implicated in mast cell proliferation in humans, while the full-length protein is reported to be less efficient at this function.74 Lastly, full-length TET3 and TET3s, also lacking the CXXC domain, are important for neuronal differentiation, while another isoform called TET3o, which differs from TET3s by only one exon, is involved in oocyte generation and fertilization.75



THE MODUS OPERANDI OF MAMMALIAN TETS The global demethylation events that occur in preimplanted embryos and PGCs are severely attenuated in tet1/2/3 knockouts, clearly demonstrating the role of mammalian TETs in active demethylation.76 In somatic tissues, however, isotope-labeling experiments showed that the vast majority of 5hmC, 5fC, and 5caC detected in mammalian gDNA exist for prolonged periods as stable marks77,78 and that their levels in various tissues do not correlate with those of their precursors.79,80 Furthermore, many reader proteins that can recognize and interpret 5hmC, 5fC, and 5caC marks as epigenetic information have been identified, including DNA glycosylases, repair proteins, transcription factors, and chromatin regulators.42,43,81−83 These observations thus show that mammalian TETs, in addition to erasing methylation marks, deposit 5hmC, 5fC, and possibly 5caC on the genome to add new layers of epigenetic control in transcriptional regulation, cellular development, and lineage specification (Figure 1). The opposing nature of TETs’ dual functionality requires cells to implement strict regulatory measures to control the enzymes’ activity in generating oxidized derivatives of 5mC. A few of these measures have been uncovered during the past decade, including chemical and structural properties inherent to TETs (discussed in the next section), putative allosteric effects on their activity by small molecules,84 their recruitment/exclusion by interacting proteins,85 and their localization in accessible regions of the chromatin.86 However, many aspects of these regulatory mechanisms are still poorly understood. There are two distinct physical modes by which TETs can locate 5mC sites in DNA: (i) a distributive mode, in which they differentiate target versus nontarget sites by either random three-dimensional diffusion or a combined one-dimensional sliding and three-dimensional hopping mechanism, or (ii) a processive mode, in which they slide along DNA between methylated sites. Once a site is found, TETs can utilize one of two chemical modes in oxidizing the substrate: (i) iterative oxidation, where they oxidize 5mC to 5caC without releasing the substrate, or (ii) distributive oxidation, where they release oxidized product from the active site after a single turnover. We have unequivocally shown in vitro that the physical and chemical modus operandi of NgTET1 and the CDs of mTET1 and mTET2 are distributive, meaning that these enzymes fully dissociate from the product and DNA after a single-oxidation step.87 This behavior was revealed by examining the distribution of products formed at specific times during the course of the reaction, which was found to be dependent on the corresponding substrate abundance and the kinetics of a particular oxidation step.87 A contrary conclusion was argued by Crawford et al.;88 however, careful investigation of their 454

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry

Figure 4. Consensus hydroxylation mechanism for Fe2+/aKG-dependent dioxygenases.95−97

complexity insert may do the same in the full-length enzyme. It has been noted previously that the insert bears homology to the C-terminal domain of Saccharomyces cerevisiae RNA polymerase II.101 Mammalian RNA polymerase II bears a similar C-terminal domain, and studies have suggested that post-translational modifications (phosphorylation, Arg methylation, sumoylation) in this domain help control the enzyme’s activity.49,102,103 It is unknown whether similar modifications can occur in the C-terminal domain of S. cerevisiae RNA polymerase II and, by extension, the low-complexity insert of mammalian TETs, but such similarities suggest that this element at the very least plays a role in regulating metazoan TET activity. The Cys-rich region of metazoan TETs is often called a domain in the literature. This description is likely a misnomer given that in the hTET2-TCD structure, the Cys-rich region does not form a distinct domain but instead wraps around the DSBH core and introduces three Cys3His Zn2+ binding sites (Zn1−Zn3) (Figure 3B).67,99 The Zn2 and Zn3 sites (Figure 3B inset) have ligands from both the Cys-rich region and DSBH and appear to be important in stabilizing two flexible loops that interact with the DNA substrate and, in the case of Zn2, another loop containing the His (H1382) and Asp (D1384) residues that coordinate to the active-site Fe2+. Zn1 is located distal to the active site, but its removal by mutational truncation of hTET2-TCD abolishes the enzymatic activity.67 Therefore, the zinc sites are thought to confer stability to hTET2-TCD’s tertiary structure, with Zn2 and Zn3 further playing a potential role in substrate binding and catalysis. Despite the structural differences, the active sites of NgTET1 and hTET2-TCD are nearly superimposable, indicating that their chemical mechanisms are highly similar, if not identical. The Fe2+-binding site has the HX(D/E)XnH

that hydrolysis or allosteric binding of the nucleotide triggers structural rearrangements in the enzyme’s active site that make 5hmC and 5fC better substrates.



STRUCTURAL AND MUTATIONAL STUDIES OF TETS Crystal structures of two TETs, NgTET1 and hTET2 truncated CD (hTET2-TCD), in complex with doublestranded DNA oligonucleotides containing an internal fully (x = m, hm) or hemimodified (x = f) 5xCpG site have provided insights into their substrate specificity and catalytic mechanism.59,67,98,99 NgTET1, which consists of a minimally decorated CD and short, unstructured N-terminal extension, adopts a three-layered jelly roll fold consisting of a distorted DSBH (eight-stranded major sheet and four-stranded minor sheet) and an α-helical layer that packs against the outer surface of the major sheet (Figure 3C).100 The open end of the DSBH, enlarged by the unequal number of strands in the major and minor sheets, serves as the active site and entrance to the Fe2+ and aKG binding sites located deeper within the βhelical core. In comparison, the CD of hTET2, although similar, is expanded through eight insertions, including the large low-complexity insert noted in the previous section, and one deletion. To facilitate crystallization of the hTET2 CD, the low-complexity insert, which is ∼300 residues long and predicted to be unstructured, was replaced with a 15-residue GS linker to yield hTET2-TCD (Figure 3B).67 Although largely disordered, the location of the GS linker in the hTET2-TCD structures is quite intriguing and suggests that the low-complexity insert may play a role in DNA binding and/or regulation of the enzyme. The linker appears to reach into the major groove of the DNA substrate on the side of the duplex opposite the active site, suggesting that the low455

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry

Figure 5. Structures of the active sites of hTET2-TCD and NgTET1 with different 5xC-containing oligonucleotides. Top row: hTET2-TCD structures with (A) 5mC (PDB entry 4NM6), (B) 5hmC (PDB entry 5DEU), and 5fC (PDB entry 5D9Y).67,99 Bottom row: NgTET1 structures with (D) 5mC (PDB entry 4LT5) and (E) 5hmC (PDB entry 5CG8).59,98 Residues proposed to be important in substrate recognition and binding are shown, and hydrogen-bonding interactions are indicated with dashed lines. Only the substrate nucleotide inserted into the active site is shown, and the two His, one carboxylate facial triad ligating Fe2+/Mn2+ has been omitted for clarity. Atoms are shown in stick representation. Residue side chains are colored by heteroatom, whereas 5xC, aKG/NOG, metal ions, and water molecules are colored according to element: gray = C, red = O, blue = N, dark orange = P, rust = Fe, purple = Mn.

motif characteristic of most non-halogenating Fe2+/aKGdependent dioxygenases that coordinates the metal ion in a two His, one carboxylate facial triad (Figure 3B,C). The cosubstrate aKG (or N-oxalylglycine (NOG), an unreactive analogue) is bound to Fe2+ in the “off-line” configuration,104 with the 1-carboxylate coordinated opposite the distal His residue (h, H1881; Ng, H279) and the 2-keto oxygen opposite Asp (h, D1384; Ng, D231) (Figure 3B,C). Co-substrate binding is further stabilized by hydrophobic, van der Waals, and hydrogen-bonding interactions with the protein, including two Arg residues that neutralize the 1- and 5-carboxylate groups (Figure 3B,C). The octahedral coordination geometry of Fe2+ is completed by a water molecule, which is proposed to be displaced by O2 in the TET·Fe2+·aKG·substrate complex to initiate turnover (Figure 4). These waters are positioned away from the C5 substituents of the targeted base (C−O distances = 4.3−4.8 Å), a situation that is similar to structures of other Fe2+/aKG-dependent dioxygenase·substrate complexes with aKG bound in the off-line mode.105−107 In the latter cases, it has been proposed that either aKG or the oxo ligand of Fe4+ O reorients during the course of the reaction mechanism to position the activated oxygen close to the substrate for H atom abstraction.104 Along these lines, a computational study of hTET2’s reaction mechanism suggested that the peroxysuccinate bridge (Figure 4) reorients concomitantly with aKG decarboxylation to position the latent oxo group closer to the substrate (C−O distances = 2.5−3.7 Å).108 As expected from studies of other Fe2+/aKG-dependent dioxygenases, ligands to

Fe2+ and aKG are completely conserved, and their mutation abolishes the 5xC oxidation activity.67 NgTET1 and hTET2-TCD interact with the minor groove of their DNA substrate primarily through hydrogen-bonding/ electrostatic interactions that occur between flexible loops surrounding the active site and the DNA’s phosphate backbone. For hTET2-TCD, the substrate-interacting loop buttressed by Zn2 also contributes a patch of mostly hydrophobic residues (1290−1296) that pack against the interior of the DNA duplex 3′ from the 5xC being oxidized. These hydrophobic residues are essential for activity since their mutation to Ala nearly eliminates 5mC oxidation but only mildly affects the KD of the enzyme−DNA interaction.67 The interactions induce significant distortions from B-form DNA by introducing kinks of 40° (hTET2-TCD) (Figure 3B) or 65° (NgTET1) (Figure 3C), causing the 5xC base targeted for oxidation to flip out of the duplex and insert into the active site. In NgTET1, the base-stacking interactions surrounding the orphaned guanine are maintained, and the protein hydrogen-bonds with N2 of this base via a serine (S148) on a hairpin loop that is inserted into the widened minor groove.59 In contrast, the orphaned guanine is pushed out of the duplex in hTET2-TCD by a Tyr (Y1294) and Met (M1293) occupying the space vacated by the flipped 5xC.67 Consistent with activity data,60,67 fully versus hemimodified DNA appears to be indistinguishable by both enzymes since no protein−base interactions are observed with the (un)modified C in the CpG of the opposite strand (Figure 3B,C). 456

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry In vitro activity assays clearly demonstrate a preference of both TETs for CpG sites.60,67 In the case of NgTET1, this preference may be explained by the observed interaction of the protein with the guanine 3′ of the targeted 5xC via hydrogen bonds between a Gln (Q310) and the base’s N1 and N2 atoms. Although mutation of Q310 reduced NgTET1mediated 5mC oxidation by 60%, the effects on the enzyme’s specificity for methylated non-CpG dinucleotides were not explored.59 In the case of hTET2-TCD, the enzyme appears to make no specific interactions with the adjacent G:C base pair other than a general stacking interaction with the inserted Y1294.67 This interaction was proposed to be the mechanism by which hTET2 distinguishes CpG from non-CpG sites,67 but the substrate specificity of a M1293A/Y1294A double mutant was not explored. Biochemical studies on TET variants with mutations in regions proximal to DNA may reveal interactions and/or factors not readily observed in the crystal structure that could narrow their specificity to the 5xCpG dinucleotide. In both TETs, the targeted 5xC is stabilized by stacking interactions with an aromatic residue (h, Y1902; Ng, F295) and the guanidino group of the Arg that is hydrogen-bonded to C1 of aKG/NOG (h, R1261; Ng, R224) (Figure 5).59,67,98,99 Hydrogen bonds between polar residues and the Watson− Crick base-pairing face of the pyrimidine supply additional binding energy and might provide the determinants that select 5mC over T (discussed in more detail in a later section). Specifically, a His residue from both proteins (h, H1904; Ng, H297) donates a hydrogen bond to N3 of the base, and an Asn (h, N1387) or Asp (Ng, D234) accepts a hydrogen bond from the exocyclic N4 amine (Figure 5). Mutation of either residue severely diminishes the ability of both TETs to oxidize 5xC.59,67 The proteins also exhibit distinct additional interactions with 5xC that may depend, at least in the case of hTET2-TCD, on the oxidation state of the base. In NgTET1, N147 donates a hydrogen bond to the exocyclic O2 of both 5mC and 5hmC (Figure 5D,E).59,98 This interaction is important for catalysis, as mutation of N147 to D reduces the 5mC oxidation activity by over 40%.59 In contrast, interactions between exocyclic O2 and hTET2-TCD appear to possibly depend on the oxidation state of the C5 substituent. A fourth His residue in the enzyme’s active site (H1386) adopts three different conformations with 5mC-, 5hmC-, and 5fC-containing substrates, and only with 5hmC does it appear capable of donating a hydrogen bond to the base’s O2 atom (N−O distance = 2.8 Å) (Figure 5B).67,99 Another potential oxidation-state dependent interaction in hTET2-TCD structures is a water-mediated hydrogen bond between T1393 and the exocyclic amino nitrogen (N4) of 5hmC and 5fC (Figure 5B,C).99 Mutational studies exploring the roles of H1386 and T1393 in oxidation of 5hmC and 5fC by hTET2-TCD are needed.

regardless of sequence content (AT-rich vs CG-rich) and length of the DNA substrate.99 Closer examination of these data, however, revealed lower 5hmC and 5fC oxidation rates with CG-rich substrates, indicating a possible effect of unmethylated CpG sites on the rates of oxidation. Furthermore, a noticeable inverse correlation between substrate length and oxidation rate for all three bases is also seen in these data, which is interesting considering the distributive behavior of TET binding. This might imply that hTET2 CD follows a sliding and hopping mechanism in search of its substrate. Examination of the NgTET159,98 and hTET2-TCD67,99 structures reveals no apparent direct contacts between the protein and the C5 substituent of the base inserted into the active site (Figure 5). This observation is consistent with the enzyme’s ability to accommodate and oxidize 5mC, 5hmC, and 5fC but raises the question of how TETs properly orient a C− H bond of the substituent toward the Fe center for H-atom abstraction during turnover (Figure 4). The absence of any steric or non-covalent bonding restraints allows for free rotation about the C5−Csubstituent bond, at least in the cases of 5mC and 5hmC. Unless 5fC is hydrated to form the gemdiol, rotation about the C5−Cformyl bond will likely be restricted because of (1) its partial double-bond character resulting from conjugation with the pyrimidine ring’s π electrons and (2) an intramolecular hydrogen bond between the formyl group’s oxygen atom and the exocyclic N4 amine (Figure 5C). The three H atoms of methyl groups are chemically equivalent, and thus, C5−CH3 bond rotation in 5mC should minimally affect its oxidation, as a C−H bond will always be oriented toward the Fe center. In contrast, 5hmC and 5fC have respectively one and two fewer C−H bonds that can be activated by the enzyme. Factoring in the distinct C−H bond-dissociation energies of a methyl, hydroxymethyl, and formyl group, these differences may explain the observed substrate preferences of TETs for 5mC-containing substrates over those with 5hmC and 5fC. An unexplored complexity that could influence the oxidation of 5hmC and 5fC is the fact that their C5 substituents contain polar functional groups that may hydrogen-bond with components of the metal center and therefore affect its chemistry. This issue is evident in the structures of hTET2TCD and NgTET1 complexed with 5hmC-containing oligonucleotides. In the case of the former, the OH group of the targeted base is oriented such that it is within hydrogenbonding distance of both the 1-carboxylate group of NOG (2.7−3.4 Å) and R1261 (3.3−3.4 Å) and thus could affect the binding and positioning of the co-substrate (Figure 5B).99 In the case of NgTET1, the orientation of the OH group allows it to hydrogen-bond with the water (3.3 Å) coordinated to the Fe2+ and thus possibly interfere with O2 binding and activation (this structure has been suggested to alternatively reflect the product state of the enzyme after 5mC oxidation). The OH group is also within hydrogen-bonding distance of the 1carboxylate group of aKG (2.9 Å) but not R224 (4.4 Å) (Figure 5E).98 In silico analysis of the hTET2 reaction mechanism suggested altered reactivity of the metal center by showing a potential hydrogen bond between the OH group of 5hmC and the Fe3+−peroxy bridge that could affect reorientation of the metal ligands after the decarboxylation step.108 Spectroscopic and crystallographic studies focused on Fe-oxidation intermediate states should provide insight into



5XC SUBSTRATE PREFERENCES OF TETS Although TETs can oxidize 5mC, 5hmC, and 5fC, the efficiencies of these reactions differ significantly. Oxidation of 5mC to 5hmC is about 3.5-fold faster than oxidation of 5hmC to 5fC, which in turn is about 1.5-fold faster than oxidation of 5fC to 5caC for mammalian TETs,25,67 NgTET1,60 and CcTET.61 Electrophoretic mobility-shift assays and fluorescence polarization and surface plasmon resonance measurements suggested that the substrate preference of hTET2 CD is not a result of different substrate-binding affinities.99 Furthermore, similar substrate preferences were reported 457

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry

Figure 6. Gene maps of prokaryotic loci encoding TET/JBP homologues. Representative prokaryotic gene clusters containing a TET/JBP homologue are grouped and labeled according to their co-associations with other predicted DNA-modifying enzymes. TET/JBP co-associations may be predictive of their substrate choice with respect to 5mC vs T. For example, only those homologues associating with a predicted C5cytosine-MT are anticipated to oxidize 5mC. Phage and bacterial names are indicated. Sequence names consisting of three alphanumeric strings uniquely identify metagenome sequence contigs obtained from the Joint Genome Institute’s (JGI) Integrated Microbial Genomes-Virus (IMG/ VR) data set: the first number corresponds to the IMG Genome ID, the second corresponds to the Gold Analysis Project ID, and the third identifies the contig from within that project’s data set.114 Gene maps were generated using Geneious 10.2.6 (https://www.geneious.com).



the effects of polar interactions on the reactivity of TET’s Fe2+ center. A hydrophobic pocket identified in the active sites of both NgTET1 and hTET2 may also be key in dictating the substrate preferences of these enzymes. This pocket consists of the aromatic residue stacking with the inserted base (h, Y1902; Ng, F295), a valine (h, V1900; Ng, V293), and either A212 (Ng) or T1372 (h) (Figure 5).98,109 The high conservation of these residues among various TET homologues from other organisms109 strongly supports their involvement in the enzyme’s function. Mutation of the small residues (Val and Ala/Thr) in either TET to ones possessing bulkier side chains causes the enzyme’s activity to stall after one round of oxidation with a 5mC-containing substrate.98,109 It was speculated in both cases that the larger side chains reduce the pocket’s volume, leading to steric clashes between the protein and the C5 substituents of 5hmC and 5fC that occlude them from the active site. Following this logic, Gly substitutions at these positions should allow both TETs to oxidize 5hmC and 5fC at rates similar to that of 5mC; this prediction turned out to be incorrect. It is notable in these studies that all mutations reduced the overall 5mC oxidation activity of NgTET1 and hTET2-TCD,98,109 indicating that other factors not immediately apparent in the crystal structures contribute to the substrate preferences of these enzymes.

EVOLUTIONARY HISTORY AND PHYLOGENETIC CLASSIFICATION OF TETS As a result of their shared catalytic properties and detectable sequence similarity, TETs and JBPs have been grouped by Aravind and co-workers into the so-called TET/JBP family.49 Homologues of these enzymes can be found in all domains of life from viruses to humans. Sequence and phylogenetic analyses performed by this group suggested that the TET/JBPs of bacteriophages are the most ancestral of the 5-methylpyrimidine dioxygenases.110 Interestingly, the reactions catalyzed by these enzymes (Figure 2) also resemble those of thymine-7hydroxylase (T7H), a fungal salvage enzyme that can iteratively oxidize thymine to 5-hydroxymethyluracil, 5formyluracil, and 5-carboxyuracil.111 The mechanistic similarities strongly suggest that TETs, JBPs, and T7Hs may share a common ancestor. Prokaryotic and viral TET/JBPs show conserved neighborhood linkages to several distinct sets of genes predicted to encode other DNA-modifying enzymes, including methyltransferases (MTs), glycosyltransferases, and enzymes with known112 or tentative T-hypermodification110 activity. One feature many of these co-occurring genes share is that their gene products are predicted to require a hydroxyl as an acceptor group in their reactions. As a result, the product of a TET/JBP-like activity could provide the chemical handle for further elaboration. These groupings are reminiscent of biosynthetic gene clusters, and their observed combinatorial 458

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry

in kinetoplastids appears to have strictly included only the genes necessary for base J synthesis (a JBP and glucosyltransferase). In summary, if a correlation indeed exists between the spread of these nucleic-acid modifying enzymes and the evolution of DNA methylation in eukaryotes, then a future focus on prokaryotic TET/JBP characterization might possibly uncover novel biochemistry and yield a deeper understanding of the evolution of regulatory base modifications in all organisms.

permutations hint at a large diversity of nucleotide modifications that await discovery. We carefully examined these prokaryotic homologues and found that they can be further classified into defined subgroups on the basis of geneneighborhood associations (Figure 6). A tempting prediction that arises from our classification is that a TET-like function (5mC-oxidizing) versus JBP-like function (T-oxidizing) is contingent on the presence of a C5-cytosine-MT-like gene. For example, the TET/JBP homologue of Proteobacteria bacterium TMED261 is associated with a predicted C5-cytosine-MT and glycosyltransferase, suggestive of a pathway in which C is first methylated and then oxidized to 5hmC for subsequent glycosylation. In contrast, the absence of neighboring C5cytosine-MTs in Mycobacterium chelonae strain D16R7 and Mycobacterium phage Nigel (Nigel) suggests that T may be targeted for oxidation. In both systems, the 5hmU formed is hypothesized to be further modified by products of neighboring genes. Other MTs, such as a predicted DNAN6A-MT in Persicivirga phage P12024L and a FkBM-like MT in cyanophage MED4−184, are also observed to cluster with TET/JBP homologues. The only characterized FkbM-like MT to date catalyzes the methylation of an oxo group during the biosynthesis of a macrolide antibiotic,113 and thus, we predict that the pairing of such a gene with a TET/JBP homologue in the absence of C5-cytosine-MT will result in the formation of 5-methoxymethyluridine as the product of this pathway. What role could these base modifications play in the physiology of a bacterium or virus? In cells, one obvious possibility is that they are the component of a bacterial restriction-modification system that protects the host’s gDNA against resident restriction endonucleases. In phages, such modifications could be used to block restriction endonucleases expressed by their hosts or, alternatively, could play a role in their morphogenesis. It has been noted that phage TET/JBP homologues are often flanked by a parB-like homologue upstream and large terminase subunit gene (e.g., Nigel) downstream (Figure 6).110 The ParB protein family has been shown to be involved in chromosome and plasmid segregation in cellular organisms, and the terminase subunit is part of a DNA packaging motor used during assembly of the virus particle.115 Therefore, it was suggested that DNA modification by TET/JBP homologues and the associated base-modifying enzymes might be used to define packaging start/end points during viral morphogenesis.68 Further studies are warranted to determine the extent and sequence specificity of base modification in the DNA of phages encoding TET/JBP homologues and whether such modifications coincide with the termini of viral genomic DNA. Aravind and co-workers state that eukaryotes commandeered TET/JBPs from bacteriophages and, through the course of evolution, repurposed these enzymes as generators of epigenetic marks.68 Acquisition of these genes likely occurred through two distinct phyletic patterns:49 (i) lateral gene transfer, as observed in animals, Acanthamoeba, Naegleria, kinetoplastids, bacteria, phages, and certain algae, and (ii) a massive gene expansion, often with 10 or more copies, as seen in Coprinopsis and Laccaria.116 In the latter case, the TET/JBP homologue is frequently coupled with active transposons that are predicted to have played important roles in speciation during evolution.116 Interestingly, TET/JBP genes from Coprinopsis, Laccaria, and metazoans are strongly correlated with a C5-cytosine-MT, suggesting that the corresponding enzymes likely act on 5mC. In contrast, the acquired element



SELECTIVITY OF 5MC OVER T FOR OXIDATION BY TETS Residues interacting with N3 and the exocyclic functional group bonded to C4 of the pyrimidine ring are expected to play a key role in dictating the substrate specificity of TETs and JBPs.60 The hydrogen-bond donating/accepting capacity is distinct at these two positions of the ring for T (N3 = donating, O4 = accepting) versus 5xC (N3 = accepting, N4 = donating). Therefore, in selecting one pyrimidine over the other, a protein should provide functional groups that match the hydrogen-bonding ability at each position. Thymidylate synthases (TSs) are good examples of this chemical logic.117 Canonical TSs that produce 2′-deoxythymidine monophosphate from 2′-deoxyuridine monophosphate (dUMP) use the amido NH2 of an Asn side chain to donate a hydrogen bond to O4 of the pyrimidine ring (Figure 7A).118,119 In contrast, the

Figure 7. Base selectivity in thymidylate synthases (TSs), TETs, and JBPs. (A) In canonical TSs, selectivity for UMP is dictated by the use of the amido-NH2 group of Asn to donate a hydrogen bond to O4 of uracil. (B) The selectivity of TS homologue gp42 from phage T4 for dCMP is dictated by the use of the side-chain carboxylate of Asp to accept a hydrogen bond from N4 of cytosine. (C) In TETs, selectivity for 5mC may be dictated by having a hydrogen-bond acceptor (h, amido O of Asn; Ng, Asp side-chain carboxylate) to interact with N4 and a hydrogen-bond donating His to interact with N3 of the pyrimidine ring. (D) Residues thought to interact with T in JBPs on the basis of the sequence alignments shown in Figure 8.

TS homologue gp42 of phage T4, which has been characterized to methylate 2′-deoxycytidine monophosphate (dCMP), harbors Asp, which accepts a hydrogen bond from N4 of cytosine, at the corresponding position (Figure 7B).120 In fact, substitution of Asn for Asp in canonical TSs increases the rate of turnover for dCMP by over 7 orders of magnitude relative to the wild-type enzyme.118 Similarly, substitution of 459

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry

Figure 8. Sequence alignment of hTET1, 2, 3 (Q8NFU7, Q6N021, and O43151), mTET1, 2, 3 (XP_011241810, XP_011241810, and XP_006505839), AmTET (XM_006561197.3), droTET (AAF47691.4), CcTET (XP_001831108.2), T. brucei JBP1 and -2 (XP_829420.1 and Q57X81.1), Leishmania major JBP1 and -2 (XP_001681321.1 and YP_007674071.1), phage Med4−184 (YP_007674071.1), phage Nigel (YP_002003841.1), TMED261 (OUX44518.1), and NgTET1 (XP_002667965). Residues highlighted in red are ligands to Fe2+ and aKG; those in green are residues with potential hydrogen bonds to the pyrimidine ring of 5xC or T; and those in cyan are residues that constitute the active-site scaffold. Open boxes correspond to residues that have been identified structurally to have the function determined by their color code but do not align properly with respective residues from other organisms (e.g., hTET1 R1261 and NgTET R224 are both ligands to aKG but are not in alignment with each other). Secondary structural elements parsed and numbered from PDB entries 4LT5 (NgTet1)59 and 4NM6 (hTET2-TCD)67 are displayed for the indicated sequences in the alignment. This figure was generated using ESPript 3.0.121

as mammalian TET1/2/3 and NgTET1, utilize His and Asn/ Asp to respectively interact with N3 and N4 of the base (Figure 7C). As shown in the structures of hTET2-TCD, the side chain of N1387 is oriented such that the amido O can accept a hydrogen bond from N4 of the 5xC substrates (Figure

the corresponding Asp in gp42 for Asn results in an enzyme that prefers dUMP over its natural substrate, dCMP.118 Sequence alignments of representative TET/JBPs (Figure 8) reveal some distinct trends in the known or predicted baseinteracting residues of these proteins. 5mC dioxygenases, such 460

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry Table 1. Comparison of Key Features of TET-Dependent 5(h)mC-Sequencing Methods method

C readout

type of method

advantages

disadvantages

BS-seq

5mC + 5hmC

TAB-seq

5hmC

TAmC-seq

5mC

chemical deamination followed by NGS sequencing enzymatic glucosylation and oxidation followed by BS-seq enrichment followed by NGS sequencing

foundation of many other methods; singlebase resolution direct detection of 5hmC; single-base resolution outperforms antibody-based sequencing methods

TET-mediated SMRT-seq

5mC and 5hmC

enzyme-dependent cytosine-modification sequencing

5mC and 5hmC

single-molecule; dependent on TET activity and polymerase kinetics enzymatic glucosylation/ oxidation/deamination

single-molecule; long reads; direct detection of 5mC and 5hmC; single-base resolution Low DNA input; detection of 5mC and 5hmC; single-base resolution

DNA degradation; 5mC and 5hmC are indistinguishable DNA degradation; TET- and BGTsequence biases TET- and BGT-sequence biases; no absolute quantification of DNA methylation status hard to distinguish IPDs of (C and 5mC) and (5fC and 5caC); TET-sequence bias

5A−C).67,99 Similarly, H1904 serves as a hydrogen-bond donor to N3 of the pyrimidine ring. These two positions are almost completely conserved in all predicted 5mC dioxygenases from multicellular organisms (Figure 8). On the basis of sequence alignments between TETs and JBPs, the corresponding positions in T dioxygenases (e.g., Leishmania and T. brucei JBP1/2) are predicted to use Arg to interact with N3 and Asp with O4 (Figure 7D). These observations are surprising given that the hydrogen-bonding potential of the residues (R = donating, D = accepting) seems discordant with the positions of the pyrimidine ring with which they are proposed to interact. It may be possible that the positioning of T and the orientation of the Arg and Asp side chains in the active sites of these enzymes allow these residues to essentially swap roles, resulting in hydrogen bonding of Arg with O4 and Asp with N3. A structure of a T dioxygenase will shed more light on this mystery. Biochemical evidence indicates, however, that the specificity for 5mC or T is more complicated than suggested by our predictions. NgTET1, which exhibits both 5mC- and Toxidizing activities,60 uses D234 to interact with N4 of 5xC (Figure 5D,E).59,98 Mutation of this residue to Asn or Ala increases the efficiency of T oxidation by NgTET1;60 we thought that Asn would be the better substitution to achieve this aim, as it could hydrogen-bond with O4, but the data showed that D234A is more efficient at oxidizing T.60 Another example is droTET, which has the same base-interacting residues (Asn and His) as mammalian TETs (Figure 8) yet is reported to have 6mA dioxygenase activity.64 Furthermore, many prokaryotic homologues shown in the sequence alignment in Figure 8 are outliers from the general trends noted above, adding further complexity to predicting TET/JBP specificity. Establishing a sequence−structure−function relationship for these enzymes will require biochemical and structural investigations of additional members of the TET/ JBP family.

sequence biases for TET, BGT, and APOBEC3A

applications due to incorporation of TET as a reagent in the procedural workflow. These methods are listed in Table 1 along with the pros and cons of their utilization. The current gold standard for profiling genomic 5mC and 5hmC content is bisulfite sequencing (BS-seq). Treatment of denatured DNA with sodium bisulfite results in deamination of C, 5fC, and 5caC to respectively form U, 5fU, and 5caU, all of which are read as T during sequencing.122 In contrast, 5mC and 5hmC are resistant to this chemical deamination and are therefore read as C.24,122 As can be inferred from this description, standard BS-seq allows 5mC + 5hmC to be identified by comparing the sequencing results of a bisulfitetreated sample to that of an untreated control, but the individual 5mC and 5hmC sites are indistinguishable (Table 1). TET-assisted bisulfite sequencing (TAB-seq) was developed as a means to selectively maintain 5hmC in samples for sequencing detection by coupling the activities of TET and T4 β-glucosyltransferase (BGT).122 BGT transfers a glucosyl group from UDP-glucose to 5hmC to generate 5-(β-Dglucosyl)methylcytosine (5gmC), which is resistant to both oxidation by TET and bisulfite-catalyzed deamination. This enzymatic reaction is used to quantitatively protect all preexisting 5hmC in a DNA sample during the first step of TABseq. Subsequent introduction of excess TET results in the oxidation of 5mC to 5fC and 5caC, the latter of which is deaminated, along with unmodified C, following bisulfite treatment. Sequencing of these samples results in 5gmC being read as C, whereas all other formerly (un)modified Cs are read as T, thus allowing for single-base resolution detection of 5hmC sites in the original DNA sample (Table 1). While powerful, BS-seq can be laborious and expensive and requires tremendous amounts of sample in order to offset the degradation of DNA that occurs during bisulfite treatment. This issue has led to the development of many alternatives to BS-seq for profiling genome-wide and/or loci-specific 5(h)mC. The typical workflow of these methods involves affinity enrichment of DNA fragments containing a particular base followed by deep sequencing. In TET-assisted 5mC sequencing (TAmC-seq),123 the ability of BGT to accept the analogue UDP-6-azidoglucose is exploited in mapping 5mC sites. After inactivation of pre-existing 5hmC sites with glucose, TET, BGT, and UDP-6-azidoglucose are combined with the glucosylated DNA sample in a one-pot reaction. TET oxidizes 5mC to 5hmC, and the latter is then rapidly converted to 6azido-β-glucosyl-5-hydroxymethyl-2′-deoxycytosine (N 3 5gmC) by BGT. A biotin tag is covalently attached to N35gmC via click chemistry for subsequent pulldown and sequencing to map the original 5mC content of the DNA



TET AS A BIOTECHNOLOGICAL TOOL FOR 5(H)MC SEQUENCING Gaining an understanding of the diverse roles that 5mC and its oxidized derivatives play in epigenetic regulation requires methods to selectively enrich, detect, and quantify these bases in gDNA samples. It was recognized soon after the discovery of their catalytic properties that TETs hold great potential for both enhancing pre-existing methylome sequencing technologies and developing new ones for profiling genomic 5hmC, 5fC, and 5caC content. Here we briefly review some of the advances that have been made in 5mC and 5hmC sequencing 461

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry

The methods described above highlight some of the biotechnological potential of utilizing TETs as reagents in the workflows of 5(h)mC-sequencing techniques. However, there are certain aspects of the biochemistry of these enzymes that are undesirable for sequencing applications and require further investigation in order to attenuate or abolish them. Perhaps the largest issue is that currently characterized TETs show a strong preference for 5mC in the context of CpG dinucleotides, which likely results in biasing of data sets against non-CpG methylation sites in DNA. Overcoming such biases may come through characterizing new TET homologues from bacteriophages or prokaryotes. Alternatively, if the factors dictating a preference for CpG sites can be identified, it may be possible to engineer well-studied TETs to possess relaxed sequence specificity. For some applications, it may also be desirable to have TET catalyze only one oxidation and then stop. The 5hmC-stalling mutations noted earlier98,109 are perhaps a step in the right direction, but further optimization is needed in order to maintain robust enzymatic activity.

sample. Cross-validation experiments revealed that TAmC-seq outperformed 5mC immunoprecipitation-based sequencing and attained genomic 5mC coverage that approached levels observed in BS-seq (Table 1).123 One proven technique for profiling many base modifications without enrichment or bisulfite treatment is TET-mediated single-molecule real-time sequencing (SMRT-seq). In SMRTseq, DNA polymerase kinetics are measured to determine the length of time, or interpulse duration (IPD), between two successive nucleotide incorporation events. When the polymerase encounters a modified base on the template strand, the IPD changes in a manner that is characteristic of the modification and its sequence context. Sites of base modification in a gDNA sample can be determined, sometimes at single-base resolution, by comparison of the IPD measured using modified DNA with that measured using an unmodified control (the IPD ratio). While SMRT-seq works particularly well in detecting 6mA and 4-methylcytosine, the effects of 5mC on the IPD ratio are subtle, rendering it very challenging to confidently call a position methylated (Table 1).124 TETmediated oxidation of 5mC in short oligonucleotides to 5hmC, 5fC, and 5caC was shown to produce similarly patterned yet readily detectable IPD ratios with magnitudes that scale with the oxidation state of the base (5hmC < 5fC ≈ 5caC), thereby providing a means of improving 5mC detection by SMRT-seq (Table 1).124 This beneficial effect was further validated by successfully mapping 95%, 77%, and 90% of the expected 5mC positions in the genomes of Escherichia coli MG1655, Bacillus halodurans C-125, and Helicobacter pylori strain 2295.60,124 TET-mediated SMRT-seq may also be useful for identifying 5hmC and 5fC by using techniques that specifically label these bases to produce distinctive IPD ratios.125 Another alternative to BS-seq is an enzyme-dependent 5(h)mC-sequencing method in which the activities of TET, cytosine deaminase (APOBEC3A), and BGT are orchestrated for mapping of 5mC and 5hmC sites.126,127 APOBEC3A is proficient in deaminating C and 5mC128,129 but displays significantly reduced activity on 5hmC and almost no activity on 5fC, 5caC, and 5gmC.127,130 Therefore, utilizing TET to oxidize all 5mC to higher oxidation states and BGT to capture any remaining 5hmC as 5gmC will block APOBEC3A’s activity on these sites. Since APOBEC3A is specific to single-stranded DNA, gDNA samples need to be denatured subsequent to treatment with TET and BGT by heating in the presence of formamide.126 Similar to other sequencing methods, distinguishing 5mC from C is achieved by comparing the sequencing results for gDNA samples with and without treatment with TET, BGT, and APOBEC3A. In the treated samples, C will be deaminated to U as a result of APOBEC3A’s activity, while former 5mC sites, rendered inert to APOBEC3A by oxidation/ glucosylation, will read as C. Distinguishing 5mC and 5hmC sites is achieved by introducing a third sample that is subjected only to the activities of BGT and APOBEC3A. The latter results in deamination of C and 5mC to U and T, respectively, while 5gmC is left intact and is therefore read as C.127 A recent method, termed APOBEC-coupled epigenetic sequencing, also utilizes BGT and APOBEC3A for single-base resolution of 5hmC.131 Since these methods are solely dependent on the activities of DNA-modifying enzymes, they hold great promise in obtaining single-base resolution detection of 5(h)mC with lower DNA input and longer reads compared with BS-based approaches (Table 1).



CONCLUSIONS AND PERSPECTIVE Despite the substantial knowledge gained in the past decade regarding the role that TETs play in active demethylation, many key biological questions remain unanswered. The current model of active demethylation, TET-TDG-BER, seems like a rather costly process in terms of cellular resources for removing epigenetic marks. Could DNMTs be linked to the 5caC decarboxylation activity observed with mESC extract,57,58 or is there a 5caC decarboxylase awaiting discovery? How is the TET-TDG-BER pathway invoked versus cytosine deaminase and/or other mechanisms? We have also highlighted evidence implicating TETs as depositors of new layers of epigenetic information, leading to the question of the exact role that 5hmC, 5fC, and 5caC have in vivo. Intense research is currently in progress to address these questions. In our opinion, an important area of the field that seems to be severely lacking is an understanding of the biochemical fundamentals of the TET catalytic reaction. A few biochemical studies and the X-ray crystal structures of TETs with 5xC substrates have illuminated some aspects of catalysis. However, a detailed kinetic examination of the Fe2+-oxidation mechanism with each of the 5xC substrates and an exploration of the inherent regulatory mechanisms that TETs employ to control each oxidation step are of imminent importance for full comprehension of enzyme function. Studies of TET homologues from phage and bacteria, which arguably may be simpler biochemical systems to examine, can help in attaining this comprehension and also provide a stepping stone to understanding the functional evolution and diversity of the TET/JBP superfamily.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Lana Saleh: 0000-0003-2629-9795 Author Contributions

The manuscript was written through contributions of all authors. All of the authors approved the final version of the manuscript. 462

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry Funding

(7) Chédin, F., Lieber, M. R., and Hsieh, C. L. (2002) The DNA methyltransferase-like protein Dnmt3L stimulates de novo methylation by Dnmt3a. Proc. Natl. Acad. Sci. U. S. A. 99, 16916−16921. (8) Hata, K., Okano, M., Lei, H., and Li, E. (2002) Dnmt3L cooperates with the Dnmt3 family of de novo DNA methyltransferases to establish maternal imprints in mice. Development 129, 1983−1993. (9) Jeltsch, A., and Jurkowska, R. Z. (2013) Multimerization of the Dnmt3a DNA methyltransferase and its functional implications. Prog. Mol. Biol. Transl. Sci. 117, 445−464. (10) Jeltsch, A. (2006) On the enzymatic properties of Dnmt1: specificity, processivity, mechanism of linear diffusion and allosteric regulation of the enzyme. Epigenetics 1, 63−66. (11) Jeltsch, A., and Jurkowska, R. Z. (2014) New concepts in DNA methylation. Trends Biochem. Sci. 39, 310−318. (12) Bostick, M., Kim, J. K., Esteve, P. O., Clark, A., Pradhan, S., and Jacobsen, S. E. (2007) UHRF1 plays a role in maintaining DNA methylation in mammalian cells. Science 317, 1760−1764. (13) Liu, X., Gao, Q., Li, P., Zhao, Q., Zhang, J., Li, J., Koseki, H., and Wong, J. (2013) UHRF1 targets DNMT1 for DNA methylation through cooperative binding of hemi-methylated DNA and methylated H3K9. Nat. Commun. 4, 1563. (14) Sharif, J., Muto, M., Takebayashi, S., Suetake, I., Iwamatsu, A., Endo, T. A., Shinga, J., Mizutani-Koseki, Y., Toyoda, T., Okamura, K., Tajima, S., Mitsuya, K., Okano, M., and Koseki, H. (2007) The SRA protein Np95 mediates epigenetic inheritance by recruiting Dnmt1 to methylated DNA. Nature 450, 908−912. (15) Hajkova, P., Erhardt, S., Lane, N., Haaf, T., El-Maarri, O., Reik, W., Walter, J., and Surani, M. A. (2002) Epigenetic reprogramming in mouse primordial germ cells. Mech. Dev. 117, 15−23. (16) Oswald, J., Engemann, S., Lane, N., Mayer, W., Olek, A., Fundele, R., Dean, W., Reik, W., and Walter, J. (2000) Active demethylation of the paternal genome in the mouse zygote. Curr. Biol. 10, 475−478. (17) Rougier, N., Bourc’his, D., Gomes, D. M., Niveleau, A., Plachot, M., Pàldi, A., and Viegas-Péquignot, E. (1998) Chromosome methylation patterns during mammalian preimplantation development. Genes Dev. 12, 2108−2113. (18) Sasaki, H., and Matsui, Y. (2008) Epigenetic events in mammalian germ-cell development: reprogramming and beyond. Nat. Rev. Genet. 9, 129−140. (19) Weber, M., Hellmann, I., Stadler, M. B., Ramos, L., Päab̈ o, S., Rebhan, M., and Schübeler, D. (2007) Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat. Genet. 39, 457−466. (20) Yamazaki, Y., Mann, M. R., Lee, S. S., Marh, J., McCarrey, J. R., Yanagimachi, R., and Bartolomei, M. S. (2003) Reprogramming of primordial germ cells begins before migration into the genital ridge, making these cells inadequate donors for reproductive cloning. Proc. Natl. Acad. Sci. U. S. A. 100, 12207−12212. (21) Seki, Y., Yamaji, M., Yabuta, Y., Sano, M., Shigeta, M., Matsui, Y., Saga, Y., Tachibana, M., Shinkai, Y., and Saitou, M. (2007) Cellular dynamics associated with the genome-wide epigenetic reprogramming in migrating primordial germ cells in mice. Development 134, 2627− 2638. (22) Jones, P. A., and Taylor, S. M. (1980) Cellular differentiation, cytidine analogs and DNA methylation. Cell 20, 85−93. (23) Smith, Z. D., and Meissner, A. (2013) The simplest explanation: passive DNA demethylation in PGCs. EMBO J. 32, 318−321. (24) He, Y. F., Li, B. Z., Li, Z., Liu, P., Wang, Y., Tang, Q., Ding, J., Jia, Y., Chen, Z., Li, L., Sun, Y., Li, X., Dai, Q., Song, C. X., Zhang, K., He, C., and Xu, G. L. (2011) Tet-mediated formation of 5carboxylcytosine and its excision by TDG in mammalian DNA. Science 333, 1303−1307. (25) Ito, S., Shen, L., Dai, Q., Wu, S. C., Collins, L. B., Swenberg, J. A., He, C., and Zhang, Y. (2011) Tet proteins can convert 5methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science 333, 1300−1303.

This work was supported entirely by internal funding from New England Biolabs, Inc. Notes

The authors declare the following competing financial interest(s): M.J.P., P.R.W., and L.S. are employees of New England Biolabs, Inc., a manufacturer and vendor of molecular biology reagents, including enzyme reagents for epigenetics research. This affiliation does not affect the authors’ impartiality, adherence to journal standards and policies, and availability of data.



ACKNOWLEDGMENTS We thank Zhiyi Sun, Bill Jack, Andy Gardner, Tom Evans, and Rich Roberts for critical feedback on this Perspective.



ABBREVIATIONS 5mC, 5-methylcytosine; gDNA, genomic DNA; PGC, primordial germ cell; C5-cytosine-MT, C5-cytosine-methyltransferases; UHRF1, E3 ubiquitin-protein ligase; TET, teneleven translocation; aKG, α-ketoglutarate; 5hmC, 5-hydroxymethylcytosine; 5fC, 5-formylcytosine; 5caC, 5-carboxycytosine; TDG, thymine-DNA glycosylase; T, thymine; BER, base excision repair; C, cytosine; HEK, human embryonic kidney; AID, activation-induced cytidine deaminase; Gadd45a, growth arrest DNA-damage-inducible protein 45a; JBP, base J-binding protein; 5hmU, 5-hydroxymethyluracil; base J, 5-(β-Dglucosyl)methyluracil; hTET, human TET; TLC, thin-layer chromatography; CD, catalytic domain; mTET, mouse TET; mESC, mouse embryonic stem cell; NgTET1, Naegleria gruberi TET1; CcTET, Coprinopsis cinerea TET; AmTET, Apis mellifera TET; droTET, Drosophila melanogaster TET; 5fU, 5-formyluracil; 5caU, 5-carboxyuracil; 6mA, N6-methyladenosine; DSBH, double-stranded beta-helix; 5xC, 5mC, 5hmC, 5fC, or 5caC (x = m, hm, f, or ca); hTET2-TCD, hTET2 truncated CD; MT, methyltransferase; BS-seq, bisulfite sequencing, TAB-seq, TET-assisted bisulfite sequencing; BGT, T4 β-glucosyltransferase; TAmC-seq, TET-assisted 5mC sequencing; SMRT-seq, single-molecule real-time sequencing; IPD, interpulse duration; APOBEC3A, a human cytosine deaminase



REFERENCES

(1) Lister, R., Pelizzola, M., Dowen, R. H., Hawkins, R. D., Hon, G., Tonti-Filippini, J., Nery, J. R., Lee, L., Ye, Z., Ngo, Q. M., Edsall, L., Antosiewicz-Bourget, J., Stewart, R., Ruotti, V., Millar, A. H., Thomson, J. A., Ren, B., and Ecker, J. R. (2009) Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315−322. (2) Bogdanović, O., and Lister, R. (2017) DNA methylation and the preservation of cell identity. Curr. Opin. Genet. Dev. 46, 9−14. (3) Smith, Z. D., and Meissner, A. (2013) DNA methylation: roles in mammalian development. Nat. Rev. Genet. 14, 204−220. (4) Li, E., Bestor, T. H., and Jaenisch, R. (1992) Targeted mutation of the DNA methyltransferase gene results in embryonic lethality. Cell 69, 915−926. (5) Okano, M., Bell, D. W., Haber, D. A., and Li, E. (1999) DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell 99, 247−257. (6) Ravichandran, M., Jurkowska, R. Z., and Jurkowski, T. P. (2018) Target specificity of mammalian DNA methylation and demethylation machinery. Org. Biomol. Chem. 16, 1419−1435. 463

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry (26) Kriaucionis, S., and Heintz, N. (2009) The nuclear DNA base, 5-hydroxymethylcytosine is present in brain and enriched in Purkinje neurons. Science 324, 929−930. (27) Tahiliani, M., Koh, K. P., Shen, Y., Pastor, W. A., Bandukwala, H., Brudno, Y., Agarwal, S., Iyer, L. M., Liu, D. R., Aravind, L., and Rao, A. (2009) Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL Partner TET1. Science 324, 930−935. (28) Maiti, A., and Drohat, A. C. (2011) Thymine DNA glycosylase can rapidly excise 5-formylcytosine and 5-carboxylcytosine: potential implications for active demethylation of CpG sites. J. Biol. Chem. 286, 35334−35338. (29) Cortázar, D., Kunz, C., Selfridge, J., Lettieri, T., Saito, Y., MacDougall, E., Wirz, A., Schuermann, D., Jacobs, A. L., Siegrist, F., Steinacher, R., Jiricny, J., Bird, A., and Schär, P. (2011) Embryonic lethal phenotype reveals a function of TDG in maintaining epigenetic stability. Nature 470, 419−423. (30) Cortellino, S., Xu, J., Sannai, M., Moore, R., Caretti, E., Cigliano, A., Le Coz, M., Devarajan, K., Wessels, A., Soprano, D., Abramowitz, L. K., Bartolomei, M. S., Rambow, F., Bassi, M. R., Bruno, T., Fanciulli, M., Renner, C., Klein-Szanto, A. J., Matsumoto, Y., Kobi, D., Davidson, I., Alberti, C., Larue, L., and Bellacosa, A. (2011) Thymine DNA glycosylase is essential for active DNA demethylation by linked deamination-base excision repair. Cell 146, 67−79. (31) Nabel, C. S., Jia, H., Ye, Y., Shen, L., Goldschmidt, H. L., Stivers, J. T., Zhang, Y., and Kohli, R. M. (2012) AID/APOBEC deaminases disfavor modified cytosines implicated in DNA demethylation. Nat. Chem. Biol. 8, 751−758. (32) Popp, C., Dean, W., Feng, S., Cokus, S. J., Andrews, S., Pellegrini, M., Jacobsen, S. E., and Reik, W. (2010) Genome-wide erasure of DNA methylation in mouse primordial germ cells is affected by AID deficiency. Nature 463, 1101−1105. (33) Barreto, G., Schafer, A., Marhold, J., Stach, D., Swaminathan, S. K., Handa, V., Doderlein, G., Maltry, N., Wu, W., Lyko, F., and Niehrs, C. (2007) Gadd45a promotes epigenetic gene activation by repair-mediated DNA demethylation. Nature 445, 671−675. (34) Ma, D. K., Jang, M. H., Guo, J. U., Kitabatake, Y., Chang, M. L., Pow-Anpongkul, N., Flavell, R. A., Lu, B., Ming, G. L., and Song, H. (2009) Neuronal activity-induced Gadd45b promotes epigenetic DNA demethylation and adult neurogenesis. Science 323, 1074−1077. (35) Rai, K., Huggins, I. J., James, S. R., Karpf, A. R., Jones, D. A., and Cairns, B. R. (2008) DNA demethylation in zebrafish involves the coupling of a deaminase, a glycosylase, and Gadd45. Cell 135, 1201− 1212. (36) Schmitz, K. M., Schmitt, N., Hoffmann-Rohrer, U., Schafer, A., Grummt, I., and Mayer, C. (2009) TAF12 recruits Gadd45a and the nucleotide excision repair complex to the promoter of rRNA genes leading to active DNA demethylation. Mol. Cell 33, 344−353. (37) Engel, N., Tront, J. S., Erinle, T., Nguyen, N., Latham, K. E., Sapienza, C., Hoffman, B., and Liebermann, D. A. (2009) Conserved DNA methylation in Gadd45a(−/−) mice. Epigenetics 4, 98−99. (38) Hashimoto, H., Liu, Y., Upadhyay, A. K., Chang, Y., Howerton, S. B., Vertino, P. M., Zhang, X., and Cheng, X. (2012) Recognition and potential mechanisms for replication and erasure of cytosine hydroxymethylation. Nucleic Acids Res. 40, 4841−4849. (39) Otani, J., Kimura, H., Sharif, J., Endo, T. A., Mishima, Y., Kawakami, T., Koseki, H., Shirakawa, M., Suetake, I., and Tajima, S. (2013) Cell cycle-dependent turnover of 5-hydroxymethyl cytosine in mouse embryonic stem cells. PLoS One 8, No. e82961. (40) Ji, D., Lin, K., Song, J., and Wang, Y. (2014) Effects of Tetinduced oxidation products of 5-methylcytosine on Dnmt1- and DNMT3a-mediated cytosine methylation. Mol. BioSyst. 10, 1749− 1752. (41) Frauer, C., Hoffmann, T., Bultmann, S., Casa, V., Cardoso, M. C., Antes, I., and Leonhardt, H. (2011) Recognition of 5hydroxymethylcytosine by the Uhrf1 SRA domain. PLoS One 6, No. e21306.

(42) Iurlaro, M., Ficz, G., Oxley, D., Raiber, E.-A., Bachman, M., Booth, M. J., Andrews, S., Balasubramanian, S., and Reik, W. (2013) A screen for hydroxymethylcytosine and formylcytosine binding proteins suggests functions in transcription and chromatin regulation. Genome Biol. 14, R119. (43) Spruijt, C. G., Gnerlich, F., Smits, A. H., Pfaffeneder, T., Jansen, P. W. T. C., Bauer, C., Münzel, M., Wagner, M., Müller, M., Khan, F., Eberl, H. C., Mensinga, A., Brinkman, A. B., Lephikov, K., Müller, U., Walter, J., Boelens, R., van Ingen, H., Leonhardt, H., Carell, T., and Vermeulen, M. (2013) Dynamic Readers for 5-(Hydroxy)Methylcytosine and Its Oxidized Derivatives. Cell 152, 1146−1159. (44) Shen, L., and Zhang, Y. (2013) 5-Hydroxymethylcytosine: generation, fate, and genomic distribution. Curr. Opin. Cell Biol. 25, 289−296. (45) Scott-Browne, J. P., Lio, C. J., and Rao, A. (2017) TET proteins in natural and induced differentiation. Curr. Opin. Genet. Dev. 46, 202−208. (46) Wu, X., and Zhang, Y. (2017) TET-mediated active DNA demethylation: mechanism, function and beyond. Nat. Rev. Genet. 18, 517−534. (47) Lorsbach, R. B., Moore, J., Mathew, S., Raimondi, S. C., Mukatira, S. T., and Downing, J. R. (2003) TET1, a member of a novel protein family, is fused to MLL in acute myeloid leukemia containing the t(10;11)(q22;q23). Leukemia 17, 637−641. (48) Ono, R., Taki, T., Taketani, T., Taniwaki, M., Kobayashi, H., and Hayashi, Y. (2002) LCX, leukemia-associated protein with a CXXC domain, is fused to MLL in acute myeloid leukemia with trilineage dysplasia having t(10;11)(q22;q23). Cancer Res. 62, 4075− 4080. (49) Iyer, L. M., Tahiliani, M., Rao, A., and Aravind, L. (2009) Prediction of novel families of enzymes involved in oxidative and other complex modifications of bases in nucleic acids. Cell Cycle 8, 1698−1710. (50) Cliffe, L. J., Kieft, R., Southern, T., Birkeland, S. R., Marshall, M., Sweeney, K., and Sabatini, R. (2009) JPB1 and JPB2 are two distinct thymidine hydroxylases involved in J biosynthesis in genomic DNA of African trypanosomes. Nucleic Acids Res. 37, 1452−1462. (51) Yu, Z., Genest, P. A., ter Riet, B., Sweeney, K., DiPaolo, C., Kieft, R., Christodoulou, E., Perrakis, A., Simmons, J. M., Hausinger, R. P., van Luenen, H. G. A. M., Rigden, D. J., Sabatini, R., and Borst, P. (2007) The protein that binds to DNA base J in trypanosomatids has features of a thymidine hydroxylase. Nucleic Acids Res. 35, 2107− 2115. (52) van Luenen, H. G. A. M., Farris, C., Jan, S., Genest, P. A., Tripathi, P., Velds, A., Kerkhoven, R. M., Nieuwland, M., Haydock, A., Ramasamy, G., Vainio, S., Heidebrecht, T., Perrakis, A., Pagie, A., van Steensel, B., Myler, P. J., and Borst, P. (2012) Glucosylated hydroxymethyluracil, DNA base J, prevents transcriptional readthrough in Leishmania. Cell 150, 909−921. (53) Pfaffeneder, T., Hackner, B., Truß, M., Münzel, M., Müller, M., Deiml, C. A., Hagemeier, C., and Carell, T. (2011) The discovery of 5-formylcytosine in embryonic stem cell DNA. Angew. Chem., Int. Ed. 50, 7008−7012. (54) Fink, R. M., and Fink, K. (1962) Utilization of radiocarbon from thymidine and other precursors of ribonucleic acid in Neurospora crassa. J. Biol. Chem. 237, 2289−2290. (55) Palmatier, R. D., McCroskey, R. P., and Abbott, M. T. (1970) The enzymatic conversion of uracil 5-carboxylic acid to uracil and carbon dioxide. J. Biol. Chem. 245, 6706−6710. (56) Smiley, J. A., Angelot, J. M., Cannon, R. C., Marshall, E. M., and Asch, D. K. (1999) Radioactivity-based and spectrophotometric assays for isoorotate decarboxylase: identification of the thymidine salvage pathway in lower eukaryotes. Anal. Biochem. 266, 85−92. (57) Schiesser, S., Hackner, B., Pfaffeneder, T., Müller, M., Hagemeier, C., Truß, M., and Carell, T. (2012) Mechanism and stem-cell activity of 5-carboxycytosine decarboxylation determined by isotope tracing. Angew. Chem., Int. Ed. 51, 6516−6520. (58) Liutkevičiu̅ tė, Z., Kriukienė, E., Ličytė, J., Rudytė, M., Urbanavičiu̅tė, G., and Klimašauskas, S. (2014) Direct decarbox464

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry ylation of 5-carboxylcytosine by DNA C5-methyltransferases. J. Am. Chem. Soc. 136, 5884−5887. (59) Hashimoto, H., Pais, J. E., Zhang, X., Saleh, L., Fu, Z. Q., Dai, N., Corrêa, I. R., Jr., Zheng, Y., and Cheng, X. (2014) Structure of a Naegleria Tet-like dioxygenase in complex with 5-methylcytosine DNA. Nature 506, 391−395. (60) Pais, J. E., Dai, N., Tamanaha, E., Vaisvila, R., Fomenkov, A. I., Bitinaite, J., Sun, Z., Guan, S., Corrêa, I. R., Jr., Noren, C. J., Cheng, X., Roberts, R. J., Zheng, Y., and Saleh, L. (2015) Biochemical characterization of a Naegleria TET-like oxygenase and its application in single molecule sequencing of 5-methylcytosine. Proc. Natl. Acad. Sci. U. S. A. 112, 4316−4321. (61) Zhang, L., Chen, W., Iyer, L. M., Hu, J., Wang, G., Fu, Y., Yu, M., Dai, Q., Aravind, L., and He, C. (2014) A TET homologue protein from Coprinopsis cinerea (CcTET) that biochemically converts 5-methylcytosine to 5-hydroxymethylcytosine, 5-formylcytosine, and 5-carboxylcytosine. J. Am. Chem. Soc. 136, 4801−4804. (62) Wojciechowski, M., Rafalski, D., Kucharski, R., Misztal, K., Maleszka, J., Bochtler, M., and Maleszka, R. (2014) Insights into DNA hydroxymethylation in the honeybee from in-depth analyses of TET dioxygenase. Open Biol. 4, 140110. (63) Delatte, B., Wang, F., Ngoc, L. V., Collignon, E., Bonvin, E., Deplus, R., Calonne, E., Hassabi, B., Putmans, P., Awe, S., Wetzel, C., Kreher, J., Soin, R., Creppe, C., Limbach, P. A., Gueydan, C., Kruys, V., Brehm, A., Minakhina, S., Defrance, M., Steward, R., and Fuks, F. (2016) Transcriptome-wide distribution and function of RNA hydroxymethylcytosine. Science 351, 282−285. (64) Zhang, G., Huang, H., Liu, D., Cheng, Y., Liu, X., Zhang, W., Yin, R., Zhang, D., Zhang, P., Liu, J., Li, C., Liu, B., Luo, Y., Zhu, Y., Zhang, N., He, S., He, C., Wang, H., and Chen, D. (2015) N6methyladenine DNA modification in Drosophila. Cell 161, 893−906. (65) Pfaffeneder, T., Spada, F., Wagner, M., Brandmayr, C., Laube, S. K., Eisen, D., Truss, M., Steinbacher, J., Hackner, B., Kotljarova, O., Schuermann, D., Michalakis, S., Kosmatchev, O., Schiesser, S., Steigenberger, B., Raddaoui, N., Kashiwazaki, G., Muller, U., Spruijt, C. G., Vermeulen, M., Leonhardt, H., Schar, P., Muller, M., and Carell, T. (2014) Tet oxidizes thymine to 5-hydroxymethyluracil in mouse embryonic stem cell DNA. Nat. Chem. Biol. 10, 574−581. (66) Hashimoto, H., Zhang, X., Vertino, P. M., and Cheng, X. (2015) The Mechanisms of Generation, Recognition, and Erasure of DNA 5-Methylcytosine and Thymine Oxidations. J. Biol. Chem. 290, 20723−20733. (67) Hu, L., Li, Z., Cheng, J., Rao, Q., Gong, W., Liu, M., Shi, Y. G., Zhu, J., Wang, P., and Xu, Y. (2013) Crystal structure of TET2-DNA complex: insight into TET-mediated 5mC oxidation. Cell 155, 1545− 1555. (68) Iyer, L. M., Abhiman, S., and Aravind, L. (2011) Natural history of eukaryotic DNA methylation systems. Prog. Mol. Biol. Transl. Sci. 101, 25−104. (69) Shukla, A., Sehgal, M., and Singh, T. R. (2015) Hydroxymethylation and its potential implication in DNA repair system: a review and future perspectives. Gene 564, 109−118. (70) Akahori, H., Guindon, S., Yoshizaki, S., and Muto, Y. (2015) Molecular evolution of the TET gene family in mammals. Int. J. Mol. Sci. 16, 28472−28485. (71) Zhang, W., Xia, W., Wang, Q., Towers, A. J., Chen, J., Gao, R., Zhang, Y., Yen, C. A., Lee, A. Y., Li, Y., Zhou, C., Liu, K., Zhang, J., Gu, T. P., Chen, X., Chang, Z., Leung, D., Gao, S., Jiang, Y. H., and Xie, W. (2016) Isoform switch of TET1 regulates DNA demethylation and mouse development. Mol. Cell 64, 1062−1073. (72) Liu, D., Li, G., and Zuo, Y. (2018) Function determinants of TET proteins: the arrangements of sequence motifs with specific codes. Briefings Bioinf., bby053. (73) Ko, M., An, J., Bandukwala, H. S., Chavez, L., Ä ijö, T., Pastor, W. A., Segal, M. F., Li, H., Koh, K. P., Lähdesmäki, H., Hogan, P. G., Aravind, L., and Rao, A. (2013) Modulation of TET2 expression and 5-methylcytosine oxidation by the CXXC domain protein IDAX. Nature 497, 122−126.

(74) Montagner, S., Leoni, C., Emming, S., Della Chiara, G., Balestrieri, C., Barozzi, I., Piccolo, V., Togher, S., Ko, M., Rao, A., Natoli, G., and Monticelli, S. (2016) TET2 regulates mast cell differentiation and proliferation through catalytic and non-catalytic activities. Cell Rep. 15, 1566−1579. (75) Jin, S. G., Zhang, Z. M., Dunwell, T. L., Harter, M. R., Wu, X., Johnson, J., Li, Z., Liu, J., Szabó, P. E., Lu, Q., Xu, G. L., Song, J., and Pfeifer, G. P. (2016) Tet3 reads 5-carboxylcytosine through its CXXC domain and is a potential guardian against neurodegeneration. Cell Rep. 14, 493−505. (76) Dawlaty, M. M., Breiling, A., Le, T., Barrasa, M. I., Raddatz, G., Gao, Q., Powell, B. E., Cheng, A. W., Faull, K. F., Lyko, F., and Jaenisch, R. (2014) Loss of Tet enzymes compromises proper differentiation of embryonic stem cells. Dev. Cell 29, 102−111. (77) Bachman, M., Uribe-Lewis, S., Yang, X., Burgess, H. E., Iurlaro, M., Reik, W., Murrell, A., and Balasubramanian, S. (2015) 5Formylcytosine can be a stable DNA modification in mammals. Nat. Chem. Biol. 11, 555−557. (78) Bachman, M., Uribe-Lewis, S., Yang, X., Williams, M., Murrell, A., and Balasubramanian, S. (2014) 5-Hydroxymethylcytosine is a predominantly stable DNA modification. Nat. Chem. 6, 1049−1055. (79) Booth, M. J., Branco, M. R., Ficz, G., Oxley, D., Krueger, F., Reik, W., and Balasubramanian, S. (2012) Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science 336, 934−937. (80) Song, C. X., Szulwach, K. E., Dai, Q., Fu, Y., Mao, S. Q., Lin, L., Street, C., Li, Y., Poidevin, M., Wu, H., Gao, J., Liu, P., Li, L., Xu, G. L., Jin, P., and He, C. (2013) Genome-wide profiling of 5formylcytosine reveals its roles in epigenetic priming. Cell 153, 678−691. (81) Hashimoto, H., Olanrewaju, Y. O., Zheng, Y., Wilson, G. G., Zhang, X., and Cheng, X. (2014) Wilms tumor protein recognizes 5carboxylcytosine within a specific DNA sequence. Genes Dev. 28, 2304−2313. (82) Mellen, M., Ayata, P., Dewell, S., Kriaucionis, S., and Heintz, N. (2012) MeCP2 binds to 5hmC enriched within active genes and accessible chromatin in the nervous system. Cell 151, 1417−1430. (83) Yildirim, O., Li, R., Hung, J. H., Chen, P. B., Dong, X., Ee, L. S., Weng, Z., Rando, O. J., and Fazzio, T. G. (2011) Mbd3/NURD complex regulates expression of 5-hydroxymethylcytosine marked genes in embryonic stem cells. Cell 147, 1498−1510. (84) Ko, M., An, J., Pastor, W. A., Koralov, S. B., Rajewsky, K., and Rao, A. (2015) TET proteins and 5-methylcytosine oxidation in hematological cancers. Immunol. Rev. 263, 6−21. (85) Cartron, P. F., Nadaradjane, A., Lepape, F., Lalier, L., Gardie, B., and Vallette, F. M. (2013) Identification of TET1 Partners That Control Its DNA-Demethylating Function. Genes Cancer 4, 235−241. (86) Rasmussen, K. D., and Helin, K. (2016) Role of TET enzymes in DNA methylation, development, and cancer. Genes Dev. 30, 733− 750. (87) Tamanaha, E., Guan, S., Marks, K., and Saleh, L. (2016) Distributive processing by the iron(II)/alpha-ketoglutarate-dependent catalytic domains of the TET enzymes is consistent with epigenetic roles for oxidized 5-methylcytosine bases. J. Am. Chem. Soc. 138, 9345−9348. (88) Crawford, D. J., Liu, M. Y., Nabel, C. S., Cao, X.-J., Garcia, B. A., and Kohli, R. M. (2016) Tet2 catalyzes stepwise 5-methylcytosine oxidation by an iterative and de novo mechanism. J. Am. Chem. Soc. 138, 730−733. (89) Xiong, J., Zhang, Z., Chen, J., Huang, H., Xu, Y., Ding, X., Zheng, Y., Nishinakamura, R., Xu, G. L., Wang, H., Chen, S., Gao, S., and Zhu, B. (2016) Cooperative action between SALL4A and TET proteins in stepwise oxidation of 5-methylcytosine. Mol. Cell 64, 913− 925. (90) Blaschke, K., Ebata, K. T., Karimi, M. M., Zepeda-Martínez, J. A., Goyal, P., Mahapatra, S., Tam, A., Laird, D. J., Hirst, M., Rao, A., Lorincz, M. C., and Ramalho-Santos, M. (2013) Vitamin C induces Tet-dependent DNA demethylation and a blastocyst-like state in ES cells. Nature 500, 222−226. 465

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry (91) Minor, E. A., Court, B. L., Young, J. I., and Wang, G. (2013) Ascorbate induces ten-eleven translocation (Tet) methylcytosine dioxygenase-mediated generation of 5-hydroxymethylcytosine. J. Biol. Chem. 288, 13669−13674. (92) Yin, R., Mao, S. Q., Zhao, B., Chong, Z., Yang, Y., Zhao, C., Zhang, D., Huang, H., Gao, J., Li, Z., Jiao, Y., Li, C., Liu, S., Wu, D., Gu, W., Yang, Y. G., Xu, G. L., and Wang, H. (2013) Ascorbic acid enhances Tet-mediated 5-methylcytosine oxidation and promotes DNA demethylation in mammals. J. Am. Chem. Soc. 135, 10396− 10403. (93) de Jong, L., Albracht, S. P., and Kemp, A. (1982) Prolyl 4hydroxylase activity in relation to the oxidation state of enzymebound iron. The role of ascorbate in peptidyl proline hydroxylation. Biochim. Biophys. Acta, Protein Struct. Mol. Enzymol. 704, 326−332. (94) Myllylä, R., Majamaa, K., Günzler, V., Hanauske-Abel, H. M., and Kivirikko, K. I. (1984) Ascorbate is consumed stoichiometrically in the uncoupled reactions catalyzed by prolyl 4-hydroxylase and lysyl hydroxylase. J. Biol. Chem. 259, 5403−5405. (95) Hanauske-Abel, H. M., and Günzler, V. (1982) A stereochemical concept for the catalytic mechanism of prolylhydroxylase: applicability to classification and design of inhibitors. J. Theor. Biol. 94, 421−455. (96) Mitchell, A. J., Dunham, N. P., Martinie, R. J., Bergman, J. A., Pollock, C. J., Hu, K., Allen, B. D., Chang, W. C., Silakov, A., Bollinger, J. M., Jr., Krebs, C., and Boal, A. K. (2017) Visualizing the Reaction Cycle in an Iron(II)- and 2-(Oxo)-glutarate-Dependent Hydroxylase. J. Am. Chem. Soc. 139, 13830−13836. (97) Ye, S., Riplinger, C., Hansen, A., Krebs, C., Bollinger, J. M., Jr., and Neese, F. (2012) Electronic structure analysis of the oxygenactivation mechanism by Fe(II)- and alpha-ketoglutarate (alphaKG)dependent dioxygenases. Chem. - Eur. J. 18, 6555−6567. (98) Hashimoto, H., Pais, J. E., Dai, N., Corrêa, I. R., Jr., Zhang, X., Zheng, Y., and Cheng, X. (2015) Structure of Naegleria Tet-like dioxygenase (NgTet1) in complexes with a reaction intermediate 5hydroxymethylcytosine DNA. Nucleic Acids Res. 43, 10713−10721. (99) Hu, L., Lu, J., Cheng, J., Rao, Q., Li, Z., Hou, H., Lou, Z., Zhang, L., Li, W., Gong, W., Liu, M., Sun, C., Yin, X., Li, J., Tan, X., Wang, P., Wang, Y., Fang, D., Cui, Q., Yang, P., He, C., Jiang, H., Luo, C., and Xu, Y. (2015) Structural insight into substrate preference for TET-mediated oxidation. Nature 527, 118−122. (100) Aik, W., McDonough, M. A., Thalhammer, A., Chowdhury, R., and Schofield, C. J. (2012) Role of the jelly-roll fold in substrate binding by 2-oxoglutarate oxygenases. Curr. Opin. Struct. Biol. 22, 691−700. (101) Upadhyay, A. K., Horton, J. R., Zhang, X., and Cheng, X. (2011) Coordinated methyl-lysine erasure: structural and functional linkage of a Jumonji demethylase domain and a reader domain. Curr. Opin. Struct. Biol. 21, 750−760. (102) Egloff, S., and Murphy, S. (2008) Cracking the RNA polymerase II CTD code. Trends Genet. 24, 280−288. (103) Sims, R. J., III, Rojas, L. A., Beck, D., Bonasio, R., Schüller, R., Drury, W. J., III, Eick, D., and Reinberg, D. (2011) The C-terminal domain of RNA polymerase II is modified by site-specific methylation. Science 332, 99−103. (104) Hausinger, R. P. (2004) Fe(II)/alpha-ketoglutarate-dependent hydroxylases and related enzymes. Crit. Rev. Biochem. Mol. Biol. 39, 21−68. (105) Chang, W. C., Guo, Y., Wang, C., Butch, S. E., Rosenzweig, A. C., Boal, A. K., Krebs, C., and Bollinger, J. M., Jr. (2014) Mechanism of the C5 stereoinversion reaction in the biosynthesis of carbapenem antibiotics. Science 343, 1140−1144. (106) Clifton, I. J., Doan, L. X., Sleeman, M. C., Topf, M., Suzuki, H., Wilmouth, R. C., and Schofield, C. J. (2003) Crystal structure of carbapenem synthase (CarC). J. Biol. Chem. 278, 20843−20850. (107) Wilmouth, R. C., Turnbull, J. J., Welford, R. W., Clifton, I. J., Prescott, A. G., and Schofield, C. J. (2002) Structure and mechanism of anthocyanidin synthase from Arabidopsis thaliana. Structure 10, 93− 103.

(108) Lu, J., Hu, L., Cheng, J., Fang, D., Wang, C., Yu, K., Jiang, H., Cui, Q., Xu, Y., and Luo, C. (2016) A computational investigation on the substrate preference of ten-eleven-translocation 2 (TET2). Phys. Chem. Chem. Phys. 18, 4728−4738. (109) Liu, M. Y., Torabifard, H., Crawford, D. J., DeNizio, J. E., Cao, X. J., Garcia, B. A., Cisneros, G. A., and Kohli, R. M. (2017) Mutations along a TET2 active site scaffold stall oxidation at 5hydroxymethylcytosine. Nat. Chem. Biol. 13, 181−187. (110) Iyer, L. M., Zhang, D., Burroughs, A. M., and Aravind, L. (2013) Computational identification of novel biochemical systems involved in oxidation, glycosylation and other complex modifications of bases in DNA. Nucleic Acids Res. 41, 7635−7655. (111) Liu, C. K., Hsu, C. A., and Abbott, M. T. (1973) Catalysis of three sequential dioxygenase reactions by thymine 7-hydroxylase. Arch. Biochem. Biophys. 159, 180−187. (112) Lee, Y. J., Dai, N., Walsh, S. E., Müller, S., Fraser, M. E., Kauffman, K. M., Guan, C., Corrêa, I. R., Jr., and Weigele, P. R. (2018) Identification and biosynthesis of thymidine hypermodifications in the genomic DNA of widespread bacterial viruses. Proc. Natl. Acad. Sci. U. S. A. 115, E3116−E3125. (113) Shafiee, A., Motamedi, H., and Chen, T. (1994) Enzymology of FK-506 biosynthesis: purification and characterization of 31-OdesmethylFK-506 O:methyltransferase from Streptomyces spMA6858. Eur. J. Biochem. 225, 755−764. (114) Paez-Espino, D., Chen, I. A., Palaniappan, K., Ratner, A., Chu, K., Szeto, E., Pillay, M., Huang, J., Markowitz, V. M., Nielsen, T., Huntemann, M., Reddy, T. B. K., Pavlopoulos, G. A., Sullivan, M. B., Campbell, B. J., Chen, F., McMahon, K., Hallam, S. J., Denef, V., Cavicchioli, R., Caffrey, S. M., Streit, W. R., Webster, J., Handley, K. M., Salekdeh, G. H., Tsesmetzis, N., Setubal, J. C., Pope, P. B., Liu, W. T., Rivers, A. R., Ivanova, N. N., and Kyrpides, N. C. (2017) IMG/ VR: a database of cultured and uncultured DNA Viruses and retroviruses. Nucleic Acids Res. 45, D457−D465. (115) Salje, J. (2010) Plasmid segregation: how to survive as an extra piece of DNA. Crit. Rev. Biochem. Mol. Biol. 45, 296−317. (116) Iyer, L. M., Zhang, D., de Souza, R. F., Pukkila, P. J., Rao, A., and Aravind, L. (2014) Lineage-specific expansions of TET/JBP genes and a new class of DNA transposons shape fungal genomic and epigenetic landscapes. Proc. Natl. Acad. Sci. U. S. A. 111, 1676−1683. (117) Weigele, P., and Raleigh, E. A. (2016) Biosynthesis and function of modified bases in bacteria and their viruses. Chem. Rev. 116, 12655−12687. (118) Hardy, L. W., and Nalivaika, E. (1992) Asn177 in Escherichia coli thymidylate synthase is a major determinant of pyrimidine specificity. Proc. Natl. Acad. Sci. U. S. A. 89, 9725−9729. (119) Matthews, D. A., Appelt, K., Oatley, S. J., and Xuong, N. H. (1990) Crystal structure of Escherichia coli thymidylate synthase containing bound 5-fluoro-2’-deoxyuridylate and 10-propargyl-5,8dideazafolate. J. Mol. Biol. 214, 923−936. (120) Graves, K. L., Butler, M. M., and Hardy, L. W. (1992) Roles of Cys148 and Asp179 in catalysis by deoxycytidylate hydroxymethylase from bacteriophage T4 examined by site-directed mutagenesis. Biochemistry 31, 10315−10321. (121) Robert, X., and Gouet, P. (2014) Deciphering key features in protein structures with the new ENDscript server. Nucleic Acids Res. 42, W320−324. (122) Yu, M., Hon, G. C., Szulwach, K. E., Song, C.-X., Zhang, L., Kim, A., Li, X., Dai, Q., Shen, Y., Park, B., Min, J.-H., Jin, P., Ren, B., and He, C. (2012) Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell 149, 1368−1380. (123) Zhang, L., Szulwach, K. E., Hon, G. C., Song, C. X., Park, B., Yu, M., Lu, X., Dai, Q., Wang, X., Street, C. R., Tan, H., Min, J. H., Ren, B., Jin, P., and He, C. (2013) Tet-mediated covalent labeling of 5-methylcytosine for its genome-wide detection and sequencing. Nat. Commun. 4, 1517. (124) Clark, T. A., Lu, X., Luong, K., Dai, Q., Boitano, M., Turner, S. W., He, C., and Korlach, J. (2013) Enhanced 5-methylcytosine detection in single-molecule, real-time sequencing via Tet1 oxidation. BMC Biol. 11, 4. 466

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467

Perspective

Biochemistry (125) Song, C. X., Clark, T. A., Lu, X. Y., Kislyuk, A., Dai, Q., Turner, S. W., He, C., and Korlach, J. (2012) Sensitive and specific single-molecule sequencing of 5-hydroxymethylcytosine. Nat. Methods 9, 75−77. (126) Vaisvila, R., Sun, Z., Guan, S., Saleh, L., Ettwiller, L., and Davis, T. B. Compositions and methods for analyzing modified nucleotides. U.S. Pat. Appl. 15/441431, July 13, 2017. (127) Vaisvila, R., Davis, T. B., Guan, S., Sun, Z., Ettwiller, L., and Saleh, L. Compositions and methods for analyzing modified nucleotides. U.S. Pat. Appl. 15/893373, June 21, 2018. (128) Carpenter, M. A., Li, M., Rathore, A., Lackey, L., Law, E. K., Land, A. M., Leonard, B., Shandilya, S. M., Bohn, M. F., Schiffer, C. A., Brown, W. L., and Harris, R. S. (2012) Methylcytosine and normal cytosine deamination by the foreign DNA restriction enzyme APOBEC3A. J. Biol. Chem. 287, 34801−34808. (129) Wijesinghe, P., and Bhagwat, A. S. (2012) Efficient deamination of 5-methylcytosines in DNA by human APOBEC3A, but not by AID or APOBEC3G. Nucleic Acids Res. 40, 9206−9217. (130) Schutsky, E. K., Nabel, C. S., Davis, A. K. F., DeNizio, J. E., and Kohli, R. M. (2017) APOBEC3A efficiently deaminates methylated, but not TET-oxidized, cytosine bases in DNA. Nucleic Acids Res. 45, 7655−7665. (131) Schutsky, E. K., DeNizio, J. E., Hu, P., Liu, M. Y., Nabel, C. S., Fabyanic, E. B., Hwang, Y., Bushman, F. D., Wu, H., and Kohli, R. M. (2018) Nondestructive, base-resolution sequencing of 5-hydroxymethylcytosine using a DNA deaminase. Nat. Biotechnol. 36, 1083−1090.

467

DOI: 10.1021/acs.biochem.8b01185 Biochemistry 2019, 58, 450−467