Biosynthesis and Function of Modified Bases in Bacteria and Their

Peter Weigele got his start in biology while working in the retail pet trade as a high school .... (1) m5C was not confirmed in bacteria again until 1...
0 downloads 0 Views 7MB Size
This is an open access article published under a Creative Commons Non-Commercial No Derivative Works (CC-BY-NC-ND) Attribution License, which permits copying and redistribution of the article, and creation of adaptations, all for non-commercial purposes.

Review pubs.acs.org/CR

Biosynthesis and Function of Modified Bases in Bacteria and Their Viruses Peter Weigele Chemical Biology, New England Biolabs, Ipswich, Massachusetts 01938, United States

Elisabeth A. Raleigh* Research, New England Biolabs, Ipswich, Massachusetts 01938, United States S Supporting Information *

ABSTRACT: Naturally occurring modification of the canonical A, G, C, and T bases can be found in the DNA of cellular organisms and viruses from all domains of life. Bacterial viruses (bacteriophages) are a particularly rich but still underexploited source of such modified variant nucleotides. The modifications conserve the coding and base-pairing functions of DNA, but add regulatory and protective functions. In prokaryotes, modified bases appear primarily to be part of an arms race between bacteriophages (and other genomic parasites) and their hosts, although, as in eukaryotes, some modifications have been adapted to convey epigenetic information. The first half of this review catalogs the identification and diversity of DNA modifications found in bacteria and bacteriophages. What is known about the biogenesis, context, and function of these modifications are also described. The second part of the review places these DNA modifications in the context of the arms race between bacteria and bacteriophages. It focuses particularly on the defense and counter-defense strategies that turn on direct recognition of the presence of a modified base. Where modification has been shown to affect other DNA transactions, such as expression and chromosome segregation, that is summarized, with reference to recent reviews.

CONTENTS 1. Introduction 1.1. Early Observations of Modified Bases in Prokaryote and Viral DNA 1.2. Detection and Analysis of Modified Nucleobases 2. Modified Nucleobases Produced by DNA Methyltransferases 2.1. Protective and Regulatory Functions of DNAMTs 2.2. DNA Methytransferase Structure and Function 2.3. DNA Methyltransferase Mechanism 3. Modified Nucleobases Found in Bacteriophages 3.1. Modified Purines in Phages 3.1.1. N6-Carbamoyl-methyladenine 3.1.2. 2-Aminoadenine 3.1.3. 7-Methylguanine 3.1.4. Deoxyarchaeosine 3.2. Phage Modified Pyrimidines 3.2.1. Deoxyuracil (dU) 3.2.2. 5-Hydroxymethyldeoxyuracil 3.2.3. Hypermodified Thymidines 3.2.4. 5-Dihydroxypentauracil 3.2.5. 5-Methylcytosine 3.2.6. hm5C and Glucosyl-hm5C of T-Even Phages © 2016 American Chemical Society

3.2.7. 5-Hydroxycytosine 4. Central Role of Deoxypyrimidine Nucleotide Monophosphate (Hydroxy) Methyltransferases in Generating Modified Pyrimidines 4.1. Enzymatic Pyrimidine C5 Modification: U versus C, Methyl versus Hydroxymethyl 4.2. Phylogenetic and Functional Clustering of dYMP (Hydroxy)methyltransferases 4.3. Phages with Potentially Undiscovered hm5dC Modifications 5. Arms Race: The Biology of Modification and Restriction 5.1. Modification Protects from Restriction and Causes Sensitivity to It 5.1.1. RM Type Summary 5.1.2. Other Reviews: Perspectives on RM Systems 5.1.3. Interaction of RM Types with Biological Base Modifications 5.2. Role of Modifications in Virulent Phage Life Cycle

12656 12656 12656 12657 12657 12657 12658 12658 12659 12659 12659 12659 12659 12660 12660 12661 12662 12665 12665 12665

12665

12665 12666 12666 12668 12668 12668 12668 12669 12670 12670

Special Issue: Genome Modifying Mechanisms Received: February 12, 2016 Published: June 20, 2016 12655

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews 5.3. Role of Modifications in Temperate Phage Life Cycle 5.4. Modification Facilitating Migration of RM Systems 5.4.1. Lysogenic Conversion: EcoP1I and EcoGIII 5.4.2. Replacement Cassettes DpnI/DpnII/ DpnIII 5.5. Orphan Modifying Enzymes 5.5.1. Lineage-Conserved Orphan Methyltransferases 5.5.2. Eroded RM Systems 5.5.3. Migratory Orphans: Prophage and Plasmid Orphan Modifying Enzymes 5.6. Restricting Modified DNA: DNA Binding Domain Fusions 5.6.1. McrBC: DUF3578 DBD-Translocase Fusion, PD(D/E)XK Separate Protein 5.6.2. MspJI Family: SRA DBD-Mrr-Cat fusion 5.6.3. PvuRts1I Family: PD(D/E)XK-SRA DBD Fusion 5.6.4. Sco5333: SRA DBD-HNH Fusion 5.6.5. EcoKMrr: Mrr-N DBD-Mrr-Cat Fusion 5.6.6. EcoKMcrA: EcoMcrA-N DBD-HNH Fusion 5.6.7. ScoA3McrA: ScoA3McrA(N) DBD-HNH 5.6.8. GmrSD Family: ParB/Srx DBD-HNH Fusion 5.6.9. SauUSI PLDc-Helicase-DUF3427 DBD Fusion 5.6.10. DpnI: PD(D/E)XK-Winged Helix DBD 5.6.11. GlaI Family, Unidentified Domains 5.7. Restricting Modified DNA with DNA Repair Enzymes 5.7.1. Repair Glycosylase UDG 5.7.2. Repair Nuclease Nfi (Endonuclease V) 5.8. Inhibition of RE Action 6. Future Directions Associated Content Supporting Information Author Information Corresponding Author Notes Biographies Acknowledgments Abbreviations References

Review

expression in certain contexts. Because such modifications to the DNA can alter the phenotypic expression of a genome without altering the genotype, per se, the biological information carried by DNA modification (as well histone modification and other cellular phenomena) is, by convention, referred to as the organism’s “epigenome.” In prokaryotes and their viruses, modified nucleobases appear primarily to be part of an arms race between viruses (and other genomic parasites) and their hosts, though epigenetic functions of modified DNA in bacteria are also beginning to be elucidated.

12671 12672 12672 12672 12672 12672 12673

1.1. Early Observations of Modified Bases in Prokaryote and Viral DNA

The first modified nucleoside, 5-methylcytosine (m5C), was observed in 1925 by Johnson and Coghill in the DNA of Mycobacterium tuberculosis.1 m5C was not confirmed in bacteria again until 1965 by Doskočil and Šormová,2,3 though it had been found in bacteriophage λ.4,5 N6-Methyladenine (m6A) was shown by Dunn and Smith in 1955 to be a minor component of bacterial and bacteriophage DNAs.6,7 N4-Methylcytosine was first shown in 1983 to be a minor component in Bacillus DNA8 and then later shown to be widespread among thermophilic9 and mesophilic bacteria.10 Structures of these nucleobases are shown in Figure 1. Wyatt and Cohen observed complete substitution of

12673 12674 12674 12675 12675 12675 12676 12676 12676 12676 12676 12677 12677 12677 12678 12678 12678 12679 12679 12679 12679 12679 12679 12679 12679 12679 12680

Figure 1. Methylated bases found in bacteria and their viruses.

cytosine by 5-hydroxymethylcytosine (hm5C) in the T-even bacteriophages,11 and a subset of these bases were subsequently shown to be glucosylated.12 An understanding of a biological role for methylated bases in bacteria did not emerge until decades later. In the late 1960s and early 1970s, Werner Arber and others noticed that bacteriophages could retain a “memory” of their most recent host that could profoundly affect their ability to infect closely related bacterial strains.13 The basis of this memory was shown to be the methylation of specific DNA sequences by a host-encoded enzyme.14 The methylation of the DNA was essential to the viability of the phage in subsequent rounds of infection on the same host. This phenomenon of host-controlled modification led to the discovery of restriction enzymes, for which Werner Arber, Daniel Nathans, and Hamilton O. Smith shared a Nobel Prize in 1978.

1. INTRODUCTION DNA is more than just combinations of A, G, C, and T. Naturally occurring variations of the canonical nucleotides can be found in the DNA of cellular organisms and viruses from all domains of life. A variety of chemical groups can be biologically appended to the nucleobase portion of a nucleotide, ranging from simple methyl groups in cellular organisms and their viruses, to amino acids, polyamines, monosaccharides, and disaccharides as found in viruses of bacteria. These modifications do not alter the specificity of base pairing; rather they are interpreted by cells, viruses, and mobile DNAs in a context-specific fashion through the interaction of cellular and viral encoded proteins with the modified DNA to distinguish self from nonself, protect DNA from being degraded, and/or control gene regulation. In eukaryotes, DNA modification can profoundly influence gene

1.2. Detection and Analysis of Modified Nucleobases

A variety of techniques have been used historically for the detection and characterization of modified nucleotides. DNA samples first had to be decomposed to individual nucleotides, usually by harsh chemical hydrolysis, such as boiling in hydrochloric or formic acid.15 Gentler, more physiological methods employed mixtures of enzymes such as DNase I from bovine pancreas, snake venom phosphodiesterase, and S1 12656

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

nuclease, and where dephosphorylation was required, “zinc activated” bacterial alkaline phosphatase (BAP) was used.16,17 Chemically or enzymatically prepared nucleotide/nucleoside mixtures could then be applied to various separation methods. The pioneering method, paper chromatography, was succeeded by thin-layer chromatography (TLC) with derivatized cellulose on a glass support. This offered improved resolution and flexibility, based on the same principles.18 Mobilities and positions of nucleotides could be visualized by chemical stains applied to plates postseparation, or, in the case of metabolically labeled material, autoradiography was used. As such, TLC was used for both analytical and preparative separations. After separation of an unknown nucleotide, the material was scraped off of plates and purified from the stationary support. Nucleotides could be subjected to further analyses, such as NMR, or a series of chemical decompositions to determine the presence of characteristic chemical groups supporting identification of the nucleotide as a purine or pyrimidine with reactive and/or protective substituents. Anion exchange columns, typically DEAE cellulose, have also been used for analytical and preparative separations of nucleosides.19 The introduction of high performance liquid chromatography (HPLC)20,21 and, later, mass spectrometry (MS),22 and reviewed in ref 23, to nucleotide analysis has greatly increased the resolution of nucleoside separations and purifications. Guided by some basic principles, empirical methods can be employed to determine buffer systems giving optimal separation during HPLC. Nucleoside standards with known retention times aid in the identification of sample composition, either by matching a sample peak to a known nucleoside species, or excluding a sample peak from known nucleosides. Additionally, HPLC can be used for the preparative separation of nucleosides for downstream structural determination by NMR.24 Although HPLC appears to be the current separation method of choice based on speed, reproducibility, scalability, and easy recovery of material, TLC remains a powerful technique for the analysis of modified nucleotides, particularly in cases where there are no known standards for the unknown modified base, or where the amounts of material are very low relative to the background nucleotides of the sample. A “postlabeling” method enables the experimenter to visualize material present in trace quantities.25,26 Here, DNA samples are first decomposed to individual nucleosides and then rephosphorylated using broadspecificity nucleotide kinases and radioactive ATP substrates. This technique has enabled the detection and visualization of trace quantities of modified nucleotides and also various adducts formed as a result of DNA damage. Capillary electrophoresis has been adapted to the study of modified nucleotides.27,28 As with the postlabeling technique, free nucleosides are derivatized to facilitate detection. In this case, the nucleosides are reacted with N-hydroxysuccinimidyl (NHS) esters bearing fluorescent moieties. Due to differences in polarity, mass, and hydrophobicity, the nucleosides themselves have characteristic electrophoretic mobilities through the capillary bed, and because of the fluorescent label, they can be readily detected by the lasers typically found in CE instrumentation. The strength of conclusions derived from these methods relies greatly on the availability of nucleotide standards. Standards can be purified from biological sources, or in the case of hypothetical nucleotide modification, the molecule can be built by organic synthesis, and then run through the battery of analyses as a candidate standard. If a synthesized molecule behaves the same

as the nucleotide under investigation (e.g., same retention time, same mass, same reactive side groups), the identity of the modification can be supported. More recently, single molecule real-time sequencing technologies (SMRT) have been employed for the detection of modified nucleotides at base resolution in bacterial genomic DNA.29,30 This “SMRT” sequencing technology optically monitors the kinetics of fluorescently tagged nucleotide incorporation by an engineered DNA polymerase as it progresses along a template strand. When the sequencing polymerase encounters modified bases on the template DNA, the kinetics of the polymerization reaction are altered, and a measurable stalling is observed as the polymerase ratchets to the next base. Machine learning algorithms trained on well-characterized templates containing known modifications can then be used to identify modified bases in experimental samples. Using this methodology, researchers have been able to characterize the sequence specific distribution of methylated bases (i.e., the “methylome”) of a variety of bacterial strains including important pathogens such as Helicobacter,31 Campylobacter,32 and Salmonella.33

2. MODIFIED NUCLEOBASES PRODUCED BY DNA METHYLTRANSFERASES In bacteria, the most common DNA modifications are associated with restriction−modification systems (RM) wherein an endonuclease and a protective methylation function are grouped together either in a single multidomain protein or in a multigene operon, or alternatively as free-standing methyltransferases involved in epigenetic regulation of cell function, such as virulence. These protective and/or epigenetic functions are provided by the action of methyltransferases with three products: m4 C, m6A, and m5C (Figure 1). 2.1. Protective and Regulatory Functions of DNA-MTs

Methylation of DNA bases serves a variety of roles in bacteria. Methyltransferase genes are often paired with a cognate restriction endonuclease gene in bacterial genomes to protect cellular DNA from endonuclease-mediated cleavage used to destroy bacteriophages and other forms of invading DNA. Other functions of methylation are also beginning to be understood including epigenetic regulation of pathogenesis, control of DNA replication, and directing strand specificity during DNA repair of mismatches deriving from errors of replication and oxidative damage. Bacteriophages can encode methyltransferases as well as restriction endonucleases and are thus obvious vehicles for the horizontal transfer of methyltransferases, as well as a component of epigenetic regulation within their hosts while in the lysogenic state (i.e., integrated into the host genome as a prophage). Such scenarios are discussed in the latter half of this review (see section 5.4 in particular). 2.2. DNA Methytransferase Structure and Function

The three DNA methyltransferase types (i.e., the m6A-, m4C-, and m5 C-methyltransferases) are homologous, with those enzymes catalyzing formation of an exocyclic N−C bond (m6A, m4C) being more closely related. Within the DNA methyltransferases, there are conserved subdomains and motifs important for target recognition, SAM binding, and catalysis. 34 The overall architecture of DNA methyltransferases is remarkably conserved: the structural core is composed of a seven-stranded βsheet containing six parallel strands and a seventh antiparallel strand inserted between the fifth and sixth strands. A bend in the 12657

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

conserved Tyr or Phe stabilizes the flipped base through stacking interactions, a main chain carbonyl positioned between two Pro residues forms a hydrogen bond with the exocyclic amino group, and a third interaction provided by an Asp, Ser, or Asn leads to deprotonation of the amino group. The activated nitrogen can thus undergo nucleophilic attack of the methyl electrophile at the C−S bond of SAM resulting in a methylated base and the formation of S-adenosyl-L-homocysteine (SAH). In the case of the C5 methyltransferases, methylation proceeds via a covalent enzyme−DNA carbanion intermediate formed by the nucleophilic attack of an invariant Cys on the C6 of the target cytosine, followed by attack of the methyl group of SAM by C5. An ensuing β-elimination initiated by deprotonation of C5 results in the release of 5-methylcytosine and SAH.

β-sheet between strands four and one occurs at a DNA binding cleft. The catalytic residues and SAM binding sites are arranged in proximity. α-Helices pack against the front and back faces of the sheet. Though this overall strand topology is conserved, the connectivity is not. DNA methyltransferases display a remarkable diversity of circular permutation. Indeed, engineered circular permutations have been constructed and shown to be active, suggesting a viable evolutionary path that could account for the varied order of conserved structural motifs observed in DNA methyltransferases.35,36 2.3. DNA Methyltransferase Mechanism

DNA methyltransferases are bi-bi enzymes catalyzing the transfer of a methyl group from S-adenosyl-L-methionine (SAM) to the exocyclic amine of cytosine (N4) and adenine (N6) or to the C5 of cytosine in situ in the DNA polymer. During the enzymecatalyzed chemistry of methyl transfer, the target base is flipped out of the DNA helix and bound in the active site of the enzyme. This phenomenon of base flipping was first observed in the crystal structure of the Haemophilus hemolyticus M.HhaI, shown schematically in Figure 2. The base flipping mechanism has

3. MODIFIED NUCLEOBASES FOUND IN BACTERIOPHAGES Collectively, bacteriophages contain the greatest diversity of modified bases so far observed in nature. These modified nucleobases include not only the products of host or bacteriophage encoded methyltransferases, but also more unusual modifications not seen in the DNAs of other organisms. The decades of work identifying and characterizing these noncanonical nucleobases have been previously reviewed,41,42 and their genetic basis has more recently been investigated bioinformatically by Aravind and colleagues.43 The general mechanisms by which bacteriophages synthesize modified nucleotides are essentially twofold. Modified nucleotides are primarily generated through the enzymatic modification of the precursors to the DNA polymer: deoxynucleotide monophosphates (dNMP). Modified dNMPs feed into deoxynucleotide triphosphate (dNTP) pools and become available to DNA polymerases during the lytic cycle of the bacteriophage. This results in a highly substituted phage genomic DNA. Some bacteriophages that incorporate modified nucleotides into their DNA further modify them through secondary modification, or “hypermodification”, of the nucleobase in situ after DNA polymerization.41 The term “hypermodification” was first used to describe base modifications in tRNA44 and later adapted to describe complex modifications in bacteriophage DNA. The hypermodified DNA is then packaged into a protein shell that is a precursor to the mature viral capsid precursor during virion morphogenesis. Why bacteriophages have so many different modifications is not well understood. A primary function of modifications in phage genomes is to prevent cleavage by host restriction endonucleases. Indeed, many bacteriophage DNAs containing a high percentage of a modified nucleotide are completely resistant to cleavage by a variety of restriction enzymes in vitro. However, DNA modifications in phage DNA are proposed to have other functions such as the regulation of operon expression, as demonstrated in the Bacillus phage SPO1 and the coliphage T4; initiation of DNA packaging into the viral capsid, as shown in coliphage P1;45 and the stability of DNA densely packed within a viral capsid, as seen for the Delftia phage ΦW-14.46 The observed diversity of modifications suggests that bacteriophages have more “freedom” to explore nucleobase variations given that they are partially decoupled from the constraints that govern cells; i.e., bacteriophages draw upon a subset of resources present in a cell, for a limited time span, in order to simply make more phages. This is discussed more extensively in sections 5.2 and 5.3. On the other hand, the modifications also elicit host response in the form

Figure 2. M.HhaI C5 DNA methyltransferase. A cartoon diagram shows the methyltransferase bound to its DNA substrate shown in green. αHelices are colored in red, β-sheets are in yellow, and the DNA helix is in green. The target cytosine is flipped out from the DNA helix through a nearly 180° rotation at the phosphodiester backbone linkage and bound in the active site proximal to SAH which is shown directly above the flipped out base.

subsequently been observed in over 25 different protein/DNA crystal structures of not only methyltransferases, but also enzymes involved in DNA repair, transcription, and replication as well, confirming early predictions of the importance of this phenomenon.37,38 Base flipping can occur actively, in which an enzyme couples free energy of binding and conformational changes to the flipping of the base out from the helix, or passively, where an enzyme captures and stabilizes a base that has spontaneously flipped during normal “breathing” of DNA.39 The methylation of a flipped DNA base positioned in the active site of a methyltransferase proceeds via different mechanisms according to the type of transfer product (see Figure 3; reviewed by Jeltsch40). For the aminomethylases, a 12658

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

Figure 3. Methyl donor and mechanisms of the DNA methyltransferases. (A) S-Adenosyl-L-methionine and S-adenosyl-L-homocysteine. (B) Reaction mechanism of C5 methylation by HhaI. (C) Reaction mechanism of N6 adenosine methyltransferase.

AX955019.1) and used as the basis of a patent application submitted by them in 2004 (US 2006/0270005 A1 and WO 2003/093461 A8). The authors identified an open reading frame in the S-2L genome sequence with significant identity with adenylosuccinate synthetase (EC 6.3.4.4). They suggested that it encodes a PurA homologue responsible for the synthesis of a 2,6diaminopurine nucleoside which, after subsequent phosphorylations to the triphosphate form, would be incorporated into newly replicated S-2L DNA via a DNA polymerase. 3.1.3. 7-Methylguanine. The DNA of the Shigella phage DDVI contains guanine methylated at the 7 position and accounting for approximately 1% of the total guanine.53 A guanine methyltransferase specifically expressed during phage infection was subsequently characterized in cell extracts prepared from DDVI infected Shigella sonnei and E. coli and shown to use S-adenosylmethionine as the methyl donor in the reaction.54 It was also demonstrated in these works that DDVI phage prepared from hosts deficient in methionine biosynthesis were restricted in wild type hosts, indicating a protective function for this modification. 3.1.4. Deoxyarchaeosine. Recently, it has been determined that a subset of pathogenic bacteria and the Escherichia phage 9g contain deazaguanosine derivatives in their DNA.55 Deazaguanosine bases are typically found in tRNAs as queosine and archeosine. The incorporation of archaeosine into DNA, rather than RNA, was first predicted by Aravind and colleagues.43 For the bacteria containing queosine, these modified guanosines are

of modification-dependent restriction, as described in section 5.6. 3.1. Modified Purines in Phages

Bacteriophages are known to have partial and even complete substitution of purines in their DNA by a modified purine. In addition to m6A, there are at least five known modified purines found in bacteriophage DNA, illustrated in Figure 4. 3.1.1. N6-Carbamoyl-methyladenine. The Escherichia coli phage Mu contains an unusual hypermodified purine rendering its DNA resistant to a variety of restriction enzymes.47,48 Approximately 15% of the adenines in this phage are replaced with by α-N-(9-D-2′-deoxyribofuranosylpurin-6-yl) glycinamide (or more simply N6-carbamoyl-methyladenine).49 A phage gene, mom, encodes an enzyme responsible for the hypermodification, though the reaction has not been reconstituted in vitro. Recently, the Mom protein has been shown to be a member of the GCN5related N-acetyltransferase (GNAT) family, suggesting the Mom enzyme uses a coenzyme-A carrier to donate a formamide moiety to an m6A substrate.50 3.1.2. 2-Aminoadenine. In the late 1970s, researchers at the Leningrad State University and Moscow State University isolated the bacteriophage S-2L using a freshwater cyanobacterium of the genus Synechocystis as a host and characterized its virion DNA. They found the adenosine component to be fully substituted by 2,6-diaminopurine.51,52 The genome of this same bacteriophage was sequenced by researchers at the Institute Pasteur (GenBank 12659

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

Figure 4. Modified purines of bacteriophages. Adenine and guanine are shown in the context of their respective base pairs. Only the nucleobase portion is shown; the side groups illustrated are attached at those positions of the purine heterocycle indicated in green. The atoms of the purine heterocycle are numbered according to standard convention.

chemical versatility of the underlying modification substrate. Alternatively, it may be a stochastic result of the small number of experimental organisms studied in detail during the heyday of phage biological studies. 3.2.1. Deoxyuracil (dU). Bacteriophage PBS1 and its clear mutant derivative PBS2 are transducing bacteriophages of Bacillus subtilis that contain deoxyuracil (dU) fully replacing thymidine.57 PBS1 accomplishes this pyrimidine substitution by altering the composition of deoxynucleotide triphosphate pools available to the viral polymerase during replication as well as by protecting the newly replicated uracil-DNA from host DNA repair pathways (see Figure 7). Deoxyuracil triphosphates for DNA polymerization are derived from the deamination of cytosine triphosphate by dCTP deaminase (EC 3.5.4.13).58 This deaminase is inhibited by high concentrations of its product, dUTP, suggesting that feedback inhibition ensures a balanced ratio of dCTP vs dUTP. Under normal cellular conditions, dUTP is removed by a host-encoded dUTPase, but this activity is inhibited during infection.59 Residual dTTP is excluded from DNA through the action of a dTTP specific phosphorylase that prevents the thymidine pool from progressing beyond dTDP.60 A dUMP-specific kinase is also expressed during infection, and the dUDP produced as a result is phosphorylated to the triphosphate form by a host nucleotide diphosphokinase (NDK).61 DNA synthesis proceeds by a phage-encoded DNA polymerase.61 Bacterial polymerases are not inhibited by dUTP, and similarly the viral polymerase may utilize dTTP, but

present in DNA at slightly more than 1 per 1000 nucleotides, but for the phage 9g, approximately one-fourth of the guanosines are substituted with archaeosine. The presence of queosine biosynthetic genes encoded by the phage56 suggests a mechansim for their occurrence in DNA. A queosine precursor, likely preQ1, is synthesized from GTP. A deazaguanosine transglycosylase-like enzyme then swaps the endogenous guanosine base out of the DNA polymer leaving the preQ1 base in its place. Subsequent modifications to the base may occur in situ leading to the final deoxyarchaeosine nucleotides observed within the DNA polymer. The actual steps of the pathway await experimental elucidation. 3.2. Phage Modified Pyrimidines

To date, by far the largest diversity of modified nucleotides observed in nature are found among the substituted pyrimidines of bacteriophages. The structures of nine known pyrimidine modifications are shown in Figure 5. Many details of the biogenesis of these nucleobases have been determined and are discussed below. In almost all cases, the modified pyrimidines enter the DNA replication pathway in the form of pyrimidine monophosphates synthesized by thymidylate synthase (TS) homologues that add methyl or hydroxymethyl groups to the pyrimidine ring of a deoxynucleotide monophosphate. The hydroxymethyl groups can serve as sites for further modification (hypermodification) after DNA replication. A generalized pathway for hypermodified 5-hydroxymethylpyrimidine derviatives is shown in Figure 6. The greater diversity (observed to date) in pyrimidines than purines may result from greater 12660

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

Figure 5. Modified pyrimidines of bacteriophages. Adenosine and cytosine are shown in the context of their respective base pairs. Only the nucleobase portion is shown. The side groups illustrated are attached at those positions of the pyrimidine heterocycle indicated in green. The atoms of the pyrimidine heterocycles are numbered according to standard convention.

dTTP concentration and synthesize hm5dUTP in its place. UTP biosynthesis begins with the deamination of dCMP to yield dUMP. dUMP is converted to hm5dUMP through the action of dUMP hydroxymethyltransferase, an enzyme homologous to TS (e.g., product of thyA in E. coli). hm5dUMP is phosphorylated to hm5dUDP by a phage encoded kinase and subsequently converted to hm5dUTP by a host encoded NDK. Together with the three remaining canonical dNTPs, hm5dU is incorporated into newly replicated DNA by a phage encoded DNA polymerase. It should be noted that the SPO1 DNA polymerase contains, in addition to a family B polymerase domain, a 3′ → 5′ exonuclease domain, and a third novel domain with similarity to the pyrimidine recognition domain of UDG, suggestive of a thymidine editing function in the polymerase (though this property has not been confirmed). The hm5dU phages also deploy mechanisms to specifically exclude dTTP from the intracellular nucleotide triphosphate

additional mechanisms to exclude residual dTTP from DNA have not been ruled out. In cellular organisms, most uracil present in DNA is created by the oxidative deamination of cytosine and is therefore mutagenic since an A rather than a G will be paired with the U during DNA replication. Like most bacteria, Bacillus subtilis encodes a uracil DNA glycosylase (UDG) as part of its DNA repair capabilities. During phage infection, this enzyme is inhibited by a phageencoded protein, the UDG inhibitor (UGI).62 UGI is a protein mimic of B-form DNA that binds tightly to the substrate-binding cleft of UDG blocking its interaction with DNA. 3.2.2. 5-Hydroxymethyldeoxyuracil. A variety of bacteriophages contain 5-hydroxymethyldeoxyuracil (hm5dU) completely substituting for thymidine in their DNA, including the well-characterized Bacillus phage SPO1 as well as Bacillus phages ϕe, SP8, H1, 2C, and SP82. Like the dU DNA phage PBS1, the hm5 dU phages manipulate intracellular dNTP pools to minimize

hm5

12661

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

Figure 6. Generalized pathway of hydroxymethylpyrimidine hypermodification in bacteriophages. The pathway can be divided into two stages: before and after DNA replication. On the right are the types enzymatic activities observed converting substrates to products in each step. Those enzymes typically encoded by bacteriophages are indicated in green. The italicized activities are inferred from genetic, biochemical, and bioinformatic evidence but have not yet been fully reconstituted in vitro.

Figure 7. Metabolic pathway of bacteriophage PBS1 dU-DNA synthesis.

3.2.3. Hypermodified Thymidines. The Delf tia phage ΦW-14 and the Bacillus phage SP10 hypermodify their thymidines. These phages use essentially the same nucleotide metabolic program as the hm5dU phages leading up to DNA replication (see Figure 8). Following replication, hm5dU nucleotides within the DNA polymer are further modified to

pool during bacteriophage morphogenesis. A phage induced nucleotidohydrolase converts dTTP to dTMP. Simultaneously, a phage-induced (but unidentified) inhibitor of host thymidylate synthase prevents the methylation of dUMP to dTMP. A host dUTPase may also produce dUMP to channel it back into the pathway leading to hm5dUTP production. 12662

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

Figure 8. Generalized pathway for hm5dU-DNA synthesis and subsequent hypermodification.

Figure 9. Hypermodification of hm5dU through a pyrophosphorylated intermediate.

yield either a hypermodified base or thymidine. No hm5dU is found within the mature DNA encapsidated by the virion. In the case of ΦW-14, the hypermodified base is called αputrescinylthymine and consists of putrescine ligated to the 5methyl carbon of a nucleobase through an N−C bond. For SP10, a glutamate is similarly appended to thymidine in the base αglutamylthymine again through an N−C bond. The structures of these two modifications are shown in Figure 9. Both hypermodified base types of phages ΦW-14 and SP10 are believed to

utilize a pyrophosphorylated intermediate: 5-(hydroxymethyl)O-pyrophosphoryluracil (see Figure 9).63,64 The proposed function of the pyrophosphate moiety of the intermediate is to activate the 5C methyl group for nucleophilic attack by a primary amine in putrescine or glutamate. Conversion of hm5dU to T in situ in these phages has similarly been proposed to proceed via the pyrophosphorylated intermediate through the reduction of the 5C methyl moiety. However, direct reduction of the hydroxymethyl group through another enzymatic mechanism, 12663

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

Figure 10. Metabolic pathway for m5dC DNA synthesis by bacteriophage Xp12.

Figure 11. Pathway of hm5C incorporation into DNA and subsequent glucosylation.

12664

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

perhaps one using NADH or FADH as an electron donor, seems equally plausible. Though the genomes of ΦW-14 and SP10 have been sequenced, the enzymes responsible for pyrophosphorylation and hypermodification have not been conclusively identified. Recently, bioinformatic analyses conducted by Aravind and colleagues at NCBI have identified genes they propose carry out these functions.43 3.2.4. 5-Dihydroxypentauracil. Another highly modified thymidine nucleotide was discovered in the Bacillus phage SP15.65,66 Like the hm5dU phages, this nucleotide is formed via the enzymatic modification of a nucleotide precursor through unknown mechanisms. Once incorporated into DNA, this base is further modified by the addition of sugar moieties. 3.2.5. 5-Methylcytosine. The Xanthomonas oryzae bacteriophage Xp12 contains a fully methylated complement of cytosines.67,68 To date, the only other organism known to have m5 C completely substituting for cytosine is the halovirus FH.69 Xp12’s m5C was shown to derive from the methylation of dCMP by a folate-dependent thymidylate synthase-like activity specific to phage-infected cells70,71 (see Figure 10 for metabolic reconstruction). A phage-induced m5dCMP kinase produces m5dCDP. This diphosphate nucleotide is subsequently phosphorylated by a host-encoded NDK to yield m5dCTP. In a manner analogous to the synthesis of dUMP from dCMP in the bacteriophage PBS1, dTTP in Xp12 infected cells is synthesized from m5dCTP by a m5dCTP-specific deaminase.72 Together with the three canonical dNTPs, m5dCTP is incorporated into DNA by a DNA polymerase, likely phage encoded. The in vitro resistance of m5C-DNA to a wide range of restriction endonucleases suggests a protective role for m5C in Xp12.73 3.2.6. hm5C and Glucosyl-hm5C of T-Even Phages. Bacteriophages T2, T4, and T6 (also known as the T-even phages) contain a fully modified complement of cytosines substituted with 5-hydroxymethylcytosine and further modified by glucosylation. A pathway schematic diagramming the synthesis of glucosylated hm5C is shown in Figure 11. hm5C in T-even bacteriophage DNA begins at the nucleotide pool level when dCMP is hydroxymethylated to yield hm5dCMP. In bacteriophage T4, this reaction is catalyzed by the product of gene 42, dCMP hydroxymethyltransferase, a phage-encoded enzyme homologous to the folate-dependent thymidylate synthases.74,75 The hydroxymethylated cytosine monophosphate is converted to the diphosphate form by gp1, T4 dNMP kinase. T4 dNMP kinase is unusual in that it utilizes dGMP and dTMP in addition to hm5dCMP as substrates, yet can exclude dAMP and dCMP. Many of the enzymes in T4’s dNTP biosynthetic pathway have been proposed to associate in a large multisubunit complex converting precursors to nucleotides and channeling these to the T4 DNA polymerase at a replication fork.76 After DNA synthesis, hm5C is glucosylated by one of two DNA glucosyltransferases. The α-glucosyl transferase transfers glucose from UDP-glucose to hm5C in an alpha linkage. Similarly, βglucosyl transferase glucosylates hm5C but producing a beta linkage between the glucose and nucleobase. The crystal structure of the T4 β-glucosyltransferase has been solved. Like DNA methyltransferases, the enzyme acts upon its target base flipped out of the helix and bound into the active site of the protein (Figure 12). Other phages known to glucosylate their hm5 C residues are coliphages T2 and T6, but among the larger family of T4-related bacteriophages,77,78 the presence and degree of glucosylation has not been reported.

Figure 12. Cartoon structure of bacteriophage T4 gp42, βglucosyltransferase, bound to DNA (PDB 1SXP). Note the target base flipped out from the DNA helix and bound in the catalytic center of the enzyme. Not shown is an UDP-glucose located proximal to the target base.

3.2.7. 5-Hydroxycytosine. An unusual cytosine modification, 5-hydroxycytosine (h5C), occurs in the Rhizobium phages RL38JI and N17.79 5-Hydroxycytosine is distinct from 5hydroxymethylcytosine; the 5C is hydroxylated rather than hydroxymethylated. The biosynthesis of this modified base is not known. 5-Hydroxycytosine is further modified by glycosylation in phage RL38JI.

4. CENTRAL ROLE OF DEOXYPYRIMIDINE NUCLEOTIDE MONOPHOSPHATE (HYDROXY) METHYLTRANSFERASES IN GENERATING MODIFIED PYRIMIDINES Many bacteriophages hypermodify their DNA beginning with a 5-hydroxymethyldeoxypyrimidine monophosphate (hm5dYMP) synthesized by a thymidylate synthase homologue. In order to more fully appreciate the extent to which modified nucleotides occur among the bacteriophages, it is useful to consider the range of reactions possible within this group of enzymes and the key residues governing their substrate choice and product outcome. Within the superfamily of thyA thymidylate synthases (TS) there are homologues that have preferences for dCMP versus dUMP as substrates. Furthermore, there are examples of enzymes that transfer a hydroxymethyl group rather than a methyl group. Thus, there are four possible reactions catalyzed by the members of this enzyme family, which are summarized in Table 1 (for detailed biochemical studies see refs 68, 71, 74, 75, 80, and 81). For the members of this group of enzymes, many of the mechanistic features are the same: a pyrimidine is alkylated by methylene-tetrahydrofolate (THF) donor. The enzymes form covalent intermediates with their pyrimidine substrates through a cysteine nucleophile (Cys146, following E. coli TS residue numbering, and hereafter) in the active site, activating the C−H bond of C5 and facilitating methylene transfer from methyleneTHF. For the methyl-transfer reactions, the methylene 12665

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

The residues governing the formation of hm5dYTP versus dYTP have not been so clearly delineated.81 In E. coli TS, ThyA Leu143 and Tyr94 have been proposed to serve as gates excluding the entry of water to the active site, disfavoring the formation of a hm5UMP over the preferred dTMP.85,86 A water gating function for these residues in the hydroxymethyltransferases, as well as evidence that water is the hydroxyl donor, is supported by isotopic labeling experiments with a distant homologue of T4 gp42, the cytidylate hydroxymethylase MilA.87

Table 1. Substrates and Products of TS-Family Enzymes substrate

methylation

dUMP

dTMP (bacteria, eukarya) thymidylate synthase EC 2.1.1.45 m5 dCMP (Xp12 and others?) dCMP methylase EC 2.1.1.54a

dCMP

m5

hydroxymethylation dUMP (SPO1, SP8, ϕe, ΦW-14, SP10) dUMP hydroxymethylase EC 2.1.2.− hm5 dCMP (T4 and T4-like phage) dCMP hydroxymethylase EC 2.1.2.8 hm5

a

Activity demonstrated in vitro with purified enzymes, but gene sequence unknown.

4.2. Phylogenetic and Functional Clustering of dYMP (Hydroxy)methyltransferases

The genetic relatedness of TS and TS-like enzymes, hereafter referred to as dYMP (hydroxy)methyltransferases, has long been known. The dUMP hydroxymethyltransferase of SPO1 was mapped to a restriction fragment of this phage’s genomic DNA by hybridization with the T4 dCMP hydroxymethyltransferase gene in Southern blots, and was subsequently cloned and sequenced.80 The sequence of SPO1 dUMP hydroxymethyltransferase was shown to be homologous to E. coli TS. T4 gp42 protein sequence also was shown to be homologous to E. coli TS.74,75 A cladogram of representative THF-dependent dYMP (hydroxy)methyltransferases reveals a clustering of sequences that correlates to their substrates and products (Figure 14; constructed from homology-based alignments88−90). The clade containing dUMP hydroxymethyltransferases includes bacteriophages that have hm5dU in their DNA (e.g., SPO1) as well as phages known to use hm5dU as a starting point for further modification (e.g., ΦW-14 and SP10). That the dYMP (hydroxy)methyltransferases of phages not yet shown to have modified (or even hypermodified) nucleotides cluster with these suggests that these phages incorporate hmU into their genomic DNA, but may also have further modified pyrimidines. This clustering predicts the presence of hm5dU in the DNA of phages that have dUMP hydroxymethyltransferase homologues in the SPO1 group. This functional clustering also extends to other classes of the dYMP modifying enzymes. The dCMP hydroxymethyltransferases of bacteriophage T4 and T4-like phages also form a monophyletic clade, while the canonical TS also encoded by these phages cluster together with the TS of bacteria and eukaryotes.

intermediate is reduced by hydride transfer from THF resulting in m5dU, otherwise known as T, and dihydrofolate (DHF). In the cases where a hydroxymethyl pyrimidine is the final product, the methylene intermediate is instead attacked by water, producing a hydroxymethylpyrimidine and THF. 4.1. Enzymatic Pyrimidine C5 Modification: U versus C, Methyl versus Hydroxymethyl

Much is known about the structural determinants of substrate specificity, mainly from studies of E. coli thymidylate synthase (ThyA) and the bacteriophage T4 dCMP hydroxymethyltransferase (gp42). A high level of structural conservation can be seen in the crystal structures of ThyA (PDB 1KZI) and T4 Gp42. A superposition of these two structures results in an overlap with an RMS of 1.99. The structural similarity of the two enzymes can be appreciated qualitatively in Figure 13, where a monomer of each enzyme is shown bound with its respective substrate in panels A and B. This structural conservation extends to the catalytic center of these two enzymes, as seen in Figure 13C. A single asparagine at position 177 (Asn177) appears to control substrate choice. In E. coli, substitution of Asn177 for an aspartate (D) residue changes the preference of the enzyme from dUMP to dCMP.82 A similar switch of substrate preference is seen with Asn to Asp substitutions at the homologous position (Asn229) in the Lactobacillus casei thymidylate synthase.83 Conversely, the bacteriophage T4 dCMP hydroxymethyl transferase contains an aspartate at this position (D179) and mutation of this residue to asparagine results in an enzyme with an increased preference for dUMP relative to dCMP.84 These findings demonstrate the key role of this amino acid position in substrate specificity of thymidylate synthase and its homologues.

Figure 13. Conservation of fold and active site in TS and T4 gp42. Cartoon diagrams of (A)E. coli thymidylate synthase (PDB 1KZI) and (B) bacteriophage T4 dCMP hydroxymethyltransferase (PDB 1B5E) monomers. Their respective substrates dUMP and dCMP are shown bound in their catalytic centers. (C) dNMP recognition in the active site of E. coli ThyA and T4 dCMP hydroxymethyltransferase (Gp42). Catalytic residues are not shown. ThyA residue Asn177 makes complementary hydrogen bonds to N3 and C4 keto oxygen of dUMP. Similarly, Asn179 of Gp42 forms hydrogen bonds with N3 and N4 of dCMP. 12666

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

Figure 14. Phylogenetic and phenotypic clustering of dYMP (hydroxy)methyltransferases. The cladogram was generated using a MAFFT alignment of dYMP (hydroxy)methyltransferases from 50 taxa (Gene Index number and source organism indicated). These were aligned using the FFT-NS-x1000 algorithm with a JTT200 scoring matrix in the Geneious software package. A phylogenetic tree was generated from the alignment using PHYML with default parameters. Other methods yielded similar tree topologies. Substrate preferences are indicated where enzymatic activities were demonstrated experimentally (e.g., dCMP hydroxymethyltransferase). The amino acid residues occupying the homologous positions in E. coli ThyA at 143 and 177 are indicated in the right-hand columns.

Table 2. T4 and T4-like Bacteriophages Encoding Two dYMP (Hydroxy)methyltransferasesa phage

thymidylate synth

dCMP-hm’ase

Aeromonas phage 44RR2.8t Bacillus phage G enterobacteria phage IME08 enterobacteria phage RB69 enterobacteria phage T4 Escherichia phage wV7 Pectobacterium bacteriophage PM2 Salmonella phage S16 Shigella phage pSs-1 Sphingomonas phage PAU Stenotrophomonas phage IME13 Yersinia phage vB YenM TG1 Aeromonas phage 25

NP_932561 YP_009015427 YP_003734371 NP_861931 AAC12816.1 YP_007004972 YP_009211676 YP_007501266 YP_009111048 YP_007006632 YP_009217609 YP_009200496 YP_656432

NP_932389 YP_009015622 YP_003734190 NP_861738 NP_049659 YP_007004786 YP_009211464 YP_007501078 YP_009110863 YP_007006789 YP_009217483 YP_009200310 YP_656269

α-GT

β-GT

YP_009015609, YP_009015616

NP_049673 YP_007004802

NP_049658

YP_007501076 YP_009110878

a

GenBank accession numbers for a canonical TS and dCMP hydroxymethyltransferase from each bacteriophage are indicated. DNA glucosyltransferases, where known, are also indicated.

protein sequence. The amino acid occupancy at Asn177, a key determinant in substrate specificity (see section 4.1), similarly correlates with functional groupings among the dYMP (hydroxy)methyltransferases. A second residue, Leu143, has been implicated in the resolution of the methylene intermediate

The functional clustering is further supported at the amino acid sequence level. Indicated next to each protein taxon are two of its amino acids occupying homologous positions identified from the alignments used to generate the tree. By convention, the amino acid positions are numbered according to the E. coli TS 12667

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

to either a methyl or a hydroxymethyl group85 and is indicated in Figure 13. Which dYMP (hydroxy)methyltransferase catalyzes the methylation of dCMP has not been proved biochemically, but a clade of sequences from phages of Roseobacter and Achromobacter might be candidates. These sequences encode aspartate at position 177, but cluster with the canonical TS sequences of eubacteria. 4.3. Phages with Potentially Undiscovered hm5dC Modifications

Bacteriophage T4 and related T4-like phages encode both a canonical TS and dCMP hydroxymethyltransferase.77,78,91 Examples of such phages are given in Table 2, listing the GenBank Protein Accession number of their thymidylate synthase, dCMP hydroxymethyl transferase, and glucosyl transferase genes. While a subset of these have clear homologues to either α-GT or β-GT, or both (and by logical extension would likely contain glucosylated hydroxymethyl cytosines), a number of phages do not appear to have obvious DNA glucosyltransferases. This apparent lack of glucosyltransferase homologues suggests that some of these phages, by analogy to the hm5dU phages and their hypermodified counterparts, might contain hm5 C that is not further modified. Alternatively, these phages may contain as yet undiscovered modifications of hm5C. To date, no bacteriophages containing hm5C without further modification have been identified. The reasons for this are not clear, but among the many T-even like phages that have been isolated infecting E. coli, the absence of an “hm5C only” might simply be due to the genetic background of the host strains used. Most laboratory strains of E. coli encode one or more endogenous restriction enzymes (McrBC and often McrA) that specifically degrade DNA containing hm5C. Thus, any phage strains containing nonglucosylated hm5C in their DNA would be restricted.

Figure 15. Phage modification pattern determines how R, M, and MDE enzymes act on entering phage. (top panel) The host DNA is protected from its own cleaving enzymes. The bacterial chromosome (red circle) carries genes for self-M (orange box), self-R (yellow box with green outline), and self-MDE (blue box). The M protein (orange hoop) adds a methyl group (orange dot) to a C or A in a particular sequence context. The R and MDE proteins are unable to cleave the host genome (black ×). R is blocked from cleavage by the methylation of its sequence; the MDE is blocked from cleavage by the absence of modification in a suitable context. (bottom panel) Entering phage may be cleaved. A phage population deriving from different bacterial hosts carries different modification patterns (red, green, and blue particles). The particles adsorb to the host of the top panel and inject DNA into the cells. The blue DNA has a modification pattern (modified at a recognition site) that elicits cleavage by the MDE enzyme, so the phage dies. The green DNA has a modification pattern that elicits cleavage by the R enzyme (unmodified at a recognition site), so the phage dies. The red DNA has the same modification pattern as this host, so neither the R nor the MDE can cleave, the phage multiplies, and the host dies.

5. ARMS RACE: THE BIOLOGY OF MODIFICATION AND RESTRICTION Bacteriophage DNAs carry enzymatically generated modifications that range from simple methylation at specific sites to global substitution with the elaborate hypermodifications as described above. Most bacterial hosts regulate entry of DNA by cleaving it with restriction enzymes. Figure 15 illustrates the general process, using phage entry as an example. Three components are shown: a site-specific methyltransferase (M), a cognate restriction enzyme (R) recognizing the same site when not modified, and a modification dependent restriction enzyme (MDE). The RM system and the MDE can coexist as long as the MDE does not recognize the site modified by the M protein. Depending on the modification pattern on the entering DNA, the R, the MDE, or both may cleave it. The cleaved DNA is then sensitive to exonuclease degradation. Escape from restriction may result either when the entering DNA is accompanied by antirestriction proteins, or when such proteins are expressed early (see section 5.8). The host can regulate the efficiency of type I restriction, a phenomenon called “restriction alleviation” (RA). A major component of RA was most recently addressed by the Szczelkun laboratory.92 Restriction may also simply fail. Surviving invaders fall into two groups: those that now carry the imprint of the host and those that override the host’s modifying (and cleaving) enzymes. The latter are all virulent phages, and in most cases the overriding mechanism is nucleotide modification as described above.

5.1. Modification Protects from Restriction and Causes Sensitivity to It

Here we focus on the interaction between host restriction enzymes (R and MDE) and the modification state of the entering bacteriophage DNA. Methyltransferases of both host and phage contribute to the interaction. We begin with a brief summary of R, methyltransferase (M), and MDE types, including references to literature reviews. 5.1.1. RM Type Summary. RM systems for which the restriction activity is blocked by modification within the recognition sequence were classically grouped into three types: Type I RM systems form a ubiquitous and comparatively unified group. Specificity (S), methylation (M), and restriction (R) subunits form an R2M2S holoenzyme complex that requires SAM, ATP, and Mg2+ for cleavage activity in vitro. m6A is the 12668

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

Figure 16. Restriction enzymes respond to modified states in different ways. A particular sequence can be found in one of four states: no decoration, unmodified; blue symbols, site-specifically modified; gold ticks, fully substituted for one base (here 5hmC substituted for C, as in T4 lacking glucosyltransferases); gold ticks with gold arrows: hypermodified (here, glucosylated as in wild-type T4). Red arrows, enzyme attack; red arcs, attack thwarted. To the right of each schema, a representative sequence and its fate are drawn. (A) Type I−III RE, exemplified by type II enzyme HhaI (GCGC), will cleave the unmodified site but are typically blocked from cleaving specifically modified (line 2), base-substituted (line 3), or hypermodified (line 4) sites. (B) A nearby sequence recognized by a different RE (e.g., NlaIII, CATG) will still be cleaved even when the DNA is methylated at the first sequence. Base-substituted DNA will be protected from both enzymes, as will hypermodified DNA. (C) Type IV modification dependent enzymes (MDE) exemplified by McrBC require the presence of modification for cleavage to occur. Unmodified DNA is not cleaved, but some specific modification patterns are susceptible to cleavage, as is the base-substituted DNA. Some MDE are blocked by hypermodification, as shown for McrBC. (D) Some MDE strongly prefer hypermodified DNA substrates. Unmodified DNA is not cleaved. Most of those characterized will act on the base-substituted DNA to varying degrees.

but unlike them, cleavage occurs at a fixed position to one side of the site.93 These enzymes were recently shown to function as Res1Mod2 complexes.94 Type IV RE are modification dependent enzymes (MDE): RNA-independent DNA-cleaving enzymes that require base modification to act. Although examples of this group were discovered genetically very early, few examples have been wellstudied until very recently. At present, understanding of the diversity and distribution of MDE is expanding rapidly, so the discussion below is a snapshot of a rapidly changing field. Recent interest is driven in part by interest in epigenetic phenomena in animals and plants, which are closely entwined with DNA modification states. 5.1.2. Other Reviews: Perspectives on RM Systems. The biology, population biology, and enzymology of RM systems are all fascinating topics that cannot be adequately addressed here, so relevant reviews are summarized. The cleavage mechanisms of type II RE, which act in the absence of nucleotide (NTP) and cleave at a fixed position relative to the recognition site, have been reviewed.95−98 Nucleotide-dependent types I, III, and IV

protective base in the vast majority of systems, although m4C has been observed recently.33 ATP hydrolysis accompanies translocation of DNA adjacent to the recognition site; cleavage occurs when translocating complexes collide, usually leading to undefined cleavage positions. Assembly of the cleavagecompetent complex is highly regulated. The mechanism by which the M2S modifying assembly distinguishes modification states has been intensively studied for the classical EcoKI, EcoAI, and EcoR124I enzymes, each representing a subgroup of type I enzymes. Type II systems comprise two components each recognizing the same DNA site: R and M. The specificity function is integral to each protein. For those systems that recognize asymmetric sites, two M proteins are usually present, one to modify each strand. The evolutionary trajectory that unites R proteins with suitable M proteins is not understood. Type III RM systems comprise one component with both specificity and modification function (historically called Mod) and one with ATPase and DNA cleavage activity (historically called Res). As with type I enzymes, ATP is required for cleavage, 12669

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

have been reviewed.99−103 Type IV and type IIM modificationdependent enzymes (MDE) were reviewed,104−106 and are treated more extensively below (note that an early proposed definition of “type IV RE” and assignment of two enzymes107,108 was revised; these are now type IIG109). REBASE110 (http:// rebase.neb.com) maintains assignment of enzyme names and types as well as extensive information on enzyme properties. With data from Pacific Biosciences SMRT sequencing,29,111,112 base-specific modification patterns can often be assigned to particular genes without enzyme purification, greatly expanding our understanding of the variety and distribution of types I and III among prokaryotes. The role of restriction in cells and populations has been addressed.113−117 5.1.3. Interaction of RM Types with Biological Base Modifications. Types I−III R enzymes are blocked by base modification within the recognition site, and will cleave DNA not so modified (lines 1 in Figure 16). A wide diversity of specific sequences are recognized by RE, with diverse mechanisms of recognition, catalysis, and regulation of catalysis.97,110,118 The catalytic mechanism most associated with RE is phosphodiester hydrolysis by the “restriction enzyme fold” (PD(D/E)XK motif; Pfam clan PDDEXK (CL0236)). Several catalytic mechanisms have been described, and new mechanisms continue to be added: most recently a type II enzyme displaying combined glycosylaselyase action.119 Methylations (line 2 in Figure 16) that block R cleavage arise from the action of site-specific methyltransferase (M) activities. The M activities may be R cognates: activities recognizing the same site as the R, specified by M genes found near R genes and required to protect endogenous DNA from type I−III RE. They may also be expressed from so-called “orphan” M genes, which are not accompanied by cognate R genes120−122 (see section 5.5). In either case, these add a methyl group to a specific base within the M recognition site, as in lines 2 of Figure 16. M proteins fall into three main groups: two that methylate an exocyclic amino group, either of A or C (m6A or m4C), and those that methylate the 5 position of cytosine (m5C; see sections 2.2 and 2.3). The key features of these enzymes are specific sequence recognition and methyl transfer from S-adenosylmethionine (SAM) to a base flipped out of the helix. The type II M proteins that accompany type II RE display a two-domain structure, in which sequence recognition is largely determined in one domain (target recognition domain, TRD), while cofactor binding and catalytic activity reside in the other. Cofactor binding is via a Rossman fold (NADP_Rossmann (CL0063)). Orphan M’s are generally similar to RE-associated M’s. Additional reviews of the bacterial M protein structures and mechanism can be found in refs 101, 123, and 124. MDE will recognize site-specific modifications in particular sequence contexts, as described in detail in section 5.6. The required sequence context is normally degenerate when compared with the high specificity of type I−III R. Presumably this allows recognition of foreign DNA from many sources. Some sequence specificity is needed to allow coexistence with the M activities of the host’s other RM systems. Base-substitution modification as an anti-RM strategy (lines 3 in Figure 16) results from biosynthetic incorporation of alternative dNTPs, such as hm5dCTP or hm5dUTP during DNA replication. So far, this phenomenon is limited to virulent phages, some of which engineer the dNTP pools as described in section 3. The life history that enables this is described further in section 5.2. As a counter-defense, most MDE that will recognize

methylated cytosines will also recognize hydroxymethylated cytosines. Hypermodification (lines 4 in Figure 16) results from postsynthetic decoration of the base-substituted DNA by virulent phage modifying enzymes as described in sections 3.2.3−3.2.6. Hypermodification protects against some but not all MDE. Those that do cleave hypermodified DNA often will also recognize and cleave when the hypermodification is missing, but less efficiently. These are addressed specifically in section 5.6.3 and 5.6.8. 5.2. Role of Modifications in Virulent Phage Life Cycle

Other reviews cover general phage/host interactions.125−128 Here we focus on those dependent on the modified state of DNA bases. A recent review of T-even phage describes in detail one phage family of this kind.78 Base modifications affect the life cycle of virulent phages (those without a dormancy option) in two ways. As a first effect, phage-borne base substitution modifications may be targeted by MDE (Figure 17 restricting host, orange arrow in red circle of

Figure 17. Modifications in virulent phages. The restricting host (orange cell, orange arrow in red DNA circle) makes an MDE that will attack modified or hypermodified DNA, preventing infection. The nonrestricting host may express ineffective RE (green arrow) that do not prevent infection, resulting in phage development as described in the text.

host DNA; action as in Figure 16C,D), leading to death of the phage (orange cell). However, most type I−III RE (Figure 17 nonrestricting host, green arrow in the host DNA) do not cleave highly modified DNA. REBASE (http://rebase.neb.com/ rebase/rebms.html) contains detailed information on modification sensitivity of many RE. 12670

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

Figure 18. Temperate phage infection: restriction, development, and lysogeny.

dots) act on temperate phage (blue DNA) immigrating to a new host population. In some cases, the entering phage DNA may encode its own M (blue arrow, blue segments, blue dots). Temperate phages typically carry the imprint (modification pattern) of the previous host. Thus, most of the immigrant phages entering a new host with a new RE/M/MDE set will suffer DNA destruction. Cleavage (as in Figure 16A−C) by RE depends on the modification pattern given by the previous host. The cell goes on to multiply, expressing a normal complement of transcripts (red arcs). However, a fraction of the immigrants escape restriction (Figure 18, second and third rows). This may occur because there is too little R or MDE protein, because there are too few sites in the phage DNA, or because of R or MDE inhibition for a variety of reasons (section 5.8). In the second and third rows of Figure 18, the decision between lysis and lysogeny is made. Most of the phages that escape restriction take a lytic path similar to that of virulent phages (second row, Figure 18), except that host functions are not shut down as an early event. The host M (red arrow) continues to act, so that newly synthesized phage DNA carries the host modification pattern (red segments), although modification may not be complete.134 Some phages carry genes specifying their own M enzyme (blue arrow), so both phage and host modification may be present on phage DNA (blue and red segments). During subsequent infection cycles, the modification pattern on the phage DNA is the same as the host and thus protects against restriction by that cell lineage. The phage can now reinfect the surviving offspring of siblings very efficiently (shown at the far right, loop from the second to the first row). In the scenario depicted in the bottom row of Figure 18, a fraction of the escapees take a lysogenic path, becoming dormant. Usually the phage integrates into the host chromosome. Phage replication, structural genes, and lysis genes are not expressed. However, many temperate phages carry functions that are expressed in the lysogenic state, conferring new properties on the host, a phenomenon called “lysogenic conversion”135 (see further below in section 5.4.1). Lysogenic conversion can provide advantages to the prophage-carrying host. The most famous classical and recent examples are prophage-borne toxin

A second effect is phage attack on the unsubstituted host DNA: a phage-specified nuclease specific for unsubstituted DNA may degrade host DNA (Figure 17, blue-green cell, broken red circle), facilitating takeover of the cell’s metabolism. The only enzyme of this type that has been well-studied is T4 endonuclease II, the product of the denA gene (END2_BPT4 in Entrez105,129−131). This enzyme is a stripped-down version of a cleavage domain found in homing endonucleases and a few type II restriction enzymes. A companion 3′ → 5′ single-strand exonuclease (DexA) converts the fragments to nucleotides.78 Related proteins are found in many but not all phages related to T4, and in other enterobacteriophages. The degraded host DNA is used for phage DNA synthesis following engineering of nucleotide pools to substitute a modified base (e.g., hm5C) for a normal one (C) as described in sections 3.2.2 and 3.2.6. Phage DNA is packaged into particles, and cell lysis releases phage into the medium. Note, however, that the virulent lifestyle does not require a base-replacement strategy: three families of E. coli phages (T1, T5, and T3/T7), for example, have no such modifications but do degrade host DNA.132 T7 degrades host DNA with a structurespecific endonuclease (endonuclease I, product of gene 3) and an exonuclease (product of gene 6), both of which also process replicating T7 DNA see (see, e.g., Tran et al.133). Host DNA degradation occurs before phage replication as begun, possibly allowing replication forks to serve as the marker for host degradation. 5.3. Role of Modifications in Temperate Phage Life Cycle

Temperate phages are also subject to restriction (Figure 18, top row), but can escape at low frequency, acquiring protection from further restriction in the process, as described above (section 5.1) and illustrated in the second scenario (Figure 18, second row). In addition, they have a dormancy option (Figure 18, third row). Dormancy allows formation of lysogens, in which the phage protein expression is shut down and DNA is passively propagated. This less-aggressive lifestyle is more complex than that of virulent phages. Here, the focus is the role played by base modifications in determining the outcome of the phage−host interaction. In the top row of Figure 18 host R and MDE (genes not shown) that are compatible with the host M (red arrow, orange 12671

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

genes.136 Phages can even bring in their own RM systems. Here, we highlight both complete RM systems30,137−139 (section 5.4.1) and orphan M genes (section 5.4.2) that may be carried by prophages. Such prophages will reenter the lytic path at a low frequency (shown at the far right in Figure 18, loop from the third to the second row). The frequency of induction can also be regulated by the phage in response to the host’s physiology. For example, many prophages respond to the cell’s DNA damage response (called the SOS response) by increasing the frequency of induction to nearly one per cell. This allows the phage to escape bad times and migrate to a more-hospitable environment.

copied, and protected by the conventional DpnM Gm6ATC enzyme. Spn23FI, also called DpnIII, a recently described third member of this cassette set, specifies an m5C-specific M. It does not have an ss-DNA-directed M, resulting in genetic isolation of the lineage containing it.147 5.5. Orphan Modifying Enzymes

M genes not accompanied by RE genes are called orphan M’s, as described above (section 5.1.3). There is an emerging idea that these fall into three groups: those conserved in particular bacterial lineages, which serve important epigenetic roles in host metabolism; those that are eroded parts of RM systems, possibly in the process of acquiring epigenetic roles;148−150 and those expressed specifically in preparation for phage or plasmid transfer to a new cell,120,121,126,141,151−153 here called migratory orphans. In addition, some orphans are components of defense systems that do not rely on DNA cleavage of the invader.154−156 Details of these novel defense processes are still under investigation and are not described here. 5.5.1. Lineage-Conserved Orphan Methyltransferases. Lineage-conserved orphan M’s are those found in every isolate of a particular taxonomic group, in the absence of a cognate RE. A recent report157 surveyed 230 bacterial and archeal genomes containing methyltransferase genes identifiable bioinformatically, and for which modification motifs were identifiable in the kinetic signatures observed by the Pacific Biosciences SMRT sequence technology. About half of the genes were orphans, not located near candidate R functions. These orphans were frequently conserved among related taxa, unlike candidate RM system genes, which are sporadically distributed. Many new modification specificity assignments could be made, enabling future exploration of consequences for gene expression and genome function. Functional roles could not be inferred immediately from this mammoth effort, but the few lineageconserved orphans that are well-characterized (see below) suggest that other DNA-binding proteins can read the modification pattern, often with far-reaching effects on cell physiology. Laboratory studies of mutant hosts illuminate the extensive roles the modification plays in vivo for two examples, Dam and CcrM. In other cases, the conserved role is still unclear. Longterm conservation suggests that these may be important in the natural environment. Their presence in the bacteria that are phage hosts constrains what sorts of MDE specificities can be maintained at the same time (see section 5.6). 5.5.1.1. Dam and its Biological Effects on γ-Proteobacteria and Their Phages (See REBASE M.EcoKDam). Acquisition of the Dam M (Gm6ATC) was associated with reorganization of replication initiation and segregation, from SMC-directed to SeqA-MukBEF directed.158 Løbner-Olesen et al. observed that the phylogenetic tree of dam genes in a subset of γ-proteobacteria was congruent with the ribosomal tree, and coincided with acquisition of chromosome segregation genes mukBEF and loss of the segregation genes smc. The authors proposed descent of Dam from an RE-associated M such as M.DpnIIA, which is found together with the type II RE DpnII (GATC) in some Firmicutes. In addition to coordination of replication, Dam regulates mismatch DNA repair, some pathogenic processes, and transposase expression and activity.158,159 Enteric bacteriophages make use of Dam modification in their developmental programs. Expression of the Mu Mom protein (see sections 3.1.1 and 5.5.3.3) is Dam-regulated,160 a process

5.4. Modification Facilitating Migration of RM Systems

5.4.1. Lysogenic Conversion: EcoP1I and EcoGIII. Because phages escape restriction at biologically reasonable frequencies (1/106 −1/10), these “second wave” lysogenic invaders can add new capabilities to the host population. Complete RM systems on prophages have been known from the early days, when EcoP1I, a type III RM system, was shown to restrict DNA entry, both of other phages and of conjugal plasmids.139,140 This situation is rather rare, however.141 When the RM genes are expressed during vegetative growth, as illustrated in the bottom row of Figure 18, gene expression can be extensively reprogrammed. In Figure 18, third row, new methylation (blue dots) is added to the existing pattern (orange dots) on the host DNA. The effect on host expression is shown as addition of blue transcripts to the orange ones. A recent example demonstrated that the prophage-specified RM system EcoGIII restricts entry of phages and conjugal plasmids during lysogeny, while reprogramming expression globally.30 Genes associated with motility and metal transport were most affected in this particular example. Orphan M genes carried by prophages could in principle have a similar reprogramming effect. However, most characterized examples are expressed only late in the lytic cycle, not during vegetative growth of the lysogen (see section 5.5.3). 5.4.2. Replacement Cassettes DpnI/DpnII/DpnIII. Some M’s are capable of protecting DNA transferred in single stranded form, facilitating migration of the genes for them to new hosts, by natural transformation or conjugation. The earliest single-strand-specific M characterized was not an orphan, but an added component of a type IIP RM system. The site-specific M.DpnA (M2.DpnII) will modify GATC to Gm6ATC in single-stranded DNA. It is part of a variable cassette (genome island) found in Streptococcus pneumoniae. Members of this species usually have alternative cassettes: DpnII, cleaving unmodified GATC, or DpnI, an MDE, cleaving Gm6ATC. The DpnII cassette includes three genes: for DpnII; for M.DpnM (M1.DpnII), a standard M acting on double-stranded DNA; and for M.DpnA (M2.DpnII), preferentially modifying singlestranded DNA.142−147 With this cassette variation arrangement, progeny phage surviving restriction by R.DpnII in one cell in the population may still be restricted upon entry into another with R.DpnI (Figure 18, the loop from the second row to the first could often yield restriction). Modification of DNA in the single-stranded state facilitates acquisition of novel gene islands from related naturally competent strains by recombination. Transformation during natural competence involves an internalized single-stranded intermediate. The DpnA M is induced during development of natural competence, and thus protects entering single-stranded (but not double-stranded phage) DNA until it is integrated, 12672

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

mediated by OxyR binding to hemimethylated not methylated DNA.161 Bacteriophages of Dam-expressing lineages frequently carry orphan M genes with the same GATC specificity as Dam. Both virulent and temperate phages do this: virulent T1, T2, T4, RB69, and RB55 and temperate P1 and the phi993W prophage of E. coli strain O157:H7. In most cases the role these play is unclear. For P1, DNA packaging into the capsid depends on methylation of Dam sites near the packaging initiation site (pac).159 For T-even phages, the poor modification activity displayed by the host Dam M when hm5C is present in DNA162 suggests that the T2 and T4 enzymes were acquired to provide this function. What life cycle stage requires it is not clear, particularly since the related phage T6 does not have homology to the gene.162 It is known that EcoP1 (AGACC) can restrict hm5C but not ghm5C-containing T2, and that this restriction is eliminated by mutations in T2 Dam (T2 Damh) that allow methylation of the related site, to AGm6ACC. 5.5.1.2. Enteric Dcm (M.EcoKDcm). The M gene dcm (Cm5CWGG) is highly conserved in E. coli163 and is found in Klebsiella, Salmonella, and Enterobacter cloacea.164 This suggests an important conserved function, but mutations that eliminate it have little effect in ordinary laboratory conditions.165 Its presence is associated with an increase in C → T transition mutations at Dcm sites, presumably due to deamination of m5C. This occurs only when the cell also lacks the associated mismatch-repair nuclease Vsr.164 Most recently, lack of active Dcm in Escherichia coli K-12 was found to result in increased transcription of ribosomal protein genes in stationary phase, as tested using quantitative PCR. The relevant promoters contain Dcm recognition sequences, suggesting that methylation suppresses transcription in this condition.163 Lack of methylation may then lead to unbalanced allocation of resources to translation at a time when little needs to be translated. 5.5.1.3. Enteric YhdJ (M.EcoKII). YhdJ modifies ATCGm6AT. It is nonessential in laboratory model organisms E. coli K-12 and Salmonella enterica serovar Typhimurium LT2. Although widely distributed in E. coli, it is not expressed in the laboratory.166 In contrast, it usually is expressed in Salmonella,33 although not in S. Typhimurium LT2.166 5.5.1.4. Caulobacter CcrM (M.CcrMI). CcrM (Gm6ANTC) plays a critical role in coordination of replication in most αproteobacteria. It has been best studied in C. crescentus. Its expression is cell cycle regulated, such that most sites are hemimethylated (on only the parental strand) for most of the cell cycle, becoming fully methylated at the onset of cell division.167−169 5.5.1.5. Cyanobacterial DmtA (M.AvaVI, Gm6ATC) and DmtC (M.AvaVIII, m5CGATCG). The Nostoc−Anabaena−Synechocystis lineage carries DmtA, an M that modifies Gm6ATC, and another, DmtC, that modifies CGm6ATCG as deduced from restriction enzyme protection patterns, expression cloning, and limited sequence comparisons.170 Companion restriction enzymes were sought but not found. Nevertheless, the DmtA gene was reported to be essential. Similarly, a large internal deletion of the homologue of DmtC in Synechocystis sp. 6803 (SynMI; M.Ssp6803I) prevented growth under normal laboratory conditions. It grew slowly under modified conditions, at lower light intensity. Analysis of 141 cyanobacterial genomes found nearly universal distribution of predicted GATC-recognizing M’s, and also conservation of CGATCG-recognizing M’s in several genera.171

No attempt was made to separately evaluate the distribution of potential RE with these specificities. This is a much more difficult bioinformatic problem, because type II RE are highly divergent in sequence.97 5.5.2. Eroded RM Systems. It has been appreciated for many years that the RM systems are extremely variable among natural isolates of the same species (e.g., refs 14 and 172−174). This variability allows cell populations to defend against epidemic spread once the initial restriction barrier upon invasion has been surmounted by surviving phages (Figure 18, row 2). In order for the population to achieve variability, new systems must be acquired and others lost in some lines, by some mechanism. One mechanism of variation is acquisition of a new system, with loss of an old one by mutational inactivation, followed by deletion. Whole-genome sequences enable experimental tests of bioinformatic predictions of R and M gene presence. The results described below are compatible with such a pathway. A thorough analysis in one case relied on the extraordinary abundance of RM systems in Helicobacter pylori,173,175−177 and on the rich accumulation of experimentally validated RM activities and associated sequences in REBASE.110 Type II R and M genes were predicted bioinformatically, then cloned, expressed, and analyzed biochemically in E. coli. (Predicted types I and III were not tested.) Two sequenced H. pylori isolates each carried four active type II RM systems out of 14 (H. pylori 26695) or 16 (H. pylori J99) candidates. Genes for four of the active systems were unique to one or the other strain, suggesting active acquisition processes. Thirteen candidate systems were apparently allelic pairs, with homologous R or M genes or both. Four of the allelic pairs were active in one strain but not the other. Of these four with active and inactive alleles, one inactive allele was missing an R gene altogether, while two R genes were inactivated by multiple frame shifts or substitutions. The fourth inactive allele lacked an M gene while retaining a (presumably) mutant R. The net result was that each strain carried four type II R activities, but no R activity was shared with the other strain. In a later study, genetic deletion of the R genes from the four active systems of H. pylori 26695 raised the frequency of natural transformation from other H. pylori strains by a cumulative 100fold.178 Active recruitment, inactivation, and loss are part of the RM story. Have the other shared systems that lack R activity come to serve epigenetic roles instead? The presence of so many M activities in H. pylori has made generalizable conclusions difficult, and will not be surveyed here. However, in a different case, M appears to be on the way to a dedicated epigenetic role, accompanied by loss of R function. In this case, programmed variable expression of M activities (as a “phasevarion”) for type III M enzymes in two species has been firmly associated with variable expression of genes involved in host recognition and virulence.149,150,179,180 The R gene (res) was not required for this: numerous independent inactivating deletions were found in isolates H. inf luenzae. The res allele for one isolate, Rd (HindVI in REBASE), did not restrict transforming DNA in vivo.149 These studies illustrate how functions can evolve to provide a biological role for the M activity independent of the R activity. 5.5.3. Migratory Orphans: Prophage and Plasmid Orphan Modifying Enzymes. An emerging generalization is that phage-borne modifying enzymes tend to be degenerate or multispecific, presumably in order to provide defense against a broad range of host RE systems in the population. Most, but not all, modification enzymes borne by temperate phages are SAM12673

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

dependent “orphan” site-specific DNA methyltransferasesthat is, not associated with a restriction function.141 The primary role of these appears to be protection from host type I−III restriction systems. 5.5.3.1. Multispecific Bacillus Prophage M’s. Multispecific Bacillus prophage M’s were key to early characterization of M function and domain organization. Elegant work from the Trautner laboratory identified the existence of discrete target recognition domains (TRDs) that specified DNA sites of action.117 These TRDs were mutationally separable from catalytic regions, and were coded for in contiguous regions. These specificity-determining variable regions of M’s of phages SPR, Phi3T, and Rho11s were characterized and compared with the monospecific M’s described above (section 2.2). Up to four independently mutable TRDs mediated action by one enzyme (see, e.g., ref 181). An N-terminal TRD was later identified, making the total five.182 The modifications protected against RE frequently encountered in Bacillus hosts. These M’s are expressed only upon induction, not during vegetative growth of the lysogen.183 5.5.3.2. Multiple Monospecific M’s. A different approach to acquiring multiple defenses is found in Lactococcus phages. During evolution associated with intensive cultivation in cheese production, three phages acquired multiple separate orphan M’s. At least three distinct M genes were represented in three phages, with each phage containing two or three genes. These bioinformatically identified M’s were experimentally shown to be functional by observing modified motifs in phage DNA using the SMRT sequencing platform.152 5.5.3.3. Mom Modification. A distinct family of M enzymes is the phage Mu Mom enzyme. Mom belongs to an acyltransferase family rather than the methyltransferase family.50,184 It modifies ∼15% of adenine residues postsynthetically, forming N6carbamoyl-methyladenine185 (see section 3.1.1). There is weak sequence specificity: the adenine in 5′-SASNY-3′ is modified. This pattern protects against many restriction enzymes in vivo.160 Expression of Mom is lethal; thus it is absent during vegetative growth of lysogens. Most likely for this reason, in vitro characterization is lacking. Importantly, the enzyme is subject to complex regulation of transcription and translation late in phage development, presumably avoiding interference with needed cell functions while its biosynthetic apparatus is required.186 These regulatory components have been adopted by related phages for a similar purpose (see section 5.5.3.4). The degree of modification of virion DNA differs depending on mode of replication (low following lytic infection, high following induction of lysogens). 5.5.3.4. Weakly Specific A M’s. Mu-related prophages of Haemophilus and Neisseria were found to carry SAM-dependent m6 A-forming M’s of weak specificity (e.g., M.HaeV, modifying (T/G/C)m6A; Hia5).187 These genes are inserted neatly to replace the Mom acyltransferase at the same position in the genome, under control of the complex regulatory system that keeps Mom silent during vegetative growth. Like Mom, these are expressed only late in infection. Protection is afforded in vitro for a large number of RE with A-containing recognition sites. The efficacy of modification in protecting from restriction in vivo was not evaluated. 5.5.3.5. Weakly Specific ssDNA M’s. Another ss-DNAdirected M, the Sm6AY-modifying M.EcoGIX, was recently identified on a conjugal plasmid. Modification accompanies the conjugation process, in the donor in this case. DNA transferred by conjugation is the single-strand displaced by plasmid

replication during transfer. This extremely nonspecific adenine M promotes conjugal transfer to cells containing EcoGIII or EcoRI.30,188 The level of modification during conjugation has not been determined. Activity is weak in model tests.188 5.6. Restricting Modified DNA: DNA Binding Domain Fusions

Characterized MDE are all fusions of cleavage domains to DNA recognition domains. At least two modes of DNA recognition are known for other proteins. Sequence-specific DNA recognition is often accomplished by binding to B-DNA in the major groove, with or without DNA distortion. This mode of interaction is frequent for regulatory proteins and type II R enzymes, for example.97,189 On the other hand, recognition by base flipping is a strategy commonly employed by enzymes that do chemistry: DNA repair enzymes and modification methyltransferases39,190 as well as a few R proteins.191,192 For MDE, specific inspection of the modified base in an extrahelical (flipped) conformation is a common principle in all but one MDE examined in sufficient detail. This strategy is realized in several distinct ways. Coupling of recognition to strand cleavage affords additional variability. Type IV MDE are extremely diverse, and probably much more widespread than is currently known. Homologues of known enzymes can be identified bioinformatically, but new categories are continually discovered as genome-enabled genetic research is coupled with new genetic methods allowing research beyond model organisms. Supporting Information Table 1 groups these into families, which are discussed below. MDE discovery usually requires a genetic system and suitable bacteriophage or plasmid challengers. By contrast, identification of type II RE relies primarily on use of cell extracts to digest DNA in vitro. Diagnostic banding patterns result when the digest is visualized by gel electrophoresis. In most cases, such an approach to MDE discovery (using highly modified substrates) does not yield characteristic patterns, because the DNA is heavily degraded by the usually weakly sequence-specific enzymes. The degradation pattern is thus difficult to distinguish from nonspecific action. Type I and type III RE are most readily identified from characteristic behavior of the modified base during Pacific Biosciences SMRT sequencing.110,111 Since type IV MDE are not associated with M’s, this is not effective for discovery. Even with these limitations to discovery, type IV MDE fall into at least 10 groups, distinguished by connectivity and identity of DNA binding domain, cleavage domain, and presence or absence of NTP hydrolysis. An 11th group has been characterized enzymatically but not dissected functionally or characterized bioinformatically. 5.6.1. McrBC: DUF3578 DBD-Translocase Fusion, PD(D/ E)XK Separate Protein. EcoKMcrBC was the second MDE to be well-characterized (after DpnI), and is still the most complex.106,193−201 The basic findings were reviewed recently104 and will be summarized here. The N-terminal DNA binding domain of McrB (McrB-N; Pfam DUF3578, PF12102) has been crystallized and compared with the m5C-binding SRA domain (ref 193; PDB 3SSC 3SSD, 3SSE). Like the SRA domain members and DNA methyltransferases, specific binding of McrB-N is mediated by recognition of an extrahelical base flipped into a recognition pocket. The McrB-N fold is distinct from both the SRA fold and the M fold. The DNA binding domain is fused to an AAA+ translocase domain, which mediates formation of a hexameric ring in the 12674

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

consistent with bioinformatic analysis.207 Kazrani215 reported mutational evidence supporting their identification of the active site. Shao et al. created similar catalytic mutants with similar functional results, but were unable to visualize important residues in their crystal structures.214 Shao et al. used their structures to create mutants that increased selectivity for hm5C over m5C while retaining cleavage capacity. Differences in detail between the structures may result from different approaches. Different space groups were obtained, possibly accounting for the lack of a dimer interface for the Kazrani et al. structure. The N-terminal His tag but not the Cterminal tag led to migration as a monomer. This gave better crystals and was used for structure determination. Shao et al. used an N-terminal tag but obtained dimers. Neither reports inactivation by imidazole as reported by Wang.211 AbaSI, a homologue with increased discrimination in favor of hm5 C compared with m5C, was crystallized with substrate and product DNA and without DNA216 (PDB 4PAR, 4PBA, 4PBB). The enzyme was isolated as a dimer. Consistent with the PvuRts1I results, the N-terminal domain most resembled the PD(D/E)XK nuclease Vsr, while the C-terminal domain resembles SRA family members. A rigid linker connects the two domains. The complexes with DNA did not reveal a flipped base, and thus do not represent the base-inspection stage of the reaction. 5.6.4. Sco5333: SRA DBD-HNH Fusion. Sco5333 qualifies as an MDE in vivo, and comprises an N-terminal SRA binding domain fused with a C-terminal HNH cleavage domain. Han et al.217 found that Sco5333 can be expressed E. coli, but only in the absence of Dcm (Cm5CWGG) methylation. Mutations in conserved catalytic residues of the HNH domain or in conserved residues in the SRA domain each relieved this incompatibility. Han et al. mention possible domain swapping among subgroups of enzymes, such that SRA-domain subgroups reassort association with cleavage domain groups and subgroups. They classified the Sco5333 binding domain as a member of the SRA group, and used gel-shift experiments to characterize binding activity of the full-length enzyme. The enzyme bound specifically to m5C-containing DNA, whether fully modified or hemimethylated, with little preference for particular 5′ or 3′ nucleotides adjacent to the m5C. Mutation of the catalytic domain did not affect the shift, but mutations of the SRA domain abolished it. Dissociation constants obtained using isothermal scanning calorimetry further suggest that it does not discriminate hemimethylated sites from fully methylated ones. Unfortunately, details of m5C recognition and cleavage position could not be determined in vitro. Weak nonspecific cleavage to relax and then linearize a supercoiled substrate depended on divalent cation (Mg or Mn preferred), but Sco5333 showed no preference for cleavage of modified DNA. Thus, coupling of cleavage to recognition is less stringent in vitro than in vivo. In vitro, preferential binding was observed but cleavage was nonspecific. In contrast, the enzyme discriminated effectively in vivo; it was toxic only when m5C was present. Homologue Tbis1 (from a high-GC Gram+, Thermobispora bispora) displayed this relaxed coupling in vivo as well as in vitro, since it was not tolerated when expression was induced in E. coli, even in the absence of Dcm methylation. To reconcile the apparent coupling of cleavage to binding preference in vivo but not in vitro, three possibilities occur: first, a folding error (in the foreign E. coli cytoplasm or during purification) could result in uncoupling of cleavage from binding; second, an interacting protein or cofactor might be required to

presence of GTP. This complex can bind specifically to DNA containing the motif Rm5C. hm5C is also accepted, but ghm5C is not. Local sequence surrounding the Rm5C motif affects binding. In the presence of McrC, which contains a PD(D/E)XKfamily catalytic motif, and a translocation block, cleavage occurs 30−35 nt (nucleotides) from the Rm5C motif. The translocation block may be another McrBC:DNA complex or a protein bound to DNA, such as the LacI repressor bound to its site. 5.6.2. MspJI Family: SRA DBD-Mrr-Cat fusion. The MspJI family has six characterized members.202−207 Each carries out site recognition and cleavage with one polypeptide species. The cleavage position is relatively well-defined, with a 4-base 5′ extension 12 nt from the m5C, and 16−17 nt on the opposite strand.205 hm5C is accepted, but not ghm5C. The MspJI N-terminal DNA binding domain is an SRA-family member (refs 202 and 204; structure without DNA, PDB 4F0P, 4F0Q; with DNA, PDB 4R28). Each enzyme displays distinct sequence preferences in a five-base stretch surrounding the modified site, which is flipped out of the duplex as commonly occurs for SRA domain proteins. In a study with three members of the family, the contacts between the protein and bases flanking the flipped out base were examined. A subsequent study identified short protein loops contacting the nearby bases, which varied among the three proteins examined (LpnPI (ref 208; PDB 4RZL), AspBHI (ref 203; PDB 4OC8), and SgrTI). Structure-guided mutations were made that swapped these loops between family members. A significant reduction of selectivity for sequence flanking the modified position was enabled this way.208,209 The C-terminal cleavage domain is a variant of the PD(D/ E)XK domain (DX20QAK, or Mrr-Cat). As is true for many RE, binding of two specific sites is needed to trigger cleavage. The cocrystal structure of MspJI with its substrate202 (PDB 4R28) showed that this is accomplished with a tetrameric assembly. Catalytic centers that carry out double-strand cleavage are assembled from different monomers, possibly set to make four scissions in two DNA targets out of four bound DNAs. The mechanism of integration of binding with cleavage is thus complex. 5.6.3. PvuRts1I Family: PD(D/E)XK-SRA DBD Fusion. PvuRts1I was reported to restrict glucosylated T-even phages in vivo, with apparent preference for particular glucosylation patterns.210 Recent work confirmed activity on glucosylated DNA in vitro, and characterized the nature of the modification recognized and position of cleavage for three homologues: PvuRts1I itself, AbaSI (originally AbaSDFI in Wang et al.211) and PpeHI.211,212 As a group, these recognize m5C slightly, and hm5C and ghm5C better, with differing preferences; AbaSI is highly specific for ghm5C. They also display weak and variable selectivity for the sequence surrounding the modified base. Sequence preferences were determined by cloning digested fragments of fully substituted DNA and sequencing the clone junctions. Twenty active homologues out of 28 candidates were studied in a follow-up analysis using cleavage of degenerate oligonucleotide substrates.213 Dimeric in solution, all 20 required two modified sites (half-sites) for double-strand cleavage, with spacing between half-sites of ∼22 nucleotides. Incision occurs ∼11−13 nt 3′ to the modified base on one strand and 9−10 nt 3′ on the other. Crystal structure analyses of PvuRts1I (PDB 4OQ2, 4OKY) in the absence of DNA214,215 agree that binding is mediated by a Cterminal SRA-like domain, and that the N-terminal cleavage domain is a divergent member of the PD(D/E)XK family, 12675

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

ScoA3McrA (ScoMcrA) comprises a distinct N-terminal DNA binding domain (not similar to EcoKMcrA) fused to a Cterminal HNH cleavage domain somewhat similar to the EcoKMcrA HNH domain. It is active in vivo228 and in vitro230 on DNA with Dcm-modified DNA (Cm5CWGG), or with phosphorothioate linkages, or both. Cleavage required 100 μM Mn2+ or Co2+; other divalent cations were ineffective. Prolonged incubation leads to cleavage of unmodified DNA as well. Preferred target sites were cleaved at various positions up to 13 nt 5′ to the m5C, as determined by cloning digested DNA. ScoA3McrA will cleave in proximity to phosphorothioate linkages as well, replicating the pattern obtained by hydrolyzing the linkages with peracid treatment. Activity on hm5C- or ghm5Ccontaining sites has not been reported. 5.6.8. GmrSD Family: ParB/Srx DBD-HNH Fusion. EcoCTGmrSD was first characterized as a two-component enzyme specific for glucosylated T-even phage DNA.231 Later resequencing suggested that the clone characterized carried a stop codon and small deletion relative to the true genomic sequence. This resulted in the separation of components, which nevertheless reconstituted a functional enzyme in vitro.232 Homologues are distributed widely in bacteria and archaea, in most cases as fusion genes, but sometimes apparently unfused.233 A single-chain homologue, Eco94GmrSD, was characterized and compared with the earlier results for the two-chain enzyme. Detailed in vitro characterization has been hampered by toxicity.232 However, some conclusions are clear. Both GmrSD family members restrict glucosylated T4 in vivo only when the T4-encoded inhibitor IP1* is absent. IP1* presence in the cell rescues some of the toxic effects of expression, which presumably result from untargeted cleavage. In vitro, both enzymes prefer ghm5C DNA to unmodified C and are stimulated by nucleotides. They differ in action on hm5C (Eco94GmrSD cleaves it) and nucleotide preference. Eco94GmrSD activity decays on storage at −20 °C. This is reversed by fresh dithiothreitol (DTT). Three of six alanine replacement mutations of residues aligned with the predicted HNH motif relieved toxicity and reduced or eliminated in vitro cleavage activity. Motif analysis and structural modeling233 assigned GmrS to DUF262 (PF03235), which was always N-terminal to GmrD, DUF1524 (PF07510). Key residues in the GmrD model appear to correspond to those shown to be required for cleavage.232 Similar analysis proposed assignment of a ParB/Srx fold to GmrS.233 ParB family proteins bind to specific DNA sites, are required for conjugal plasmid segregation, and have been known to have DNA cleavage activity for some time.234 A recent report found a ParB-related protein to exhibit both nuclease and ATPase activity,235 and proposed similarity to sulfiredoxin enzymes. Sulfiredoxins use ATP and thioredoxin or glutaredoxin to drive reduction of sulfinic acid (e.g., overoxidized cysteines) in proteins.236 The response of Eco94GmrSD activity to DTT after storage (see above) is compatible with the idea that oxidation state is important for function. 5.6.9. SauUSI PLDc-Helicase-DUF3427 DBD Fusion. SauUSI is structured with N-terminal DNA cleavage, C-terminal DNA binding, and (d)ATP-dependent helicase/translocase in the middle. The enzyme recognizes m5C and hm5C but not ghm5C or m4C,237 with a consensus recognition Sm5CNGS, based on modification patterns created by sensitizing M’s. Like a few other site-specific nucleases, the cleavage domain belongs to the phospholipase domain family PLDc-2 (Pfam13091). Requirement for four key amino acids was

carry out the coupling; third, compartmentalization in vivo in the native host might allow attack on incoming DNA but not the chromosome. 5.6.5. EcoKMrr: Mrr-N DBD-Mrr-Cat Fusion. Published evidence of EcoKMrr function also is limited to in vivo effects. Initial recognition of the activity relied on reduced transmission in Mrr-containing hosts of certain cloned M’s associated with type II RE, particularly m6A-forming M.PstI (CTGCm6AG) and M.HhaII (Gm6ANTC) but also a variety of m5C forming M’s.218,219 This reduction was shown to be directed specifically to the modified bases by tests using site-specifically modified lambda phage.220 The type III enzyme StyLTI (CAGm6AG) is specifically incompatible with EcoKMrr.221 Note that this incompatibility depends on the particular modified sequence, not the enzyme class in toto, since type III enzymes EcoP1 (AGm6ACC) and EcoP15I (CAGCm6AG) are compatible with EcoKMrr. Mrr almost certainly cleaves methylated DNA. Heitman and Model showed that sensitive M clones cause induction of the DNA repair response (SOS), using the dinD::lacZ reporter system to measure damage.218 This strongly suggests that the restriction involves DNA cleavage. Bioinformatic fold-recognition analysis yielded prediction of a modified PD(D/E)XK domain, dubbed Mrr-Cat (pfam04471),222 and later prediction of an N-terminal winged helix with presumptive DNA binding activity.223 This is the MrrN (pfam14338) domain. Mutations in conserved residues in these domains abrogated a high-pressure sensitivity phenotype associated with the presence of EcoKMrr. Other enzymes with Mrr-Cat have been characterized in vitro (see section 5.6.2, MspJI family). We have found no report of a Mrr-N domain characterized in this way. 5.6.6. EcoKMcrA: EcoMcrA-N DBD-HNH Fusion. McrA is responsible for one of the earliest described restriction phenomena. It is active in vivo on m5C in particular sequences, and phage substituted with hm5C but not ghm5C.104,114 Like Sco5333, McrA DNA binding (to Ym5C) was demonstrated in vitro, but modification-dependent cleavage was not reported.224,225 Extensive mutagenesis226 confirmed the presence of two domains: an N-terminal region expected to bind to DNA, and a C-terminal HNH (PF01844) presumptive cleavage domain, as predicted by bioinformatic analysis.227 In vivo evidence for cleavage activity was induction of the DNA repair response (SOS), using the dinD::lacZ reporter system to measure damage. A mutation in the N-terminal domain enabled in vivo discrimination between m5C (still recognized) and hm5C (not recognized).226 In vitro, gel shift experiments confirmed and clarified recognition: C-terminally tagged McrA bound specifically to (Y>R)m5CGR. A hemimethylated site was bound as effectively as a fully methylated site. Mismatches at the recognition site abrogated binding. An N-terminal affinity tag interfered with binding, further confirming the importance of the N-terminal segment for binding.224 5.6.7. ScoA3McrA: ScoA3McrA(N) DBD-HNH. Streptomyces coelicolor A3 restricts transformation of plasmids with modified bases.228 ScoA3McrA is one of at least four gene products responsible. It also targets DNA from bacteria that express the Dnd (DNA degradation) system,229 which adds phosphorothioate modification to a nonbridging oxygen of the phosphodiester backbone in favored degenerate DNA sequence contexts. 12676

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

Table 3. Restriction Inhibitors and Their Modes of Action inhibitor

target

origin

OCR Ral

type I type I

phage T7 phage lambda

Lar

type I

Rac cryptic prophage

DarA, DarB Stp ArdA

type I

phage P1

type I type I

phage T4 conjugal plasmid

ClpXP ArdB, KlcA

type I type I

host conjugal plasmid

ArdC Arn

type I McrBC

conjugal plasmid T4

IP1 StpA UGI, p56

GmrSD type II, DNase I UDG

phage T4 host phages PBS2;250 Phi29249

mode of action DNA mimic257,259 promotes de novo modification260 promotes de novo modification261,262 site occlusion

molecular mechanism undefined injected with DNA263

disassembly DNA mimic; preferentially inhibits R264 degrades R protein266−268 target unknown; globular protein269 protects single strands in vitro270 DNA mimic; also displaces H-NS278 not a mimic281 condenses DNA271,272 DNA mimic; stoichiometric250,251

confirmed with changes to alanine. This cleavage domain normally does not require a divalent cation, but EDTA-treated SauUSI depended on Mg2+, possibly because of the ATP requirement conferred by the helicase component. The Cterminal DUF3427 domain is presumed to mediate DNA binding, but this was not explored. Biological activity was verified by transformation and infection experiments in homologous and heterologous hosts. A plasmid bearing active SauUSI could not be introduced into a Dcm+ E. coli host; when established in active orientation in a Dcmdeficient host, restriction of Dcm-modified phage λ was observed. In Staphylococcus aureus, the original host, restriction of plasmid transformation was also observed to depend on the presence of m5C, again in Dcm sites. 5.6.10. DpnI: PD(D/E)XK-Winged Helix DBD. The first enzymatically characterized MDE,238 DpnI (Gm6ATC) is highly specific in comparison with the others, approaching the selectivity of ordinary type IIP RE. Biologically, it is a component of the alternative cassette system described above (section 5.4.2), effectively restricting phage but not transforming DNA. DpnI cleaves fully modified Gm6ATC sites 3′ to the adenine, leaving a blunt end. Under suitable conditions it will leave 80% of hemimethylated DNA uncut.239 Most work on this enzyme has compared unmodified with site specifically modified substrates. Availability of nonspecific m6A-forming M.HaeV (M.Hia5) recently enabled detection of cleavage at relaxed sites, Sm6ATS240 (see also REBASE http://rebase.neb.com/cgi-bin/ msget?DpnI for further details). DpnI is unlike many type IIP enzymes, which recognize DNA via half-site binding by components of a dimer. It is also unlike other type IV families, in that the modified bases (Gm6ATC) recognized by DpnI are not extrahelical, although the helix is locally distorted. Nevertheless, like members of other type IV families, DpnI is structurally a fusion. In this case, an N-terminal PD(D/E)XK nuclease domain is joined with a winged helix (wH) variant of the helix-turn-helix fold family.240,241 Some of the modification specificity is present in the isolated cleavage domain; it preferentially cleaved modified DNA, and was blocked by competition from the isolated binding domain.240 The wH domain and the catalytic domain of a single polypeptide bind to different DNA sites. In the first structure240

notes expressed before injection is complete258 molecular mechanism undefined

releases another antiphage strategy near transfer origin, promoter in a hairpin of entering ssDNA265 protects vulnerable sites during DNA repair near transfer origin, expressed from a promoter in a hairpin of entering ssDNA only protects superinfecting phage275−277 highly specific; injected with DNA282 interference observed in vitro multiple independent folds

(PDB 4ESJ), with 1:1 DNA:protein ratio, the cleavage domain was not in position to act. The second structure241 (PDB 4KYW), obtained at 2:1 DNA:protein ratio in the presence of cleavage-inhibiting Ca2+, showed details of recognition by both domains. Compared with the first structure, the two domains were rotated relative to each other ∼75° by means of unwinding of two helices, one in each domain; disordered loops of the catalytic domain became fixed, approaching the DNA from both major and minor grooves. A single DpnI catalytic domain contacts both methyl groups of the recognition site; these are wedged apart relative to unmethylated B-form DNA structures recorded for GATC sites. Independent binding by the two domains may increase activity by the intact protein by increasing the local concentration of sites at which the other domain can act. However, Mierzejewska et al.241 also present evidence that recognition by the two domains is integrated by more than such affinity effects. DNA binding by each domain shielded critical residues from deuterium exchange with solvent. Mutations affecting binding capacity of one domain affected shielding of the other, suggesting communication between domains. 5.6.11. GlaI Family, Unidentified Domains. Several restriction enzymes with reasonably specific m5C-containing sites are available commercially. GlaI, recognizing GCGC, is best characterized. Full activity requires m5C at four positions, two on each strand.242 However, changing either of the outer m5C:G positions to T:A yielded 60% activity, suggesting that the methyl group at C5 is more important than the base or its complement. This was a better substrate than the oligonucleotide with three m5 C:G and C:G in an outer position. T:A in an inner position was a poor substrate even with the remaining three m5C:G (20% activity), as was C:G in either inner position. Thus, base and methylation specificity is high for m5CpG:Gm5C, and relaxed in the flanks. 5.7. Restricting Modified DNA with DNA Repair Enzymes

Virulent phages containing noncanonical nucleotides can be targets of enzymes that normally repair damaged DNA. These unusual bases can be recognized by repair enzymes, leading to cleavage. Two additional modification-dependent phage-restriction processes stem from such activity, one long known but the other very recently identified. 12677

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

modifications. Finally, the phage may counter with inhibitors directed to the hypermodification-recognizing enzyme. Inhibitors can play a role both early and late in this evolutionary spiral, and may be as variable as the restriction systems themselves. Those characterized tend to be specific for one enzyme type or subtype. Table 3 briefly summarizes these with references to biological effects and structural information, where known.257−272 Restriction inhibitors active against type II and type III RE have not been described. Two sorts of reasons for this can be proposed. First, it may be that no one has looked in the right way, a very plausible scenario. Systematic study of diverse phages or plasmid transfer in a variety of restricting hosts would set the stage for discovery. The abundance of open reading frames (ORFs) of unidentified function in phage genomes273 leaves much room for discovery. Alternatively, the problem of evolving an inhibitor may be a high-investment, low-payoff situation for mobile elements. Though ubiquitous, type II RE are extremely diverse in domain structure and recognition site, possibly making creation of general-purpose inhibitors difficult. Type III RE are fairly infrequent, even after mining the accumulating genome sequences,110 so pressure to evolve inhibitors may be low also. In contrast, type I inhibitors are frequent. Many phages and mobile elements express inhibitory proteins that specifically defend against these ubiquitous systems. There are numerous points of leverage for this family, given complexity and commonality of cofactor requirement, assembly mechanism, and organization. Accordingly, the inhibitory mechanisms are themselves diverse (Table 3; see also refs 99 and 274). None of the inhibitors have been tested for activity against type IIG enzymes, which are related to type I but do not require ATP, and have R and M domains in a single polypeptide.109 Type IV inhibitors may be more abundant than is evident in Table 3. Again, they have not been systematically sought. Those that are known are heritage examples, legacies of the intensive study of phage genetics and biochemistry before the era of cloning. T4 expresses an McrBC-inhibiting protein, Arn, which can protect superinfecting hmC-containing phage from restriction275−277 and could also play a role permitting (inefficient) phage growth in the absence of the glucosyl donor, UDP-glucose. Ho et al. carried out a structural study278 of this protein. Based on the resulting negatively charged surface and its disposition, they proposed that the enzyme acts as a DNA mimic. A docking approach enabled a model of Arn binding to the N-terminal DBD of McrB. Ho et al. also demonstrated in vitro interaction of Arn with the histone-like protein H-NS, and proposed that Arn could titrate the protein from DNA. This could help to counteract potential gene-silencing effects of H-NS in infected cells. Such effects of HNS have been reported (e.g., for Salmonella pathogenicity islands279), but not for T4. Another role for Arn was sought because the antirestriction property in this case seems insufficient to account for its conservation. That is because homologues of McrBC are not universally distributed in E. coli, and in any event the glucosyl donor that affords protection from it is always present in nature.280 Neutralization of the proposed antiphage activity of H-NS binding could provide the necessary selective advantage. Another T4-encoded protein, IP1*, is packaged in the phage particle and injected into the new host to inhibit EcoCTGmrSD.231,281 The inhibition is not effective against all

Damaged bases resulting from chemical insults (deamination and alkylation especially) are targets of glycosylases and nucleases that result in removal of the damage and repair by base excision repair (BER) enzymes (see, e.g., van der Veen243). The novel enzyme R.PabI has adopted a glycosylase-lyase baseexcision strategy to cleave a specific unmodified palindromic site as well.119,192 5.7.1. Repair Glycosylase UDG. Uracil DNA glycosylase (UDG) is a universal repair enzyme that removes uracil bases from double-stranded DNA. Base-excision repair enzymes proceed to act at the abasic site, reincorporating thymidine. Uracil may arise as a product of cytosine deamination in situ. If unrepaired, C → T transition mutations would result. Like other DBDs involved in type IV restriction, UDG flips the base out of the helix for recognition.244 In this case, enzyme action removes the modified base, and DNA incision results from the action of a separate enzyme.243 As described above (section 3.2.1), some bacteriophages, such as the Bacillus subtilis phage PBS2, are fully U-substituted62,245 with resulting resistance to some type II RE.246 These phages normally survive because UDG activity is inhibited by a phage protein (see Table 3); PBS2 DNA is extensively degraded when protein synthesis is inhibited early in infection. Other Bacillus phage polymerases, such as Phi29, discriminate T from dU poorly during polymerization, and hence these phages must eliminate the normal repair activity during replication.247,248 Interestingly, the PBS2 and Phi29 inhibitors have completely different folds.249−251 5.7.2. Repair Nuclease Nfi (Endonuclease V). The diversity of base modifications that can serve as “restrictioneliciting” moieties was recently broadened. DNA of the Pseudomonas phage PaP1 was “unclonable” for many years. Recently, endonuclease V of E. coli was shown to be responsible for this problem.252 The nature of the modified base responsible for this has not been clarified, but is likely a modified purine, based on the properties of endonuclease V. E. coli endonuclease V (pfam04493; cl00653) is a divalent cation-dependent repair enzyme of the endonuclease superfamily (CL0189 at EBI; cl00653 at NCBI). It cleaves the second phosphodiester 3′ to inosine in dsDNA, and is responsible for suppressing mutagenesis (presumably via mispairing) by deamination of purines (xanthine and hypoxanthine) following nitrosoylating insults.253,254 In vitro, it also recognizes structural alteration such as mismatches, flaps, and Y-junctions less efficiently, especially at higher pH in the presence of Mn2+. The crystal structure of the E. coli apoenzyme (missing a flexible 9 aa C-terminal tail)255 resembled that of the Thermotoga maritima enzyme. The Tma EndoV cocrystal structure256 revealed the damaged base flipped toward the minor groove, in contrast to DNA glycosylase action, which rotates the base toward the major groove. A key result of the Tma EndoV structure was identification of a strand-separating “wedge”, proposed to sense damaged bases from the minor groove and facilitate recognition of damaged bases and DNA structure aberrations. 5.8. Inhibition of RE Action

In the coevolving universe of phages and their hosts, RE attack on unmodified sites by hosts may be countered by novel base incorporation by phages, then rebutted with MDE by hosts, and then further countered with hypermodification, and back again with development of MDE that accommodate the additional 12678

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

GmrSD homologues and is not a DNA mimic, so its mechanism of action is unclear. Other T-even phages may express similar but distinct inhibitors,282 suggesting a long and complex history of coevolution between diverse host MDE systems and their specific inhibitors.

undergraduate at the State University of New York at Albany, he worked in the laboratory of the late Peter M. Snow assisting in experiments aimed at understanding the role of cell adhesion molecules in axonal guidance during Drosophila nervous system development. A chance encounter with Buckminster Fuller’s Cosmography sparked a continuing fascination with geodesic domes and viral capsid structures, which led him to the laboratory of Sherwood Casjens at the University of Utah. There, Peter did his graduate work on the molecular biology of the Salmonella phage P22 scaffolding protein, a capsid assembly chaperone. After graduate school, he joined the lab of Prof. Jonathan King at the Massachusetts Institute of Technology, where he worked on diverse projects centered on phages, including the culture and genomics of marine cyanophages infecting Prochlorococcus and Synechococcus. Peter currently directs a small group at New England Biolabs focused on identifying DNA modification systems in bacteriophages and developing new enzymatic tools for the manipulation of DNA.

6. FUTURE DIRECTIONS Bacteria and their viruses continue to be a rich source of discoveries in the chemistry and biology of DNA base modification. Looking ahead, we anticipate some fruitful avenues of inquiry. Clearly, DNA modification and hypermodification systems impart fitness advantages to bacteriophages and their hosts in defensive interactions exemplified by restriction activities. Just as clearly, even the census of modifications is incomplete, not to mention that of biosynthetic pathways involved. What additional DNA modifications exist in bacteria and their viruses? The Pacific Biosciences SMRT sequencing technology invites discovery and characterization of such modifications. Nanopore machines on the horizon may also prove useful for this. By what enzymatic mechanisms are they formed? Given that such modifications often protect DNA from restriction endonucleases, what further host countermeasures have evolved? In the simplest case, are there restriction activities that recognize hypermodification moieties other than glucosyl-hm5C ? These questions may yield completely new protein domains, as shown by new protein folds found among the type IV restriction enzymes. What additional new protein folds contribute to DNA modification, or to the recognition of those modifications, or to otherwise managing the fate of modified DNA? Are there previously characterized protein folds that have been co-opted to participate in as yet undiscovered types of DNA modification or restriction processes? Once a modification is available, its presence may be drafted to serve other roles in cell physiology, as has happened in the case of Dam and Ccr methylation. We expect the additional universe of modifications described here (and yet to be discovered) will have been drafted in other cases for other purposes, such as modulation of pathogenicity, transposition, bacterial adaptive immunity (CRISPR), and horizontal exchange.

Elisabeth A. Raleigh is Emeritus Scientist in the Research Department of New England Biolabs, Inc. in Ipswich, MA. Her research focuses on the limits to horizontal exchange in bacteria. Her pioneering work on enzymes that attack the very modified bases needed to protect DNA from cleavage by newly introduced restriction systems also yielded useful tools for the study of epigenetic processes in eukaryotes. The theme of the lab has been use of genetic approaches to assist understanding the in vitro and in vivo behavior of these and other enzymes. At present, the dynamics and mechanism of acquisition and replacement of RM gene loci are the focus. She received her B.A. with Distinction from Swarthmore College, and Ph.D. from the Massachusetts Institute of Technology. A postdoctoral fellowship with Nancy Kleckner at Harvard University working with transposon Tn10 followed. She began her independent career at New England Biolabs, becoming successively senior scientist, Director of Prokaryotic Research, Director of Research and Development, and Head of Technology Assessment, before returning to her scientific passion. She has published more than 50 scientific articles and patents, in journals such as Proceedings of the National Academy of Sciences of the United States of America, Nucleic Acids Research, and Journal of Molecular Biology.

ACKNOWLEDGMENTS This work was supported by Don Comb and New England Biolabs. We thank Brian Anton, Shuang-yong Xu, and Christopher Noren for critical commentary and Bill Jack and Rich Roberts for stimulating discussions. Two anonymous reviewers made important suggestions that were incorporated with gratitude.

ASSOCIATED CONTENT S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.chemrev.6b00114. Families of type IV restriction enzymes (MDE); repair enzymes and MDE-related enzymes interfering with bacteriophage infection (PDF)

ABBREVIATIONS hm5 C 5-hydroxymethyldeoxycytosine hm5 dCMP 5-hydroxymethyldeoxycytidine monophosphate hm5 dCTP 5-hydroxymethylcytosine triphosphate hm5 dU 5-hydroxymethyldeoxyuracil hm5 dUDP hydroxymethyldeoxyuridine diphosphate hm5 dUMP hydroxymethyldeoxyuridine monophosphate hm5 dUTP hydroxymethyldeoxyuridine triphosphate ghm5 C glucosyl-5-hydroxymethylcytosine m5 C 5-methyldeoxycytosine m5 dCTP 5-methylcytosine triphosphate A adenine α-GT α-glucosyltransferase AdoMet S-adenosyl methionine ATP (ribo)adenosine triphosphate β-GT β-glucosyltransferase BAP bacterial alkaline phosphatase

AUTHOR INFORMATION Corresponding Author

*Tel.: 978-380-7238. E-mail: [email protected]. Notes

The authors declare the following competing financial interests: The review discusses restriction enzymes, which are supplied commercially by their employer, New England Biolabs. Biographies Peter Weigele got his start in biology while working in the retail pet trade as a high school student in northern New Jersey. While an 12679

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews BER C CE dAMP DBD dCMP dCTP DEAE dGMP DHF DNA DNA-MT Dnd dNMP dNTP dTMP dTTP DUF dUMP dUTP dYMP G GNAT GTP HCl hm’ase HNH HPLC M m’ase MDE Mg2+ m4 C m6 A NMR NTP PLDc R R RE RM S S SAM SRA T THF TRD TS UDG UGI Y

Review

(3) Hotchkiss, R. D. The Quantitative Separation of Purines, Pyrimidines, and Nucleosides by Paper Chromatography. J. Biol. Chem. 1948, 175 (1), 315−332. (4) Ledinko, N. Occurrence of 5-Methyldeoxycytidylate in the DNA of Phage Lambda. J. Mol. Biol. 1964, 9 (3), 834−835. (5) Gough, M.; Lederberg, S. Methylated Bases in the Host-Modified Deoxyribonucleic Acid of Escherichia coli and Bacteriophage Lambda. J. Bacteriol. 1966, 91 (4), 1460−1468. (6) Dunn, D. B.; Smith, J. D. Occurrence of a New Base in the Deoxyribonucleic Acid of a Strain of Bacterium Coli. Nature 1955, 175 (4451), 336−337. (7) Dunn, D. B.; Smith, J. D. The Occurrence of 6-Methylaminopurine in Deoxyribonucleic Acids. Biochem. J. 1958, 68 (4), 627−636. (8) Janulaitis, A.; Klimasauskas, S.; Petrusyte, M.; Butkus, V. Cytosine Modification in DNA by Bcni Methylase Yields N4-Methylcytosine. FEBS Lett. 1983, 161 (1), 131−134. (9) Ehrlich, M.; Gama-Sosa, M. A.; Carreira, L. H.; Ljungdahl, L. G.; Kuo, K. C.; Gehrke, C. W. DNA Methylation in Thermophilic Bacteria: N4-Methylcytosine, 5-Methylcytosine, and N5methyladenine. Nucleic Acids Res. 1985, 13 (4), 1399−1412. (10) Ehrlich, M.; Wilson, G. G.; Kuo, K. C.; Gehrke, C. W. N4Methylcytosine as a Minor Base in Bacterial DNA. J. Bacteriol. 1987, 169 (3), 939−943. (11) Wyatt, G. R.; Cohen, S. S. The Bases of the Nucleic Acids of Some Bacterial and Animal Viruses: The Occurrence of 5-Hydroxymethylcytosine. Biochem. J. 1953, 55 (5), 774−782. (12) Sinsheimer, R. L. Nucleotides from T2r+ Bacteriophage. Science 1954, 120 (3119), 551−553. (13) Arber, W.; Hattman, S.; Dussoix, D. On the Host-Controlled Modification of Bacteriophage Lambda. Virology 1963, 21, 30−35. (14) Arber, W.; Linn, S. DNA Modification and Restriction. Annu. Rev. Biochem. 1969, 38 (500), 467−500. (15) Wyatt, G. R. The Purine and Pyrimidine Composition of Deoxypentose Nucleic Acids. Biochem. J. 1951, 48 (5), 584−590. (16) Volkin, E.; Khym, J. X.; Cohn, W. E. The Preparation of Desoxynucleotides. J. Am. Chem. Soc. 1951, 73 (4), 1533−1536. (17) Lichtenstein, J.; Cohen, S. S. Nucleotides Derived from Enzymatic Digests of Nucleic Acids of T2, T4, and T6 Bacteriophages. J. Biol. Chem. 1960, 235 (4), 1134−1141. (18) Randerath, K. Thin-Layer Chromatography of Nucleotides on Layers of Cellulose Ion-Exchangers. Nature 1962, 194, 768−769. (19) Khym, J. X.; Cohn, W. E. The Ion-Exchange Separation of the 5′Ribonucleotides and Deoxyribonucleotides. Biochim. Biophys. Acta 1954, 15 (1), 139. (20) Gehrke, C. W.; Zumwalt, R. W.; McCune, R. A.; Kuo, K. C. Quantitative High-Performance Liquid Chromatography Analysis of Modified Nucleosides in Physiological Fluids, Trna, and DNA. Recent Results Cancer Res. 1983, 84, 344−359. (21) Gehrke, C. W.; McCune, R. A.; Gama-Sosa, M. A.; Ehrlich, M.; Kuo, K. C. Quantitative Reversed-Phase High-Performance Liquid Chromatography of Major and Modified Nucleosides in DNA. J. Chromatogr. 1984, 301 (1), 199−219. (22) McCloskey, J. A. Structural Characterization of Natural Nucleosides by Mass Spectrometry. Acc. Chem. Res. 1991, 24 (3), 81− 88. (23) Dudley, E.; Bond, L. Mass Spectrometry Analysis of Nucleosides and Nucleotides. Mass Spectrom. Rev. 2014, 33 (4), 302−331. (24) Caugant, D. A.; Levin, B. R.; Selander, R. K. Distribution of Multilocus Genotypes of Escherichia coli within and between Host Families. J. Hyg. 1984, 92 (3), 377−384. (25) Phillips, D. H.; Arlt, V. M. The 32p-Postlabeling Assay for DNA Adducts. Nat. Protoc. 2007, 2 (11), 2772−2781. (26) Phillips, D. H. Detection of DNA Modifications by the 32pPostlabelling Assay. Mutat. Res., Fundam. Mol. Mech. Mutagen. 1997, 378 (1−2), 1−12. (27) Stach, D.; Schmitz, O. J.; Stilgenbauer, S.; Benner, A.; Dohner, H.; Wiessler, M.; Lyko, F. Capillary Electrophoretic Analysis of Genomic DNA Methylation Levels. Nucleic Acids Res. 2003, 31 (2), e2.

base excision repair cytosine capillary electrophoresis deoxyadenosine monophosphate DNA binding domain deoxycytidine monophosphate deoxycytidine triphosphate diethylaminoethyl deoxyguanosine monophosphate dihydrofolate deoxyribonucleic acid DNA methyltransferase DNA degradation gene deoxynucleotide monophosphate deoxynucleotide triphosphate deoxythymidine monophosphate deoxythymidine triphosphate domain of unknown function deoxyuridine monophosphate deoxyuridine triphosphate deoxypyrimidine monophosphate guanine GCN5-related N-acetyltransferase (ribo)guanosine triphosphate hydrogen chloride hydroxymethylase “nuclease signature found in homing endonucleases, colicins, and MutS” high pressure liquid chromatography modification subunit or DNA methyltransferase methyltransferase modification dependent restriction enzyme magnesium cation N4-methylcytosine N6-methyladenine nuclear magnetic resonance (ribo)nucleotide triphosphate phospholipase D subgroup c motif purine restriction subunit or enzyme restriction enzyme restriction−modification C or G specificity subunit S-adenosylmethionine set and ring finger associated domain thymine tetrahydrofolate target recognition domain thymidylate synthase uracil DNA glycosylase uracil glycosylase inhibitor pyrimidine

REFERENCES (1) Johnson, T. B.; Coghill, R. D. Researches on Pyrimidines. C111. The Discovery of 5-Methyl-Cytosine in Tuberculinic Acid, the Nucleic Acid of the Tubercle Bacillus1. J. Am. Chem. Soc. 1925, 47 (11), 2838− 2844. (2) Doskočil, J.; Šormová, Z. The Occurrence of 5-Methylcytosine in Bacterial Deoxyribonucleic Acids. Biochim. Biophys. Acta, Nucleic Acids Protein Synth. 1965, 95 (3), 513−515. 12680

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

(49) Swinton, D.; Hattman, S.; Crain, P. F.; Cheng, C. S.; Smith, D. L.; McCloskey, J. A. Purification and Characterization of the Unusual Deoxynucleoside, Alpha-N-(9-Beta-D-2′-Deoxyribofuranosylpurin-6Yl)Glycinamide, Specified by the Phage Mu Modification Function. Proc. Natl. Acad. Sci. U. S. A. 1983, 80 (24), 7400−7404. (50) Kaminska, K. H.; Bujnicki, J. M. Bacteriophage Mu Mom Protein Responsible for DNA Modification Is a New Member of the Acyltransferase Superfamily. Cell Cycle 2008, 7 (1), 120−121. (51) Khudyakov, I. Y.; Kirnos, M. D.; Alexandrushkina, N. I.; Vanyushin, B. F. Cyanophage S-2l Contains DNA with 2,6Diaminopurine Substituted for Adenine. Virology 1978, 88 (1), 8−18. (52) Kirnos, M. D.; Khudyakov, I. Y.; Alexandrushkina, N. I.; Vanyushin, B. F. 2-Aminoadenine Is an Adenine Substituting for a Base in S-2l Cyanophage DNA. Nature 1977, 270 (5635), 369−370. (53) Nikolskaya, I. I.; Lopatina, N. G.; Debov, S. S. Methylated Guanine Derivative as a Minor Base in the DNA of Phage DDVI Shigella Disenteriae. Biochim. Biophys. Acta, Nucleic Acids Protein Synth. 1976, 435 (2), 206−210. (54) Nikolskaya, II; Tediashvili, M. I.; Lopatina, N. G.; Chanishvili, T. G.; Debov, S. S. Specificity and Functions of Guanine Methylase of Shigella Sonnei DDVI Phage. Biochim. Biophys. Acta, Nucleic Acids Protein Synth. 1979, 561 (1), 232−239. (55) Thiaville, J. J.; Kellner, S. M.; Yuan, Y.; Hutinet, G.; Thiaville, P. C.; Jumpathong, W.; Mohapatra, S.; Brochier-Armanet, C.; Letarov, A. V.; Hillebrand, R. Novel Genomic Island Modifies DNA with 7Deazaguanine Derivatives. Proc. Natl. Acad. Sci. U. S. A. 2016, 113 (11), E1452−E1459. (56) Kulikov, E. E.; Golomidova, A. K.; Letarova, M. A.; Kostryukova, E. S.; Zelenin, A. S.; Prokhorov, N. S.; Letarov, A. V. Genomic Sequencing and Biological Characteristics of a Novel Escherichia coli Bacteriophage 9g, a Putative Representative of a New Siphoviridae Genus. Viruses 2014, 6 (12), 5077−5092. (57) Takahashi, I.; Marmur, J. Replacement of Thymidylic Acid by Deoxyuridylic Acid in the Deoxyribonucleic Acid of a Transducing Phage for Bacillus subtilis. Nature 1963, 197, 794−795. (58) Price, A. R. Bacteriophage Pbs2-Induced Deoxycytidine Triphosphate Deaminase in Bacillus subtilis. J. Virol. 1974, 14 (5), 1314−1317. (59) Price, A. R.; Frato, J. Bacillus subtilis Deoxyuridinetriphosphatase and Its Bacteriophage PBS2-Induced Inhibitor. J. Biol. Chem. 1975, 250 (22), 8804−8811. (60) Price, A. R.; Fogt, S. M. Deoxythymidylate Phosphohydrolase Induced by Bacteriophage PBS2 During Infection of Bacillus subtilis. J. Biol. Chem. 1973, 248 (4), 1372−1380. (61) Price, A. Synthesis of DNA Containing Uracil During Bacteriophage Infection of Bacillus subtilis; Technical Progress Report (8th Year), Nov 1, 1977−October 31, 1978; Department of Biological Chemistry, University of Michigan: Ann Arbor, MI, 1978. (62) Katz, G. E.; Price, A. R.; Pomerantz, M. J. Bacteriophage PBS2Induced Inhibition of Uracil-Containing DNA Degradation. J. Virol. 1976, 20 (2), 535−538. (63) Maltman, K. L.; Neuhard, J.; Warren, R. A. 5-[(Hydroxymethyl)O-Pyrophosphoryl]Uracil, an Intermediate in the Biosynthesis of AlphaPutrescinylthymine in Deoxyribonucleic Acid of Bacteriophage Phi W14. Biochemistry 1981, 20 (12), 3586−3591. (64) Witmer, H. Synthesis of Deoxythymidylate and the Unusual Deoxynucleotide in Mature DNA of Bacillus Subtilis Bacteriophage Sp10 Occurs by Postreplicational Modification of 5-Hydroxymethyldeoxyuridylate. J. Virol. 1981, 39 (2), 536−547. (65) Casella, E.; Markewych, O.; Dosmar, M.; Witmer, H. Production and Expression of dTMP-Enriched DNA of Bacteriophage SP15. J. Virol. 1978, 28 (3), 753−766. (66) Ehrlich, M.; Ehrlich, K. C. A Novel, Highly Modified, Bacteriophage DNA in Which Thymine Is Partly Replaced by a Phosphoglucuronate Moiety Covalently Bound to 5-(4′,5′Dihydroxypentyl)Uracil. J. Biol. Chem. 1981, 256 (19), 9966−9972. (67) Kuo, T. T.; Huang, T. C.; Teng, M. H. 5-Methylcytosine Replacing Cytosine in the Deoxyribonucleic Acid of a Bacteriophage for Xanthomonas oryzae. J. Mol. Biol. 1968, 34 (2), 373−375.

(28) Uhrova, M.; Deyl, Z.; Suchanek, M. Separation of Common Nucleotides, Mono-, Di- and Triphosphates, by Capillary Electrophoresis. J. Chromatogr., Biomed. Appl. 1996, 681 (1), 99−105. (29) Flusberg, B. A.; Webster, D. R.; Lee, J. H.; Travers, K. J.; Olivares, E. C.; Clark, T. A.; Korlach, J.; Turner, S. W. Direct Detection of DNA Methylation During Single-Molecule, Real-Time Sequencing. Nat. Methods 2010, 7 (6), 461−465. (30) Fang, G.; Munera, D.; Friedman, D. I.; Mandlik, A.; Chao, M. C.; Banerjee, O.; Feng, Z.; Losic, B.; Mahajan, M. C.; Jabado, O. J.; et al. Genome-Wide Mapping of Methylated Adenine Residues in Pathogenic Escherichia coli Using Single-Molecule Real-Time Sequencing. Nat. Biotechnol. 2012, 30 (12), 1232−1239. (31) Lee, W. C.; Anton, B. P.; Wang, S.; Baybayan, P.; Singh, S.; Ashby, M.; Chua, E. G.; Tay, C. Y.; Thirriot, F.; Loke, M. F.; et al. The Complete Methylome of Helicobacter pylori UM032. BMC Genomics 2015, 16, 424. (32) Murray, I. A.; Clark, T. A.; Morgan, R. D.; Boitano, M.; Anton, B. P.; Luong, K.; Fomenkov, A.; Turner, S. W.; Korlach, J.; Roberts, R. J. The Methylomes of Six Bacteria. Nucleic Acids Res. 2012, 40 (22), 11450−11462. (33) Pirone-Davies, C.; Hoffmann, M.; Roberts, R. J.; Muruvanda, T.; Timme, R. E.; Strain, E.; Luo, Y.; Payne, J.; Luong, K.; Song, Y.; et al. Genome-Wide Methylation Patterns in Salmonella enterica Subsp. enterica Serovars. PLoS One 2015, 10 (4), e0123639. (34) Malone, T.; Blumenthal, R. M.; Cheng, X. Structure-Guided Analysis Reveals Nine Sequence Motifs Conserved among DNA Amino-Methyltransferases, and Suggests a Catalytic Mechanism for These Enzymes. J. Mol. Biol. 1995, 253 (4), 618−632. (35) Peisajovich, S. G.; Rockah, L.; Tawfik, D. S. Evolution of New Protein Topologies through Multistep Gene Rearrangements. Nat. Genet. 2006, 38 (2), 168−174. (36) Vilkaitis, G.; Lubys, A.; Merkiene, E.; Timinskas, A.; Janulaitis, A.; Klimasauskas, S. Circular Permutation of DNA Cytosine-N4Methyltransferases: In Vivo Coexistence in the BcnI System and in Vitro Probing by Hybrid Formation. Nucleic Acids Res. 2002, 30 (7), 1547− 1557. (37) Roberts, R. J. On Base Flipping. Cell 1995, 82 (1), 9−12. (38) Klimasauskas, S.; Kumar, S.; Roberts, R. J.; Cheng, X. HhaI Methyltransferase Flips Its Target Base out of the DNA Helix. Cell 1994, 76 (2), 357−369. (39) Cheng, X.; Roberts, R. J. Adomet-Dependent Methylation, DNA Methyltransferases and Base Flipping. Nucleic Acids Res. 2001, 29 (18), 3784−3795. (40) Jeltsch, A. Beyond Watson and Crick: DNA Methylation and Molecular Enzymology of DNA Methyltransferases. ChemBioChem 2002, 3 (4), 274−293. (41) Gommers-Ampt, J. H.; Borst, P. Hypermodified Bases in DNA. FASEB J. 1995, 9 (11), 1034−1042. (42) Warren, R. A. Modified Bases in Bacteriophage DNAs. Annu. Rev. Microbiol. 1980, 34, 137−158. (43) Iyer, L. M.; Zhang, D.; Burroughs, A. M.; Aravind, L. Computational Identification of Novel Biochemical Systems Involved in Oxidation, Glycosylation and Other Complex Modifications of Bases in DNA. Nucleic Acids Res. 2013, 41 (16), 7635−7655. (44) Hall, R. H. The Modified Nucleosides in Nucleic Acids, illustrated ed.; Columbia University Press: 1971. (45) Sternberg, N.; Coulby, J. Cleavage of the Bacteriophage P1 Packaging Site (pac) Is Regulated by Adenine Methylation. Proc. Natl. Acad. Sci. U. S. A. 1990, 87 (20), 8070−8074. (46) Scraba, D. G.; Bradley, R. D.; Leyritz-Wills, M.; Warren, R. A. Bacteriophage Phi W-14: The Contribution of Covalently Bound Putrescine to DNA Packing in the Phage Head. Virology 1983, 124 (1), 152−160. (47) Allet, B.; Bukhari, A. I. Analysis of Bacteriophage Mu and LambdaMu Hybrid DNAs by Specific Endonucleases. J. Mol. Biol. 1975, 92 (4), 529−540. (48) Kahmann, R.; Kamp, D.; Zipser, D. In DNA Insertion Elements, Plasmids and Episomes; Cold Spring Harbor Laboratory: Cold Spring Harbor, NY, 1977. 12681

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

(68) Kuo, T. T.; Tu, J. Enzymatic Synthesis of Deoxy-5-MethylCytidylic Acid Replacing Deoxycytidylic Acid in Xanthomonas oryzae Phage Xp12 DNA. Nature 1976, 263 (5578), 615. (69) Vogelsang-Wenke, H.; Oesterhelt, D. Isolation of a Halobacterial Phage with a Fully Cytosine-Methylated Genome. Mol. Gen. Genet. 1988, 211 (3), 407−414. (70) Kuo, T. T.; Chow, T. Y.; Lin, Y. T. A New Thymidylate Biosynthesis in Xanthomonas oryzae Infected by Phage Xp 12. Virology 1982, 118 (2), 293−300. (71) Feng, T. Y.; Tu, J.; Kuo, T. T. Characterization of Deoxycytidylate Methyltransferase in Xanthomonas oryzae Infected with Bacteriophage Xp12. Eur. J. Biochem. 1978, 87 (1), 29−36. (72) Wang, R. Y.; Ehrlich, M. 5-Methyl-dCTP Deaminase Induced by Bacteriophage Xp-12. J. Virol. 1982, 42 (1), 42−48. (73) Huang, L. H.; Farnet, C. M.; Ehrlich, K. C.; Ehrlich, M. Digestion of Highly Modified Bacteriophage DNA by Restriction Endonucleases. Nucleic Acids Res. 1982, 10 (5), 1579−1591. (74) Lamm, N.; Tomaschewski, J.; Ruger, W. Nucleotide Sequence of the Deoxycytidylate Hydroxymethylase Gene of Bacteriophage T4 (g42) and the Homology of Its Gene Product with Thymidylate Synthase of E. coli. Nucleic Acids Res. 1987, 15 (9), 3920. (75) Lamm, N.; Wang, Y.; Mathews, C. K.; Ruger, W. Deoxycytidylate Hydroxymethylase Gene of Bacteriophage T4. Nucleotide Sequence Determination and Over-expression of the Gene. Eur. J. Biochem. 1988, 172 (3), 553−563. (76) Mathews, C. K.; Wheeler, L. J.; Ungermann, C.; Young, J. P.; Ray, N. B. Enzyme Interactions Involving T4 Phage-Coded Thymidylate Synthase and Deoxycytidylate Hydroxymethylase. Adv. Exp. Med. Biol. 1993, 338, 563−570. (77) Desplats, C.; Krisch, H. M. The Diversity and Evolution of the T4Type Bacteriophages. Res. Microbiol. 2003, 154 (4), 259−267. (78) Petrov, V. M.; Ratnayaka, S.; Nolan, J. M.; Miller, E. S.; Karam, J. D. Genomes of the T4-Related Bacteriophages as Windows on Microbial Genome Evolution. Virol. J. 2010, 7, 292. (79) Swinton, D.; Hattman, S.; Benzinger, R.; Buchanan-Wollaston, V.; Beringer, J. Replacement of the Deoxycytidine Residues in Rhizobium Bacteriophage Rl38JI DNA. FEBS Lett. 1985, 184 (2), 294−298. (80) Wilhelm, K.; Ruger, W. Deoxyuridylate-Hydroxymethylase of Bacteriophage SPO1. Virology 1992, 189 (2), 640−646. (81) Schellenberger, U.; Livi, L. L.; Santi, D. V. Cloning, Expression, Purification, and Characterization of 2′-Deoxyuridylate Hydroxymethylase from Phage SPO1. Protein Expression Purif. 1995, 6 (4), 423− 430. (82) Hardy, L. W.; Nalivaika, E. Asn177 in Escherichia coli Thymidylate Synthase Is a Major Determinant of Pyrimidine Specificity. Proc. Natl. Acad. Sci. U. S. A. 1992, 89 (20), 9725−9729. (83) Liu, L.; Santi, D. V. Mutation of Asparagine 229 to Aspartate in Thymidylate Synthase Converts the Enzyme to a Deoxycytidylate Methylase. Biochemistry 1992, 31 (22), 5100−5104. (84) Graves, K. L.; Butler, M. M.; Hardy, L. W. Roles of Cys148 and Asp179 in Catalysis by Deoxycytidylate Hydroxymethylase from Bacteriophage T4 Examined by Site-Directed Mutagenesis. Biochemistry 1992, 31 (42), 10315−10321. (85) Fritz, T. A.; Liu, L.; Finer-Moore, J. S.; Stroud, R. M. Tryptophan 80 and Leucine 143 Are Critical for the Hydride Transfer Step of Thymidylate Synthase by Controlling Active Site Access†,‡. Biochemistry 2002, 41 (22), 7021−7029. (86) Song, H. K.; Sohn, S. H.; Suh, S. W. Crystal Structure of Deoxycytidylate Hydroxymethylase from Bacteriophage T4, a Component of the Deoxyribonucleoside Triphosphate-Synthesizing Complex. EMBO J. 1999, 18 (5), 1104−1113. (87) Chen, C.; Gao, T.; Zhao, G.; Deng, Z.; Hu, S.; Xu, H. H.; He, X. Evidence from 18o Feeding Studies for Hydroxyl Group Donor in the Reaction Catalyzed by Cytidylate Hydroxymethylase Mila. Chin. Sci. Bull. 2013, 58 (8), 864−868. (88) Katoh, K.; Misawa, K.; Kuma, K.; Miyata, T. Mafft: A Novel Method for Rapid Multiple Sequence Alignment Based on Fast Fourier Transform. Nucleic Acids Res. 2002, 30 (14), 3059−3066.

(89) Kearse, M.; Moir, R.; Wilson, A.; Stones-Havas, S.; Cheung, M.; Sturrock, S.; Buxton, S.; Cooper, A.; Markowitz, S.; Duran, C.; et al. Geneious Basic: An Integrated and Extendable Desktop Software Platform for the Organization and Analysis of Sequence Data. Bioinformatics 2012, 28 (12), 1647−1649. (90) Guindon, S.; Gascuel, O. A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. Syst. Biol. 2003, 52 (5), 696−704. (91) Nolan, J. M.; Petrov, V.; Bertrand, C.; Krisch, H. M.; Karam, J. D. Genetic Diversity among Five T4-Like Bacteriophages. Virol. J. 2006, 3, 30. (92) Simons, M.; Diffin, F. M.; Szczelkun, M. D. ClpXP Protease Targets Long-Lived DNA Translocation States of a Helicase-Like Motor to Cause Restriction Alleviation. Nucleic Acids Res. 2014, 42 (19), 12082−12091. (93) Xu, S. Y.; Nugent, R. L.; Kasamkattil, J.; Fomenkov, A.; Gupta, Y.; Aggarwal, A.; Wang, X.; Li, Z.; Zheng, Y.; Morgan, R. Characterization of Type II and III Restriction-Modification Systems from Bacillus cereus Strains ATCC 10987 and ATCC 14579. J. Bacteriol. 2012, 194 (1), 49− 60. (94) Butterer, A.; Pernstich, C.; Smith, R. M.; Sobott, F.; Szczelkun, M. D.; Toth, J. Type III Restriction Endonucleases Are Heterotrimeric: Comprising One Helicase-Nuclease Subunit and a Dimeric Methyltransferase That Binds Only One Specific DNA. Nucleic Acids Res. 2014, 42 (8), 5139−5150. (95) Pingoud, A.; Fuxreiter, M.; Pingoud, V.; Wende, W. Type II Restriction Endonucleases: Structure and Mechanism. Cell. Mol. Life Sci. 2005, 62 (6), 685−707. (96) Pingoud, A.; Jeltsch, A. Structure and Function of Type II Restriction Endonucleases. Nucleic Acids Res. 2001, 29 (18), 3705− 3727. (97) Pingoud, A.; Wilson, G. G.; Wende, W. Type II Restriction Endonucleases–a Historical Perspective and More. Nucleic Acids Res. 2014, 42 (12), 7489−7527. (98) Roberts, R. J.; Halford, S. E. In Nucleases, 2nd ed.; Linn, S. M., Lloyd, R. S., Roberts, R. J., Eds.; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY, 1993. (99) Loenen, W. A.; Dryden, D. T.; Raleigh, E. A.; Wilson, G. G. Type I Restriction Enzymes and Their Relatives. Nucleic Acids Res. 2014, 42 (1), 20−44. (100) Bourniquel, A. A.; Bickle, T. A. Complex Restriction Enzymes: NTP-Driven Molecular Motors. Biochimie 2002, 84 (11), 1047−1059. (101) Dryden, D. T.; Murray, N. E.; Rao, D. N. Nucleoside Triphosphate-Dependent Restriction Enzymes. Nucleic Acids Res. 2001, 29 (18), 3728−3741. (102) Murray, N. E. Type I Restriction Systems: Sophisticated Molecular Machines (a Legacy of Bertani and Weigle). Microbiol. Mo.l Biol. Rev. 2000, 64 (2), 412−434. (103) Bickle, T. A. In Nucleases, 2nd ed.; Linn, S. M., Lloyd, R. S., Roberts, R. J., Eds.; Cold Spring Harbor Monograph Archive: Cold Spring Harbor Laboratory Press: Cold Spring Harbor, NY, 1993. (104) Loenen, W. A.; Raleigh, E. A. The Other Face of Restriction: Modification-Dependent Enzymes. Nucleic Acids Res. 2014, 42 (1), 56− 69. (105) Carlson, K.; Raleigh, E. A.; Hattman, S. In Molecular Biology of Bacteriophage T4; Karam, J. D., Drake, J. W., Kreuzer, K. N., Mosig, G., Hall, D. H., Eiserling, F. A., Black, L. W., Spicer, E. K., Kutter, E., Carlson, K., et al., Eds.; American Society for Microbiology: Washington, DC, 1994. (106) Raleigh, E. A. Organization and Function of the mcrBC Genes of Escherichia coli K-12. Mol. Microbiol. 1992, 6 (9), 1079−1086. (107) Lepikhov, K.; Tchernov, A.; Zheleznaja, L.; Matvienko, N.; Walter, J.; Trautner, T. A. Characterization of the Type Iv Restriction Modification System BspLU11III from Bacillus sp. LU11. Nucleic Acids Res. 2001, 29 (22), 4691−4698. (108) Janulaitis, A.; Petrusyte, M.; Maneliene, Z.; Klimasauskas, S.; Butkus, V. Purification and Properties of the Eco57l Restriction Endonuclease and MethylasePrototypes of a New Class (Type IV). Nucleic Acids Res. 1992, 20 (22), 6043−6049. 12682

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

(131) Carlson, K.; Lagerback, P.; Nystrom, A. C. Bacteriophage T4 Endonuclease II: Concerted Single-Strand Nicks Yield Double-Strand Cleavage. Mol. Microbiol. 2004, 52 (5), 1403−1411. (132) Webster, R. G.; Granoff, A. Encyclopedia of Virology; Academic Press: 1994. (133) Tran, N. Q.; Lee, S. J.; Richardson, C. C.; Tabor, S. A Novel Nucleotide Kinase Encoded by Gene 1.7 of Bacteriophage T7. Mol. Microbiol. 2010, 77 (2), 492−504. (134) Hughes, S. G.; Hattman, S. The Sensitivity of Bacteriophage Lambda DNA to Restriction Endonuclease RII. J. Mol. Biol. 1975, 98 (3), 645−647. (135) Brussow, H.; Canchaya, C.; Hardt, W. D. Phages and the Evolution of Bacterial Pathogens: From Genomic Rearrangements to Lysogenic Conversion. Microbiol Mol. Biol. Rev. 2004, 68 (3), 560−602. (136) Penades, J. R.; Chen, J.; Quiles-Puchalt, N.; Carpena, N.; Novick, R. P. Bacteriophage-Mediated Spread of Bacterial Virulence Genes. Curr. Opin. Microbiol. 2015, 23, 171−178. (137) Dempsey, R. M.; Carroll, D.; Kong, H.; Higgins, L.; Keane, C. T.; Coleman, D. C. Sau42i, a Bcgi-Like Restriction-Modification System Encoded by the Staphylococcus aureus Quadruple-Converting Phage Phi42. Microbiology 2005, 151 (4), 1301−1311. (138) Pullinger, G. D.; Bevir, T.; Lax, A. J. The Pasteurella Multocida Toxin Is Encoded within a Lysogenic Bacteriophage. Mol. Microbiol. 2004, 51 (1), 255−269. (139) Iida, S.; Meyer, J.; Bachi, B.; Stalhammar-Carlemalm, M.; Schrickel, S.; Bickle, T. A.; Arber, W. DNA Restriction–Modification Genes of Phage P1 and Plasmid P15b. Structure and in Vitro Transcription. J. Mol. Biol. 1983, 165 (1), 1−18. (140) Arber, W.; Morse, M. L. Host Specificity of DNA Produced by Escherichia coli. VI. Effects on Bacterial Conjugation. Genetics 1965, 51, 137−148. (141) Ershova, A. S.; Karyagina, A. S.; Vasiliev, M. O.; Lyashchuk, A. M.; Lunin, V. G.; Spirin, S. A.; Alexeevski, A. V. Solitary Restriction Endonucleases in Prokaryotic Genomes. Nucleic Acids Res. 2012, 40 (20), 10107−10115. (142) de la Campa, A. G.; Springhorn, S. S.; Kale, P.; Lacks, S. A. Proteins Encoded by the DpnI Restriction Gene Cassette. Hyperproduction and Characterization of the DpnI Endonuclease. J. Biol. Chem. 1988, 263 (29), 14696−14702. (143) Cerritelli, S.; Springhorn, S. S.; Lacks, S. A. Dpna, a Methylase for Single-Strand DNA in the DpnII Restriction System, and Its Biological Function. Proc. Natl. Acad. Sci. U. S. A. 1989, 86 (23), 9223−9227. (144) Lacks, S. A.; Mannarelli, B. M.; Springhorn, S. S.; Greenberg, B. Genetic Basis of the Complementary DpnI and DpnII Restriction Systems of S. Pneumoniae: An Intercellular Cassette Mechanism. Cell 1986, 46 (7), 993−1000. (145) Johnston, C.; Polard, P.; Claverys, J. P. The DpnI/DpnII Pneumococcal System, Defense against Foreign Attack without Compromising Genetic Exchange. Mob. Genet. Elements 2013, 3 (4), e25582. (146) Johnston, C.; Martin, B.; Granadel, C.; Polard, P.; Claverys, J. P. Programmed Protection of Foreign DNA from Restriction Allows Pathogenicity Island Exchange During Pneumococcal Transformation. PLoS Pathog. 2013, 9 (2), e1003178. (147) Eutsey, R. A.; Powell, E.; Dordel, J.; Salter, S. J.; Clark, T. A.; Korlach, J.; Ehrlich, G. D.; Hiller, N. L. Genetic Stabilization of the Drug-Resistant PMEN1 Pneumococcus Lineage by Its Distinctive DpnIII Restriction-Modification System. mBio 2015, 6 (3), e00173-15. (148) Seib, K. L.; Jen, F. E.; Tan, A.; Scott, A. L.; Kumar, R.; Power, P. M.; Chen, L. T.; Wu, H. J.; Wang, A. H.; Hill, D. M.; et al. Specificity of the ModA11, ModA12 and ModD1 Epigenetic Regulator N(6)Adenine DNA Methyltransferases of Neisseria meningitidis. Nucleic Acids Res. 2015, 43 (8), 4150−4162. (149) Fox, K. L.; Dowideit, S. J.; Erwin, A. L.; Srikhanta, Y. N.; Smith, A. L.; Jennings, M. P. Haemophilus inf luenzae Phasevarions Have Evolved from Type III DNA Restriction Systems into Epigenetic Regulators of Gene Expression. Nucleic Acids Res. 2007, 35 (15), 5242− 5252.

(109) Roberts, J. R.; Belfort, M.; Bestor, T.; Bhagwat, A. S.; Bickle, T. A.; Bitinaite, J.; Blumenthal, R. M.; Degtyarev, S. K.; Dryden, D. T. F.; Dybvig, K.; et al. A Nomenclature for Restriction Enzymes, DNA Methyltransferases, Homing Endonucleases and Their Genes. Nucleic Acids Res. 2003, 31 (7), 1805−1812. (110) Roberts, R. J.; Vincze, T.; Posfai, J.; Macelis, D. REBASE - a Database for DNA Restriction and Modification: Enzymes, Genes and Genomes. Nucleic Acids Res. 2015, 43 (D1), D298−D299. (111) Korlach, J.; Turner, S. W. Going Beyond Five Bases in DNA Sequencing. Curr. Opin. Struct. Biol. 2012, 22 (3), 251−261. (112) Eid, J.; Fehr, A.; Gray, J.; Luong, K.; Lyle, J.; Otto, G.; Peluso, P.; Rank, D.; Baybayan, P.; Bettman, B.; et al. Real-Time DNA Sequencing from Single Polymerase Molecules. Science 2009, 323 (5910), 133−138. (113) Vasu, K.; Nagaraja, V. Diverse Functions of RestrictionModification Systems in Addition to Cellular Defense. Microbiol Mol. Biol. Rev. 2013, 77 (1), 53−72. (114) Loenen, W. A.; Dryden, D. T.; Raleigh, E. A.; Wilson, G. G.; Murray, N. E. Highlights of the DNA Cutters: A Short History of the Restriction Enzymes. Nucleic Acids Res. 2014, 42 (1), 3−19. (115) Furuta, Y.; Kobayashi, I. In Bacterial Integrative Mobile Genetic Elements; Roberts, A. P., Mullany, P., Eds.; Landes Bioscience: 2012. (116) Barcus, V. A.; Murray, N. E. Barriers to Recombination: Restriction. Soc. Gen. Microbiol. Symp. 1995, 52, 31−58. (117) Trautner, T. A.; Noyer-Weidner, M. In DNA Methylation: Molecular Biology and Biological Significance; Jost, J. P., Saluz, H., Eds.; Birkhäuser Verlag: Basel, 1993. (118) Orlowski, J.; Bujnicki, J. M. Structural and Evolutionary Classification of Type II Restriction Enzymes Based on Theoretical and Experimental Analyses. Nucleic Acids Res. 2008, 36 (11), 3552− 3569. (119) Fukuyo, M.; Nakano, T.; Zhang, Y.; Furuta, Y.; Ishikawa, K.; Watanabe-Matsui, M.; Yano, H.; Hamakawa, T.; Ide, H.; Kobayashi, I. Restriction-Modification System with Methyl-Inhibited Base Excision and Abasic-Site Cleavage Activities. Nucleic Acids Res. 2015, 43 (5), 2841−2852. (120) Murphy, J.; Mahony, J.; Ainsworth, S.; Nauta, A.; van Sinderen, D. Bacteriophage Orphan DNA Methyltransferases: Insights from Their Bacterial Origin, Function, and Occurrence. Appl. Environ. Microbiol. 2013, 79 (24), 7547−7555. (121) Seshasayee, A. S.; Singh, P.; Krishna, S. Context-Dependent Conservation of DNA Methyltransferases in Bacteria. Nucleic Acids Res. 2012, 40 (15), 7066−7073. (122) Furuta, Y.; Abe, K.; Kobayashi, I. Genome Comparison and Context Analysis Reveals Putative Mobile Forms of RestrictionModification Systems and Related Rearrangements. Nucleic Acids Res. 2010, 38 (7), 2428−2443. (123) Bheemanaik, S.; Reddy, Y. V.; Rao, D. N. Structure, Function and Mechanism of Exocyclic DNA Methyltransferases. Biochem. J. 2006, 399 (2), 177−190. (124) Gong, W.; O’Gara, M.; Blumenthal, R. M.; Cheng, X. Structure of PvuII DNA-(Cytosine N4) Methyltransferase, an Example of Domain Permutation and Protein Fold Assignment. Nucleic Acids Res. 1997, 25 (14), 2702−2715. (125) Seed, K. D. Battling Phages: How Bacteria Defend against Viral Attack. PLoS Pathog. 2015, 11 (6), e1004847. (126) Stern, A.; Sorek, R. The Phage-Host Arms Race: Shaping the Evolution of Microbes. BioEssays 2011, 33 (1), 43−51. (127) Bickle, T. A.; Kruger, D. H. Biology of DNA Restriction. Microbiol. Rev. 1993, 57 (2), 434−450. (128) Krüger, D. H.; Bickle, T. A. Bacteriophage Survival: Multiple Mechanisms for Avoiding the Deoxyribonucleic Acid Restriction Systems of Their Hosts. Microbiol. Rev. 1983, 47 (3), 345−360. (129) Andersson, C. E.; Lagerback, P.; Carlson, K. Structure of Bacteriophage T4 Endonuclease II Mutant E118a, a Tetrameric GIYYIG Enzyme. J. Mol. Biol. 2010, 397 (4), 1003−1016. (130) Lagerback, P.; Carlson, K. Amino Acid Residues in the GIY-YIG Endonuclease II of Phage T4 Affecting Sequence Recognition and Binding as Well as Catalysis. J. Bacterio.l 2008, 190 (16), 5533−5544. 12683

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

(150) Srikhanta, Y. N.; Maguire, T. L.; Stacey, K. J.; Grimmond, S. M.; Jennings, M. P. The Phasevarion: A Genetic System Controlling Coordinated, Random Switching of Expression of Multiple Genes. Proc. Natl. Acad. Sci. U. S. A. 2005, 102 (15), 5547−5551. (151) Marinus, M. G.; Casadesus, J. Roles of DNA Adenine Methylation in Host-Pathogen Interactions: Mismatch Repair, Transcriptional Regulation, and More. FEMS Microbiol. Rev. 2009, 33 (3), 488−503. (152) Murphy, J.; Klumpp, J.; Mahony, J.; O’Connell-Motherway, M.; Nauta, A.; van Sinderen, D. Methyltransferases Acquired by Lactococcal 936-Type Phage Provide Protection against Restriction Endonuclease Activity. BMC Genomics 2014, 15, 831. (153) Forde, B. M.; Phan, M. D.; Gawthorne, J. A.; Ashcroft, M. M.; Stanton-Cook, M.; Sarkar, S.; Peters, K. M.; Chan, K. G.; Chong, T. M.; Yin, W. F.; et al. Lineage-Specific Methyltransferases Define the Methylome of the Globally Disseminated Escherichia coli ST131 Clone. mBio 2015, 6 (6), e01602-15. (154) Hoskisson, P. A.; Sumby, P.; Smith, M. C. The Phage Growth Limitation System in Streptomyces coelicolor a(3)2 Is a Toxin/Antitoxin System, Comprising Enzymes with DNA Methyltransferase, Protein Kinase and ATPase Activity. Virology 2015, 477, 100−109. (155) Sumby, P.; Smith, M. C. Genetics of the Phage Growth Limitation (Pgl) System of Streptomyces coelicolor A3(2). Mol. Microbiol. 2002, 44 (2), 489−500. (156) Goldfarb, T.; Sberro, H.; Weinstock, E.; Cohen, O.; Doron, S.; Charpak-Amikam, Y.; Afik, S.; Ofir, G.; Sorek, R. BREX Is a Novel Phage Resistance System Widespread in Microbial Genomes. EMBO J. 2015, 34 (2), 169−183. (157) Blow, M. J.; Clark, T. A.; Daum, C. G.; Deutschbauer, A. M.; Fomenkov, A.; Fries, R.; Froula, J.; Kang, D. D.; Malmstrom, R. R.; Morgan, R. D.; et al. The Epigenomic Landscape of Prokaryotes. PLoS Genet. 2016, 12 (2), e1005854. (158) Løbner-Olesen, A.; Skovgaard, O.; Marinus, M. G. Dam Methylation: Coordinating Cellular Processes. Curr. Opin. Microbiol. 2005, 8 (2), 154−160. (159) Casadesus, J.; Low, D. Epigenetic Gene Regulation in the Bacterial World. Microbiol Mol. Biol. Rev. 2006, 70 (3), 830−856. (160) Kahmann, R.; Hattman, S. In Phage Mu; Symonds, N., Toussaint, A., van de Putte, P., Howe, M. M., Eds.; Cold Spring Harbor Laboratory: Cold Spring Harbor, NY, 1987. (161) Hattman, S.; Sun, W. Escherichia coli OxyR Modulation of Bacteriophage Mu mom Expression in dam+ Cells Can Be Attributed to Its Ability to Bind Hemimethylated Pmom Promoter DNA. Nucleic Acids Res. 1997, 25 (21), 4385−4388. (162) Hattman, S.; Malygin, E. G. Bacteriophage T2dam and T4dam DNA-[N6-Adenine]-Methyltransferases. Prog. Nucleic Acid Res. Mol. Biol. 2004, 77, 67−126. (163) Militello, K. T.; Simon, R. D.; Qureshi, M.; Maines, R.; Van Horne, M. L.; Hennick, S. M.; Jayakar, S. K.; Pounder, S. Conservation of Dcm-Mediated Cytosine DNA Methylation in Escherichia coli. FEMS Microbiol. Lett. 2012, 328 (1), 78−85. (164) Lieb, M.; Bhagwat, A. S. Very Short Patch Repair: Reducing the Cost of Cytosine Methylation. Mol. Microbiol. 1996, 20 (3), 467−473. (165) Marinus, M. G.; Lobner-Olesen, A. DNA Methylation. EcoSal Plus 2014, 6, 1. (166) Broadbent, S. E.; Balbontin, R.; Casadesus, J.; Marinus, M. G.; van der Woude, M. YhdJ, a Nonessential CcrM-Like DNA Methyltransferase of Escherichia coli and Salmonella enterica. J. Bacteriol. 2007, 189 (11), 4325−4327. (167) Reisenauer, A.; Shapiro, L. DNA Methylation Affects the Cell Cycle Transcription of the CtrA Global Regulator in Caulobacter. EMBO J. 2002, 21 (18), 4969−4977. (168) Collier, J.; McAdams, H. H.; Shapiro, L. A DNA Methylation Ratchet Governs Progression through a Bacterial Cell Cycle. Proc. Natl. Acad. Sci. U. S. A. 2007, 104 (43), 17111−17116. (169) Gonzalez, D.; Kozdon, J. B.; McAdams, H. H.; Shapiro, L.; Collier, J. The Functions of DNA Methylation by CcrM in Caulobacter crescentus: A Global Approach. Nucleic Acids Res. 2014, 42 (6), 3720− 3735.

(170) Matveyev, A. V.; Young, K. T.; Meng, A.; Elhai, J. DNA Methyltransferases of the Cyanobacterium Anabaena PCC 7120. Nucleic Acids Res. 2001, 29 (7), 1491−1506. (171) Stucken, K.; Koch, R.; Dagan, T. Cyanobacterial Defense Mechanisms against Foreign DNA Transfer and Their Impact on Genetic Engineering. Biol. Res. 2013, 46 (4), 373−382. (172) Barcus, V. A.; Titheradge, A. J. B.; Murray, N. E. The Diversity of Alleles at the hsd Locus in Natural Populations of Escherichia coli. Genetics 1995, 140 (4), 1187−1197. (173) Jeltsch, A.; Pingoud, A. Horizontal Gene Transfer Contributes to the Wide Distribution and Evolution of Type II RestrictionModification Systems. J. Mol. Evol. 1996, 42 (2), 91−96. (174) Brooks, J. E.; Raleigh, E. A. In Bacterial Genomes: Structure and Analysis; de Bruijn, F. J., Lupski, J. R., Weinstock, G., Eds.; Chapman and Hall: New York, 1998. (175) Lin, L. F.; Posfai, J.; Roberts, R. J.; Kong, H. Comparative Genomics of the Restriction-Modification Systems in Helicobacter Pylori. Proc. Natl. Acad. Sci. U. S. A. 2001, 98 (5), 2740−2745. (176) Kong, H.; Lin, L. F.; Porter, N.; Stickel, S.; Byrd, D.; Posfai, J.; Roberts, R. J. Functional Analysis of Putative Restriction-Modification System Genes in the Helicobacter pylori J99 Genome. Nucleic Acids Res. 2000, 28 (17), 3216−3223. (177) Krebes, J.; Morgan, R. D.; Bunk, B.; Sproer, C.; Luong, K.; Parusel, R.; Anton, B. P.; Konig, C.; Josenhans, C.; Overmann, J.; et al. The Complex Methylome of the Human Gastric Pathogen Helicobacter pylori. Nucleic Acids Res. 2014, 42 (4), 2415−2432. (178) Zhang, X. S.; Blaser, M. J. Natural Transformation of an Engineered Helicobacter pylori Strain Deficient in Type II Restriction Endonucleases. J. Bacteriol. 2012, 194 (13), 3407−3416. (179) Srikhanta, Y. N.; Fox, K. L.; Jennings, M. P. The Phasevarion: Phase Variation of Type III DNA Methyltransferases Controls Coordinated Switching in Multiple Genes. Nat. Rev. Microbiol. 2010, 8 (3), 196−206. (180) Tan, A.; Hill, D. M.; Harrison, O. B.; Srikhanta, Y. N.; Jennings, M. P.; Maiden, M. C.; Seib, K. L. Distribution of the Type II DNA Methyltransferases ModA, ModB and ModD among Neisseria meningitidis Genotypes: Implications for Gene Regulation and Virulence. Sci. Rep. 2016, 6, 21015. (181) Wilke, K.; Rauhut, E.; Noyer-Weidner, M.; Lauster, R.; Pawlek, B.; Behrens, B.; Trautner, T. A. Sequential Order of Target-Recognizing Domains in Multispecific DNA-Methyltransferases. EMBO J. 1988, 7 (8), 2601−2609. (182) Sethmann, S.; Ceglowski, P.; Willert, J.; Iwanicka-Nowicka, R.; Trautner, T. A.; Walter, J. M.(Phi)BssHII, a Novel Cytosine-C5-DNAMethyltransferase with Target-Recognizing Domains at Separated Locations of the Enzyme. EMBO J. 1999, 18 (12), 3502−3508. (183) Gunthert, U.; Pawlek, B.; Stutz, J.; Trautner, T. A. Restriction and Modification in Bacillus subtilis: Inducibility of a DNA Methylating Activity in Nonmodifying Cells. J. Virol. 1976, 20 (1), 188−195. (184) Iyer, L. M.; Tahiliani, M.; Rao, A.; Aravind, L. Prediction of Novel Families of Enzymes Involved in Oxidative and Other Complex Modifications of Bases in Nucleic Acids. Cell Cycle 2009, 8 (11), 1698− 1710. (185) Hattman, S. Specificity of the Bacteriophage Mu Mom +-Controlled DNA Modification. J. Virol. 1980, 34 (1), 277−279. (186) Hattman, S. Unusual Transcriptional and Translational Regulation of the Bacteriophage Mu mom Operon. Pharmacol. Ther. 1999, 84 (3), 367−388. (187) Drozdz, M.; Piekarowicz, A.; Bujnicki, J. M.; Radlinska, M. Novel Non-Specific DNA Adenine Methyltransferases. Nucleic Acids Res. 2012, 40 (5), 2119−2130. (188) Yamaichi, Y.; Chao, M. C.; Sasabe, J.; Clark, L.; Davis, B. M.; Yamamoto, N.; Mori, H.; Kurokawa, K.; Waldor, M. K. High-Resolution Genetic Analysis of the Requirements for Horizontal Transmission of the ESBL Plasmid from Escherichia coli O104:H4. Nucleic Acids Res. 2015, 43 (1), 348−360. (189) Rohs, R.; Jin, X.; West, S. M.; Joshi, R.; Honig, B.; Mann, R. S. Origins of Specificity in Protein-DNA Recognition. Annu. Rev. Biochem. 2010, 79, 233−269. 12684

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

(190) Roberts, R. J.; Cheng, X. Base Flipping. Annu. Rev. Biochem. 1998, 67, 181−198. (191) Horton, J. R.; Zhang, X.; Maunus, R.; Yang, Z.; Wilson, G. G.; Roberts, R. J.; Cheng, X. DNA Nicking by HinP1I Endonuclease: Bending, Base Flipping and Minor Groove Expansion. Nucleic Acids Res. 2006, 34 (3), 939−948. (192) Miyazono, K.; Furuta, Y.; Watanabe-Matsui, M.; Miyakawa, T.; Ito, T.; Kobayashi, I.; Tanokura, M. A Sequence-Specific DNA Glycosylase Mediates Restriction-Modification in Pyrococcus abyssi. Nat. Commun. 2014, 5, 3178. (193) Sukackaite, R.; Grazulis, S.; Tamulaitis, G.; Siksnys, V. The Recognition Domain of the Methyl-Specific Endonuclease McrBC Flips out 5-Methylcytosine. Nucleic Acids Res. 2012, 40 (15), 7552−7562. (194) Pieper, U.; Pingoud, A. A Mutational Analysis of the PD···D/ EXK Motif Suggests That McrC Harbors the Catalytic Center for DNA Cleavage by the GTP-Dependent Restriction Enzyme McrBC from Escherichia coli. Biochemistry 2002, 41 (16), 5236−5244. (195) Pieper, U.; Groll, D. H.; Wunsch, S.; Gast, F. U.; Speck, C.; Mucke, N.; Pingoud, A. The GTP-Dependent Restriction Enzyme McrBC from Escherichia coli Forms High-Molecular Mass Complexes with DNA and Produces a Cleavage Pattern with a Characteristic 10Base Pair Repeat. Biochemistry 2002, 41 (16), 5245−5254. (196) Panne, D.; Muller, S. A.; Wirtz, S.; Engel, A.; Bickle, T. A. The McrBC Restriction Endonuclease Assembles into a Ring Structure in the Presence of G Nucleotides. EMBO J. 2001, 20 (12), 3210−3217. (197) Stewart, F. J.; Panne, D.; Bickle, T. A.; Raleigh, E. A. MethylSpecific DNA Binding by McrBC, a Modification-Dependent Restriction Enzyme. J. Mol. Biol. 2000, 298 (4), 611−622. (198) Stewart, F. J.; Raleigh, E. A. Dependence of McrBC Cleavage on Distance between Recognition Elements. Biol. Chem. 1998, 379 (4−5), 611−616. (199) Pieper, U.; Brinkmann, T.; Kruger, T.; Noyer-Weidner, M.; Pingoud, A. Characterization of the Interaction between the Restriction Endonuclease McrBC from E. coli and Its Cofactor GTP. J. Mol. Biol. 1997, 272 (2), 190−199. (200) Gast, F. U.; Brinkmann, T.; Pieper, U.; Kruger, T.; NoyerWeidner, M.; Pingoud, A. The Recognition of Methylated DNA by the GTP-Dependent Restriction Endonuclease McrBC Resides in the NTerminal Domain of McrB. Biol. Chem. 1997, 378 (9), 975−982. (201) Sutherland, E.; Coe, L.; Raleigh, E. A. McrBC: A Multisubunit GTP-Dependent Restriction Endonuclease. J. Mol. Biol. 1992, 225 (2), 327−348. (202) Horton, J. R.; Wang, H.; Mabuchi, M. Y.; Zhang, X.; Roberts, R. J.; Zheng, Y.; Wilson, G. G.; Cheng, X. Modification-Dependent Restriction Endonuclease, MspJI, Flips 5-Methylcytosine out of the DNA Helix. Nucleic Acids Res. 2014, 42 (19), 12092−12101. (203) Horton, J. R.; Nugent, R. L.; Li, A.; Mabuchi, M. Y.; Fomenkov, A.; Cohen-Karni, D.; Griggs, R. M.; Zhang, X.; Wilson, G. G.; Zheng, Y.; et al. Structure and Mutagenesis of the DNA Modification-Dependent Restriction Endonuclease AspBHI. Sci. Rep. 2014, 4, 4246. (204) Horton, J. R.; Mabuchi, M. Y.; Cohen-Karni, D.; Zhang, X.; Griggs, R. M.; Samaranayake, M.; Roberts, R. J.; Zheng, Y.; Cheng, X. Structure and Cleavage Activity of the Tetrameric MspJI DNA Modification-Dependent Restriction Endonuclease. Nucleic Acids Res. 2012, 40 (19), 9763−9773. (205) Cohen-Karni, D.; Xu, D.; Apone, L.; Fomenkov, A.; Sun, Z.; Davis, P. J.; Kinney, S. R.; Yamada-Mabuchi, M.; Xu, S. Y.; Davis, T.; et al. The MspJI Family of Modification-Dependent Restriction Endonucleases for Epigenetic Studies. Proc. Natl. Acad. Sci. U. S. A. 2011, 108 (27), 11040−11045. (206) Zheng, Y.; Cohen-Karni, D.; Xu, D.; Chin, H. G.; Wilson, G.; Pradhan, S.; Roberts, R. J. A Unique Family of Mrr-Like ModificationDependent Restriction Endonucleases. Nucleic Acids Res. 2010, 38 (16), 5527−5534. (207) Bujnicki, J. M.; Rychlewski, L. Grouping Together Highly Diverged PD-(D/E)XK Nucleases and Identification of Novel Superfamily Members Using Structure-Guided Alignment of Sequence Profiles. J. Mol. Microbiol. Biotechnol. 2001, 3 (1), 69−72.

(208) Sasnauskas, G.; Zagorskaite, E.; Kauneckaite, K.; Tamulaitiene, G.; Siksnys, V. Structure-Guided Sequence Specificity Engineering of the Modification-Dependent Restriction Endonuclease LpnPI. Nucleic Acids Res. 2015, 43 (12), 6144−6155. (209) Sasnauskas, G.; Kostiuk, G.; Tamulaitis, G.; Siksnys, V. Target Site Cleavage by the Monomeric Restriction Enzyme BcnI Requires Translocation to a Random DNA Sequence and a Switch in Enzyme Orientation. Nucleic Acids Res. 2011, 39 (20), 8844−8856. (210) Janosi, L.; Yonemitsu, H.; Hong, H.; Kaji, A. Molecular Cloning and Expression of a Novel Hydroxymethylcytosine-Specific Restriction Enzyme (PvuRts1I) Modulated by Glucosylation of DNA. J. Mol. Biol. 1994, 242 (1), 45−61. (211) Wang, H.; Guan, S.; Quimby, A.; Cohen-Karni, D.; Pradhan, S.; Wilson, G.; Roberts, R. J.; Zhu, Z.; Zheng, Y. Comparative Characterization of the PvuRts1I Family of Restriction Enzymes and Their Application in Mapping Genomic 5-Hydroxymethylcytosine. Nucleic Acids Res. 2011, 39 (21), 9294−9305. (212) Szwagierczak, A.; Brachmann, A.; Schmidt, C. S.; Bultmann, S.; Leonhardt, H.; Spada, F. Characterization of PvuRts1I Endonuclease as a Tool to Investigate Genomic 5-Hydroxymethylcytosine. Nucleic Acids Res. 2011, 39 (12), 5149−5156. (213) Borgaro, J. G.; Zhu, Z. Characterization of the 5Hydroxymethylcytosine-Specific DNA Restriction Endonucleases. Nucleic Acids Res. 2013, 41 (7), 4198−4206. (214) Shao, C.; Wang, C.; Zang, J. Structural Basis for the Substrate Selectivity of PvuRts1I, a 5-Hydroxymethylcytosine DNA Restriction Endonuclease. Acta Crystallogr., Sect. D: Biol. Crystallogr. 2014, 70 (9), 2477−2486. (215) Kazrani, A. A.; Kowalska, M.; Czapinska, H.; Bochtler, M. Crystal Structure of the 5hmC Specific Endonuclease PvuRts1I. Nucleic Acids Res. 2014, 42 (9), 5929−5936. (216) Horton, J. R.; Borgaro, J. G.; Griggs, R. M.; Quimby, A.; Guan, S.; Zhang, X.; Wilson, G. G.; Zheng, Y.; Zhu, Z.; Cheng, X. Structure of 5Hydroxymethylcytosine-Specific Restriction Enzyme, AbaSI, in Complex with DNA. Nucleic Acids Res. 2014, 42 (12), 7947−7959. (217) Han, T.; Yamada-Mabuchi, M.; Zhao, G.; Li, L.; Liu, G.; Ou, H. Y.; Deng, Z.; Zheng, Y.; He, X. Recognition and Cleavage of 5Methylcytosine DNA by Bacterial SRA-HNH Proteins. Nucleic Acids Res. 2015, 43 (2), 1147−1159. (218) Heitman, J.; Model, P. Site-Specific Methylases Induce the SOS DNA Repair Response in Escherichia coli. J. Bacteriol. 1987, 169 (7), 3243−3250. (219) Waite-Rees, P. A.; Keating, C. J.; Moran, L. S.; Slatko, B. E.; Hornstra, L. J.; Benner, J. S. Characterization and Expression of the Escherichia coli Mrr Restriction System. J. Bacteriol. 1991, 173 (16), 5207−5219. (220) Kelleher, J. E.; Raleigh, E. A. A Novel Activity in Escherichia coli K-12 That Directs Restriction of DNA Modified at CG Dinucleotides. J. Bacteriol. 1991, 173 (16), 5220−5223. (221) Tesfazgi Mebrhatu, M.; Wywial, E.; Ghosh, A.; Michiels, C. W.; Lindner, A. B.; Taddei, F.; Bujnicki, J. M.; Van Melderen, L.; Aertsen, A. Evidence for an Evolutionary Antagonism between Mrr and Type III Modification Systems. Nucleic Acids Res. 2011, 39 (14), 5991−6001. (222) Bujnicki, J. M.; Rychlewski, L. Identification of a PD-(D/E)XKLike Domain with a Novel Configuration of the Endonuclease Active Site in the Methyl-Directed Restriction Enzyme Mrr and Its Homologs. Gene 2001, 267 (2), 183−191. (223) Orlowski, J.; Mebrhatu, M. T.; Michiels, C. W.; Bujnicki, J. M.; Aertsen, A. Mutational Analysis and a Structural Model of MethylDirected Restriction Enzyme Mrr. Biochem. Biophys. Res. Commun. 2008, 377 (3), 862−866. (224) Mulligan, E. A.; Hatchwell, E.; McCorkle, S. R.; Dunn, J. J. Differential Binding of Escherichia coli McrA Protein to DNA Sequences That Contain the Dinucleotide m5CpG. Nucleic Acids Res. 2010, 38 (6), 1997−2005. (225) Mulligan, E. A.; Dunn, J. J. Cloning, Purification and Initial Characterization of E. coli McrA, a Putative 5-Methylcytosine-Specific Nuclease. Protein Expression Purif. 2008, 62 (1), 98−103. 12685

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

(226) Anton, B. P.; Raleigh, E. A. Transposon-Mediated Linker Insertion Scanning Mutagenesis of the Escherichia coli McrA Endonuclease. J. Bacteriol. 2004, 186 (17), 5699−5707. (227) Bujnicki, J. M.; Radlinska, M.; Rychlewski, L. Atomic Model of the 5-Methylcytosine-Specific Restriction Enzyme McrA Reveals an Atypical Zinc Finger and Structural Similarity to ββαMe Endonucleases. Mol. Microbiol. 2000, 37 (5), 1280−1281. (228) Gonzalez-Ceron, G.; Miranda-Olivares, O. J.; Servin-Gonzalez, L. Characterization of the Methyl-Specific Restriction System of Streptomyces coelicolor A3(2) and of the Role Played by Laterally Acquired Nucleases. FEMS Microbiol. Lett. 2009, 301 (1), 35−43. (229) Xu, T.; Liang, J.; Chen, S.; Wang, L.; He, X.; You, D.; Wang, Z.; Li, A.; Xu, Z.; Zhou, X.; et al. DNA Phosphorothioation in Streptomyces lividans: Mutational Analysis of the dnd Locus. BMC Microbiol. 2009, 9, 41. (230) Liu, G.; Ou, H. Y.; Wang, T.; Li, L.; Tan, H.; Zhou, X.; Rajakumar, K.; Deng, Z.; He, X. Cleavage of Phosphorothioated DNA and Methylated DNA by the Type IV Restriction Endonuclease ScoMcrA. PLoS Genet. 2010, 6 (12), e1001253. (231) Bair, C. L.; Black, L. W. A Type IV Modification Dependent Restriction Nuclease That Targets Glucosylated Hydroxymethyl Cytosine Modified DNAs. J. Mol. Biol. 2007, 366 (3), 768−778. (232) He, X.; Hull, V.; Thomas, J. A.; Fu, X.; Gidwani, S.; Gupta, Y. K.; Black, L. W.; Xu, S. Y. Expression and Purification of a Single-Chain Type IV Restriction Enzyme Eco94GmrSD and Determination of Its Substrate Preference. Sci. Rep. 2015, 5, 9747. (233) Machnicka, M. A.; Kaminska, K. H.; Dunin-Horkawicz, S.; Bujnicki, J. M. Phylogenomics and Sequence-Structure-Function Relationships in the GmrSD Family of Type IV Restriction Enzymes. BMC Bioinf. 2015, 16, 336. (234) Grohmann, E.; Stanzer, T.; Schwab, H. The ParB Protein Encoded by the RP4 Par Region Is a Ca(2+)-Dependent Nuclease Linearizing Circular DNA Substrates. Microbiology 1997, 143 (12), 3889−3898. (235) Maindola, P.; Raina, R.; Goyal, P.; Atmakuri, K.; Ojha, A.; Gupta, S.; Christie, P. J.; Iyer, L. M.; Aravind, L.; Arockiasamy, A. Multiple Enzymatic Activities of ParB/Srx Superfamily Mediate Sexual Conflict among Conjugative Plasmids. Nat. Commun. 2014, 5, 5322. (236) Chi, Y. H.; Kim, S. Y.; Jung, I. J.; Shin, M. R.; Jung, Y. J.; Park, J. H.; Lee, E. S.; Maibam, P.; Kim, K. S.; Park, J. H.; et al. Dual Functions of Arabidopsis Sulfiredoxin: Acting as a Redox-Dependent Sulfinic Acid Reductase and as a Redox-Independent Nuclease Enzyme. FEBS Lett. 2012, 586 (19), 3493−3499. (237) Xu, S. Y.; Corvaglia, A. R.; Chan, S. H.; Zheng, Y.; Linder, P. A Type IV Modification-Dependent Restriction Enzyme Sauusi from Staphylococcus aureus subsp. aureus USA300. Nucleic Acids Res. 2011, 39 (13), 5597−5610. (238) Lacks, S.; Greenberg, B. A Deoxyribonuclease of Diplococcus pneumoniae Specific for Methylated DNA. J. Biol. Chem. 1975, 250 (11), 4060−4066. (239) Lu, L.; Patel, H.; Bissler, J. J. Optimizing DpnI Digestion Conditions to Detect Replicated DNA. BioTechniques 2002, 33 (2), 316−318. (240) Siwek, W.; Czapinska, H.; Bochtler, M.; Bujnicki, J. M.; Skowronek, K. Crystal Structure and Mechanism of Action of the N6Methyladenine-Dependent Type IIM Restriction Endonuclease R.DpnI. Nucleic Acids Res. 2012, 40 (15), 7563−7572. (241) Mierzejewska, K.; Siwek, W.; Czapinska, H.; Kaus-Drobek, M.; Radlinska, M.; Skowronek, K.; Bujnicki, J. M.; Dadlez, M.; Bochtler, M. Structural Basis of the Methylation Specificity of R.DpnI. Nucleic Acids Res. 2014, 42 (13), 8745−8754. (242) Tarasova, G. V.; Nayakshina, T. N.; Degtyarev, S. K. Substrate Specificity of New Methyl-Directed DNA Endonuclease GlaI. BMC Mol. Biol. 2008, 9 (1), 7. (243) van der Veen, S.; Tang, C. M. The BER Necessities: The Repair of DNA Damage in Human-Adapted Bacterial Pathogens. Nat. Rev. Microbiol. 2015, 13 (2), 83−94.

(244) Pearl, L. H. Structure and Function in the Uracil-DNA Glycosylase Superfamily. Mutat. Res., DNA Repair 2000, 460 (3−4), 165−181. (245) Duncan, B. K.; Warner, H. R. Metabolism of Uracil-Containing DNA: Degradation of Bacteriophage PBS2 DNA in Bacillus subtilis. J. Virol. 1977, 22 (3), 835−838. (246) Berkner, K. L.; Folk, W. R. The Effects of Substituted Pyrimidines in DNAs on Cleavage by Sequence-Specific Endonucleases. J. Biol. Chem. 1979, 254 (7), 2551−2560. (247) Hauser, R.; Blasche, S.; Dokland, T.; Haggard-Ljungquist, E.; von Brunn, A.; Salas, M.; Casjens, S.; Molineux, I.; Uetz, P. Bacteriophage Protein-Protein Interactions. Adv. Virus Res. 2012, 83, 219−298. (248) Serrano-Heras, G.; Salas, M.; Bravo, A. A Uracil-DNA Glycosylase Inhibitor Encoded by a Non-Uracil Containing Viral DNA. J. Biol. Chem. 2006, 281 (11), 7068−7074. (249) Cole, A. R.; Ofer, S.; Ryzhenkova, K.; Baltulionis, G.; Hornyak, P.; Savva, R. Architecturally Diverse Proteins Converge on an Analogous Mechanism to Inactivate Uracil-DNA Glycosylase. Nucleic Acids Res. 2013, 41 (18), 8760−8775. (250) Putnam, C. D.; Shroyer, M. J.; Lundquist, A. J.; Mol, C. D.; Arvai, A. S.; Mosbaugh, D. W.; Tainer, J. A. Protein Mimicry of DNA from Crystal Structures of the Uracil-DNA Glycosylase Inhibitor Protein and Its Complex with Escherichia coli Uracil-DNA Glycosylase. J. Mol. Biol. 1999, 287 (2), 331−346. (251) Mol, C. D.; Arvai, A. S.; Sanderson, R. J.; Slupphaug, G.; Kavli, B.; Krokan, H. E.; Mosbaugh, D. W.; Tainer, J. A. Crystal Structure of Human Uracil-DNA Glycosylase in Complex with a Protein Inhibitor: Protein Mimicry of DNA. Cell 1995, 82 (5), 701−708. (252) Lu, S.; Le, S.; Tan, Y.; Li, M.; Liu, C.; Zhang, K.; Huang, J.; Chen, H.; Rao, X.; Zhu, J.; et al. Unlocking the Mystery of the Hard-toSequence Phage Genome: PaP1Methylome and Bacterial Immunity. BMC Genomics 2014, 15, 803. (253) Weiss, B. Endonuclease V of Escherichia coli Prevents Mutations from Nitrosative Deamination During Nitrate/Nitrite Respiration. Mutat. Res., DNA Repair 2001, 461 (4), 301−309. (254) Yao, M.; Kow, Y. W. Further Characterization of Escherichia coli Endonuclease V. Mechanism of Recognition for Deoxyinosine, Deoxyuridine, and Base Mismatches in DNA. J. Biol. Chem. 1997, 272 (49), 30774−30779. (255) Zhang, Z.; Jia, Q.; Zhou, C.; Xie, W. Crystal Structure of E. coli Endonuclease V, an Essential Enzyme for Deamination Repair. Sci. Rep. 2015, 5, 12754. (256) Dalhus, B.; Arvai, A. S.; Rosnes, I.; Olsen, O. E.; Backe, P. H.; Alseth, I.; Gao, H.; Cao, W.; Tainer, J. A.; Bjoras, M. Structures of Endonuclease V with DNA Reveal Initiation of Deaminated Adenine Repair. Nat. Struct. Mol. Biol. 2009, 16 (2), 138−143. (257) Roberts, G. A.; Stephanou, A. S.; Kanwar, N.; Dawson, A.; Cooper, L. P.; Chen, K.; Nutley, M.; Cooper, A.; Blakely, G. W.; Dryden, D. T. Exploring the DNA Mimicry of the Ocr Protein of Phage T7. Nucleic Acids Res. 2012, 40 (16), 8129−8143. (258) Kruger, D. H.; Gola, G.; Weisshuhn, I.; Hansen, S. The Ocr Gene Function of Bacterial Viruses T3 and T7 Prevents Host-Controlled Modification. J. Gen. Virol. 1978, 41 (1), 189−192. (259) Stephanou, A. S.; Roberts, G. A.; Cooper, L. P.; Clarke, D. J.; Thomson, A. R.; MacKay, C. L.; Nutley, M.; Cooper, A.; Dryden, D. T. Dissection of the DNA Mimicry of the Bacteriophage T7 Ocr Protein Using Chemical Modification. J. Mol. Biol. 2009, 391 (3), 565−576. (260) Loenen, W. A.; Murray, N. E. Modification Enhancement by the Restriction Alleviation Protein (Ral) of Bacteriophage Lambda. J. Mol. Biol. 1986, 190 (1), 11−22. (261) King, G.; Murray, N. E. Modification Enhancement and Restriction Alleviation by Bacteriophage Lambda. Gene 1995, 157 (1− 2), 225. (262) King, G.; Murray, N. E. Restriction Alleviation and Modification Enhancement by the Rac Prophage of Escherichia coli K-12. Mol. Microbiol. 1995, 16 (4), 769−777. 12686

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687

Chemical Reviews

Review

(263) Iida, S.; Streiff, M. B.; Bickle, T. A.; Arber, W. Two DNA Antirestriction Systems of Bacteriophage P1, DarA, and DarB: Characterization of DarA− Phages. Virology 1987, 157 (1), 156−166. (264) McMahon, S. A.; Roberts, G. A.; Johnson, K. A.; Cooper, L. P.; Liu, H.; White, J. H.; Carter, L. G.; Sanghvi, B.; Oke, M.; Walkinshaw, M. D.; et al. Extensive DNA Mimicry by the ArdA Anti-Restriction Protein and Its Role in the Spread of Antibiotic Resistance. Nucleic Acids Res. 2009, 37 (15), 4887−4897. (265) Thomas, A. T.; Brammar, W. J.; Wilkins, B. M. Plasmid R16 ArdA Protein Preferentially Targets Restriction Activity of the Type I Restriction-Modification System EcoKI. J. Bacteriol. 2003, 185 (6), 2022−2025. (266) Makovets, S.; Powell, L. M.; Titheradge, A. J. B.; Blakely, G. W.; Murray, N. E. Is Modification Sufficient to Protect a Bacterial Chromosome from a Resident Restriction Endonuclease? Mol. Microbiol. 2004, 51 (1), 135−147. (267) Makovets, S.; Doronina, V. A.; Murray, N. E. Regulation of Endonuclease Activity by Proteolysis Prevents Breakage of Unmodified Bacterial Chromosomes by Type I Restriction Enzymes. Proc. Natl. Acad. Sci. U. S. A. 1999, 96 (17), 9757−9762. (268) Makovets, S.; Titheradge, A. J.; Murray, N. E. ClpX and ClpP Are Essential for the Efficient Acquisition of Genes Specifying Type IA and IB Restriction Systems. Mol. Microbiol. 1998, 28 (1), 25−35. (269) Serfiotis-Mitsa, D.; Herbert, A. P.; Roberts, G. A.; Soares, D. C.; White, J. H.; Blakely, G. W.; Uhrin, D.; Dryden, D. T. The Structure of the KlcA and ArdB Proteins Reveals a Novel Fold and Antirestriction Activity against Type I DNA Restriction Systems in Vivo but Not in Vitro. Nucleic Acids Res. 2010, 38 (5), 1723−1737. (270) Belogurov, A. A.; Delver, E. P.; Agafonova, O. V.; Belogurova, N. G.; Lee, L. Y.; Kado, C. I. Antirestriction Protein Ard (Type C) Encoded by IncW Plasmid pSA Has a High Similarity to the “Protein Transport” Domain of TraC1 Primase of Promiscuous Plasmid RP4. J. Mol. Biol. 2000, 296 (4), 969−977. (271) Keatch, S. A.; Leonard, P. G.; Ladbury, J. E.; Dryden, D. T. StpA Protein from Escherichia coli Condenses Supercoiled DNA in Preference to Linear DNA and Protects It from Digestion by Dnase I and EcoKI. Nucleic Acids Res. 2005, 33 (20), 6540−6546. (272) Flyvbjerg, H.; Keatch, S. A.; Dryden, D. T. Strong Physical Constraints on Sequence-Specific Target Location by Proteins on DNA Molecules. Nucleic Acids Res. 2006, 34 (9), 2550−2557. (273) Pope, W. H.; Bowman, C. A.; Russell, D. A.; Jacobs-Sera, D.; Asai, D. J.; Cresawn, S. G.; Jacobs, W. R.; Hendrix, R. W.; Lawrence, J. G.; Hatfull, G. F.; et al. Whole Genome Comparison of a Large Collection of Mycobacteriophages Reveals a Continuum of Phage Genetic Diversity. eLife 2015, 4, e06416. (274) Labrie, S. J.; Samson, J. E.; Moineau, S. Bacteriophage Resistance Mechanisms. Nat. Rev. Microbiol. 2010, 8 (5), 317−327. (275) Dharmalingam, K.; Goldberg, E. B. Restriction in Vivo. IV. Effect of Restriction of Parental DNA on the Expression of Restriction Alleviation Systems in Phage T4. Virology 1979, 96 (2), 404−411. (276) Dharmalingam, K.; Goldberg, E. B. Phage-Coded Protein Prevents Restriction of Unmodified Progeny T4 DNA. Nature 1976, 260 (5550), 454−456. (277) Dharmalingam, K.; Goldberg, E. B. Mechanism Localisation and Control of Restriction Cleavage of Phage T4 and Lambda Chromosomes in Vivo. Nature 1976, 260 (5550), 406−410. (278) Ho, C. H.; Wang, H. C.; Ko, T. P.; Chang, Y. C.; Wang, A. H. The T4 Phage DNA Mimic Protein Arn Inhibits the DNA Binding Activity of the Bacterial Histone-Like Protein H-NS. J. Biol. Chem. 2014, 289 (39), 27046−27054. (279) Fass, E.; Groisman, E. A. Control of Salmonella Pathogenicity Island-2 Gene Expression. Curr. Opin. Microbiol. 2009, 12 (2), 199−204. (280) Dharmalingam, K.; Revel, H. R.; Goldberg, E. B. Physical Mapping and Cloning of Bacteriophage T4 Anti-Restriction Endonuclease Gene. J. Bacteriol. 1982, 149 (2), 694−699. (281) Rifat, D.; Wright, N. T.; Varney, K. M.; Weber, D. J.; Black, L. W. Restriction Endonuclease Inhibitor IPI* of Bacteriophage T4: A Novel Structure for a Dedicated Target. J. Mol. Biol. 2008, 375 (3), 720−734.

(282) Bair, C. L.; Rifat, D.; Black, L. W. Exclusion of GlucosylHydroxymethylcytosine DNA Containing Bacteriophages Is Overcome by the Injected Protein Inhibitor IPI*. J. Mol. Biol. 2007, 366 (3), 779− 789.

12687

DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687