Molecular Mechanisms of Transcription Elongation in Archaea

Sep 11, 2013 - Molecular Mechanisms of Transcription Elongation in Archaea. Finn Werner*. RNAP Laboratory, Institute for Structural and Molecular Biol...
2 downloads 0 Views 19MB Size
Review pubs.acs.org/CR

Molecular Mechanisms of Transcription Elongation in Archaea Finn Werner* RNAP Laboratory, Institute for Structural and Molecular Biology, Division of Biosciences, University College London, Darwin Building, Gower Street, London WC1E 6BT, U.K. 5. SummaryA Paradigm Shift Author Information Corresponding Author Notes Biography Acknowledgments References

P Q Q Q Q Q Q

1. INTRODUCTION TO RNA POLYMERASES 1.1. Transcription and the Central Dogma

The central dogma of molecular biology describes the flow of information in biological systems by transcription and translation, collectively often referred to as gene expression.1 The first step, transcription, is the synthesis of RNA in a DNAtemplate-dependent fashion. In the second step, translation, proteins are synthesized in a messenger (m)RNA-templatedependent manner. RNA, however, is not only an informationcarrying molecule (e.g., mRNA and the genomes of RNA viruses2), it also can adopt distinct structures capable of interacting with ligands in a highly specific manner (e.g., riboswitches3) and carry out chemical catalysis on its own (i.e., ribozymes4) or in the context of RNP (ribonucleoprotein) particles containing both RNA and protein components (e.g., ribosomes, telomerase, RNase P5). In summary, RNA is an interesting and versatile molecule with a genotype and a phenotype, and its synthesis by transcription is a worthy subject of study. Moreover, defects in the transcription machineries lead to the deregulation of gene expression that can result in severe morbidity and mortality, including a broad spectrum of cancers. The principal enzymes of transcription are RNA polymerases (RNAPs). When studying the structure of RNAP active centers, it becomes apparent that template-dependent transcription has emerged at least six times independently in evolution.6 Two of the more important categories of RNAPs are (i) the singlesubunit enzymes with the prototypical palm−fingers−thumb active site motifs7 (including bacteriophage and mitochondrial RNAPs and replicative DNAPs) and (ii) the multisubunit RNAPs6c that are essential for the transcription of cellular genomes in the three domains of life, without exceptions. All RNAPs make repeatedly use of the DNA template by progressing through the “transcription cycle” multiple times, which in essence is a highly regulated amplification step of the genetic information by which many RNA molecules are transcribed using the same template DNA molecule. According

CONTENTS 1. Introduction to RNA Polymerases 1.1. Transcription and the Central Dogma 1.2. Origin and Evolution of RNA Polymerase 1.3. ArchaeaThe Third Domain of Life 2. Subunit Organization of RNAPs in the Three Domains 2.1. RNAP Subunit Structure and Function 2.2. Structure of RNAP and Transcription Complexes 2.2.1. Overall Structural Organization of RNAP 2.2.2. Rigid and Flexible Structures Facilitate RNAP Function 3. The Transcription Cycle 3.1. Transcription InitiationRNAP Recruitment and Abortive Cycling 3.2. Factor Swapping during Promoter Escape TFE and Spt4/5 4. Transcription Elongation 4.1. Parameters Affecting Transcription Elongation 4.2. Structural Elements of RNAP Involved in Elongation 4.3. The Archaeal Jaws Rpo5 and -13Interactions with the Downstream DNA 4.4. The Rpo4/7 StalkHolding onto the RNA Transcript 4.5. The Switch 3 RegionMelting the RNA− DNA Hybrid 4.6. Transcription Elongation Factors 4.7. Transcript Cleavage Factor TFS 4.8. Transcription Processivity Factors Are Universally Conserved in Evolution 4.9. Structure and Function of Spt4/5 in Archaea 4.10. Putative Functions of the Archaeal Spt5 KOW DomainCoupling of Transcription and Translation 4.11. NusA and Antitermination © XXXX American Chemical Society

A A B B B B D D E E E F G G H H H J J J M M

Special Issue: 2013 Gene Expression

O P

Received: April 30, 2013

A

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

RNAPs.6b Much less speculative than the origin of RNAPs is the evolutionary relationship between extant enzymes.

to the RNA world hypothesis, prior to the emergence of DNA as the principal repository of genetic information, RNA fulfilled this function as well being the template for protein synthesis.4b,8 In this scenario, translation output would be limited by the copy number of the genome. In extant cells belonging to all three domains of life, the genomes consist of DNA. By utilizing an RNA intermediate for protein expression, many copies of RNA can be synthesized from the DNA template, which results in a dramatic increase of the dynamic range of gene expression. The transcription cycle can be divided into three main phases, initiation, elongation, and termination, all of which are modulated by external transcription factors and subject to regulation. In the following, I will briefly (i) introduce the evolution of multisubunit RNAPs that has resulted in the transcription systems in extant (i.e., contemporary) bacteria, archaea, and eukarya; (ii) describe the archaeal transcription cycle; and (iii) focus on the RNA synthesis phase of the transcription cycle transcription elongationemphasizing (iv) the structural elements of RNAP involved in elongation, as well as (v) the modus operandi of transcript cleavage and transcription processivity factors.

1.3. ArchaeaThe Third Domain of Life

The ground breaking working of Carl Woese in the late 1980s revealed that life on earth is organized into three domains: Bacteria, Archaea, or Eukarya10 (Figure 1). Archaea arelike bacteriaprokaryotes that lack internal membranes and organelles, largely live a unicellular microbial lifestyle, and have modestly sized genomes, a polycistronic gene organization, and relatively simple regulatory networks.11 However, a closer inspection of their genomic sequences (including rRNA sequences), and in particular the composition and architecture of the machines that carry out information processing, transcription and translation, reveals that archaea are much more closely related to eukaryotes than they are to bacteria.10 Members of the archaea achieved notoriety as extremophiles that can survive in extremes of temperature, salinity, and acidity, but over the past decade, it has become apparent that they inhabit every available niche on planet Earth, including the human body.12 Archaea have emerged as attractive model systems because they are homologous to, yet often more simple or streamlined than, their eukaryotic counterparts and because of their high biochemical tractability.13 Proteins from archaea, and in particular from hyperthermophilic archaea, are often expressed at high levels and in soluble form in heterologous expression systems such as Escherichia coli, which is a distinctive advantage for structure determination and structure−function analysis using recombinant systems. Finally, their genome size (typically 1.5−3 Mbp) and small number of genes (∼1500− 3000) simplifies global approaches including whole-genome occupancy profiling, transcriptomics, and proteomics, as compared to eukaryotic systems.14

1.2. Origin and Evolution of RNA Polymerase

All multisubunit RNAPs are evolutionarily related throughout the three domains of life;6c this implies that they are derived from a common ancestral enzyme in a hypothetical organism referred to as the last universal common ancestor (LUCA9) that existed prior to the split of the bacterial, archaeal, and eukaryal lineages6c (Figure 1). The very early origin of RNAPs at the dawn of life is uncertain, but it has been hypothesized that a noncatalytic RNA binding protein that bound to a catalytically active ribozyme RNAP gave rise to proteinaceous

2. SUBUNIT ORGANIZATION OF RNAPS IN THE THREE DOMAINS The subunits of RNAPs have a different nomenclature in the three domains of life (Table 1). Archaeal RNAP subunits are named Rpo (RNA polymerase) followed by a number (1, 2, etc.); an older classification uses Rpo followed by roman letter (RpoA, B, etc.) or only the letter (A, B, etc.). The RNAPII system in eukaryotic organisms uses the RPB acronym (RNA polymerase B, i.e., II) followed by the same numbering as the archaeal subunits (i.e., RPB7 corresponds to Rpo7). Bacterial RNAP subunits are named using greek letters (α, β, β′, and ω). 2.1. RNAP Subunit Structure and Function

The subunit organization of all multisubunit RNAPs reflects a common architecture based on five subunits that present the universally conserved core of the RNAP (Figure 1).15 This core roughly corresponds to the modern bacterial RNAP and encompasses about 75% of the protein mass of archaeal and eukaryotic RNAPs6c (Table 1). These include the assembly platform, the catalytic subunits, and a small subunit (ω, Rpo6 in archaea) that is involved in enzyme stability16 (Table 1). The assembly of RNAP is nucleated by the formation of the assembly platform, which consists of a homodimer of α subunits in most bacteria (including E. coli)17 and a heterotetramer in eukaryotes (RPB3/11/10/12)18 and archaea (Rpo3/11/10/12).19 Since both Rpo3 and 11 are homologous to α in bacteria, the former appear to have evolved by duplication and speciation of an ancestral form of the latter (Table 1). In this context it is noteworthy that bacteria from the Francisella genus use a heterodimer consisting of two

Figure 1. RNA polymerase structure in the three domains of life. The figure illustrates the structures of RNAPs in Bacteria (A, Thermus aquaticus, pdb 1I6V), Archaea (B, Sulfolobus shibatae, pdb 4B1O) and Eukarya (C, Sacharomyces cerevisiae RNAPII, pdb 1NT9). The last universal common ancestor (LUCA) is indicated at the branch of the bacterial and archaeo-eukaryotic lineages. Homologous subunits are color coded and clearly demonstrate the high degree of similarity between all three transcription engines and, in particular, how the archaeal RNAP mirrors eukaryotic RNAPII. B

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

bacteria.6c Added RNAP subunits extend the assembly platform (Rpo10 and -12), add downstream DNA template binding sites (Rpo5 and -13), and RNA transcript and transcription factor binding sites (Rpo4 and -7). Studies that crucially depend on recombinant, in vitro reconstituted RNAPs provided unequivocal evidence that the combination of assembly platform and catalytic subunits is necessary and sufficient to carry out relatively complex RNAP functions, including promoterdirected and start site specific transcription initiation and transcription elongation, whereas the auxiliary subunits were not strictly required.13 The RNAP subunit composition varies between the two main phyla of Archaea, creanarchaea and euryarchaea.25 The crenarchaeal RNAP is more closely related to the eukaryotic RNAPII because it contains Rpo8, while the euryarchaeal RNAP does not.25b Archaeal species belonging to the order of Sulfolobales and Desulfurococcales and by inference Acidilobales and Korarchaea contain an RNAP subunit that is unique to the archaeal domain, Rpo1326 (section 4.3). Some bacterial RNAP subunits contain lineage specific insertions15b and at least one (Gram positive) bacteriaspecific subunitδhas been discovered even though its function remains opaque.27 In eukaryotes several parallel transcription systems with distinct RNAPs have evolved for the expression of nonoverlapping subsets or classes of genes, three in metazoans and five in plants. In terms of subunit composition, the majority of eukaryotic RNAP subunits are derived from ancestral versions that had emerged before the split of the archaeo-eukaryotic lineages. The three canonical classes of metazoan (animal) RNAPs have five subunits in common (RPB5, -6, -8, -10, and -12).28 The significance of this conservation is unclear, but the other subunits underwent alteration by duplication and speciation, thus giving rise to novel paralogues and distinct transcription systems. Since the catalytic mechanism of the RNAPs remained the same, it is likely that the new/emerging properties of the diverged RNAPs mostly concerned the nature of gene repertoire and their regulation, and it very likely involved coevolution of RNAP subunits and basal transcription initiation factors. RNAPI and -III contain subunits that are paralogous to transcription factors of the RNAPII system, including TFIIE and -F (Table 1).29 This suggests that ancestral versions of once reversibly associated RNAPII transcription factors were stably incorporated into ancestors of RNAPI and -III, a mechanism referred to as transcription factor “capture”.30 Some capture events can be rationalized; e.g., RNAPIII transcribes predominantly short (10-fold), but the

4.4. The Rpo4/7 StalkHolding onto the RNA Transcript

The most prominent difference between bacterial and archaoeukaryotic RNAPs is the stalk doamin. The stalk is comprised of two subunits, Rpo4 and -7; it is present in all classes of eukaryotic and archaeal RNAPs (Table 1), but no genuine bacterial homologues have been identified yet (Figure 1 and Table 1). Rpo7 has an elongated shape, contains an RNAbinding S1 domain (OB fold), and anchors the stalk to the RNAP core just below the clamp via Rpo1 and -6 (Figure 2D). RNAPs lacking Rpo6 cannot stably incorporate the stalk.35a,75 Rpo4 binds to Rpo7 opposite the RNA-binding surface and stabilizes its structure19,76 (Figure 6C; residues involved in RNA binding are highlighted in red). The nascent RNA transcript emerges from the RNAP through the RNA exit channel (highlighted as red dotted circle in Figure 6E) and is directed toward the stalk, where it interacts with Rpo7 via H

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 6. Interactions between RNAP and RNA modulates transcription elongation. Transcription elongation complexes in eukaryotes (A, RNAPII−DNA−RNA, S. cerevisiae, pdb 1Y1W) and archaea (B, RNAP−DNA, S. shibatae, pdb 4B1O) are closely related on the structural level. The RNAP stalk that is important for processivity is highlighted in blue (Rpo7) and magenta (Rpo4), switch 3 is highlighted as a red surface mesh, the bridge and trigger helices are in green, and the DNA and RNA are shown as bright orange and red half-ladders, respectively. In addition to the large subunits, the interaction with downstream DNA also involves the RNAP jaws consisting of Rpo5 and -13, highlighted as pink dotted circle. Panel C shows the archaeal RNAP stalk (M. jannaschii, pdb 1GO3) with amino acid residues involved in RNA-binding highlighted as red spheres. Panels D and E show a close-up of the active site and DNA−RNA hybrid binding compartment of the eukaryotic elongation complex, highlighting the potential role of switch 3 in separating the RNA from the DNA template strand. The RNA exit channel is shown as a red dotted circle, and all elements are color coded according to the key.

electrostatic interactions.77 So far it has not been possible to resolve the structure of the RNA transcript bound to the stalk, which is likely due to its conformational flexibility; however, UV cross-linking experiments have unequivocally demonstrated the interaction between the transcript and the stalk of RNAPII.77 M. jannaschii RNAPs assembled from recombinant subunits lacking Rpo4 and -7 are capable of initiating transcription in a TBP/TFB- and promoter-dependent manner.13 However, they are defective in transcription elongation and show severely reduced processivity in vitro, in particular on DNA templates lacking the nontemplate DNA strand (NTS).78 Furthermore, Rpo7 mutations that compro-

mise RNA binding show processivity defects comparable to the loss of the stalk, which suggest that the RNA binding stimulates the elongation phase of transcription. This is congruent with the notion that interactions between the RNAP and the nucleic acid components of the TEC, in this case Rpo7 and RNA, lead to its stabilization and counteract dissociation, thus resulting in longer transcripts being synthesized in vitro.78 The average operon size in bacteria and archaea is similar, and notably smaller than that of eukaryotic genes. The enhanced processivity of stalk-containing RNAPs might have been a prerequisite for the expansion of the gene size during the evolution of eukaryotes, where genes due to the intron−exon I

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

structure can reach lengths of several million bases.79 The results described above were obtained using purely recombinant components and in vitro transcription assays using synthetic elongation scaffolds; however, the outcome is corroborated in vivo by yeast genetics.80 The eukaryotic homologue of Rpo4 is not essential for cell viability in the baker yeast S. cerevisiae. However, rpb4 deletion strains are temperature-sensitive and RNAP occupancy profiling at the permissive temperature (24 °C) shows that RNAPII is depleted from the 3′-ends of long genes, i.e., loss of rpb4 function at low temperatures results in a moderate processivity defect.80 At the nonpermissive temperature (37 °C), transcription is shut down globally, likely due to a catastrophic elongation defect, which emphasizes the strict dependence of transcription of all class II genes on the RNAP stalk.81 The temperature dependency of the rpb4 deletion could reflect that RPB7 is destabilized sufficiently to denature in the absence of RPB4 and/or that the interactions between the RNA transcript and RNAP are more important for the stability of the TEC at high temperatures. The former hypothesis is corroborated by the fact that recombinant M. jannaschii Rpo7 is completely insoluble and that coexpression of Rpo4 alleviates this problem by forming a soluble and well-folded Rpo4/7 complex.19,76 From a technical point of view, it is noteworthy that RNAPII purified from this strain contains neither RPB4 nor RPB7, indicating that RPB4 stabilizes RPB7 and its interaction with the core RNAP.82 In eukaryotes, the stalk also serves as recruitment platform for RNA 3′-end processing factors, and the deletion of rpb4 leads to an altered polyadenylation site usage, which implies that the phenotype of the rpb4 deletion could be confounded by additional mechanisms other than processivity defects.80 A genetic study in the euryarchaeal hyperthermophile Thermococcus kodakaraensis confirms the overall trend observed in yeast. Whereas the Rpb7 homologue Rpo7 appears essential for cell viability (because it cannot be deleted), the deletion of the Rpb4 homologue Rpo4 leads to a temperature-sensitive phenotype displaying normal growth at 70 °C and slow growth and lower cell densities at 85 °C.83 Moreover, T. kodakaraensis RNAPs purified from the Rpo4 deletion strain lackslike the yeast enzymethe RNAP stalk, which illustrates nicely how the archaeal RNAP serves as a valid model system for eukaryotic RNAPII. However, in T. kodakaraensis, RNAPs lacking the stalk did not show the processivity defects observed in S. cerevisiae or M. jannaschii, which is likely due to different assay conditions (in vivo/vitro, promoter-directed versus elongation scaffold assays, reaction temperatures, etc.) or due to species-specific differences.

not been resolved crystallographically in archaeal or eukaryotic RNAPs, but elegant single molecule FRET experiments have determined its path across the clamp.84 The RNA−DNA hybrid is at a right angle to the downstream DNA duplex (Figure 6E). NTP substrates are likely to enter the active site via the NTP entry pore and bind, one at a time, in the active site of RNAP complementary to the TS by Watson−Crick base-pairing. An alternative model has been proposed by which the NTPs enter the active site environment via the DNA binding channel and start to form basepairs with the TS downstream of the +1 register.85 At the upstream face of RNAP the strands of the RNA−DNA hybrid are separated prior to the reannealing of the TS and the NTS, making the overall process of making and breaking hydrogen bonds during transcription elongation energy neutral86 (Figure 5). The RNA−DNA hybrid has a defined length (9 or 10 bp87), and transcription elongation rates are affected by its maintenance, i.e., by the efficient separation of the RNA and DNA strands as well as the reannealing of the TS and NTS.88 Unless one basepair of the RNA−DNA hybrid has been separated, no additional NTP can be incorporated at the 3′-end of the nascent RNA transcript.89 Several flexible and structurally disordered protein loops protrude into the RNAP cleft to handle the TS, NTS, and RNA; they include the lid, rudder, switch, and forkloop motifs of RNAP.39,90 The switch 3 motif is of special interest in the context of elongation (Figure 6D,E, highlighted as red mesh). In structural models of eukaryotic RNAPII switch 3 interacts with the first base of the RNA that has disengaged from the TS and is likely to facilitate the melting of the hybrid (Figure 6D,E). Santangelo and co-workers replaced switch 3 of the T. kodakaraensis RNAP with a glycine linker and investigated the effect on archaeal RNAP activity in in vitro transcription assays. In the bacterial RNAP the deletion of switch 3 has a dramatic effect on both elongation complex stability and translocation rates, which makes it difficult to discern the consequences of an impaired separation of the RNA−DNA hybrid.91 The loss of switch 3 in the archaeal RNAP also led to transcription elongation defects, yet the stability of the elongation complex was not compromised.90 This establishes a direct link between hybrid separation and the elongation rate of RNAP. In addition to the RNAP subunits, exogenous factors are important determinants for transcription elongation. 4.6. Transcription Elongation Factors

A range of transcription factors assist RNAP during the elongation phase of the transcription cycle; they alter the translocation rate and/or the processivity of the TEC. In archaea, similar to eukaryotes but less sophisticated, the DNA template for transcription is chromatinized by histones92 and other factors such as Alba.93 The proteins bound to the DNA template exert a global repressive effect on transcription because they deny access of initiation factors to the promoter and present a physical obstacle for the TEC.94 Modification systems that alter the DNA binding properties of chromatin proteins and thereby regulate the template accessibility are in effect also transcription elongation factors, but they are discussed elsewhere.95 Below I will elaborate on the two most important classes of transcription elongation factors in archaea, namely, the transcript cleavage and processivity factors.

4.5. The Switch 3 RegionMelting the RNA−DNA Hybrid

The handling of the nucleic acids strands is a major task during the transcription process, and RNAPs devote substantial efforts to this effect. The interactions among RNAP subunits, the template and nontemplate strand DNA, and the RNA transcript are schematically illustrated in Figure 5. In the TEC the DNA downstream of the active center is held between the jaws of RNAP by subunits Rpo1, -2, -5, and -13 and in the DNA binding channel (Figures 5 and 6). The DNA strands are separated just downstream of the catalytic center (register +3) and rewind (reanneal at +11) on the upstream face of RNAP, where it does not interact with the duplex DNA. Whereas the template strand (TS) is threaded through the active site, the nontemplate strand (NTS) runs over the RNAP clamp. The NTS, like the RNA transcript longer than 10 nucleotides, has

4.7. Transcript Cleavage Factor TFS

All multisubunit RNAP have the propensity to pause, stall, and, unlike single subunit enzymes, the ability to backtrack, which is a retrograde movement of RNAP along the DNA template.96 J

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 7. RNA polymerization and transcript cleavage are facilitated by the same RNAP active center. Panels A and B are schematic representations of the two principal chemical reactions RNAPs catalyze, RNA polymerization and cleavage, respectively. Both reactions are nucleophilic substitutions catalyzed by two magnesium ions in the RNAP active site (MgA and MgB, highlighted as pink spheres). In panel A, the 3′-hydroxyl moiety of the RNA attacks the α phosphate group of the incoming NTP substrate, and a new phosphodiester bond is formed while pyrophosphate serves as leaving group. In panel B, a water molecule attacks an internal phosphodiester bond, and a short RNA transcript (RNA2) is released. The attacking lone pair is highlighted as two small red dots, and the scissile bond is highlighted in blue. In both reactions the pentavalent transition state is stabilized by the two magnesium ions, but whereas MgB coordinates all three phosphate groups (α, β, and γ) of the NTP in reaction A, it needs an additional ligand for reaction B to occur efficiently. The additional ligand can be provided by the Asp/Glu loop of a transcript cleavage factor (highlighted in yellow) that is inserted into the active site via the NTP entry pore. Panel C shows the active site environment of the eukaryotic RNAPII-TFIIS complex (S. cerevisiae, pdb 1Y1V) illustrating the close proximity of the bridge helix (green), TFIIS (red ribbon), and its Asp-Gly loop (yellow) with MgA (pink sphere).

exits the RNAP active site through the NTP entry pore, while the newly generated RNA 3′-end is perfectly aligned in the active site for the next round of phosphodiester bond formation. Transcript cleavage factors are present in all three domains of life, the homologous TFS and TFIIS factors in archaea101 and eukaryotes,100 respectively, and the structurally and evolutionary unrelated, but functionally analogous, Gre proteins in bacteria.102 Archaeal TFS only encompasses two of the three domains present in the RNAPII transcript cleavage factor TFIIS. Sequence alignments reveal that these domainsincluding the carboxylate residues critical for transcript cleavageare conserved in RNAP subunits present in all eukaryotic transcription systems: A12 (DE residues) in RNAPI, RPB9 (DT) in RNAPII, and C11 (DE) in RNAPIII (Table 1). Despite their active centers being virtually identical, RNAPIII can undergo transcript cleavage at neutral pH without assistance from external factors but dependent on the C11 subunit, while RNAPII not only depends on RPB9 but also on TFIIS. Swapping the C-terminal domain of RPB9 with its counterpart from C11 renders RNAPII highly competent at

This is possible because the 3′-end of the RNA can exit the active center through the NTP entry pore of RNAP, and as a consequence, backtracked complexes do not contain a RNA 3′end in the active site.97 RNAPs overcome this problem by tuning to an alternative catalytic mechanismtranscript cleavagean endonucleolytic reaction during which an activated water molecule carries out a nucleophilic attack on a internal phosphodiester bond of the RNA63 (Figure 7). At elevated pH values (pH 9.5−10) this reaction occurs spontaneously since hydroxyl ions are stronger nucleophiles than water molecules.98 At neutral pH values only a small residual transcript cleavage activity remains that is strictly dependent on the RNAPII subunit RPB999 (Figure 8, Table 1). Efficient cleavage is enabled by transcript cleavage factors that bind to RNAP and insert a domain into its catalytic center via the NTP entry pore63,100 (Figure 8, highlighted in red). Two carboxylate groups (DE residues) at the tip of this insertion domain coordinate and thereby stabilize the second active site magnesium ion (MgB) that is crucial for the stabilization of the penta-coordinated transition state of the cleavage reaction (Figure 7B). MgB bound to the short RNA cleavage product K

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 8. Rescue of backtracked complexes by transcript cleavage factors. Due to the lack of structural information of archaeal RNAP−transcript cleavage factor complexes, the homologous yeast structure is shown. This figure shows a reactivation intermediate of the RNAPII−TFIIS complex including a short RNA−DNA hybrid (S. cerevisiae, pdb 3PO3). In panel B, the locations of the paralogous RPB9 RNAP subunit and TFIIS transcript cleavage factor are highlighted in green and red, respectively. The N- and C-terminal Zn-ribbons of RPB9 are homologous to TFIIS domains 2 and 3, respectively (Zn ions are highlighted as olive spheres). Panel B is a side view of the complex from a perspective indicated by the dotted arrow in panel A. Domain 3 (d3) of TFIIS is inserted into the active site via the NTP entry pore, where the Asp/Glu loop is proximal to MgA, while MgB was not resolve in this structure. RNAPII variants containing a synthetic/chimeric RPB9−C11 fusion harbor the C11-derived C-ribbon in a location akin to domain 3 (d3) of TFIIS in the RNAPII−TFIIS complex (red circle). The “relocation” (indicated with a green−red block arrow in B) of the C11 C-ribbon likely provides the molecular basis for RNAPIII’s high endogenous transcript cleavage activity as the C11 RNAPIII subunit carries out the function of the TFIIS transcription factor (see the text for details). Note that the TFIIS domain 3 catalytic glutamate and aspartate residues were substituted for alanine residues (highlighted in cyan) in order to stabilize the complex, i.e., in order to prevent in situ transcript cleavage. Panel D shows the same features as C, but the RPB2 subunit has been removed to enable a clear view into the active site of RNAP.

subunit. TFS from Methanococcus thermolithotrophicus is not stably incorporated into the RNAP (unlike RPB9) and it stimulates transcript cleavage like C11, the chimeric RPB9− C11 fusion, and TFIIS.99,101,103 Replacing the C-terminal domain of Pyrococcus furiosus TFS with the cognate domain of C11 does not further enhance its transcript cleavage activity, but mutating the DE catalytic carboxylate residues of either the TFS or the TFS−C11 fusion abolishes cleavage.99 In summary, despite its similarity to RPB9 on the sequence level, the Cterminal domain of archaeal TFS is functionally reminiscent of C11 and altogether has the function of a genuine transcript cleavage factor such as TFIIS. The phylogenetic distribution of these proteins suggests that an ancestral version of the cleavage factors emerged after LUCA, but before the split of the archaoeukaryotic lineages (Figure 9). This factor had a two-domain organization and in all likelihood reversibly associated with RNAP, both of which are traits akin to archaeal TFS. In eukaryotes, this ancestral gene underwent duplication and

transcript cleavage at levels comparable to RNAPIII and independently of TFIIS.99 A structure of RNAPII that contains the chimeric RPB9−C11 fusion protein reveals that the RNAP active center is unperturbed, and thus rules out a purely allosteric effect of the RPB9−C11 fusion on cleavage activity. Rather, the C11-derived C-terminal domain of the RPB9−C11 fusion protein is highly mobile, in contrast to the paralogous RPB9 domain, and alanine substitutions of the carboxylate residues abolish this activity. Altogether this provides strong evidence that the C11 domain of the RPB9−C11 fusion is flexible and can be inserted into the RNAP active site, where it stimulates cleavage in a fashion similar to the transcript cleavage factor TFIIS99 (Figure 8B). This suggests that C11 acts as an inbuilt cleavage factor rendering RNAPIII independent of exogenous cleavage factors. Where does that leave archaeal TFS? The archaeal RNAP is not prone to transcript cleavage in the absence of factors at neutral pH, similar to RNAPII, congruent with the absence of a bona fide RPB9 RNAP L

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

and eukaryotes (Table 1). The only RNAP-associated transcription factor that is present in all three domains of life and therefore emerged before the time of LUCA is the transcription elongation factor Spt5 (also known as NusG in bacteria).59 The universal conservation demonstrates that Spt5-like factors are evolutionarily ancient and suggests that early RNAPs (during the time of LUCA) were assisted and possibly regulated during elongationrather than initiationof transcription, which provides the rationale of the “elongation first hypothesis”.6c It makes sense that RNA synthesiselongationemerged prior to its start-site (and termination-site) specificity. Transcription initiation in LUCA could have been relatively random or with a weak sequence bias toward T/A-rich elements that tend to distort and melt more readily than G/C-rich sequences. After LUCA, factors are likely to have evolved that assisted the recruitment of their cognate RNAPs to transcription start sites; in the archaeo-eukaryotic lineage TBP coevolved with the TATA box (consensus box in M. jannaschii, TATATATA107), while in the bacterial lineage the sigma factors coevolved with the Pribnow box (−10 consensus in E. coli TATAAT) and the −35 promoter element.108 The small and relatively simple genomes of LUCA were quite possibly transcribed into long polycistronic mRNAs, creating the need for factors that would ensure a high processivity of transcription such as Spt5 and NusG. What is the structure of Spt5, and how does it enhance elongation?

Figure 9. Putative evolution of transcript cleavage factors and RBP9like subunits. The ancestral two-domain version (d2−d3) of the transcription factor emerged in the archaeo-eukaryotic lineage (A-E) and in all likelihood associated reversibly with RNAP. After the split of the two lineages, the gene duplicated and speciated in eukaryotes (E) but remained largely unchanged in archaea (A). The eukaryotic preTFIIS factor expanded its structural repertoire to include a third domain (d1−d2−d3) thus forming the modern TFIIS while remaining a dissociable factor. Its paralogue, pre-RPB9, got stably incorporated into the cognate RNAP in a process referred to as transcription factor “capture”.30 As part of the multiplication process of RNAPs in eukaryotes, three distinct paralogues of pre-RPB9 evolved, A12 in RNAPI, RPB9 in RNAPII, and C11 in RNAPIII. While C11 ensures efficient transcript cleavage in RNAPIII, RPB9 only poorly supports transcript cleavage, rendering RNAPII dependent on TFIIS in order to release stalled elongation complexes. Interestingly, a fusion of RPB9 and C11 restores the transcript cleavage activity of RPB9, and structural information suggests that the C11-derived (C-ribbon) domain of the fusion protein is sufficiently mobile to reach into the active site similar to the d3 domain of TFIIS (see Figure 7B) and stimulate transcript cleavage (see the text for details).

4.9. Structure and Function of Spt4/5 in Archaea

All Spt5 factors contain a universally conserved NGN (NusG N-terminus) domain that interacts with the RNAP clamp coiled coil109 and with its small interaction partner, Spt4, in archaea and eukaryotes46b,59 (Figure 10A). In addition, Spt5 has one (archaea and bacteria) or multiple (eukaryotes) KOW domains (Figure 10A, highlighted in light pink) that interact with exogenous factors (see section 4.9). Spt4 is built around a Znribbon, and it stabilizes the Spt5 NGN domain in archaea and eukaryotes,46b whereas the NusG NGN domain is perfectly stable and in structural terms well-ordered in solution without the need for any binding partners.110 The location of Spt4 in the elongation complex is at the upstream face of RNAP and proximal to the site where the DNA strands reanneal, and it could make direct contacts with upstream DNA (Figure 10C, highlighted with red dashed circle). A recent study in mammalian cells indicates that Spt4 is critical for the transcription of long trinucleotide repeats; however, it is problematic to differentiate between direct effects attributable to Spt4 and indirect effects caused by the stabilization of the Spt5 NGN domain.111 The archaeal Spt4/5 complex associates readily with its cognate RNA polymerase and stimulates the processivity of RNAP46b,112 (Figure 10B). The structure of the RNAP−Spt4/5 complex reveals the molecular basis of this activity (Figure 11).46b,109,112 The Spt4/5 complex binds to the tip of the RNAP clamp coiled coil, and the bulk of the protein is located across the DNA binding channel. In transcription elongation complexes, this locks the DNA template (Figure 11, highlighted as bright orange surface) into the RNAP and prevents the dissociation of the RNAP−DNA−RNA elongation complex, thereby facilitating high transcription processivity (Figure 11C to D). In the cytosol, Spt4/5 can reversibly associate with “free” RNAP (Figure 11B to D), and since Spt4/ 5 blocks access to the DNA binding channel, the RNAP−Spt4/ 5 complex cannot engage with the genome (Figure 11D to C).55 How is this obstacle overcome at promoters? The

speciation, and while one paralogue was captured by its host RNAP and stably incorporated as subunit (RPB9), the “free” form (TFIIS) still reversibly associated with RNAP and modulated its function.104 As part of the multiplication of transcription systems in eukaryotes, the gene encoding the RPB9-like subunit triplicated and speciated into the A12, RPB9, and C11 RNAP subunits in extant eukaryotes (Figure 9). Why are cleavage factors so pervasive in many transcription systems? One possibility is that the active site architecture of multisubunit RNAPsthe double-psi beta barrel configurationis prone to backtracking, extrusion of the RNA 3′-end through the entry pore, and arrest, and transcript cleavage factors are essential to overcome that problem. TFIIS has also been implicated in RNAPI and -III transcription complexes,105 and in addition to the release of stalled TEC by transcript cleavage, it may play additional roles during the transcription cycle, including a means to improve fidelity during catalysis and a stabilization of the PIC during initiation, all of which could have provided a selective advantage in evolution.103,106 4.8. Transcription Processivity Factors Are Universally Conserved in Evolution

The very core of multisubunit RNAPs is universally conserved in evolution, whereas the factors that enable transcription initiation in bacteria are fundamentally different from archaea M

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 10. Archaeal Spt4/5. Panel A shows the structure of the archaeal Spt4/5 processivity factor (P. furiosus, pdb 3P8B) with the NGN and KOW domains highlighted in firebrick red and pea green, respectively, and Spt4 colored in wheat. Panel B shows a model of the archaeal RNAP−DNA− Spt4/5 structure (based on crystal structures and modified from ref 59). Panel C shows a model of the eukaryotic (S. cerevisiae) RNAPII−DNA− Spt4/5 complex including the upstream portion of the template DNA whose location was determined using the Förster resonance energy transfer based nanopositioning system (NPS).84 The proximity of Spt4 and the upstream DNA is highlighted with a dashed circle. Domains and structural motifs are color coded according to the key in the figure.

mechanism is at work in bacteria, since the initiation factor sigma 70 and the bacterial Spt5 homologue NusG bind competitively to the bacterial RNAP (section 3.2). In eukaryotes, in particular metazoans, Spt4/5 in conjunction with transcription factors NELF and TEFb facilitates another important mode of regulating transcription and possibly mRNA processing.114 Due to “promoter proximal pausing” the elongation complex stalls approximately 40 basepairs downstream of the transcription start site.115 A poorly characterized

solution to this conundrum is the initiation factor TFE (section 3.2); archaeal TFE is able to displace Spt4/5 from the RNAP at the promoter, since (i) the binding sites of the Spt5 NGN domain and the TFE WH domain are overlapping, (ii) the binding of the two factors is mutually exclusive, and (iii) the affinity of TFE for RNAP in the context of the PIC is higher than that for Spt4/5.55 The outcome of the entire ensemble of RNAP and initiation and elongation factors is an increase of the promoter specificity of archaeal RNAP.113 In theory, a similar N

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 11. Modus operandi of Spt4/5. Transcription elongation by the RNAP−DNA complex (A) is frequently interrupted by pausing and stalling that can cause the TEC to dissociate and transcription to terminate prematurely (B). The incorporation of the processivity factor Spt4/5 into the elongation complex (C) secures the template DNA in the RNAP and prevents dissociation (C to D, highlighted with a red cross). “Free” RNAP (B) can associate with Spt4/5 (D), which prevents binding to DNA sequences other than at promoters. At promoters, TFE displaces Spt4/5 and facilitates efficient transcription initiation; the outcome is an increase in promoter specificity of RNAP by Spt4/5.

and translation rates correlate over a broad range of growth conditions.120 While both elongation rates are variable, the ratio between them remains remarkably constant at three nucleotides polymerized for every amino acid synthesized.121 This in effect ensures that the yield of transcription (RNA) meets the need for translation, which implies a tight regulation of resources that is crucial for the competitiveness of microorganisms that live under chronic energy stress.122 The direct interaction between the RNAP-bound NusG KOW domain and the ribosomal protein S10 may provide the molecular rationale for this phenomenon.118 Moreover, S10 and the termination factor rho compete for the binding to the NusG KOW domain;118 while RNAP is transcribing protein encoding genes, ribosomes translate the mRNA cotranscriptionally and RNAP-bound NusG interacts with the ribosomal protein S10, which prevents recruitment of rho.121 Once the ribosome has encountered a stop codon and terminated translation, the NusG-S10 interaction is disrupted. This unmasks the NusG KOW domain and leads to the recruitment of rho, which promptly results in transcription termination. The coupling between transcription

signaling event induces phosphorylation of RNAPII and transcription factors, which releases the stalled RNAPs and swiftly induces gene expression. The biological role of this mechanism is not entirely clear, it could facilitate mRNA processing (e.g., capping) or simply be a quick means for transcription activation circumventing the need to assemble PICs first.116 The details of this mechanism are discussed elsewhere in this issue of Chemical Reviews (Price et al.).117 4.10. Putative Functions of the Archaeal Spt5 KOW DomainCoupling of Transcription and Translation

The NGN domain of Spt5 and NusG is necessary and sufficient for the processivity activity of the factor,46b,110 while the KOW domain interacts with accessory factors including the rho termination factor,110 the ribosomal protein S10 (also known as the antitermination factor NusE) in bacteria,118 and a plethora of RNA processing and chromatin remodelling factors in eukaryotes.119 The interplay among bacterial RNAP, the elongation factor NusG, the termination factor rho, and the ribosome is particularly elegant. In bacteria, the transcription O

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

and translation also increases the processivity of transcription and reduces backtracking.121 This in turn plays an important role for the maintenance of the genome integrity, since backtracked complexes can lead to the formation of R-loops and on collisions with the replication fork can lead to doublestranded DNA breaks.123 The mechanisms linking RNAP backtracking and genome damage have recently been reviewed.124 Since NusG/Spt5 and S10 are highly conserved between bacteria and archaea, and furthermore considering that archaea are true prokaryotes and therefore transcription and translation occur in the same cellular compartment, it is more than likely that the two processes are coupled by similar molecular mechanisms.125 4.11. NusA and Antitermination

The mechanisms discussed so far are global and not involved in gene-specific expression. Some bacterial species have functionally specialized NusG paralogues, the most well characterized of which is RfaH, which is discussed elsewhere in this issue of Chemical Reviews (Artsimovitch et al.).126,127 Whole genome occupancy studies in bacteria have revealed that NusG is also associated with RNAP transcribing nontranslated genes including rRNA operons.62 Considering the role of the coupling between transcription and translation to prevent rho-dependent termination, the absence of ribosomes on transcription elongation complexes renders them vulnerable to premature termination by rho. In order to eliminate the threat of premature termination on structural or otherwise noncoding RNA genes, the so-called antitermination (AT) complex assembles cotranscriptionally and protects RNAP against termination. Despite many years of study, the details of the molecular mechanism of antitermination remains opaque. A plethora of proteins are involved in the AT complex, including NusG, NusE, NusB, NusA, and conserved sequence motifs in the RNA (box A, B, and C).128 NusA plays an additional role in rRNA maturation; it facilitates the annealing of the two halfsites of the Rnase III recognition sequence, the first event in rRNA precursor processing.129 All AT-factors but the least important oneNusBare conserved in archaeal genomes, which suggests that transcription regulation by antitermination also could operate in the archaeal domain. Archaeal NusA is a small protein that only encompasses two KH domains and lacks the RNAP-interacting N-terminal and S1 domains as well as the regulatory C-terminal domains, all of which are widely conserved in bacterial NusA factors (Figure 12). The solution structure of the Aeropyrun pernix (Aep) NusA has been solved by NMR spectroscopy.130 Both KH domains have intact GXXG loops that are prototypical for KH domains131 and known to mediate interactions with single-stranded RNA, which is in good agreement with observed RNA binding of the Aep NusA in the midnanomolar range130 (Figure 12, highlighted in red). However, the biologically relevant RNA sequence targets of NusA in archaea have not been identified yet; whether NusA in archaea interacts with RNAP and whether this binding modulates the elongation and termination properties of RNAP or whether its main function is rRNA maturation remain to be investigated.

Figure 12. Archaeal NusA. The structure of the putative archaeal elongation and antitermination factor NusA from A. pernix (pdb 2CXC). The protein encompasses two KH domains with the prototypical RNA-binding GXXG loops highlighted in red.130 The table gives a short overview of bacterial antitermination factors and their archaeal homologues.

the past decade it has become apparent that elongation is anything but simple, or continuous. Rather, a structural understanding of the catalytic mechanism and the interaction network between RNAP, DNA template, and RNA transcript has revealed that it is a remarkably demanding task to transcribe genes with the high processivity required to faithfully execute the genetic program of any organism. Furthermore, it seems that transcription elongation factors predate initiation factors in evolutionary terms. In reality, elongation is interrupted frequently and both RNAP subunits (e.g., the stalk) and exogenous transcription factors (e.g., TFS and Spt4/5) have evolved to improve processivity and to overcome impediments to elongation. Experimental systems that make use of recombinant, in vitro reconstituted RNAPs have made a substantial contribution to the functional dissection of the mechanisms of transcription elongation, and in particular, archaeal model systems have been exploited to deepen the understanding of eukaryotic RNAPs that are not amenable to these approaches. Naturally considerable care has to be taken when analyzing any in vitro experiments, not only in terms of functional quality assessment of recombinant materials and rigorous control experiments, but also by constantly seeking to compare the results generated in that way with in vivo studies. The field has mainly focused on characterizing the molecular mechanisms of transcription elongation in vitro using a limited set of promoters and DNA templates. Unfortunately, many archaea are insensitive to commonly used antibiotics, but with the advent of metabolic selection in archaea, simple genetic manipulation is now possible, and we can test how deletions and mutations in transcription factors and RNAP subunits affect the viability and phenotypes of recombinant archaeal strains and how the mutations affect transcription performance in vivo.83,132 Another emerging and very powerful approach to study transcription elongation is whole genome occupancy and expression profiling (ChIP-seq and RNA-seq, respectively133). These global approaches will elucidate the pace of the transcription machinery across the archaeal genome and

5. SUMMARYA PARADIGM SHIFT Until recently the initiation phase of transcription was considered the main target and mechanism of regulating gene expression, and that elongation was simple and unregulated and continued smoothly until transcription terminated. However, in P

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

correlate the occupancy of RNAPs with the transcription output on a systems level. Only by combining the insights obtained by the high precision and rigorous control of recombinant systems with studies carried out in the context of the whole organism can a coherent and comprehensive understanding of the subject be achieved. No doubt an exciting future lies ahead for transcription in the archaea! What have we learned? (1) High transcription processivity is reliant on interactions between the RNAP, the DNA, and RNA components of the nucleic acid scaffold of the elongation complex. (2) RNAP do not only pause but also backtrack in a retrograde movement along the DNA template, which can lead to arrest and termination of transcription. (3) Transcript cleavage factors play an important role during elongation because they reactivate backtracked complexes by complementing the active center of RNAP and stimulating its intrinsic Mg2+-dependent cleavage activity. (4) Processivity factors promote elongation by closing the DNA binding channel and thereby preventing the dissociation of the elongation complex. (5) According to the “elongation first hypothesis” the phylogenetic distribution of processivity factors reflects that transcription elongation was assisted by regulatory factors before transcription initiation. (6) The molecular machines that carry out information processing in cells, RNAPs and ribosomes, are physically and functionally coupled not only because the product of the former is the template of the latter but via transcription elongation factors. (7) We have yet much to learn to appreciate the intricacies of the molecular mechanisms of transcription elongation.

RNAP laboratory applies a dedicated and interdisciplinary research programme to investigate the molecular mechanisms of transcription using biochemistry, structural and molecular biology, biophysics, systems, and computational biology approaches. The real workhorses of the lab are RNAPs from the archaeal hyperthermophiles M. jannaschii and S. solfataricus.

ACKNOWLEDGMENTS Research in my laboratory is currently supported by a Wellcome Trust Investigator Award (WT096553MA) and BBSRC grant (BB/H019332/1). I’d like to thank all members of the UCL ISMB RNAP laboratory for their courage and hard work. And thanks to Alan Cheung for preparing the structural model of the archaeal RNAP−DNA−Spt4/5 complex in Figure 10B. REFERENCES (1) Watson, J. D.; Crick, F. H. Nature 1953, 171, 964. (2) tenOever, B. R. Nat. Rev. Microbiol. 2013, 11, 169. (3) Tucker, B. J.; Breaker, R. R. Curr. Opin. Struct. Biol. 2005, 15, 342. (4) (a) Guerrier-Takada, C.; Gardiner, K.; Marsh, T.; Pace, N.; Altman, S. Cell 1983, 35, 849. (b) Muller, U. F. Cell. Mol. Life Sci. 2006, 63, 1278. (5) Egan, E. D.; Collins, K. RNA 2012, 18, 1747. (6) (a) Iyer, L. M.; Aravind, L. J. Struct. Biol. 2012, 179, 299. (b) Iyer, L. M.; Koonin, E. V.; Aravind, L. BMC Struct. Biol. 2003, 3, 1. (c) Werner, F.; Grohmann, D. Nat. Rev. Microbiol. 2011, 9, 85. (7) Cheetham, G. M.; Steitz, T. A. Curr. Opin. Struct. Biol. 2000, 10, 117. (8) Joyce, G. F., Orgel, L. E. The RNA World; Cold Spring Harbor Laboratory Press: Plainview, NY, 1999. (9) Goldman, A. D.; Bernhard, T. M.; Dolzhenko, E.; Landweber, L. F. Nucleic Acids Res. 2013, 41, D1079. (10) Woese, C. R.; Kandler, O.; Wheelis, M. L. Proc. Natl. Acad. Sci. U. S. A. 1990, 87, 4576. (11) O’Malley, M. A.; Koonin, E. V. Biol. Direct 2011, 6, 32. (12) Jarrell, K. F.; Walters, A. D.; Bochiwal, C.; Borgia, J. M.; Dickinson, T.; Chong, J. P. Microbiology 2011, 157, 919. (13) Werner, F.; Weinzierl, R. O. Mol. Cell 2002, 10, 635. (14) (a) Browne, P. D.; Cadillo-Quiroz, H. Archaea 2013, 2013, 586369. (b) Maupin-Furlow, J. A.; Humbard, M. A.; Kirkland, P. A. Curr. Opin. Microbiology 2012, 15, 351. (15) (a) Abbondanzieri, E. A.; Greenleaf, W. J.; Shaevitz, J. W.; Landick, R.; Block, S. M. Nature 2005, 438, 460. (b) Lane, W. J.; Darst, S. A. J. Mol. Biol. 2010, 395, 671. (c) Lane, W. J.; Darst, S. A. J. Mol. Biol. 2010, 395, 686. (16) Minakhin, L.; Bhagat, S.; Brunning, A.; Campbell, E. A.; Darst, S. A.; Ebright, R. H.; Severinov, K. Proc. Natl. Acad. Sci. U. S. A. 2001, 98, 892. (17) (a) Ebright, R. H.; Busby, S. Curr. Opin. Genet. Dev. 1995, 5, 197. (b) Ebright, R. H. J. Mol. Biol. 2000, 304, 687. (18) Cramer, P.; Bushnell, D. A.; Kornberg, R. D. Science 2001, 292, 1863. (19) Werner, F.; Eloranta, J. J.; Weinzierl, R. O. Nucleic Acids Res. 2000, 28, 4299. (20) Mukhamedyarov, D.; Makarova, K. S.; Severinov, K.; Kuznedelov, K. BMC Mol. Biol. 2011, 12, 50. (21) Ishihama, A. Mol. Microbiol. 1992, 6, 3283. (22) Tan, Q.; Linask, K. L.; Ebright, R. H.; Woychik, N. A. Genes Dev. 2000, 14, 339. (23) Thomm, M.; Reich, C.; Grunberg, S.; Naji, S. Biochem. Soc. Trans. 2009, 37, 18. (24) Severinov, K.; Mustaev, A.; Kukarin, A.; Muzzin, O.; Bass, I.; Darst, S. A.; Goldfarb, A. J. Biol. Chem. 1996, 271, 27969. (25) (a) Kampfer, P. Antonie van Leeuwenhoek 2012, 101, 3. (b) Koonin, E. V.; Makarova, K. S.; Elkins, J. G. Biol. Direct 2007, 2, 38.

AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Tel: +44 20 7679 0147 Notes

The authors declare no competing financial interest. Biography

Finn Werner was born in Germany as a member of the Danish minority. He received his scientific training at the University of Copenhagen in Denmark (B.Sc. in 1993 and Cand. Scient. in Molecular Biology in 1996) and at Imperial College London in the United Kingdom (Ph.D. in Biochemistryin 2002). He took up a lectureship at University College London in 2005 and is currently a Wellcome Trust Investigator and full professor of molecular biophysics at the Institute of Structural and Molecular Biology (ISMB). His Q

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(26) (a) Korkhin, Y.; Unligil, U. M.; Littlefield, O.; Nelson, P. J.; Stuart, D. I.; Sigler, P. B.; Bell, S. D.; Abrescia, N. G. PLoS Biol. 2009, 7, e1000102. (b) Wojtas, M. N.; Mogni, M.; Millet, O.; Bell, S. D.; Abrescia, N. G. Nucleic Acids Res. 2012, 40, 9941. (27) (a) Doherty, G. P.; Fogg, M. J.; Wilkinson, A. J.; Lewis, P. J. Microbiology 2010, 156, 3532. (b) Rabatinova, A.; Sanderova, H.; Matejckova, J. J.; Korelusova, J.; Sojka, L.; Barvik, I.; Papouskova, V.; Sklenar, V.; Zidek, L.; Krasny, L. J. Bacteriol. 2013, 195, 2603. (28) Cramer, P.; Armache, K. J.; Baumli, S.; Benkert, S.; Brueckner, F.; Buchen, C.; Damsma, G. E.; Dengl, S.; Geiger, S. R.; Jasiak, A. J.; Jawhari, A.; Jennebach, S.; Kamenski, T.; Kettenberger, H.; Kuhn, C. D.; Lehmann, E.; Leike, K.; Sydow, J. F.; Vannini, A. Annu. Rev. Biophys. 2008, 37, 337. (29) (a) Vannini, A. Biochim. Biophys. Acta 2013, 1829, 258. (b) Vannini, A.; Cramer, P. Mol. Cell 2012, 45, 439. (30) Carter, R.; Drouin, G. Mol. Biol. Evol. 2009, 26, 2515. (31) (a) Cheng, B.; Price, D. H. Nucleic Acids Res. 2008, 36, e135. (b) Sikorski, T. W.; Buratowski, S. Curr. Opin. Cell Biol. 2009, 21, 344. (32) Ruprich-Robert, G.; Thuriaux, P. Nucleic Acids Res. 2010, 28, 4559. (33) Steitz, T. A. Nature 1998, 391, 231. (34) Grohmann, D.; Werner, F. Res. Microbiol. 2011, 162, 10. (35) (a) Grohmann, D.; Hirtreiter, A.; Werner, F. Biochem. J. 2009, 421, 339. (b) Grohmann, D.; Klose, D.; Klare, J. P.; Kay, C. W.; Steinhoff, H. J.; Werner, F. J. Am. Chem. Soc. 2010, 132, 5954. (36) Weinzierl, R. O. J. Chem. Rev. 2013, DOI: 10.1021/cr400148k. (37) (a) Feig, M.; Burton, Z. F. Biophys. J. 2010, 99, 2577. (b) Feig, M.; Burton, Z. F. Proteins 2010, 78, 434. (38) Fouqueau, T.; Zeller, M. E.; Cheung, A. C.; Cramer, P.; Thomm, M. Nucleic Acids Res. 2013, DOI: 10.1093/nar/gkt433. (39) Naji, S.; Bertero, M. G.; Spitalny, P.; Cramer, P.; Thomm, M. Nucleic Acids Res. 2008, 36, 676. (40) Mukhopadhyay, J.; Das, K.; Ismail, S.; Koppstein, D.; Jang, M.; Hudson, B.; Sarafianos, S.; Tuske, S.; Patel, J.; Jansen, R.; Irschik, H.; Arnold, E.; Ebright, R. H. Cell 2008, 135, 295. (41) Cheung, A. C.; Cramer, P. Cell 2012, 149, 1431. (42) Chakraborty, A.; Wang, D.; Ebright, Y. W.; Korlann, Y.; Kortkhonjia, E.; Kim, T.; Chowdhury, S.; Wigneshweraraj, S.; Irschik, H.; Jansen, R.; Nixon, B. T.; Knight, J.; Weiss, S.; Ebright, R. H. Science 2012, 337, 591. (43) Weixlbaumer, A.; Leon, K.; Landick, R.; Darst, S. A. Cell 2013, 152, 431. (44) Bushnell, D. A.; Kornberg, R. D. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 6969. (45) Weinzierl, R. O. Archaea 2011, 2011, 608385. (46) (a) Grohmann, D.; Werner, F. RNA Biol. 2010, 7, 310. (b) Hirtreiter, A.; Damsma, G. E.; Cheung, A. C.; Klose, D.; Grohmann, D.; Vojnic, E.; Martin, A. C.; Cramer, P.; Werner, F. Nucleic Acids Res. 2010, 38, 4040. (47) Tagami, S.; Sekine, S.; Kumarevel, T.; Hino, N.; Murayama, Y.; Kamegamori, S.; Yamamoto, M.; Sakamoto, K.; Yokoyama, S. Nature 2010, 468, 978. (48) (a) Parvin, J. D.; Sharp, P. A. Cell 1993, 73, 533. (b) Qureshi, S. A.; Bell, S. D.; Jackson, S. P. EMBO J. 1997, 16, 2927. (49) Saxena, A.; Ma, B.; Schramm, L.; Hernandez, N. Mol. Cell. Biol. 2005, 25, 9406. (50) Knutson, B. A.; Hahn, S. Science 2011, 333, 1637. (51) Treutlein, B.; Muschielok, A.; Andrecka, J.; Jawhari, A.; Buchen, C.; Kostrewa, D.; Hog, F.; Cramer, P.; Michaelis, J. Mol. Cell 2012, 46, 136. (52) (a) Kostrewa, D.; Zeller, M. E.; Armache, K. J.; Seizl, M.; Leike, K.; Thomm, M.; Cramer, P. Nature 2009, 462, 323. (b) Werner, F.; Weinzierl, R. O. Mol. Cell. Biol. 2005, 25, 8344. (c) Wiesler, S. C.; Weinzierl, R. O. Nucleic Acids Res. 2011, 39, 464. (53) Renfrow, M. B.; Naryshkin, N.; Lewis, L. M.; Chen, H. T.; Ebright, R. H.; Scott, R. A. J. Biol. Chem. 2004, 279, 2825. (54) (a) Grunberg, S.; Bartlett, M. S.; Naji, S.; Thomm, M. J. Biol. Chem. 2007, 282, 35482. (b) Naji, S.; Grunberg, S.; Thomm, M. J. Biol. Chem. 2007, 282, 11047.

(55) Grohmann, D.; Nagy, J.; Chakraborty, A.; Klose, D.; Fielden, D.; Ebright, R. H.; Michaelis, J.; Werner, F. Mol. Cell 2011, 43, 263. (56) Kapanidis, A. N.; Margeat, E.; Ho, S. O.; Kortkhonjia, E.; Weiss, S.; Ebright, R. H. Science 2006, 314, 1144. (57) He, Y.; Fang, J.; Taatjes, D. J.; Nogales, E. Nature 2013, 495, 481. (58) Holstege, F. C.; Tantin, D.; Carey, M.; van der Vliet, P. C.; Timmers, H. T. EMBO J. 1995, 14, 810. (59) Werner, F. J. Mol. Biol. 2012, 417, 13. (60) Larochelle, S.; Amat, R.; Glover-Cutter, K.; Sanso, M.; Zhang, C.; Allen, J. J.; Shokat, K. M.; Bentley, D. L.; Fisher, R. P. Nat. Struct. Mol. Biol. 2012, 19, 1108. (61) Sevostyanova, A.; Svetlov, V.; Vassylyev, D. G.; Artsimovitch, I. Proc. Natl. Acad. Sci. U. S. A. 2008, 105, 865. (62) Mooney, R. A.; Davis, S. E.; Peters, J. M.; Rowland, J. L.; Ansari, A. Z.; Landick, R. Mol. Cell 2009, 33, 97. (63) Kettenberger, H.; Armache, K. J.; Cramer, P. Mol. Cell 2004, 16, 955. (64) Landick, R. Biochem. Soc. Trans. 2006, 34, 1062. (65) Palangat, M.; Hittinger, C. T.; Landick, R. J. Mol. Biol. 2004, 341, 429. (66) Sevostyanova, A.; Belogurov, G. A.; Mooney, R. A.; Landick, R.; Artsimovitch, I. Mol. Cell 2011, 43, 253. (67) (a) Grohmann, D.; Klose, D.; Fielden, D.; Werner, F. Biochem. Soc. Trans. 2011, 39, 122. (b) Grohmann, D.; Werner, F.; Tinnefeld, P. Curr. Opin. Chem. Biol. 2013, 17, 691. (68) (a) Larson, M. H.; Zhou, J.; Kaplan, C. D.; Palangat, M.; Kornberg, R. D.; Landick, R.; Block, S. M. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, 6555. (b) Herbert, K. M.; Zhou, J.; Mooney, R. A.; Porta, A. L.; Landick, R.; Block, S. M. J. Mol. Biol. 2010, 399, 17. (69) Zhou, J.; Schweikhard, V.; Block, S. M. Biochim. Biophys. Acta 2013, 1829, 29. (70) Weinzierl, R. O. BMC Biol. 2010, 8, 134. (71) (a) Cheong, J. H.; Yi, M.; Lin, Y.; Murakami, S. EMBO J. 1995, 14, 143. (b) Kim, T. K.; Ebright, R. H.; Reinberg, D. Science 2000, 288, 1418. (72) (a) Bartlett, M. S.; Thomm, M.; Geiduschek, E. P. J. Biol. Chem. 2004, 279, 5894. (b) Grunberg, S.; Reich, C.; Zeller, M. E.; Bartlett, M. S.; Thomm, M. Nucleic Acids Res. 2010, 38, 1950. (73) Bartlett, M. S.; Thomm, M.; Geiduschek, E. P. Nat. Struct. Biol. 2000, 7, 782. (74) Wojtas, M. N.; Abrescia, N. G. Biochem. Soc. Trans. 2013, 41, 356. (75) Ouhammouch, M.; Werner, F.; Weinzierl, R. O.; Geiduschek, E. P. J. Biol. Chem. 2004, 279, 51719. (76) Todone, F.; Brick, P.; Werner, F.; Weinzierl, R. O.; Onesti, S. Mol. Cell 2001, 8, 1137. (77) Ujvari, A.; Luse, D. S. Nat. Struct. Mol. Biol. 2006, 13, 49. (78) Hirtreiter, A.; Grohmann, D.; Werner, F. Nucleic Acids Res. 2010, 38, 585. (79) Tennyson, C. N.; Klamut, H. J.; Worton, R. G. Nat. Genet. 1995, 9, 184. (80) Runner, V. M.; Podolny, V.; Buratowski, S. Mol. Cell. Biol. 2008, 28, 1883. (81) Miyao, T.; Barnett, J. D.; Woychik, N. A. J. Biol. Chem. 2001, 276, 46408. (82) Orlicky, S. M.; Tran, P. T.; Sayre, M. H.; Edwards, A. M. J. Biol. Chem. 2001, 276, 10097. (83) Hirata, A.; Kanai, T.; Santangelo, T. J.; Tajiri, M.; Manabe, K.; Reeve, J. N.; Imanaka, T.; Murakami, K. S. Mol. Microbiol. 2008, 70, 623. (84) Andrecka, J.; Treutlein, B.; Arcusa, M. A.; Muschielok, A.; Lewis, R.; Cheung, A. C.; Cramer, P.; Michaelis, J. Nucleic Acids Res. 2009, 37, 5803. (85) Xiong, Y.; Burton, Z. F. J. Biol. Chem. 2007, 282, 36582. (86) Wilson, K. S.; Conant, C. R.; von Hippel, P. H. J. Mol. Biol. 1999, 289, 1179. (87) Zhang, Y.; Feng, Y.; Chatterjee, S.; Tuske, S.; Ho, M. X.; Arnold, E.; Ebright, R. H. Science 2012, 338, 1076. R

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(88) Kireeva, M. L.; Komissarova, N.; Kashlev, M. J. Mol. Biol. 2000, 299, 325. (89) Kireeva, M. L.; Komissarova, N.; Waugh, D. S.; Kashlev, M. J. Biol. Chem. 2000, 275, 6530. (90) Santangelo, T. J.; Reeve, J. N. J. Biol. Chem. 2010, 285, 23908. (91) Kent, T.; Kashkina, E.; Anikin, M.; Temiakov, D. J. Biol. Chem. 2009, 284, 13497. (92) Sandman, K.; Reeve, J. N. Curr. Opin. Microbiol. 2006, 9, 520. (93) Wardleworth, B. N.; Russell, R. J.; Bell, S. D.; Taylor, G. L.; White, M. F. EMBO J. 2002, 21, 4654. (94) Wilkinson, S. P.; Ouhammouch, M.; Geiduschek, E. P. Proc. Natl. Acad. Sci. U. S. A. 2010, 107, 6777. (95) Marsh, V. L.; Peak-Chew, S. Y.; Bell, S. D. J. Biol. Chem. 2005, 280, 21122. (96) (a) Gomez-Herreros, F.; de Miguel-Jimenez, L.; MillanZambrano, G.; Penate, X.; Delgado-Ramos, L.; Munoz-Centeno, M. C.; Chavez, S. FEBS Lett. 2012, 586, 2820. (b) Shaevitz, J. W.; Abbondanzieri, E. A.; Landick, R.; Block, S. M. Nature 2003, 426, 684. (97) Cheung, A. C.; Cramer, P. Nature 2011, 471, 249. (98) Orlova, M.; Newlands, J.; Das, A.; Goldfarb, A.; Borukhov, S. Proc. Natl. Acad. Sci. U. S. A. 1995, 92, 4596. (99) Ruan, W.; Lehmann, E.; Thomm, M.; Kostrewa, D.; Cramer, P. J. Biol. Chem. 2011, 286, 18701. (100) Kettenberger, H.; Armache, K. J.; Cramer, P. Cell 2003, 114, 347. (101) Hausner, W.; Lange, U.; Musfeldt, M. J. Biol. Chem. 2000, 275, 12393. (102) Opalka, N.; Chlenov, M.; Chacon, P.; Rice, W. J.; Wriggers, W.; Darst, S. A. Cell 2003, 114, 335. (103) Lange, U.; Hausner, W. Mol. Microbiol. 2004, 52, 1133. (104) Carter, R.; Drouin, G. Mol. Biol. Evol. 2010, 27, 1035. (105) (a) Ghavi-Helm, Y.; Michaut, M.; Acker, J.; Aude, J. C.; Thuriaux, P.; Werner, M.; Soutourina, J. Genes Dev. 2008, 22, 1934. (b) Labhart, P. J. Biol. Chem. 1997, 272, 9055. (c) Schnapp, G.; Graveley, B. R.; Grummt, I. Mol. Gen. Genet.: MGG 1996, 252, 412. (106) Kim, B.; Nesvizhskii, A. I.; Rani, P. G.; Hahn, S.; Aebersold, R.; Ranish, J. A. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 16068. (107) Zhang, J.; Li, E.; Olsen, G. J. Nucleic Acids Res. 2009, 37, 3588. (108) Harley, C. B.; Reynolds, R. P. Nucleic Acids Res. 1987, 15, 2343. (109) Martinez-Rucobo, F. W.; Sainsbury, S.; Cheung, A. C.; Cramer, P. EMBO J. 2011, 30, 1302. (110) Mooney, R. A.; Schweimer, K.; Roesch, P.; Gottesman, M.; Landick, R. J. Mol. Biol. 2009, 391, 341. (111) Liu, C. R.; Chang, C. R.; Chern, Y.; Wang, T. H.; Hsieh, W. C.; Shen, W. C.; Chang, C. Y.; Chu, I. C.; Deng, N.; Cohen, S. N.; Cheng, T. H. Cell 2012, 148, 690. (112) Klein, B. J.; Bose, D.; Baker, K. J.; Yusoff, Z. M.; Zhang, X.; Murakami, K. S. Proc. Natl. Acad. Sci. U. S. A. 2011, 108, 546. (113) Hartzog, G. A.; Kaplan, C. D. Mol. Cell 2011, 43, 161. (114) Yamaguchi, Y.; Shibata, H.; Handa, H. Biochim. Biophys. Acta 2013, 1829, 98. (115) Core, L. J.; Lis, J. T. Science 2008, 319, 1791. (116) Core, L. J.; Waterfall, J. J.; Lis, J. T. Science 2008, 322, 1845. (117) Guo, J.; Price, D. H. Chem. Rev. 2013, DOI: 10.1021/ cr400105n. (118) Burmann, B. M.; Schweimer, K.; Luo, X.; Wahl, M. C.; Stitt, B. L.; Gottesman, M. E.; Rosch, P. Science 2010, 328, 501. (119) Lindstrom, D. L.; Squazzo, S. L.; Muster, N.; Burckin, T. A.; Wachter, K. C.; Emigh, C. A.; McCleery, J. A.; Yates, J. R., III; Hartzog, G. A. Mol. Cell. Biol. 2003, 23, 1368. (120) (a) Vogel, U.; Jensen, K. F. J. Bacteriol. 1994, 176, 2807. (b) Sorensen, M. A.; Vogel, U.; Jensen, K. F.; Pedersen, S. Antonie van Leeuwenhoek 1993, 63, 323. (c) Vogel, U.; Sorensen, M.; Pedersen, S.; Jensen, K. F.; Kilstrup, M. Mol. Microbiol. 1992, 6, 2191. (121) Proshkin, S.; Rahmouni, A. R.; Mironov, A.; Nudler, E. Science 2010, 328, 504. (122) Burmann, B. M.; Rosch, P. Transcription 2011, 2, 130. (123) Dutta, D.; Shatalin, K.; Epshtein, V.; Gottesman, M. E.; Nudler, E. Cell 2011, 146, 533.

(124) (a) Helmrich, A.; Ballarino, M.; Nudler, E.; Tora, L. Nat. Struct. Mol. Biol. 2013, 20, 412. (b) McGary, K.; Nudler, E. Curr. Opin. Microbiol. 2013, 16, 122. (125) Santangelo, T. J.; Cubonova, L.; Matsumi, R.; Atomi, H.; Imanaka, T.; Reeve, J. N. J. Bacteriol. 2008, 190, 2244. (126) Belogurov, G. A.; Mooney, R. A.; Svetlov, V.; Landick, R.; Artsimovitch, I. EMBO J. 2009, 28, 112. (127) Tomar, S. K.; Artsimovitch, I. Chem. Rev. 2013, DOI: 10.1021/ cr400064k. (128) (a) Arnvig, K. B.; Zeng, S.; Quan, S.; Papageorge, A.; Zhang, N.; Villapakkam, A. C.; Squires, C. L. J. Bacteriol. 2008, 190, 7251. (b) Santangelo, T. J.; Artsimovitch, I. Nat. Rev. Microbiol. 2011, 9, 319. (129) Bubunenko, M.; Court, D. L.; Al Refaii, A.; Saxena, S.; Korepanov, A.; Friedman, D. I.; Gottesman, M. E.; Alix, J. H. Mol. Microbiol. 2013, 87, 382. (130) Shibata, R.; Bessho, Y.; Shinkai, A.; Nishimoto, M.; Fusatomi, E.; Terada, T.; Shirouzu, M.; Yokoyama, S. Biochem. Biophys. Res. Commun. 2007, 355, 122. (131) Hollingworth, D.; Candel, A. M.; Nicastro, G.; Martin, S. R.; Briata, P.; Gherzi, R.; Ramos, A. Nucleic Acids Res. 2012, 40, 6873. (132) (a) Wagner, M.; van Wolferen, M.; Wagner, A.; Lassak, K.; Meyer, B. H.; Reimann, J.; Albers, S. V. Front. Microbiol. 2012, 3, 214. (b) Leigh, J. A.; Albers, S. V.; Atomi, H.; Allers, T. FEMS Microbiol. Rev. 2011, 35, 577. (133) (a) Furey, T. S. Nat. Rev. Genet. 2012, 13, 840. (b) Martin, J. A.; Wang, Z. Nat. Rev. Genet. 2011, 12, 671. (134) Blombach, F.; Makarova, K. S.; Marrero, J.; Siebers, B.; Koonin, E. V.; van der Oost, J. Biol. Direct 2009, 4, 39. (135) Geiger, S. R.; Lorenzen, K.; Schreieck, A.; Hanecker, P.; Kostrewa, D.; Heck, A. J.; Cramer, P. Mol. Cell 2010, 39, 583.

S

dx.doi.org/10.1021/cr4002325 | Chem. Rev. XXXX, XXX, XXX−XXX