The Writers, Readers, and Functions of the RNA ... - ACS Publications

Jul 10, 2013 - Overarching Principles. The current view of the. CTD phosphorylation cycle is depicted in Figure 2. The cycle described here is based o...
0 downloads 0 Views 5MB Size
Review pubs.acs.org/CR

The Writers, Readers, and Functions of the RNA Polymerase II C‑Terminal Domain Code Célia Jeronimo,† Alain R. Bataille,†,§ and François Robert*,†,‡ †

Institut de recherches cliniques de Montréal, Montréal, Québec, Canada H2W 1R7 Département de Médecine, Faculté de Médecine, Université de Montréal, Montréal, Québec, Canada H3T 1J4



2.4.2. Dephosphorylation of P-Ser5 and PSer7: Ssu72, a Phosphatase “A-cis-ted” by Ess1 2.4.3. Partial Dephosphorylation of P-Ser2: Fcp1 Winning over Ser2 Kinases 2.4.4. Complete Dephosphorylation of P-Ser2: Fcp1 with a Little Help from Its Friend 2.4.5. Other CTD Phosphatases 2.5. Beyond Phosphorylation 2.5.1. CTD O-GlcNAcylation 2.5.2. Modifications of Nonconsensus CTD Residues: Arginine Methylation and Beyond 3. Reading the CTD Code 3.1. Approaches To Identify CTD-Binding Proteins and Characterize Their Interactions with the CTD 3.1.1. CID Domain 3.1.2. GTase NT Domain 3.1.3. WW Domain 3.1.4. FF Domain 3.1.5. RRM Domain 3.1.6. SRI Domain 3.1.7. Tandem SH2 Domain 4. Functions of the CTD Code 4.1. Does the CTD Regulate Transcription? 4.1.1. Initiation 4.1.2. Pausing 4.1.3. Elongation 4.1.4. Termination 4.2. CTD Coordinates RNA Processing, Transcriptional Termination, and mRNA Export 4.2.1. mRNA Capping 4.2.2. Splicing 4.2.3. 3′ End Processing, Termination, and Export 4.3. The CTD Allows for a Feedback on Chromatin 4.3.1. Recruiting KATs and Chromatin Remodelers To Open Up Chromatin 4.3.2. Recruiting KDACs To Close Up Chromatin 4.3.3. Recruiting KMTs 4.3.4. Recruiting Histone Chaperones

CONTENTS 1. Introduction 2. Writing the CTD Code 2.1. Approaches To Characterize CTD Modifications 2.2. The Current CTD Phosphorylation Cycle Model 2.2.1. Overarching Principles 2.2.2. Walking through the Cycle 2.2.3. Unreachable Complexity 2.2.4. Mammalian versus Yeast Cells 2.2.5. Exceptions to the Generic CTD Cycle: Gene Specific CTD Phosphorylation Patterns 2.3. Adding Phosphorylation Marks: CTD Kinases 2.3.1. TFIIH Places Both P-Ser5 and P-Ser7 Marks Prior to Initiation 2.3.2. Modulating the Initial Marks Early during Elongation 2.3.3. Ctk1 (CTDK1) and Bur1 (P-TEFb) Share the Duty of Placing P-Ser2 during Elongation 2.3.4. Phosphorylation of Tyr1 Is Performed by a Nontraditional CTD Kinase 2.3.5. Phosphorylation of Thr4 by Plk3 2.3.6. CTD Kinases with Ambiguous Specificities and Poorly Understood Roles 2.3.7. Ordered Recruitment of CTD Kinases 2.3.8. Interplays between the CTD Kinases: Timing, Priming, and Cross-Phosphorylation 2.4. Removing Phosphorylation Marks: CTD Phosphatases 2.4.1. Dephosphorylation of P-Tyr1: The Chase Is On

B C C C C D D D

E F F F

G G G G H

I

I J K K K K

K M

M N N N N O O O O O O P P P P P R S V V V V W

I Special Issue: 2013 Gene Expression

I

Received: March 3, 2013

© XXXX American Chemical Society

A

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews 4.3.5. A Complex Network of Activities Ensuring Genomic Fidelity 4.3.6. Conservation in Higher Eukaryotes 5. Perspectives and Conclusion Author Information Corresponding Author Present Address Notes Biographies Acknowledgments References

Review

W X X Y Y Y Y Y Y Z

1. INTRODUCTION Transcription in the nucleus of animal cells is performed by three distinct RNA polymerases (RNAP), namely RNAPI, RNAPII, and RNAPIII. Among them, RNAPII is responsible for the transcription of all protein-coding genes, as well as several noncoding RNAs, including small nuclear RNAs (snRNAs) and small nucleolar RNAs (snoRNAs). It therefore has to deal with a huge number of different substrates (genes) compared to RNAPI and RNAPIII that are synthesizing a limited number of different transcripts (mostly rRNAs and tRNAs). Accordingly, transcription by RNAPII is subject to a plethora of regulatory cues. Another distinct feature of RNAPII is the presence of a peculiar C-terminal extension on its largest subunit (Rpb1), referred to as the C-terminal domain (CTD).1 This unstructured domain, not found on any other RNAP, is made of tandem repetitions of the heptapeptide Y1S2P3T4S5P6S7 with some degenerate repeats (Figure 1).2 The number of CTD repeats increases with organism complexity, ranging from 26 (yeast), to 42 (Drosophila), to 52 (mammals). Despite carrying very little information on its own (due to its poor sequence content), the CTD is nevertheless highly conserved and essential for viability in all organisms.3−7 This importance stems from the fact that each amino acid in the CTD heptapeptide can be posttranslationally modified.8 The serines at positions 2 (Ser2), 5 (Ser5), and 7 (Ser7), as well as the tyrosine at position 1 (Tyr1) and the threonine at position 4 (Thr4) can all be phosphorylated (Figure 1). In addition, serines and the threonine can also be glycosylated. This combinatorial modification scheme is made even more complex by the fact that both prolines (Pro3 and Pro6) can be found in either cis or trans conformation. Finally, modifications can occur on nonconsensus residues. For example, an arginine replacing the serine at position 7 of repeat number 31 in the human CTD has been shown to be methylated and six lysines in the distal (C-terminal) section of the murine CTD can be ubiquitylated.9 In theory, this combination of repetitions and modifications would allow the CTD to adopt almost endless different states. CTD modifications occur dynamically as RNAPII travels along genes, and it is now very well established that this allows for RNAPII to dynamically recruit regulatory factors to specific regions during transcription.1 Via this spatiotemporal recruitment of factors, the CTD allows for the coupling of transcription by RNAPII to RNA processing and chromatin reorganization. The expression “CTD code” was coined by Steve Buratowski to describe the idea that different combinations of CTD modifications create as many different specific interaction surfaces for various proteins to interact with the CTD.10 In this article, we will review the literature on the

Figure 1. The RNAPII CTD, which is made of repetitions of the consensus YSPTSPS peptide. The sequences of the CTD from Saccharomyces cerevisiae and Homo sapiens are shown. Each repeat is numbered starting from the proximal (N-terminal) to the most distal (C-terminal). Repeats deviating from the consensus sequence are numbered in red, and nonconsensus residues are highlighted in yellow. The consensus sequence is displayed with its known modifications. Phosphorylation of Tyr1, Ser2, Thr4, Ser5, and Ser7 are represented in red, blue, orange, green, and purple, respectively, a convention that is used throughout the figures of this article. Isomerization of Pro3 and Pro6 is illustrated by a double-headed arrow, and O-GlcNAcylation is represented by a “G”. B

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

CTD code, with an emphasis on the recent genome-wide studies that have shed a great deal of light on how this complex and dynamic CTD modification cycle is established along genes. To differentiate from other excellent descriptive reviews in the CTD field, we have included speculative aspects and emphasize points of dispute. This is meant to be constructive rather than provocative. For other aspects of the CTD, we refer to the several excellent recent reviews on the subject,1,8,11,12 including one by Jeff Corden that, among other things, highlights evolutionary and structural aspects of the CTD.13 Figure 2. Schematic representation of the different CTD phosphorylation marks along a gene. The various phosphorylation marks are color labeled as per Figure 1. Note that two different P-Ser2 signals are depicted: they are labeled as H5 and 3E10, after the name of the antibodies used to detect them. They represent different variants of PSer2 that are dephosphorylated using a different dynamic. This figure is reminiscent of the way average genome-wide ChIP signal is often represented. That said, however, the lines here are meant to represent an interpretation of the data, rather than the data itself. This explains why P-Ser5 and P-Ser7 abruptly raise to 100% right at the transcription start site (TSS) rather than peaking a bit further downstream as observed in raw ChIP data. Similarly, P-Ser2 (H5), PSer5, and P-Ser7 are depicted as decreasing from the polyA (pA) signal while the raw ChIP signal starts to decline about 200 base pairs (bp) upstream from that milestone. This is based on the fact that ChIP experiments have a limited resolution so that a dephosphorylation triggered at the pA is expected to translate into a visible decrease about 200 bp upstream. The same reasoning applies to phosphorylation of Ser5 and Ser7 at the TSS. In that last case, the interpretation is augmented by the fact that in vitro experiments have clearly demonstrated that phosphorylation occurs before initiation. See the main text for details.

2. WRITING THE CTD CODE 2.1. Approaches To Characterize CTD Modifications

As soon as it was discovered in 1985, the CTD was recognized to be phosphorylated.14,15 The hyperphosphorylated form of the largest RNAPII subunit Rpb1, easily recognized by its slower mobility on SDS-PAGE, was named IIo while the nonphosphorylated and fast migrating form was named IIa. Early on, elegant experiments using in vitro reconstituted transcription systems showed that RNAPII is recruited to promoters in its nonphosphorylated form, while elongation is carried out by RNAPII carrying a phosphorylated CTD.16−19 It was not until the development of monoclonal antibodies against specific phospho-epitopes, notably H5 and H14, recognizing P-Ser2 and P-Ser5 respectively,20,21 and the use of chromatin immunoprecipitation (ChIP),22 however, that the concept of a dynamic CTD phosphorylation cycle could be envisioned. In a milestone paper, the Buratowski group showed for the first time using the H14 antibody that P-Ser5 is generated early during the transcription cycle, while P-Ser2, as detected by the H5 antibody, accumulates more downstream, concomitantly to the loss of P-Ser5.22 This simple model ruled the field for almost a decade until the development of new antibodies by the Eick group (notably 3E8, 3E10, 4E12)23 and the use of genome-wide ChIP approaches allowing for a reinvestigation of the CTD phosphorylation cycle dogma.24−27 These recent studies also allowed for a better understanding of how the CTD cycle is established. While a number of studies using biochemical and genetic approaches had allowed for the identification of several CTD kinases and phosphatases, how these enzymes coordinate the establishment of a precise CTD phosphorylation cycle along genes is only starting to be understood. Below, we will summarize the current view of the CTD phosphorylation cycle before discussing the role of the various CTD-modifying enzymes.

start site (TSS) and transcription termination site (TTS), respectively. While this may seem like a trivial statement at first, it actually has far-reaching consequences, especially for small genes. For instance, the generally accepted statement that PSer5 and P-Ser2 are more abundant in the 5′ end and 3′ end of genes, respectively, has to be reconsidered. Indeed, this does not hold for short genes. Because their 3′ end is too close to their 5′ end, short genes (about 500 bp or less) have P-Ser5 all across and virtually no P-Ser2. We shall emphasize, however, that short and long genes obey the same rules, but because PSer5 is partially removed around 500 bp after the TSS, and PSer2 is slowly accumulating over the first kilobase, a 500 bp long gene will necessarily have high P-Ser5 throughout the transcribed region and very little P-Ser2 at its 3′ end. One should therefore avoid thinking in relative terms and rather think about the CTD phosphorylation cycle as a “ruler” counting in base pairs. Another overarching conclusion from recent genome-wide studies in budding yeast is that the CTD cycle is a universal phenomenon obeying the same rules across virtually all genes.25,27 While there were originally disputes about this,24−26 it has now become clear that discrepancies mostly arose from differences in data analysis as well as from the fact that the yeast genome is very compact (the average intergenic region is 400 bp), leading to “bleeding-over” effects in ChIP experiments27 (also reviewed in ref 11). Whereas the CTD is modified according to the same rules for all genes in budding yeast, compelling evidence for exceptions have been reported in Schizosaccharomyces pombe and in mammalian cells.28−31 As discussed further below, more complex eukaryotes have evolved mechanisms to rewire the CTD cycle at specific genes. Below

2.2. The Current CTD Phosphorylation Cycle Model

2.2.1. Overarching Principles. The current view of the CTD phosphorylation cycle is depicted in Figure 2. The cycle described here is based on data obtained from several laboratories using the budding yeast Saccharomyces cerevisiae, but comments regarding other eukaryotes will be discussed when relevant. Essentially, the key feature described a decade ago still holds true today: P-Ser5 accumulates earlier than PSer2. However, because the current data covers the whole genome, one can now build a much more quantitative model. In addition, the current model includes data on P-Ser7, P-Tyr1, and P-Thr4, marks for which no antibodies were available until very recently. One outstanding conclusion drawn from these studies is the fact that the CTD phosphorylation profiles do not scale with gene length.24,25,27 Indeed, the marks are deposited and removed as a function of the distance from the transcription C

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

dephosphorylated only after termination27 (Figure 2). Because H5 and 3E10 antibodies do not recognize exactly the same epitope, we had proposed that functionally distinct variants of P-Ser2 are removed by distinct mechanisms with differing timing;27 see section 2.4.4 for more details. Along with the partial dephosphorylation of P-Ser2 (H5 + BL antibodies), both Ser5 and Ser7 are dephosphorylated with similar kinetics,27 preceded shortly by the dephosphorylation of Tyr135 (Figure 2). In mammalian cells, the CTD is also phosphorylated on Thr4 concomitantly with the dephosphorylation of other marks after the passage over the pA.38 Because some level of P-Thr4 can also be detected along the gene, and since the increase of P-Thr4 levels after the pA is accompanied by a similar increase in P-Ser2 (3E10) levels, one cannot exclude the possibility that P-Thr4, like P-Ser2, is accumulating over the body of the gene but is masked until the polymerase reaches the pA. This seems likely given that the recognition of the anti-P-Thr4 antibody is blocked by the presence of P-Ser2 or P-Ser5.38 Nevertheless, RNAPII leaves the chromatin template a few hundred base pairs after the pA (a few kilobases in mammalian cells) with a CTD partially phosphorylated on Ser2 (the 3E10 variant) and on Thr4 (in mammalian cells) (Figure 2). These two marks therefore need to be dephosphorylation off-template prior to the recycling of RNAPII to the promoter. 2.2.3. Unreachable Complexity. Both the phosphorylation and dephosphorylation of the CTD therefore occur in a highly orchestrated manner, creating a very dynamic interaction surface constantly remodeled as the polymerase moves along a gene. The model described above shows that one needs to consider multiple repeats since most phospho marks cannot be modeled using a simple binary (phosphorylated vs nonphosphorylated) mode. Indeed, the fact that different marks (notably P-Tyr1, P-Ser2, and P-Ser5) are detectable at various levels along genes implies that different repeats are differentially marked at any given position along a gene. Although P-Ser7 and P-Thr4 have patterns that would be consistent with a binary model, it appears most likely that, even for these marks, both the phosphorylated and nonphosphorylated states coexist on different repeats at any given position along the gene. Attempting to model the CTD phosphorylation cycle in more detail is currently out of reach since no tools exist that allow assessing the position of a mark along the CTD (that is, on what repeat it occurs) or the positional relationship between two marks along the CTD. For example, when considering the codetection of P-Ser2 and P-Ser5, one has to envision several possibilities: both marks present on distantly separated repeats (YS2PTSPS and YSPTS5PS epitopes), both marks present on the same repeat (YS2PTS5PS epitope), or both marks present on consecutive repeats (YSPTS 5 PSYS 2 PTSPS and YS2PTSPSYSPTS5PS epitopes). When considering all five possible phosphorylation marks and adding the possibility of having prolines in two different conformations, the number of possibilities is almost infinite (see ref 39). Although it is impossible at this point to determine to what extent the cell exploits this potential complexity, several examples suggest that it occurs at least to a certain extent (see section 3). New antibodies or the development of alternative technologies, such as the use of mass spectrometry (MS), are required to dissect this level of complexity of the CTD cycle. 2.2.4. Mammalian versus Yeast Cells. Although most of the efforts in the dissection of the CTD phosphorylation cycle was done in S. cerevisiae, genome-wide ChIP profiles are

we will describe the CTD phosphorylation patterns as they sequentially occur as the polymerase travels along a gene. We consider here a gene that is at least one kilobase long since such a gene does fully experience all known possible CTD phosphorylation/dephosphorylation events. 2.2.2. Walking through the Cycle. A plethora of in vitro evidence suggests that RNAPII is recruited to promoters in its nonphosphorylated form. This includes the observation that neither preinitiation complex (PIC) assembly nor transcription initiation can be achieved in vitro using hyperphosphorylated RNAPII preparations.18,19,32,33 Because phosphorylation of the CTD at Ser5 and Ser7 occurs very early during the transcription cycle, however, in vivo assays such as ChIP do not allow teasing apart the recruitment of RNAPII from its phosphorylation state at these sites. The recruitment of RNAPII to promoters is therefore modeled as being nonphosphorylated, but this part of the CTD phosphorylation cycle is solely based on in vitro data. In vitro data suggests that the CTD is phosphorylated, at least partially, before the formation of the first phosphodiester bound.19 This is consistent with the fact that total RNAPII ChIP profiles are undistinguishable from those of P-Ser5 and PSer7 at promoters as mentioned above. Because evidence suggests that phosphorylation of Ser5 is a prerequisite for the phosphorylation of Ser7,34 we propose that P-Ser5 may precede P-Ser7 and that phosphorylation of Ser7 must always occur adjacently to P-Ser5. Phosphorylation of Ser5, however, is susceptible to occur next to either phosphorylated or nonphosphorylated Ser7. Transcription is therefore initiated with an RNAPII phosphorylated on Ser5 and Ser7 (Figure 2). While the initial ChIP data using the H14 antibody suggested that P-Ser5 is mostly removed during the first few hundred base pairs of transcription,22 the most recent data (thanks to the availability of the 3E8 antibody) clearly shows that while P-Ser5 levels indeed drop within the first 500 bp after the TSS, nearly half of the signal remains detectable across the entire transcribed unit27 (Figure 2). This is consistent with the fact that two different P-Ser5 phosphatases, found in the 5′ end and 3′ end of genes, have been characterized. The activity of one of these enzymes, however, has been put into question, so it remains possible that the initial decrease in P-Ser5 observed by ChIP is due to the masking of the epitope rather than a bona fide decrease in phosphorylation (see section 2.3.2). Both P-Tyr1 and P-Ser2 gradually accumulate over the first kilobase of the transcription unit, after which the levels do reach a maximum25,27,35 (Figure 2). At that point, the level of all phosphorylation marks is stable until reaching the end of the gene, and subsequent events occur as a function of the distance from the polyadenylation site (pA). The signal that triggers the dephosphorylation of the CTD was never formally identified, but it most likely involves the polyadenylation signal.36 The first CTD phosphorylation marks appear to decline about 200 bp before the RNAPII disengages DNA when assayed by ChIP.27 Given the resolution of this assay, the observed pattern is consistent with dephosphorylation being triggered by the pA (as depicted in Figure 2). P-Ser2 dephosphorylation is a complex process. Indeed, when assayed using the H5 antibody (or an antibody from Bethyl Laboratories (BL)), P-Ser2 is among the first phosphoCTD marks to decline.27,37 When assayed using the 3E10 antibody, however, the signal actually extends as far downstream as does total RNAPII, suggesting that P-Ser2 is D

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Table 1. Known CTD Kinases S. cerevisiae

S. pombe

D. melanogaster

H. sapiens

protein complex

specificitya

Kin28 Ctk1 Bur1 Srb10 − − Cdc28 − Cka2 −

Mcs6 Lsk1 Cdk9 Cdk8 − − Cdc2 − Cka1 −

Cdk7 Cdk12 Cdk9 Cdk8 − − Cdc2 − CK2alpha dERK/rolled

Cdk7 Cdk12 Cdk9 Cdk8 Plk3 c-Abl Cdc2/Cdk1 Brd4 CK2alpha1 ERK2/MAPK1

TFIIH CTDK1 P-TEFb Mediator − − − − CK2 −

P-Ser5, P-Ser7 P-Ser2, P-Ser5, P-Ser7 P-Ser2, P-Ser5, P-Ser7 P-Ser2, P-Ser5 P-Thr4 P-Tyr1 P-Ser5 P-Ser2 P-Ser13b P-Ser5

a Phosphorylations for which the in vivo relevance has been established are shown in bold. bRefers to the serine at position 13 of the last (52nd) CTD repeat on the human CTD.

available for some of the marks in other organisms.28,38,40 While some of these profiles may appear to be drastically different when comparing organisms at first glance, a more careful analysis of this data suggests otherwise. Indeed, the most apparent differences between organisms stems from the fact that RNAPII experiences strong promoter-proximal pausing in some organisms (Drosophila and human) while this is virtually absent in others (S. cerevisiae and S. pombe) reviewed in ref 41. Because RNAPII density as assayed by ChIP is rather uniform along a gene in yeast, interpretation of P-marks is rather straightforward. In organisms such as Drosophila and humans, however, RNAPII is not easily detectable downstream from the transcription pause site, even at the most highly transcribed genes. This needs to be taken into account when interpreting the CTD phosphorylation profiles. P-Ser5, for example, is barely detectable beyond the pause site in mammalian cells but this is mirroring total RNAPII.38 This data therefore does not argue that Ser5 is completely dephosphorylated soon after release from the pausing, but is consistent with yeast data showing that, despite P-Ser5 levels being somewhat decreased early after initiation, a significant level persists until RNAPII reaches the end of the gene. As in mammalian cells, RNAPII occupancy is not uniform in S. pombe. Here, however, the maximal occupancy is reached at the 3′ end,28,42 suggesting that a significant pause occurs during cleavage/polyadenylation/ termination in that organism. A notable difference between organisms is that Thr4 seems not to be phosphorylated in S. cerevisiae. We would like to mention here that while P-Thr4 was originally claimed to be present in S. cerevisiae,38 our own ChIP data, as well as that shown in supplementary data in ref 35, strongly suggests that the weak signal observed by Western blot in a previous publication38 may be due to residual affinity of the anti-P-Thr4 antibody toward nonphosphorylated CTD repeats. It remains possible that budding yeast carries low levels of P-Thr4, but if that were the case, its distribution over genes were to mirror that of total RNAPII and would therefore be strikingly different from that observed in mammalian cells. Budding yeast seems to be an exception here, however, since all the other organisms tested, including fission yeast, clearly harbor this mark.38 2.2.5. Exceptions to the Generic CTD Cycle: Gene Specific CTD Phosphorylation Patterns. Different from S. cerevisiae, where virtually no gene seems to escape the general CTD cycle described above, exceptional genes with CTD phosphorylation cycle deviating from the canonical profile can be found in other organisms. In fission yeast, a few hundred genes have been shown to accumulate P-Ser2 earlier (more 5′)

than typical genes.28 This is correlated with the early recruitment of the Ser2 kinase Lsk1 (the Ctk1 ortholog) to the 5′UTR. Intriguingly, while P-Ser2 is dispensable for transcription at the vast majority of genes, it appears to be required at these genes. These genes include STE11, a key regulator of sexual differentiation in fission yeast, resulting in lsk1 or S2A mutants being sterile.28,43 Interestingly, these genes tend to carry long 5′UTRs that, at least in the case of STE11, are important for the transcriptional function of Lsk1-mediated CTD phosphorylation. The interpretation of these experiments is further complicated as an S7A mutation can partially suppress the S2A mating phenotype.44 Schwer and Shuman also showed that the CTD is hyperphosphorylated on Ser7 in the S2A mutant, opening the possibility that the S2A sterile phenotype is caused by Ser7 hyperphosphorylation rather than, or in addition to, the absence of Ser2 phosphorylation.44 Regardless of the mechanism underlying the sterile phenotype, these experiments clearly demonstrate that S. pombe utilizes phosphorylation of Ser2 in a noncanonical manner in order to regulate transcription rather than RNA processing at certain genes. Recent studies in S. pombe also showed that Ser2 phosphorylation plays a role in the regulation of specific sets of genes upon different stimuli,45,46 but it has not yet been established whether this involves atypical P-Ser2 deposition nor if the role of P-Ser2 is transcriptional or rather linked to mRNA processing. Another example of a noncanonical CTD phosphorylation pattern was reported for a set of primary response genes in macrophages.31 Indeed, these genes are transcribed in the absence of Ser2 phosphorylation in noninduced conditions. These transcripts, however, remain unspliced and are quickly degraded. Upon macrophage stimulation, RNAPII adopts the canonical CTD phosphorylation pattern, leading to proper splicing and expression of the primary response genes. This data illustrates that transcription can occur without P-Ser2 and clearly demonstrates the importance of CTD phosphorylation in splicing as discussed in section 4.2.2. More recently, Diamant et al. have identified genes that remain hypophosphorylated even under inducible conditions.29 Upon stimulation by TNFα, NF-κB stimulates the recruitment of DSIF (DRB sensitivity inducing factor) to some of its target genes, including A20 and IκBα. These genes are then fully transcribed in the absence of Ser2 phosphorylation, perhaps because NF-κB activation also leads to the release of P-TEFb. Contrarily to primary response genes in noninduced condition,31 A20 and IκBα transcripts are properly processed and translated. While CTD phosphorylation plays a key role in E

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

mRNA processing at most genes, as will be discussed in section 4.2.3, this study shows that some CTD phosphorylation marks may be dispensable for pre-mRNA at some genes. In the case of A20 and IκBα, RNA processing relies heavily on DSIF. 2.3. Adding Phosphorylation Marks: CTD Kinases

2.3.1. TFIIH Places Both P-Ser5 and P-Ser7 Marks Prior to Initiation. Several kinases have been shown to phosphorylate the CTD in vitro or in vivo, but only now we start to understand how they coordinately generate the complex and highly dynamic CTD phosphorylation cycle (Table 1). Very early on, in vitro assays have established that the general transcription factor TFIIH contains a CTD kinase activity (Kin28 in yeast, Cdk7 in human).47,48 Based on work by several laboratories, it is very well established that Kin28, as part of TFIIH and within the PIC, is responsible for the deposition of P-Ser5 and P-Ser7 prior to transcription initiation (Figure 3A).22,26,27,34,49−52 Elegant work by the Ansari group showed that phosphorylation of Ser5 by Kin28 is a prerequisite for the phosphorylation of Ser7 by the same enzyme,34 suggesting that P-Ser7 is generally added next to P-Ser5. Kin28 seems to be specific for these two serine residues since it has virtually no activity toward Ser2 in vitro27,34 and its inhibition has only neglectable effects on P-Ser2 profiles in vivo.27,34,53 This is all consistent with the fact that Kin28, as part of the TFIIH complex, is restricted to the promoter region where P-Ser5 and P-Ser7 (but not P-Ser2, P-Tyr1, and P-Thr4) levels are high (Figure 2). 2.3.2. Modulating the Initial Marks Early during Elongation. Soon after initiation, P-Ser5 levels decline partially, reaching a plateau at about 50% of its initial level about 500 bp after initiation (Figure 2). An atypical phosphatase, named Rtr1 (RPAP2 in human), was shown to be responsible for this activity54,55 (Figure 3B). The fact that only a fraction of Ser5 is dephosphorylated at this point argues for the existence of at least two functionally distinct variants of P-Ser5. It also suggests that Rtr1 can read that specificity. Since phosphorylation of Ser5 is a prerequisite for phosphorylation of Ser7,34 one can speculate that, after initiation, some repeats contain the dual P-Ser5/P-Ser7 marks while some may only be marked by P-Ser5. In such a scenario, one can imagine the presence of P-Ser7 next to P-Ser5 being responsible for restricting Rtr1’s phosphatase activity to either of these variants of P-Ser5, which would explain the partial dephosphorylation of P-Ser5 at this point in the phosphorylation cycle. Although speculative, such a model should be testable using current technologies. Perhaps in line with that model, the Murphy group recently showed that RPAP2 is recruited to snRNA genes in human cells via interaction with P-Ser7, providing a mechanism explaining the importance of Ser7 phosphorylation for the expression of these small RNAs.55 Quiet puzzlingly, however, they also provide evidence that this mechanism does not operate at protein-coding genes, implying that additional factors, present specifically on snRNA genes, must be involved. Also, our recent ChIP data showed that replacing Ser7 with alanines does not affect the P-Ser5 pattern, suggesting that phosphorylation of Ser7 is not involved in the modulation of PSer5 by Rtr1.27 The mechanisms allowing the recruitment of Rtr1 to the 5′ end of protein-coding genes therefore remains to be elucidated. The phosphatase activity of Rtr1 is rather controversial. While a few groups have reported compelling evidence for such an activity,54,55 others have not been able to reproduce such

Figure 3. Representation of the ordered recruitment of different CTD kinases and the phosphatase Rtr1 in the first kilobase from the TSS. (A) TFIIH and its associated kinase Kin28 (Cdk7 in human) is recruited at the TSS as part of the preinitiation complex (PIC) where it phosphorylates Ser5 and Ser7 before initiation. (B) Soon after initiation (within the first 500 bp), Bur1/Bur2 (also known as PTEFb) is recruited to the CTD through recognition of P-Ser5 and phosphorylates Ser2 on some repeats. Other mechanisms have also been proposed to explain the recruitment of P-TEFb as described in the main text. Concomitantly, the Rtr1 phosphatase partially dephosphorylates Ser5. In human cells, the ortholog of Rtr1 (RPAP2) is recruited via binding to P-Ser7 (not depicted here). (C) Further downstream, Ctk1 (as part of CTDK1) generates the bulk of P-Ser2. Ctk1 has to compete with the P-Ser2 phosphatase Fcp1. The mechanism that recruits Ctk1 and Fcp1 is not completely understood (see the text for details), but evidence suggests that deubiquitylation of H2B by the Ubp8 subunit of SAGA may be required for the recruitment of Ctk1. This complex sequential recruitment of CTD kinases is of prime importance for the implementation of proper crosstalk between the enzymes as depicted in Figure 4.

data.56 Furthermore, recent structural data argues that Rtr1 lacks a catalytic site.56 Although the occupancy of Rtr1 in the 5′ end of genes fits its role as a CTD phosphatase, other mechanisms can be envisioned to explain the decrease in the PSer5 signal in that region. For example, it is quite possible that the apparent decrease in the P-Ser5 signal is due to masking. Masking may occur if a protein binds to the CTD and sterically blocks the access of the antibody to its epitope. This may appear unlikely given that ChIP experiments are performed in semidenaturing conditions, but this type of masking has nevertheless been reported before.27 Alternatively, masking F

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

respective activities of the CTD kinases and phosphatase. Interfering with Fcp1 activity indeed leads to the hyperaccumulation of P-Ser2 over genes. Notably, the slope of PSer2 accumulation is steeper in fcp1 mutants, directly suggesting that the presence of Fcp1 indeed restricts the accumulation of P-Ser2.27,53 This data also suggests that the plateau of P-Ser2 reached about one kilobase downstream from the TSS is not the result of a saturation of repeats with P-Ser2 but rather the result of a programmed schedule of phosphorylation/dephosphorylation battle over genes. More recently, the Cech laboratory reported another protein that prevents premature Ser2 phosphorylation. They showed that in human cells, FUS, a RNA-binding protein involved in several diseases, binds the CTD and prevents phosphorylation of Ser2 near the TSS. In the absence of FUS, the CTD was found to be prematurely phosphorylated on Ser2, leading to premature termination at many genes.67 Here again, this raises the question as to which CTD repeats are phosphorylated and whether this occurs randomly. 2.3.4. Phosphorylation of Tyr1 Is Performed by a Nontraditional CTD Kinase. The pattern of P-Tyr1 is very similar to that of P-Ser235 (Figure 2), which immediately suggested that Bur1 and/or Ctk1 may be the responsible kinase(s). Mutants for neither of those, nor Kin28 or Srb10, however, turned out to affect P-Tyr1 profiles by ChIP (ref 35 and our own unpublished results). Additional potential candidates were also tested in our laboratory, and so far, the identity of the Tyr1 kinase in yeast has been elusive. c-Abl was shown to phosphorylate the CTD in vitro and in vivo (see section 2.3.6.4),68 but since it has no homologue in yeast, it is unlikely to be the physiological Tyr1 kinase during transcription. It is therefore becoming clear that the Tyr1 kinase is most likely a protein not traditionally thought of as being involved in transcription. 2.3.5. Phosphorylation of Thr4 by Plk3. As mentioned above, the P-Thr4 signal is modest in the body of genes but tremendously increases downstream from the pA38 (Figure 2). This pattern is consistent with Thr4 being phosphorylated during termination, but because of the sensitivity of the antibody to neighboring phosphoserines, one cannot rule out the possibility that Thr4 is phosphorylated during elongation in a manner similar to Ser2 and Tyr1. Consistent with that last possibility, Thr4 was initially proposed to be phosphorylated by Cdk9 (the mammalian Bur1),69 but this conclusion was recently challenged.38 Another group rather showed, using both in vitro and in vivo assays, that Thr4 is likely phosphorylated by Plk3 (Polo-like kinase 3).38 To our knowledge, the genomic occupancy of Plk3 was never profiled, which would help in determining whether P-Thr4 is an elongation- or termination-associated mark. 2.3.6. CTD Kinases with Ambiguous Specificities and Poorly Understood Roles. In addition to Kin28, Ctk1, Bur1, and recently Plk3, several other kinases have been reported to phosphorylate the CTD, but their role in the CTD phosphorylation cycle remains uncertain. Here we will briefly review the literature about some of these kinases. 2.3.6.1. Srb10/Cdk8. Srb10 is a kinase associated with Mediator.70,71 Srb10 (and its mammalian ortholog Cdk8) has been shown to phosphorylate the CTD in vitro72 and in vivo,70 but evidence that it does so in the context of the CTD cycle during transcription is lacking. Srb10 was originally proposed to phosphorylate the CTD in solution, therefore preventing RNAPII recruitment to promoters.72 More recently, it was

may occur if the CTD is modified in such a way that the P-Ser5 epitope is changed into a variant that is no longer recognized by the antibody. In this specific case, the addition of phosphate groups on Ser2 at some P-Ser5 repeats may generate masking. This possibility is consistent with the fact that the presence of a phosphate group on the Ser2 that precedes Ser5 reduces the affinity of anti-P-Ser5 antibodies (H14 and 3E8) for their epitopes.23 It is also consistent with our findings that P-Ser5 profiles are somewhat more uniform in ctk1 mutants (where virtually no P-Ser2 is detected on genes).27 Further experiments are therefore needed before we can determine what is causing the decrease of the P-Ser5 levels after initiation. 2.3.3. Ctk1 (CTDK1) and Bur1 (P-TEFb) Share the Duty of Placing P-Ser2 during Elongation. In yeast, at least two kinases, Ctk1 and Bur1, can efficiently phosphorylate the CTD on Ser2 in vitro57−60 (Figure 3B,C). Both kinases have therefore been proposed to be responsible for the phosphorylation of Ser2 during elongation (reviewed in ref 61). Current data suggests that, while both enzymes contribute to P-Ser2 levels in vivo, Ctk1 is the major Ser2 kinase.27,53,62 Indeed, in the absence of functional Ctk1, very little P-Ser2 can be detected on genes by ChIP.27 In bur1 mutants, or in mutants for its associated cyclin Bur2, however, P-Ser2 levels are slightly but reproducibly reduced (especially in the 5′ portion of genes), suggesting that this kinase is responsible for the phosphorylation of Ser2 on some CTD repeats.27,62 While the contribution of Bur1 to Ser2 phosphorylation is quantitatively modest, it may nevertheless be functionally critical. For example, it may create a P-Ser2 variant that cannot be otherwise generated by Ctk1 and specifically “read” by some proteins. The existence of functionally distinct P-Ser2 variants is supported by its complex mode of dephosphorylation (see sections 2.4.3 and 2.4.4). In addition, our laboratory has generated proteomic evidence that Ctk1 and Bur1 are required for the association of RNAPII with distinct sets of proteins (unpublished observations). As will be discussed in sections 4.1.2 and 4.2.1.2, Bur1 and its human ortholog Cdk9 have functions that extend far beyond the CTD since they have a wide range of different substrates. The Bur1/Ctk1 duality has long been thought to be a yeastspecific oddity, until the Greenleaf laboratory identified fly and human Cdk12 as the Ctk1 ortholog.63 They also provided evidence, using complementation assays in yeast, that Cdk9 is likely the functional Bur1 ortholog. This functional data is in agreement with previous phylogenic work demonstrating that BUR1 is more closely related to CDK9 while CTK1’s closest homologue in human is CDK12.64,65 Both Bur1 and Ctk1 can be detected by ChIP all along active genes.37,53,66 Yet, phosphorylation of Ser2 increases steadily during the first kilobase of genes (See Figure 2). In addition, the slope of P-Ser2 accumulation is much less steep than those of P-Ser5 and P-Ser7. Several explanations may elucidate this difference, including the fact that phosphorylations of Ser5 and Ser7 occur on a nontranscribing polymerase while Ser2 is phosphorylated as RNAPII is moving down the gene. Alternatively, more repeats may have to be phosphorylated on Ser2 than on Ser5 and Ser7. These possibilities, however, are difficult to test with current technologies. One mechanism that has been demonstrated to slow down accumulation of PSer2 is the competition of Ser2 kinases with the phosphatase Fcp1 (Figure 3C).27,53 Fcp1, like the Ser2 kinases, occupies the body of active genes.37,53 The accumulation of P-Ser2 during elongation is therefore the result of a balance between the G

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

proposed to play a role in transcriptional elongation,73 but its deletion does not have any detectable effects on CTD profiles in vivo, suggesting that it may regulate elongation in a manner that does not involve CTD phosphorylation. However, we cannot eliminate the possibility that it phosphorylates the CTD in a way not detectable by current antibodies. Finally, Srb10/ Cdk8 has other substrates, including cyclin H (the partner of Kin28/Cdk7), which has been proposed as a pathway used by Cdk8-containing mediator to repress transcription.74,75 2.3.6.2. Cdc28/Cdc2. Although far from being the best characterized CTD kinase, Cdc2 (also known as Cdk1) was nevertheless the first kinase shown to phosphorylate the CTD.76 Phosphorylation of the CTD by Cdc2 is stimulated by the isomerase Pin1 and it has been proposed that this activity is involved in the hyperphosphorylation of the CTD during mitosis, perhaps contributing to the global shutdown of transcription in that phase of the cell cycle.77 More recently, the yeast ortholog of Cdc2, Cdc28, was shown to boost transcription and mRNA capping at a subset of genes during cell cycle entry (G1/S) via Ser5 phosphorylation.78 Interestingly, the role of Cdc28 in CTD phosphorylation seems to be gene-specific since the kinase binds to and regulates some highly transcribed genes such as PMA1 but not others such as ACT1 and SSE1. The regulation of PMA1 by Cdc28 occurs specifically in S phase entry, the moment where Cdc28 is active. This led to the idea that Cdc28 works as a booster of transcription during cell cycle entry when the bud has to grow significantly.79 This model, however, is largely based on work done at the PMA1 gene. It will be interesting to see whether this can be generalized to other genes heavily transcribed during cell cycle entry. Interestingly, Cdc28 was also shown to play a role in the recruitment of the 19S proteasome to some promoters. This function, which is independent of its kinase activity, stimulates transcription via the eviction of promoter nucleosomes.80−82 2.3.6.3. Brd4. Brd4 is the last of the CTD kinases discovered. Brd4 is a bromodomain-containing protein of high clinical relevance since it is involved in several diseases including many cancers.83 It is recruited to chromatin via binding to acetylated histones through its two bromodomains.31,84−86 Recently, this binding was shown to be regulated by phosphorylation of Brd4 by CK2.87 Brd4 binds and recruits P-TEFb to chromatin.88,89 As a consequence, it contributes to the phosphorylation of the CTD. Last year, however, the Singer group showed that Brd4 is a CTD kinase phosphorylating Ser2 in vitro and in vivo.86 Interestingly, Brd4 and P-TEFb generate different variants of PSer2 since a CTD phosphorylated by Brd4 is recognized by the 3E10 but not the H5 antibody while P-TEFb generates a CTD better recognized by H5.86,90 Recent ChIP-Seq experiments have shown that Brd4 associates with active promoters as well as several enhancers.91 Brd4 does not appear to travel with RNAPII;91 thus its direct contribution to CTD phosphorylation is likely to be restricted to polymerases in the promoterproximal region. The contribution of Brd4 to the CTD phosphorylation cycle has not been investigated, so it remains unclear what contribution it makes and how it relates to other Ser2 kinases such as P-TEFb and Cdk12. 2.3.6.4. c-Abl. c-Abl (Abl1) and the related Arg (Abl2) kinases were shown to phosphorylate the CTD on Tyr1 both in vitro and in vivo.68,92−95 Experiments performed in human cells suggest that c-Abl-dependent phosphorylation of the CTD may play a role in cell cycle regulation of transcription at specific genes, in HIV gene transcription, and upon DNA damage

(reviewed in ref 96). c-Abl contains a SH2 domain that binds PTyr1, allowing the kinase to processively phosphorylate all 52 repeats of the CTD.97 Phosphorylation of the CTD by c-Abl also requires the last (and nonconsensus) CTD repeat of the human CTD.94 Taken together, this data suggests that c-Abl may dock on the last CTD repeat and processively phosphorylate the CTD via a spreading mechanism involving the binding to the self-promoted P-Tyr1. The last CTD repeat is not conserved through evolution so that neither the yeast nor the Drosophila CTD can be phosphorylated by c-Abl.94 Combined with the fact that P-Tyr1but not c-Ablis present in yeast, this data also argues that c-Abl is unlikely to be the kinase responsible for CTD tyrosine phosphorylation during the transcription cycle. Current data rather suggests that c-Abl phosphorylates the CTD in specific conditions, and perhaps at specific genes. 2.3.6.5. CK2. CK2 (also called CKII or casein kinase II) phosphorylates the CTD98 but only on the last (and nonconsensus) CTD repeat on Ser13.99,100 CK2 occupies transcribed genes and interacts with and phosphorylates many general transcription factors.101−109 Whether CTD phosphorylation by CK2 plays a role in transcription, though, is unclear.100,109 Certainly, however, CK2 is required for cell cycle progression and in response to stress,110 but it remains to be established whether CTD phosphorylation is involved in these functions of CK2. 2.3.6.6. ERK1/2. ERK2 (MAPK1) phosphorylates Ser5 of both consensus and nonconsensus repeats in vitro.111,112 ERK1/2-dependent CTD phosphorylation has been proposed to be involved in response to several stresses113,114 and in the hyperphosphorylation of RNAPII during both mitosis and meiosis.115,116 2.3.7. Ordered Recruitment of CTD Kinases. Besides Kin28, which is recruited as part of TFIIH within the PIC (Figure 3A), the mechanisms that govern the recruitment of the other CTD kinases (as well as the CTD phosphatases) are incompletely defined. Yet this is one of the most crucial elements that would allow a better understanding of how the CTD phosphorylation cycle is orchestrated. Indeed, most CTD kinases are quite promiscuous in vitro27,117 and recent data suggested that the order by which they enter the cycle is instructive regarding their specificities.27 Several mechanisms have been proposed to explain the recruitment of P-TEFb. In yeast, phosphorylation of the CTD by TFIIH was proposed to trigger the recruitment and the activity of P-TEFb.62,118,119 The recruitment of P-TEFb may also occur via its interaction with the capping enzyme.120 In higher eukaryotes, several mechanisms have been proposed. Indeed, Brd4,89 Mediator,121 and the HIV transactivator Tat122,123 were all shown to recruit P-TEFb in different contexts. In addition, P-TEFb may be recruited as part of the super elongation complex (SEC), a huge protein complex containing several elongation factors.124 Regardless of the exact mechanism, in all models P-TEFb is recruited after TFIIH and often in a mechanism predicted to depend on TFIIH kinase activity (Figure 3B). The mechanisms governing the recruitment of Ctk1, the main Ser2 kinase, remain poorly understood, although it has recently been shown to interact with nucleosomes in a manner that is blocked by H2B ubiquitylation125 (Figure 3C). This study shows that the Ubp8 deubiquitinase present in the lysine acetyltransferase complex SAGA is required for the recruitment of Ctk1. While this does not provide a satisfying mechanism for H

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

recruitment to active genes (nonubiquitylated nucleosomes are everywhere), it provides an interesting link between histone modifications and CTD phosphorylation. In addition, because H2B ubiquitylation is ultimately regulated by Kin28/TFIIH (PSer5 is involved in the recruitment of SAGA during elongation126), it further reinforces the pioneer role of Kin28/TFIIH in setting up the CTD cycle (Figure 3). 2.3.8. Interplays between the CTD Kinases: Timing, Priming, and Cross-Phosphorylation. As mentioned above, CTD kinases are recruited in a very well orchestrated manner and current knowledge suggests that Kin28/TFIIH is a master regulator of this process. Further supporting this idea, inhibiting the activity of Kin28 in vivo leads to aberrant phosphorylation of Ser5 and Ser7 by other kinases (including Bur1 and certainly others) downstream from the TSS.27 This means that, while Kin28 is likely the sole kinase to actually phosphorylate Ser5 and Ser7 in normal situations, other kinases have the intrinsic ability to phosphorylate these residues and have to be kept from doing so. Timing in the recruitment of CTD kinases along genes is therefore a key determinant of their factual specificity in vivo. In addition to restricting the activity of other kinases, Kin28 also achieves its pioneering role by priming the CTD for phosphorylation by other kinases. For example, phosphorylation of Ser5 stimulates the phosphorylation of Ser2 by Ctk1.127 In addition, Kin28, and its fission yeast ortholog Mcs6, mediates the recruitment of P-TEFb and stimulates deposition of P-Ser262,118,128 (Figure 4A). In vitro assays also suggest that phosphorylation of Ser7 stimulates the kinase activity of PTEFb while P-Ser5 inhibits it.117 Interestingly, by phosphorylating the CTD on Ser5, Kin28 primes the CTD for phosphorylation of Ser7 by itself. Another example of CTD priming was proposed by Hinnebusch and colleagues, who showed that Bur1/P-TEFb primes the CTD for further phosphorylation by Ctk1/CTDK162 (Figure 3C). Recently, interplay between CTD kinases was shown to operate more directly. Indeed, Devaiah and Singer shown that several of the CTD kinases can phosphorylate each other.90 This leads to either stimulation or inhibition of their kinase activities. Notably, they showed that P-TEFb and Brd4, two Ser2 kinases, phosphorylate each other leading to their respective activation (and sometime inhibition) (Figure 4B). Finally, they also showed that Cdk7/TFIIH phosphorylates and inhibits Brd4. In addition, TFIIH phosphorylates other kinases such as Cdc2 and Cdk2, leading to either their activation or inactivation.90,129,130 This further reinforces the idea that TFIIH (Kin28/Cdk7) is a master regulator of CTD phosphorylation.

Figure 4. Different interplays (cross-talks) between CTD kinases that contribute to the establishment of the CTD phosphorylation cycle. The two described cross-talk mechanisms are depicted. (A) CTDpriming. Phosphorylation of the CTD by some kinases has been shown to “prime” the CTD for further phosphorylation. As depicted here, phosphorylation of Ser5 by TFIIH primes phosphorylation of Ser7 by the same kinase. Furthermore, phosphorylation of Ser5 and Ser7 by TFIIH primes the CTD for phosphorylation of Ser2 by PTEFb as P-Ser5 was shown to recruit P-TEFb while P-Ser7 was shown to stimulate its activity (see the main text for details). (B) Crossphosphorylation. In addition to phosphorylation of the CTD, some CTD kinases can phosphorylate other substrates including other CTD kinases. It was recently shown that TFIIH can inactivate Brd4 by phosphorylating it. In addition, P-TEFb and Brd4 phosphorylate each other, leading to modulation of their respective activities. Phosphorylation of Brd4 by P-TEFb leads to its activation, while Brd4 can either stimulate or inhibit the activity of P-TEFb depending on what amino acid it phosphorylates.

the kinases (see Figure 3B), so the next sections will focus on the other two known CTD phosphatases: Ssu72 (section 2.4.2) and Fcp1 (sections 2.4.3 and 2.4.4). 2.4.1. Dephosphorylation of P-Tyr1: The Chase Is On. P-Tyr1 is the first mark to decline in the 3′ end of genes (Figure 2). The P-Tyr1 phosphatase, however, has yet to be identified (Figure 5A). Since there is a lot of interest in the identity of that enzyme, it can be expected that it will soon be identified. 2.4.2. Dephosphorylation of P-Ser5 and P-Ser7: Ssu72, a Phosphatase “A-cis-ted” by Ess1. Ssu72 was initially identified genetically as a phenotype enhancer of a TFIIB mutation131 and was then characterized as part of APT, a subcomplex of the yeast cleavage/polyadenylation factor (CPF) 3′ end processing complex.132−136 Several lines of evidence, however, also suggested a link with the CTD. Notably, some ssu72 alleles are synthetically lethal with mutants of the CTD kinases Kin28 and Ctk1 and suppressed by overexpression of the CTD phosphatase Fcp1.137 In addition, Ssu72 interacts with subunits of RNAPII both genetically and physically.134,138 In 2004, the Moore and Hampsey groups showed that Ssu72 has P-Ser5 phosphatase activity both in vitro and in vivo.139 As a component of the CPF complex, Ssu72 is localized at the 3′

2.4. Removing Phosphorylation Marks: CTD Phosphatases

We have so far described how the CTD phosphorylation marks are established during the first kilobase by the concerted action of several kinases. Here we will discuss how different CTD phosphatases coordinately dephosphorylate the CTD upon passage over the polyA signal (Figure 5). As described in section 2.2, the dephosphorylation of the CTD is no less complex than its phosphorylation. It involves at least three distinct phosphatases, Fcp1, Ssu72, and Rtr1/RPAP2 (Table 2), which, like the CTD kinases, function together in a complex relationship. In addition, CTD dephosphorylation is also regulated by proline isomerization by the peptidyl-prolyl isomerase Ess1 (Pin1 in human). Rtr1/RPAP2 intervenes early in the CTD cycle and was discussed above together with I

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

While the evidence for Ssu72 as a P-Ser7 phosphatase is compelling, this activity was somewhat surprising since modeling the binding of a P-Ser7 peptide within the Ssu72 active site based on available data revealed incompatibility due to steric constrains.144 This issue was recently solved by Xiang et al.,145 who showed that P-Ser7 peptides interact with the Ssu72 catalytic site in the reverse orientation relative to P-Ser5 peptides. This reverse orientation allows P-Ser7 to fit in the active site of the phosphatase albeit with much lower affinity. Consequently, dephosphorylation of P-Ser7 peptides is extremely inefficient in vitro compared with P-Ser5 peptides.145 This data clashes with the fact that both P-Ser5 and P-Ser7 are dephosphorylated with the same kinetics on genes in vivo.27 When a full-length CTD is used as substrate, however, Ssu72 appears to dephosphorylate both P-Ser5 and P-Ser7 with similar efficiency, and this is reproducible across several laboratories.27,37,145 While the reason for that is unclear, it suggests that Ssu72 makes additional contacts with the CTD in addition to the ones it makes via its catalytic site. It also suggests that these additional interactions somehow stimulate the activity toward P-Ser7. Further experiments are required before we understand how Ssu72 dephosphorylates both PSer5 and P-Ser7. Another interesting difference between the Ssu72/P-Ser5 and Ssu72/P-Ser7 structures concerns the isomerization state of Pro6. Indeed, Pro6 is in the trans conformation in the P-Ser7-bound peptide,145 whereas it is in cis in the P-Ser5-bound peptide.144 This suggests a model for how the prolyl isomerase Ess1 might stimulate Ssu72 and perhaps help the phosphatase dephosphorylating P-Ser7 with similar kinetics as with P-Ser5 in vivo. By accelerating the switch between both isomerization states, Ess1 may allow Ssu72 to processively remove both P-Ser5 (in the cis conformation) and P-Ser7 (in the trans conformation). Additional biochemical experiments will be required in order to better understand the enzymology of Ess1 and Ssu72. 2.4.3. Partial Dephosphorylation of P-Ser2: Fcp1 Winning over Ser2 Kinases. At the same time as P-Ser5 and P-Ser7 are dephosphorylated by Ssu72, P-Ser2 is being dephosphorylated by Fcp1 (Figure 5A). As described in section 2.2.2, however, a fraction of P-Ser2, recognized by the 3E10 antibody but not by the H5 and BL antibodies, is resistant to this dephosphorylation activity and is removed after termination by a mechanism that is currently ill-defined but which requires the concerted action of both Ssu72 and Fcp1 phosphatases.27,37 Fcp1 was the first CTD phosphatase to be identified.146,147 It has been reported to dephosphorylate P-Ser2, P-Ser5, and even P-Ser7 in vitro,148−151 but it appears to specifically affect P-Ser2 levels in vivo.27,53 The discrepancy between the in vitro specificities observed between different laboratories suggests that it is regulated by some combination of external factors, including the following: protein partners or cofactors (notably

Figure 5. Dephosphorylation of the CTD at the 3′ end of genes. (A) As RNAPII reaches the polyA (pA) signal, Ssu72 is recruited as part of the CPF factor and dephosphorylates Ser5 and Ser7, stimulated by the isomerase Ess1. At the same time, Fcp1 partially dephosphorylates Ser2 and (at least in higher eukaryotes) Plk3 phosphorylates Thr4. All these events are preceded by dephosphorylation of P-Tyr1 by an unknown phosphatase. (B) RNAPII disengages chromatin a few hundred base pairs later with phosphate groups on Ser2 and Thr4. The completion of P-Ser2 dephosphorylation is achieved by Fcp1 in a manner that depends on prior dephosphorylation by Ssu72, while PThr4 is dephosphorylated by a yet to be identified phosphatase.

end of genes,27 which made it a perfect candidate as the CTD phosphatase completing the work initiated by Rtr1 in dephosphorylating the residual P-Ser5 prior to termination. This was confirmed recently using ChIP assays.27 Dephosphorylation of P-Ser5 at the end of genes was also shown to require Ess1, the prolyl isomerase that interacts with P-Ser5 and isomerizes the CTD (Figure 5A).27,140−142 Conveniently, Ess1 was shown to stimulate the activity of Ssu72 toward P-Ser5 in vitro.143 Recently, we and others have shown that Ssu72 can also dephosphorylate P-Ser7 in vitro27,37 and that its activity is required for the dephosphorylation of P-Ser7 at the end of genes (Figure 5A).27 This places Ssu72 as a dual specificity phosphatase, removing phosphate groups from both Ser5 and Ser7 at the end of the transcription unit. Interestingly, CTD isomerization by Ess1 was shown to be required for the removal of P-Ser7 in vivo and to stimulate Ssu72’s activity toward PSer7, similarly to P-Ser5.27 Table 2. Known CTD Phosphatases

a

S. cerevisiae

S. pombe

D. melanogaster

H. sapiens

protein complex

specificitya

Ssu72 Fcp1 Rtr1 − Cdc14

Ssu72 Fcp1 Rtr1 − Clp1/Flp1

Ssu72 Fcp1 CG34183 CG5830 Cdc14

Ssu72 Fcp1 RPAP2 Scp1/CTDSP1 Cdc14b

CPF − − − −

P-Ser5, P-Ser7 P-Ser2, P-Ser5 P-Ser5 P-Ser5 P-Ser5

Phosphorylations for which the in vivo relevance has been established are shown in bold. J

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

around the resistant P-Ser2 epitopes. Structural data on Fcp1-PCTD substrates would be extremely valuable for the elucidation of these mechanisms, but regardless of the exact mechanism, the most significant implication of these findings is that there exist multiple functionally distinct variants of phosphorylation marks on the CTD. This evidence, coupled to an increasing number of others (see ref 39 for an example), reinforces the idea that combinations of marks are biologically relevant and that the notion of a combinatorial CTD code is more than just an elegant theoretical concept. 2.4.5. Other CTD Phosphatases. Other CTD phosphatases have been described, but their roles have not been clearly defined (Table 2). Perhaps the best characterized is Scp1, a phosphatase similar to Fcp1 but with specificity toward PSer5.164,165 Scp1 is specifically expressed in non-neuron cells where it negatively regulates the expression of neuronal genes using unknown mechanisms.166 Cdc14b, one of the two mammalian orthologs of the yeast M phase specific phosphatase Cdc14, was recently shown to target P-Ser5 and regulate the expression of cell-cycle-specific genes.167 Analogously, Fcp1 was recently shown to be required for mitosis exit in human cells, although the relevant substrate for this activity is not the RNAPII CTD.168

MEP50, TFIIB, TFIIF, or the Rpb4/7 subunits of RNAPII103,107,152−158), Fcp1’s phosphorylation state (Fcp1 is phosphorylated by CK2103,106,159), the length of the substrate (full CTD versus short peptides), the combinatorial marks on the CTD substrate, and/or the timing of the recruitment of Fcp1 along the transcription cycle. While it has a clear stimulatory activity toward Ssu72, the effect of CTD proline isomerization on the phosphatase activity of Fcp1 is unclear. Using different in vitro and in vivo assays, different groups have reported either stimulation160 or inhibition77 of Fcp1’s activity by Ess1/Pin1. Here again, the exact nature of the substrate used, as well as the source of Fcp1 (native versus recombinant) may be a cause. When looking at CTD phosphorylation along genes, however, we were not able to detect P-Ser2 defects in an ess1 mutant,27 suggesting that if proline isomerization regulates Fcp1, it may do so only after RNAPII has terminated transcription or, as has been proposed before, during M phase, when the polymerase becomes hyperphosphorylated.77 Fcp1 occupies the entire transcribed unit but its occupancy, as measured by ChIP, is maximal in the 3′ end of genes.37 As discussed in section 2.3.3, Fcp1 opposes the activity of Ctk1 (and most likely also Bur1) during elongation, ensuring the proper accumulation rate of P-Ser2 throughout this stage of transcription (Figure 3C). While the kinases appear to win that battle during elongation, the balance is shifted toward Fcp1’s advantage after the pA signal (Figure 5). This is perhaps due to the fact that Fcp1 occupancy increases at that time, but most certainly it also involves the fact that the Ser2 kinases are leaving the elongation complex at that point. Neither the mechanism triggering the exit of the CTD kinases nor the one allowing for the recruitment of additional Fcp1 at the end of genes has been worked out. Fcp1 interaction with RNAPII is not solely mediated by the CTD.161 Indeed, Fcp1 binding to RNAPII has been shown to be mediated by the general transcription factors TFIIB and TFIIF107,153,154,156−158 as well as Rpb4/7,155 two subunits of RNAPII known to be somewhat loosely associated with the core of the enzyme.162 It is therefore possible that changes in CTD modifications (for example, those mediated by Ssu72), together with modulation of these additional interaction surfaces, regulate the amount of Fcp1 reaching the CTD along transcribed genes. 2.4.4. Complete Dephosphorylation of P-Ser2: Fcp1 with a Little Help from Its Friend. In yeast, transcription terminates with the CTD being partially phosphorylated on Ser2 (the Fcp1-resistant/3E10-reactive variant of P-Ser2)27 (Figure 2). In human cells, however, P-Thr4 levels rise considerably concomitantly to the dephosphorylation of Ser2, Ser5, and Ser7, likely due to the action of Plk3 kinase.38 As a result, termination occurs with a CTD bivalently marked by PSer2 and P-Thr4 in human cells. These marks therefore need to be removed in solution prior to the recycling of RNAPII (Figure 5B). While the phosphatase for P-Thr4 is unknown, in vitro evidence suggests that Fcp1 removes the residual P-Ser2 after termination.27,146,163 Interestingly, Fcp1 is able to remove the 3E10 (P-Ser2) mark in vitro, but only when assisted by Ssu72.27 Because Ssu72 is not able to remove any P-Ser2 in this assay, it most likely acts by preparing the CTD substrate for Fcp1. According to that model, the action of Ssu72 prior to termination would set the stage for further dephosphorylation of P-Ser2 by Fcp1 after termination. While the mechanism for this cooperation is unclear, the simplest explanation may be that Ssu72 removes some inhibitory P-Ser5 and/or P-Ser7

2.5. Beyond Phosphorylation

2.5.1. CTD O-GlcNAcylation. While the discussion above focuses on phosphorylation, CTD is not limited to this modification. Notably, O-GlcNAcylationwhich consists of the addition of a monosaccharide N-acetylglucosamine in β-Oglycosidic linkage to the side chain hydroxyl groups of serines and threonineshas been known to decorate the CTD since 1983.169 The function of CTD O-GlcNAcylation has been elusive, but a recent study by Ranuncolo and colleagues has shed light on this CTD modification.170 O-GlcNAc is a very dynamic modification added and removed by O-GlcNAc transferase (OGT) and N-acetylglucosamidase (OGA), respectively. It occurs on many proteins in addition to the RNAPII CTD. Interestingly, RNAPII O-GlcNAcylation is restricted to the CTD.169 While it has been shown to occur on all three serines and on Thr4, the rapid turnover of this modification makes it difficult to ascertain which CTD residue(s) is(are) the primary substrate(s) in vivo (see ref 170). Using an antibody directed against the O-GlcNAc-CTD, it was recently shown by ChIP that this CTD modification occurs on promoter-bound RNAPII,170 consistent with the fact that it is mutually exclusive with CTD phosphorylation.169,171 Using OGT and OGA inhibitors, Ranuncolo et al. were able to show that cycles of glycosylation/deglycosylation take place during PIC assembly and are required for preinitiation.170 Because O-GlcNAcylation prevents CTD phosphorylation, its presence in the PIC may prevent premature phosphorylation by TFIIH or other kinases.169,171 O-GlcNAcylation is nonexistent in S. cerevisiae, however, supporting the idea that this modification is not necessary for the basic CTD cycle. It may play cell-type-specific roles in higher eukaryotes or be necessary in these organisms due to their increased complexity in terms of CTD length and sequence, number of potential CTD interactors, and number of different types of promoter elements. 2.5.2. Modifications of Nonconsensus CTD Residues: Arginine Methylation and Beyond. While the yeast CTD is mainly made of consensus repeats, higher eukaryotes contain several nonconsensus repeats, especially in the distal part of the K

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Table 3. Characterized CTD Binding Proteinsa CTD binding protein complex

factor

species

protein function

direct/indirect

interaction domain

P-CTD preference

refs



P-Ser5

196, 411

KMT KMT

indirect (via Paf1C) direct n.d.

n.d. n.d.

450 453−455

S.c.

KMT

direct

SRI

hSet2 (HYPB; KMT3A)

H.s.

KMT

direct

SRI

Rpd3S

n.d.

S.c.

KDAC

n.d.

n.d.

Set3C NuA4 SAGA − −

n.d. n.d. n.d. Chd8 Spt6

KDAC KAT KAT chromatin remodeler histone chaperone

n.d. n.d. n.d. n.d. direct

n.d. n.d. n.d. n.d. SH2

FACT

n.d.

histone chaperone

n.d.

n.d.



HP1c

S.c. S.c. S.c. H.s. S.c.; H. s. S.c.; D. m. D.m.

P-Ser5 P-Ser5; P-Ser2 P-Ser2/ P-Ser5 P-Ser2/ P-Ser5 P-Ser2/ P-Ser5 P-CTD P-CTD P-CTD P-CTD P-Ser2; P-Tyr1 P-CTD

transcription elongation

direct

n.d.

PAF

Cdc73

S.c.

transcription elongation

direct

WW (likely)

PAF

Ctr9

S.c.

transcription elongation

direct

n.d.

PAF

Rtf1

S.c.

transcription elongation

direct

n.d.



Ess1; Pin1

prolyl isomerase

direct

WW

− − capping enzyme

Rtr1 RPAP2 CE

S.c.; H. s. S.c. H.s. Mam

CTD phosphatase CTD phosphatase mRNA capping

direct direct direct

n.d. n.d. GTase NT

capping enzyme

Ceg1

S.c.

direct

capping enzyme

Cgt1

C.a.

capping enzyme

Pce1

S.p.

capping enzyme

Pct1

S.p.

U1 snRNP

Prp40

S.c.

mRNA capping (RNA guanylyltransferase) mRNA capping (RNA guanylyltransferase) mRNA capping (RNA guanylyltransferase) mRNA capping (RNA triphosphatase) splicing

U2 snRNP −

U2AF65 FBP11 (HYPA)

H.s. H.s.

− − −

SCAF8 SR proteins CA150 (TCERG1)

CstF

COMPASS (Set1C) SET1A/B MLL1/2

n.d.

S.c.

KMT

Wdr82 Menin

H.s. H.s.



Set2 (KMT3)



198, 244, 394, 456 211, 452 200, 201 200 202 126 457 35, 197, 207, 208 203, 427, 428

P-Ser5; P-Ser2 P-Ser2/ P-Ser5 P-Ser2/ P-Ser5 P-Ser2/ P-Ser5 P-Ser5

428

54 55 188, 226, 288

n.d.

P-Ser5 P-Ser7 P-Ser2; P-Ser5 P-CTD

direct

GTase NT

P-Ser5

225

direct

n.d.

P-Ser5

291

direct

n.d.

P-Ser5

291

direct

WW; FF

206, 236

splicing splicing

direct direct

RRM FF

H.s. Mam H.s.

splicing splicing transcription/splicing

direct n.d. direct

CID n.d. FF

CstF50

H.s.

cleavage/polyadenylation

direct

n.d.

CPSF CFIA

n.d. Pcf11

H.s. S.c.

cleavage/polyadenylation cleavage/polyadenylation

n.d. direct

n.d. CID

P-Ser2/ P-Ser5 P-CTD P-Ser2/ P-Ser5 P-CTD P-CTD P-Ser2/ P-Ser5 CTD; P-CTD P-CTD P-Ser2

CFIA CFIA CPF CPF CPF Rat1-Rai1-Rtt103 Ndr1 complex Ndr1 complex

Rna14 Rna15 Pta1 Yhh1 Ydh1 Rtt103 Nrd1 Sen1

S.c. S.c. S.c. S.c. S.c. S.c. S.c. S.c.

cleavage/polyadenylation cleavage/polyadenylation cleavage/polyadenylation cleavage/polyadenylation cleavage/polyadenylation termination termination termination (DNA/RNA helicase)

n.d. indirect n.d. direct direct direct direct n.d.

n.d. n.d. n.d. n.d. n.d. CID CID n.d.

P-CTD P-CTD P-CTD P-CTD P-CTD P-Ser2 P-Ser5 P-Ser2

L

412 412 412 140, 192, 230−232

186, 289

243 235 221, 222 315 234 180, 346 180 181, 212, 214 −216, 224 181 181 189 183 182 184, 220 219 376, 377, 458

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Table 3. continued CTD binding protein complex

factor

species

protein function

direct/indirect

interaction domain

Integrator

n.d.

H.s.

snRNA 3′ end processing

n.d.

n.d.

TREX

Yra1

S.c.

mRNA export

direct

RRM

SAGA and TREX2 −

Sus1 PSF and p54nrb/NonO

S.c. H.s.

n.d. n.d.

n.d. n.d.



Npl3

S.c.

direct

n.d.



Hrr25

S.c.

transcription/mRNA export multifunctional (nuclear processes) multifunctional (nuclear processes) kinase

direct

n.d.

− − −

Asr1 Rsp5 RECQ5

S.c. S.c. H.s.

RNAPII E3 ligase RNAPII E3 ligase DNA helicase

n.d. direct direct

n.d. WW SRI

− −

BRCA1 RPRD1A; RPRD1B and RPRD2 PCIF1

Mam H.s.

tumor suppressor unknown

n.d. n.d.

H.s.

unknown

direct



P-CTD preference

refs

P-Ser2/ P-Ser7 P-Ser2/ P-Ser5 P-CTD CTD; P-CTD P-Ser2

356

177, 459

n.d. CID

P-Ser2/ P-Ser5 P-Ser5 CTD; P-Ser2 P-Ser2/ P-Ser5 P-CTD n.d.

460 223

WW

P-CTD

461, 462

177, 190 204 185, 324 199

205 193, 206, 228, 229 245−247

a

S.c., Saccharomyces cerevisiae; H.s., Homo sapiens; D.m., Drosophila melanogaster; Mam, mammalian; C.a., Candida albicans; S.p., Schizosaccharomyces pombe; n.d., nondetermined; KMT, lysine methyltransferases; KAT, lysine acetyltransferases; KDAC, lysine deacetylases; SRI, Set2 Rpb1 interaction; SH2, Src homology 2; WW, tryptophan tryptophan; GTase NT, guanylyltransferase nucleotidyltransferase; FF, phenylalanine phenylalanine; RRM, RNA recognition motif; CID, CTD-interacting domain; P-CTD, phosphorylated CTD (used when the specificity is not determined in more detail).

section 4. Here, we will briefly review the molecular mechanisms used by CTD-interacting factors to bind specific phospho-CTD epitopes. For more details on that topic, we refer the readers to recent reviews.1,174−176

CTD. These nonconsensus repeats contain amino acids that increase the potential repertoire of CTD modifications in these organisms. One documented example is the methylation of an arginine at position 7 of a nonconsensus repeat in human cells (Arg1810) by the coactivator-associated arginine methyltransferase 1 (CARM1).9 Interestingly, CARM1-dependent Arg1810 methylation is inhibited by CTD phosphorylation in vitro, suggesting that it occurs prior to initiation. Also, this modification seems to affect the expression levels of snRNA and snoRNAs specifically.9 Since these small RNAs are heavily modified, it is tempting to speculate that this noncanonical CTD modification couples the transcription of these noncoding RNAs with their processing in mammalian cells. In addition to Arg1810 methylation, Li et al. reported recently that the HECT domain E3 ubiquitin ligase Wwp2 interacts with the CTD in murine cells and ubiquitylates six lysine residues in nonconsensus CTD repeats.172 While this CTD ubiquitylation was shown to target RNAPII for ubiquitin-mediated degradation,172 it remains to be seen whether CTD ubiquitylation occurs during the CTD phosphorylation cycle. The human CTD contains several lysines, notably at position 7 of distal repeats (see Figure 1), which are potential sites for acetylation, methylation, and ubiquitylation. Except for lysine ubiquitylation, these modifications have not yet been described, but it would not be a very wild guess to imagine that they will be in the near future. It is worth mentioning that a fully consensus CTD in mammalian cells supports normal growth and cell viability, suggesting that nonconsensus repeats do not play an essential role, at least at the cellular level.173

3.1. Approaches To Identify CTD-Binding Proteins and Characterize Their Interactions with the CTD

Several approaches have been described to identify proteins that interact with the CTD or with specific phosphorylated forms of the CTD. These include the use of immobilized in vitro phosphorylated CTD substrates for the capture of interactors from cell extracts177−185 or purified components,186−190 yeast two-hybrid screens,183,191−193 protein− protein photo-cross-linking,194 competition binding with antiCTD antibodies,195 coimmunoprecipitations,196−198 peptide pulldowns,193,199 ChIP assays in CTD modifying enzyme mutants,184,196,200−204 far Western blotting,205,206 reverse far Western blotting,198 and affinity purification.204,205 These interactions were sometimes quantitatively evaluated using chemical shift perturbation,207 nuclear magnetic resonance (NMR), fluorescence anistrophy,35,208,209 and surface plasmon resonance (SPR).177,210,211 Finally, several of these interactions have been characterized at the atomic level by X-ray crystallography or NMR (see section 3.1). In a recent review, Zhang et al. reported a list of 27 yeast factors characterized as CTD-binding proteins.176 Most of these proteins are involved in either transcription, mRNA processing, mRNA transport, or chromatin modification. This list could be extended to more than a hundred yeast proteins, if including the noncharacterized interactors identified by Phatnani et al. in a systematic assay looking for phospho-CTD interactors.177 The number of CTD interactors increases even further when considering higher eukaryotes. An updated list of CTD interactors including all organisms is compiled in Table 3. The characterization of these CTD-binding proteins at the atomic level has revealed that a surprisingly diverse number of different protein domains can mediate interactions with the

3. READING THE CTD CODE As mentioned in the Introduction, the complex and dynamic CTD phosphorylation pattern orchestrated along genes serves as a dynamic landing pad for CTD-associated proteins, allowing for the coupling of transcription with other nuclear processes. The biological function of the CTD code will be discussed in M

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

structure shows binding of the GTase nucleotidyltransferase (NT) domain to two nonconsecutive CTD repeats (a triheptad peptide repeat was used). Since the middle CTD repeat is not making substantial contacts with the enzyme, it has been suggested that the GTase NT domain may contact two remote repeats, therefore creating a CTD loop that may have functional consequences. This feature, however, does not seem to be conserved in mammals. The recent crystal structure of the GTase NT domain of the murine capping enzyme, Mce1, strikingly shows that the CTD interacts with the murine and yeast GTase NT domains in completely different configurations.226 In the murine structure, only one CTD repeat seems to interact with the CTD. Moreover, the amino acids important for CTD binding in the C. albicans structure are conserved among fungi but not among mammalian GTases and vice versa. These two different modes of CTD binding are confirmed by complementation assays using site-specific mutants.226 Notwithstanding this spectacular difference in binding conformation, both GTase NT domains make critical contacts with the same CTD residues, namely P-Ser5 and the side chain of Tyr1, suggesting an intriguing combination of divergent evolution and convergent evolution.44 3.1.3. WW Domain. WW domains are short (40 amino acids long) protein modules that own their name to the presence of two signature tryptophan (W) residues spaced apart by 20−22 amino acids. The structure of the WW domains is characterized by a compact antiparallel three-stranded βsheet. Several WW domain containing proteins have been shown to interact with the CTD repeats. Importantly, the binding affinity of WW domains for their targets is often modulated by serine or threonine phosphorylation.227 For example, the RNAPII E3 ubiquitin ligase Rsp5 (Nedd4 in h umans) preferentially bind s nonph osp horylated CTD193,206,228 and in vitro functional assays suggest that its binding is occluded by phosphorylation of Ser5 (but not Ser2).229 On the other hand, studies have shown that the proline isomerase Ess1/Pin1 preferentially binds to the P-Ser5 CTD.140,192,230−232 Finally, the splicing factor Prp40 binds the phosphorylated CTD but its phospho-epitope was not determined in detail.206 Different WW domains can therefore interact with diverse phospho-CTD epitopes. Elucidation of the structure of Pin1 in complex with a CTD peptide doubly phosphorylated on Ser2 and Ser5 provided a structural explanation for the preferential binding of Pin1 to PSer5 CTD repeats.232 Indeed, the phosphate group on Ser5 makes several contacts with key residues in the Pin1 WW domains, while the one on Ser2 exhibits a greater flexibility. Interestingly, the Pin1 WW domain binds CTD peptides with both Pro3 and Pro6 in the trans conformation. 3.1.4. FF Domain. FF domains are 60 amino acid long peptides characterized by two conserved phenylalanines. They are present in several proteins, often accompanied by WW domains, and are known to bind phosphoproteins.233 FF domains are found in a few CTD binding proteins including the splicing factors CA150/TCERG1, FBP11/HYPA, and Prp40.206,234−236 The structure of several FF domains has been solved, revealing a three or four helix bundle structure.235−239 The FF domains of CA150/TCERG1 and FBP11/HYPA mediate the interaction of these proteins with the P-Ser2/P-Ser5 CTD, while the FF domain of Prp40 enhances the binding mediated by the companion WW domain. The interactions made by single FF domains are generally weak, but FF-containing proteins are thought to build

CTD. This likely reflects the fact that the CTD is not an organized domain on its own but is rather able to adapt to its protein partners by adopting different local folding conformations.212,213 Quite strikingly, with the exception of the CID domain of Nrd1175 and the catalytic domain of Ssu72,144 all CTD-interacting proteins bind CTD with prolines in the trans conformation, with the prediction that the cis isomeric state would create sterical clashes. This highlights the potential key importance of proline isomerization for the function of the CTD. Here, we shall review the different protein domains that have been characterized as able to mediate CTD or phosphoCTD interactions. 3.1.1. CID Domain. CID (CTD-interacting domain) is one of the most common and best-studied types of CTD-binding domains. CID domains are present on several CTD-interacting proteins including the yeast termination factors Pcf11, Nrd1, and Rtt103,184,212,214−220 the human RNA processing factor SCAF8,221,222 and RPRD proteins (RPRD1A, RPRD1B, and RPRD2).223 The structure of several CID domains has been solved using both crystallography and NMR.175,212,216,219,220,222 All structures revealed a common general organization for the CID domain. Notably, CIDs are made of eight α-helices in a right-handed superhelical arrangement stabilized by a large and conserved hydrophobic core in the center of the molecule. The presence of CTD peptides in some of these structures, as well as peptide affinity and NMR studies, collectively revealed common principles for CTD binding to CIDs. First, all CIDs seem to bind CTD peptides in a β-turn conformation. Second, all CTD prolines are in the trans conformation, with the exception of the structure of Nrd1 CID interacting with P-Ser5. Third, Tyr1 and Pro3 make important contacts with the CIDcontaining protein. Last, Tyr1 phosphorylation is incompatible with binding of the CTD with all CIDs analyzed. Interestingly, different CIDs preferentially bind to CTD peptides with different phosphorylation states. Pcf11 binds nonphosphorylated CTD, but its affinity is slightly increased by Ser2 phosphorylation.220 Rtt103 shows higher affinity than Pcf11 for the P-Ser2 CTD, and this difference is due to the presence of one arginine residue at position 108 in Rtt103, which makes direct contacts to the Ser2 phosphate.220 On the other hand, Nrd1 strictly requires phosphorylation of Ser5 for binding.175,219 Finally, SCAF8 can accommodate different phosphorylated forms of the CTD with different affinities.222 There is therefore no common theme as to how different CIDs achieve specificity. Accordingly, the specificity involves residues that are not conserved among the different CIDs. While phosphate groups sometimes increase affinity (and therefore specificity) by allowing additional contacts with CID residues, at least one example suggests that serine phosphorylation participates in the binding indirectly. Indeed, while phosphorylation of Ser2 increases the affinity of Pcf11 for the CTD, it does not make any direct contact with the protein.216 It was originally proposed that P-Ser2 would stabilize the β-turn by interactions with CTD-Thr4, but further experiments have proved this to be unlikely.212 How P-Ser2 can stimulate binding to the Pcf11 CID therefore remains unclear. Interestingly, the CID of Pcf11 has recently been shown to bind RNA, an interaction that may be involved in transcriptional termination.210,224 3.1.2. GTase NT Domain. The crystal structure of the RNA guanylyltransferase (GTase) of the Candida albicans capping enzyme Cgt1 (the ortholog of the S. cerevisiae Ceg1) bound to a P-Ser5 CTD peptide was solved by Fabrega et al.225 The N

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

promiscuously, albeit poorly, to different P-CTD peptides and that P-Tyr1 (rather than P-Ser2) is the preferred CTD binding epitope.35,207 This low affinity/low specificity characteristic of the Spt6 tandem SH2 domain has led Liu et al. to speculate that it works as a sensor of CTD phosphorylation levels.207 Regardless the mechanism, and given the weak affinity of its SH2 domain for the P-CTD, Spt6 almost certainly requires additional interactions for its recruitment to genes in vivo. Indeed, ChIP experiments have shown that a mutant spt6 lacking the tandem SH2 domain is still recruited to transcribed genes, although with less efficiency.25

significant affinity for their targets by cumulating several FF domains240 or by combination with other domains (often WW domains).206 3.1.5. RRM Domain. Yra1 (Aly in human) is an essential protein involved in mRNA export (ref 241 and references therein). It contains an RNA recognition motif (RRM) for which the RNA-binding activity is controversial. MacKellar and Greenleaf recently showed that the Yra1 RRM allows for weak RNA binding and, more surprisingly, for binding to CTD phosphorylated on both Ser2 and Ser5. Importantly, they also show that the RRM domain mediates the recruitment of Yra1 to genes in vivo.190 Strikingly, the Yra1 residues required for RNA binding and P-CTD binding are not the same. Consequently, RNA binding and P-CTD binding do not compete with each other. This strongly suggests that the RRM binds to both RNA and P-CTD using different mechanisms and most likely using distinct interaction surfaces. Other noncanonical RRMs have been shown to mediate protein−protein interactions.242 Notably, the RRM-containing splicing factor U2AF65 was shown to bind P-CTD in vitro, although the importance of the RRM in this activity was not determined.243 RRM therefore may represent yet another protein domain allowing some proteins to reach the CTD in vivo. 3.1.6. SRI Domain. SRI stands for Set2 Rpb1 interaction. This domain mediates the interaction of the histone methyltransferase Set2 with the Rpb1 CTD phosphorylated on both Ser2 and Ser5.177,198,244 An SRI domain also allows for the recruitment of RECQ5, a human DNA helicase involved in the maintenance of genome stability, to the elongating RNAPII.245−247 The solution structures of the SRI domain from the yeast and human Set2 proteins in complex with CTD peptides were solved using NMR.211,244 Both structures are remarkably similar, depicting the SRI domain as a left-handed three-helix bundle, a structure unique among the known different CTD-binding domains. A minimum of two consecutive CTD repeats, each doubly phosphorylated on Ser2 and Ser5, are required for optimal binding. The four phosphoserines make contacts with the two first helices of the SRI domain, creating a relatively long phospho-epitope. Interestingly, Li et al.211 made the surprising observation that diheptad peptides phosphorylated with the Ser2,Ser5,Ser2,Ser5 or Ser5,Ser2,Ser5,Ser2 conformations both bind the SRI domain with similar affinities, suggesting some flexibility in the mechanism by which the SRI binds the CTD. They further propose that three consecutive phosphates in the 2,5,2 conformation constitute a core SRI binding element that can be enhanced by binding an additional P-Ser5, either upstream or downstream of the core domain. 3.1.7. Tandem SH2 Domain. In higher eukaryotes, Src homology 2 (SH2) domains are known to bind to P-Tyrcontaining peptides (see ref 248 for a review). While they are present in many signaling proteins in multicellular organisms, only one SH2 domain containing protein is known in yeast, the histone chaperone Spt6.249 Structural studies have shown that the P-CTD binding domain of Spt6 actually contains two noncanonical SH2 domains organized in tandem and intimately packed against each other, in contrast to other known tandem SH2 domains.207−209,250,251 The tandem SH2 domain of mammalian and yeast Spt6 is required for binding of Spt6 to the P-Ser2 CTD in vitro.197,208 Because it binds a phosphoserine as opposed to a phospho-tyrosine, the Spt6 tandem SH2 domain was initially considered to be atypical. Two recent studies, however, showed that it actually binds quite

4. FUNCTIONS OF THE CTD CODE In this section, we will review the literature addressing how the dynamic CTD phosphorylation cycle allows for the coupling of transcription to other nuclear processes. For the sake of clarity, we will separate this section into three parts. First, we will review the role of the CTD in regulating transcription (section 4.1) before discussing its role in RNA processing, termination, and export (section 4.2). Then, we will review the emerging function of the CTD in chromatin biology and its central role in regulating cryptic transcription (section 4.3). Since this aspect was not extensively reviewed before, we will dedicate a significant portion of this section to it. 4.1. Does the CTD Regulate Transcription?

While CTD phosphorylation is intimately linked to transcription, its role in regulating the transcription reaction is rather controversial. In this section we will review the evidence in favor of and against a role for the CTD in regulating the different transcriptional steps. 4.1.1. Initiation. The CTD is required for RNAPII to interact with the Mediator complex, a conserved multiprotein coregulator bridging transcriptional activators to RNAPII.252 Mediator is essential for viability in yeast253 and the CTD is required for its function.254−256 Linking Mediator to RNAPII is therefore likely one of the key functions of the CTD. Genetic evidence strongly supports this model. Indeed, phenotypes associated with partial truncation of the CTD in yeast can be suppressed by mutations in several Mediator components.255,257 While essential in vivo, the CTD and Mediator are not required for transcription in some in vitro systems. For example, CTD-less RNAPII can be efficiently incorporated into the PIC and sustain transcription in vitro.258,259 Moreover, PIC can assemble and transcription can be supported by highly purified general transcription factors in vitro in the absence of Mediator (refs 18 and 260, to cite only two). Taken together, this data supports the idea that CTD and Mediator are not fundamentally required for early steps of transcription but rather play an essential regulatory role in vivo. As mentioned above, RNAPII is recruited to promoters in its nonphosphorylated form. Several lines of evidence suggest that dephosphorylation of the CTD prior to PIC assembly is a key event that may even be regulated in vivo. For example, phosphorylated RNAPII does not support PIC formation and transcription in vitro.18,19,32,33 Also, excess free and hyperphosphorylated RNAPII can be detected in Fcp1 knockdown cells.163 Taken together, this data suggests that CTD phosphorylation/dephosphorylation prior to PIC formation may play a regulatory role in vivo. Interestingly, Mediator contains a kinase module, composed of the CTD kinase Srb10 (Cdk8 in mammals), its associated cyclin, and two other proteins.261 Phosphorylation of the CTD by Srb10 prior to PIC O

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

role on CTD phosphorylation in that phenomenon. Some kind of separation-of-function mutations would help elucidate the contribution of P-TEFb-dependent CTD phosphorylation in pausing and elongation. Recently, Cdk7 (TFIIH kinase) was shown to regulate pausing by stimulating the recruitment of DSIF (and the eviction of TFIIE) and by activating P-TEFb.128 Therefore, while the role of Ser2 phosphorylation by P-TEFb in the regulation of pausing is questionable, a role for Ser5 phosphorylation in this process is beginning to be appreciated. 4.1.3. Elongation. Because phosphorylation of Ser2 occurs during elongation, it is often assumed to play a role in this phase of the transcription cycle. Early evidence for such a role included the fact that ctk1 and bur1 mutants are sensitive to the nucleotide-depleting drug 6-azauracil (6AU).60 Because the sensitivity to 6AU is not always indicative of a defect in transcription elongation, however, these experiments have to be interpreted carefully.278 Additional evidence included the fact that ctk1 and bur1 mutants have genetic interactions with elongation factors.279,280 Here again, this data has to be interpreted with caution. These genetic interactions may highlight the fact that elongation factors and CTD kinases share participation in common processes such as mRNA processing. Finally, the yeast CTDK1 was shown to stimulate elongation in in vitro assays using crude extracts.281 Several reports have shown that RNAPII occupancy or distribution, as measured by ChIP, is not affected by mutations in CTK1 or BUR1/BUR2, suggesting that phosphorylation by these kinases (as discussed above for Kin28) does not play an essential function for transcriptional elongation in vivo.27,282,283 Others, however, have reported that bur1 mutations do have a severe effect on RNAPII distribution along genes.66 The reason for that discrepancy is unclear but may involve the actual mutants used or the genes being investigated. In addition, it remains possible that functional redundancy between CTD kinases did hide their role in elongation in some studies. Measuring the elongation rate in vivo using CTD mutations would allow testing for, in a more direct way, the role of CTD phosphorylation in elongation, but the inviability of several of these CTD mutants makes these experiments technically challenging. Using an α-amanitin-resistant Rpb1 strategy, the Bensaude group recently showed that RNAPII carrying a CTD where all Ser2’s are replaced by alanines was able to transcribe across an artificial gene.284 Although the data suggested that elongation rate was slightly diminished in the mutant, the system used did not allow for a very robust measure of elongation kinetics. 4.1.4. Termination. Since this step of the transcription cycle is intimately coupled to RNA 3′ end processing, the role of the CTD in transcriptional termination will be discussed in section 4.2.3.

formation has indeed been shown to prevent CTD−mediator interactions, and it was proposed as a mechanism to negatively regulate PIC formation at certain genes.72 While the CTD needs to be dephosphorylated prior to PIC formation,18,19,32,33 it is quickly phosphorylated on Ser5 and Ser7 before the formation of the first phospho-diester bond.19 Because Mediator binds the nonphosphorylated CTD,195,254,256 phosphorylation of Ser5 and Ser7 by TFIIH within the PIC was proposed to allow for RNAPII to break contact with Mediator and other PIC components. Indeed, in vitro experiments have shown that phosphorylation of RNAPII by TFIIH triggers the dissociation of Mediator from RNAPII both in the context of free RNAPII262 and as a PIC.263 Also, phosphorylated RNAPII is free of Mediator in vivo.195 Finally, the recent structure of RNAPII in complex with the Mediator provided some rationale for how CTD phosphorylation on Ser5 may prevent RNAPII− Mediator association.264 Coupled with the fact that steady-state mRNA levels are extremely low in kin28 mutants,265 these in vitro studies led to the broadly accepted interpretation that TFIIH-dependent CTD phosphorylation is required for promoter escape and productive transcription. As it was later shown that low mRNA levels in kin28 mutants were the consequence of defects in mRNA capping,43,266,267 the relative importance of TFIIHdependent CTD phosphorylation for transcription has been put into question. Indeed, in vivo ChIP assays looking at RNAPII occupancy in kin28 mutants have led to quite variable results, ranging from no to substantial effect on RNAPII occupancy, depending on the genes being investigated and the actual mutant used.22,49,196,263 The most recent genome-wide ChIP data, using a very specific ATP analogue sensitive kin28 mutant, however, did not reveal any strong change in RNAPII occupancy along genes, although a slight pileup in the 5′ end was sometimes observed.24,26,27 This data suggests that if CTD phosphorylation is required for breaking contacts with the PIC in vivo, this phenomenon is either not strictly required for promoter escape and productive elongation or redundant activities are present. In agreement with such a possibility, phosphorylation by CTDK1 (the major Ser2 kinase) was also shown to dissociate Mediator from RNAPII in some but not all in vitro assays.262,263 4.1.2. Pausing. In higher eukaryotes, RNAPII often experiences a strong pause 20−60 bp after initiation. While this was originally described at heat shock genes in Drosophila, 268−272 it has now become clear that this phenomenon is quite generalized to perhaps most genes and conserved in mammalian cells. Several lines of evidence that will not be extensively reviewed here have led to the generally accepted view that the multisubunit transcription elongation factor NELF (for negative elongation factor) is the central player in promoting pausing (see refs 41 and 273 for recent reviews on pausing). It is also very well documented that the positive transcription elongation factor, P-TEFb, is critical for relieving pausing. Because P-TEFb is a CTD kinase, it was originally proposed to mediate the transition from the paused to the productive elongation state via phosphorylation of the CTD.274 In addition to the CTD, however, P-TEFb has several other substrates including both DSIF (DRB sensitivity inducing factor) and NELF. Phosphorylation of DSIF turns it from a repressor to an activator of elongation,275,276 while phosphorylation of NELF causes its dissociation from RNAPII.277 While this data strongly demonstrates the key role of P-TEFb in releasing RNAPII from the pause, it provides no support for a

4.2. CTD Coordinates RNA Processing, Transcriptional Termination, and mRNA Export

4.2.1. mRNA Capping. The 5′ end of mRNA is “capped” by the addition of an m7GpppN structure via the sequential action of three enzymatic activities. First, the 5′ triphosphate terminus of the pre-mRNA is cleaved to a diphosphate by RNA triphosphatase; second, RNA guanylyltransferase adds a GMP cap to the diphosphate end; last, the GpppN cap is methylated by RNA (guanine-N7) methyltransferase (reviewed in refs 285 and 286). It has long been known that capping is coupled to transcription via the phosphorylation of the CTD287 (see ref 1 for a review). In the past few years, however, the network of P

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

interaction of the guanylyltransferase domain with P-Ser2 and P-Ser5188 (Figure 6A). In S. cerevisiae, where the triphosphatase and guanylyltransferase activities are part of the Cet1 and Ceg1 proteins respectively, recruitment to RNAPII requires the interaction of Ceg1 with P-Ser5 (Figure 6B). Cet1, which does not bind the CTD on its own,186,187,288 is recruited via an interaction with Ceg1,289 although other mechanisms are likely involved in the recruitment of Cet1.290 Finally, in S. pombe, the two proteins carrying these activities (Pct1 and Pce1) do not interact together but they both independently interact with PSer5291 (Figure 6C). This data illustrates the key role played by phosphorylation of Ser5 in linking capping to transcription. This point was further demonstrated in a very elegant experiment by Schwer and Shuman, who showed that fusing the mammalian capping enzyme (carrying both the triphosphatase and guanylyltransferase activities) to an otherwise lethal rpb1 allele where Ser5 in all repeats are mutated to alanines restores viability.43 This experiment not only highlights the importance of P-Ser5 for capping but also shows that the recruitment of the capping enzyme is the sole essential function of P-Ser5. In addition to its prime role in recruitment, Ser5 phosphorylation also stimulates the guanylyltransferase activity.187,188,292,293 4.2.1.2. DSIF and P-TEFb. In addition to the importance of P-Ser5, recent data, mostly obtained in S. pombe, has highlighted the contribution of other factors, notably DSIF and P-TEFb, in coupling capping to transcription. The Spt5 subunit of DSIF harbors a C-terminal region (CTR) containing repetitions of the TPAWNSGSK nonapeptide. Not unlike the RNAPII CTD, the Spt5 CTR is phosphorylated by P-TEFb (Cdk9 in S. pombe and Bur1 in S. cerevisiae).275,294−299 Quite interestingly, the CTR interacts with both Pct1 and Pce1, the proteins respectively carrying the triphosphatase and guanylyltransferase activities in fission yeast,300,301 suggesting that Spt5 cooperates with the CTD in the recruitment of capping enzymes to RNAPII302 (Figure 6C). While most of this network was worked out in S. pombe, evidence suggests that similar mechanisms exist in budding yeast and mammalian cells (Figure 6A,B). For example, Spt5 in both human and budding yeast physically interacts with capping enzymes.293,303 Also, DSIF can stimulate capping in vitro304 and S. cerevisiae spt5 alleles show genetic interactions with capping enzyme mutants.303 More recently, the first evidence for the mechanism by which the third capping enzyme (the methyltransferase) interacts with RNAPII also emerged from work in fission yeast. The Fisher group indeed showed that Cdk9 interacts with305 and recruits118 the RNA (guanine-N7) methyltransferase Pcm1 to genes in S. pombe (Figure 6C). Interestingly, Cdk9 mediates this recruitment via a C-terminal extension (CTE) distinct from its catalytic domain.119 Moreover, worth mentioning is the fact that Cdk9 also interacts with the triphosphatase Pct1,301 further reinforcing the idea that P-TEFb works together with DSIF and the RNAPII CTD to allow for a tight coupling between transcription and pre-mRNA capping. The first enzyme involved in capping is therefore recruited to RNAPII via at least three different mechanisms (RNAPII-CTD-P-Ser5, Spt5CTR, and Cdk9-CTE) (Figure 6C). As for CTD phosphorylation, Spt5 stimulates the capping reaction allosterically.293 4.2.1.3. Taking a Pause To “Cap”. Capping was proposed to be functionally linked to promoter-proximal pausing. Indeed, pausing probably contributes to capping by allowing more time for this reaction to take place304,306,307 or by restricting the

interactions leading to the coupling of capping with transcription has grown considerably and some notable differences between species have emerged (see Figure 6). 4.2.1.1. Key Role of CTD Phosphorylation in Capping. The recruitment of the triphosphatase and guanylyltransferase activities of the capping enzyme to RNAPII is wellcharacterized. In mammalian cells, where these two activities are part of the same protein (CE), the recruitment involves

Figure 6. Recruitment and regulation of the capping enzyme by the CTD and DSIF. Interactions known to mediate the recruitment and regulation of the capping enzyme in different organisms are shown. Physical interactions are depicted as thick double-headed gray arrows, while activations and repressions are shown as thin black arrows. (A) Mammalian cells. The NT domain of the guanylyl-transferase activity of the capping enzyme interacts with both P-Ser2 and P-Ser5. The capping enzyme also interacts with hSpt5, a subunit of DSIF. Both PSer5 and hSpt5 stimulate the guanylyl-transferase activity. (B) S. cerevisiae. In budding yeast, Ceg1 (the protein carrying the guanylyltransferase activity) binds P-Ser5 but its activity is inhibited by this mark. Cet1 (the protein carrying the triphosphatase activity) is recruited by virtue of its binding to Ceg1. Evidence suggests that DSIF also interacts with the capping enzyme in budding yeast, but the exact interactions have not been worked out. (C) S. pombe. In fission yeast, both Pct1 and Pce1 (the triphosphatase and guanylyl-transferase, respectively) are independently recruited to P-Ser5 and the guanylyltransferase activity is activated by this mark. In addition, several interactions with DSIF and P-TEFb have been shown to participate in the recruitment of the capping enzyme in this organism (see the text for details). Q

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

however, is best established for U2AF, the factor that recognizes the 3′ splice site and the nearby polypyrimidine track via its U2AF35 and U2AF65 subunits, respectively. Recent studies, by the Manley and Bensaude groups, indeed showed that U2AF65 binds the phosphorylated CTD directly.243,284 Gu et al. further dissected this interaction to show that it requires phosphorylation on Ser2.284 Importantly, each group respectively provided evidence that this interaction stimulates splicing in vitro and in vivo.243,284 Because Prp40, a member of U1the snRNP responsible for the recognition of the 5′ splice sitealso binds the CTD, Manley and colleagues proposed that the CTD promotes splicing by tethering the 5′ splice site near the RNA exit channel of RNAPII. This would allow the newly transcribed 3′ splice site, freshly loaded with U2AF by the phosphorylated CTD, to be in close proximity to the 5′ site, leading to efficient splicing.243,325 Although very elegant, this model still needs to be formally tested and was recently challenged by the findings that the splicing function of Prp40 does not require CTD binding322 and that the recruitment of the U1 snRNP to transcribed sites is P-Ser2independent.284 The U2AF65-dependent coupling involves the recruitment of Prp19,243 a factor involved in downstream events on the splicing reaction,326 suggesting that alternative mechanisms may be involved. Regardless of the mechanism involved, this study clearly demonstrates that a CTD-splicing factor interaction can stimulate transcription-coupled splicing. The key role played by P-Ser2 in transcription-coupled splicing is echoed in a remarkable study by Hargreaves et al.,31 who identified a set of immune response genes in macrophages that are regulated by splicing via the phosphorylation of Ser2 by P-TEFb. Quite remarkably, these genes are transcribed in the absence of stimuli but the transcripts are unspliced and quickly degraded. This “basal” transcription is carried out by RNAPII phosphorylated on Ser5 but not on Ser2. Upon stimulation, PTEFb is recruited to these genes, Ser2 becomes phosphorylated, and splicing is enabled. This study not only elegantly supports the importance of P-Ser2 in splicing, but also highlights the recruitment of CTD kinases as a mechanism to regulate gene expression via splicing rather than transcription. 4.2.2.2. Kinetic Model. Above we have discussed how recruitment of splicing factors to the CTD may stimulate splicing in a transcription-coupled dependent manner. In addition to this “recruitment model”, a “kinetic model” has been proposed to account for the effect of transcription on splicing. In the kinetic model, alternative splicing decisions are guided by elongation rate. The idea of “kinetic coupling” was first proposed in 1988,327 but initial evidence for this model was provided 10 years later by Roberts et al.,328 who showed that inducing a transcriptional pause can favor the inclusion of an otherwise skipped exon. In their system, the pause was induced after the skipped exon but ahead of a cis regulatory element involved in exon skipping. They demonstrated that, by delaying the synthesis of the negative element, the pause allowed for a window of opportunity for the spliceosome to assemble, leading to exon inclusion. The kinetic model has since received a lot of support, notably using the fibronectin gene as a model, as recently reviewed.329,330 We will not be extensively covering this topic here since there is limited evidence for a role of the CTD. Surely, changes in CTD phosphorylation were often reported along with changes in elongation rate causing alternative splicing, but in the absence of clear evidence that elongation rate is affected by CTD phosphorylation, these correlations may not be sustained by causal relationships. In

distance between the 5′ end of the nascent transcript and the capping enzymes.308 In addition, the joint action of DSIF and P-TEFb was recently proposed to provide an mRNA capping checkpoint. In this model, by coordinating the recruitment and the allosteric activation of the two first steps of the capping reaction, DSIF would therefore take advantage of its pausing activity to initiate capping. Next, by coupling pausing release to the recruitment of the last step of the capping reaction (methyltransferase activity), P-TEFb would ensure that RNAPII does not engage into elongation without completing mRNA capping.301,304 Interestingly, the capping machinery was shown to stimulate early elongation, therefore contributing to the coordination between capping and transcription.309 4.2.2. Splicing. Although splicing and transcription can occur separately, a wealth of evidence argues for a coupling between transcription and splicing. The first hint perhaps came 25 years ago when Beyer and Osheim provided spectacular electron microscopy images of splicing occurring cotranscriptionally in Drosophila embryos.310 Later, transcription and splicing factors were shown to colocalize into discrete nuclear foci.311 While these pioneer experiments showed physical correlation between the two processes, evidence has now accumulated that strongly argues for a functional coupling between splicing and transcription. We refer readers to several excellent reviews for more complete descriptions of this literature.1,308,312,313 Here we will focus on evidence suggesting a role of the CTD in the regulation of splicing and/or the coupling between splicing and transcription. 4.2.2.1. Role of the RNAPII CTD in Splicing. Splicing is physically and functionally coupled to transcription, but does it involve the CTD? Again, a plethora of papers have provided evidence that the CTD plays a role in this coupling. Notably, (i) splicing in vivo is compromised by truncation or specific mutations of the CTD,180,284 (ii) exogenous RNAPII stimulates splicing in vitro in a CTD-dependent manner,314 (iii) CTD truncation or exogenous CTD leads to the loss of colocalization between splicing factors and transcription foci in vivo,315,316 and (iv) anti-CTD antibodies and CTD peptides can inhibit splicing in vitro.191 Until recently, the mechanism by which the CTD contributes to splicing remained uncertain because of the lack of wellcharacterized interactions between splicing factors and the CTD. Very early on, several studies in mammalian cells showed that serine/arginine-rich (SR) proteins associated with hyperphosphorylated RNAPII191,317,318and for some of them regulated alternative splicing in a CTD-dependent manner319,320but evidence for direct CTD binding is still lacking. Therefore, the contribution of SR proteins to the coupling of splicing with transcription may still not involve the CTD directly.321 Prp40, a component of the yeast U1 snRNP, was shown to interact with the phosphorylated CTD via its WW domain (see section 3), but a recent study showed that this domain (and therefore CTD interaction) was dispensable for its splicing function in vivo.322 Other splicing factors such as TCERG1 (CA150), FBP11 (HYPA), SCAF4, and SCAF8 are known to bind the CTD,221,222,234,235 but the significance of their interaction with the CTD was either never reported or still remains ambiguous.323 The first splicing factors for which the significance of their ability to bind the CTD to be reported were PSF and p54nrd. Indeed, Rosonina et al.324 showed that the binding of PSF/ p54nrd to the CTD is important for their ability to regulate alternative splicing in vivo. The connection to the CTD, R

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

and subsequent properly processing has been shown in yeast using mutants for the Ser2 kinase Ctk1,283,347 in Drosophila using flavopiridol, a highly specific P-TEFb kinase inhibitor,348 and in human cells using an S2A Rpb1 α-amanitin-resistant mutant.284 These observations are consistent with the fact that P-Ser2 increases the affinity of the essential cleavage/ polyadenylation factor Pcf11 for the CTD in in vitro experiments, as mentioned in section 3.1.1.181,214,216 Interestingly, recent genome-wide ChIP studies showed that Pcf11 distribution does not directly correlate with P-Ser2 levels but rather peaks downstream at the poly(A) site.24,25 This late recruitment of Pcf11 to the 3′ region of genes suggests among other possibilities that the CTD could be masked within the transcribed region and that P-Ser2 becomes accessible to Pcf11 only past the poly(A) site (see section 4.2.3.2 and ref 11 for further discussion on this matter). Moreover, these results point to a function for the poly(A) signal consensus elements in the recruitment of Pcf11, in addition to P-Ser2. In addition to polyadenylated genes, the CTD also contributes to the 3′ end processing of RNAPII transcripts that are not polyadenylated, such as the histone mRNA and snRNAs. To be matured, the 3′ end of metazoan replicationdependent histone mRNAs requires the recognition of its conserved stem loop by SLBP (stem-loop binding protein) and of its downstream element by the U7 snRNP. These events then allow for the recruitment of the processing machinery for 3′ endonucleolytic cleavage (reviewed in refs 1, 342, and 349). Recently, using chicken cells, Hsin and Manley showed that PThr4 is necessary for the recruitment of processing factors and subsequent histone mRNA 3′ end formation.69 Moreover, it has been shown that inhibition of the Ser2 kinase Cdk9 activity leads to impaired histone mRNA 3′ end processing and reduced phosphorylation of Thr4 in both chicken and human cells.38,69,350 Further analyses will be necessary to address exactly how the P-Thr4 facilitates 3′ end processing of histone mRNAs. snRNAs consist of a small group of highly abundant, nonpolyadenylated and noncoding transcripts containing a 3′ box sequence element (reviewed in refs 334 and 351). The CTD has been known for some years to be required for snRNA 3′ end processing in higher eukaryotes.352,353 Furthermore, Baillat et al. identified a mutisubunit complex responsible for specific snRNA 3′ end processing, termed the Integrator, which associates with the CTD,354 providing a molecular link between transcription and snRNA 3′ end processing. However, exactly how the CTD mediates snRNA processing remained vague for a while. Advances came from recent research by the Murphy laboratory, which showed that phosphorylation of Ser7 is crucial for Integrator recruitment near the 3′ box sequence and subsequent snRNA 3′ end formation.355 Further studies revealed that a double P-Ser2 and P-Ser7 CTD mark is necessary and sufficient for efficient binding to the Integrator.356 In addition, the recruitment of the Integrator to the CTD has been shown to be assisted by the binding of the Ser5 phosphatase RPAP2 to P-Ser7.55 These results involving RPAP2 in the processing of snRNAs are consistent with an early study showing RPAP2 purified in association with the Integrator and RNAPII.357 Altogether, these studies provide insights into the molecular mechanisms of how the CTD is involved in 3′ end processing of snRNAs in mammalian cells. 4.2.3.2. Transcription Termination. RNAPII termination, the least understood process of the transcription reaction, has been shown to be tightly connected to 3′ end formation (see

fact, it appears likely that changes in CTD phosphorylation occur as a consequence of alteration in elongation rate (our unpublished data). One recent study, however, has provided evidence for a role of the CTD in regulation of alternative splicing via a change in elongation rate. Indeed, the Kornblihtt group showed that UV treatment causes changes in alternative splicing of several genes, correlating with CTD hyperphosphorylation-dependent change in transcription rate.331 This represents a rarely reported example where changes in CTD phosphorylation affect transcription rate. This CTD phosphorylation, however, is beyond physiological levels and occurs only in UV-treated cells. Other factors that have been shown to mediate alternative splicing by regulating RNAPII elongation include TCERG1323 and DBIRD.332 In both cases, although the effect involves a physical interaction with RNAPII, the role of the CTD was not clearly demonstrated. 4.2.3. 3′ End Processing, Termination, and Export. It has been known for some time that the RNAPII CTD plays a central role in coupling RNA 3′ end processing to transcription termination. Recent discoveries, though, suggest that additional complexities are involved in this cotranscriptional event. While the most extensively studied are polyadenylated mRNAs, over the past few years several laboratories have shown that the CTD also functions in the 3′ end processing/termination of nonpolyadenylated RNAs such as snRNAs, snoRNAs, and cryptic unstable transcripts (CUTs) (reviewed in refs 1, 333, and 334). In addition, multiple lines of evidence argue that RNAPII transcripts follow distinct termination pathways, depending on, among other cues, the presence of specific RNA recognition sequences and the CTD phosphorylation status. Although termination remains one of the least understood aspects of the transcription cycle, significant advances have been made and we refer readers to a number of comprehensive reviews on this topic.335−339 More recently, studies have also shed some light on the involvement of the CTD in mRNA export (see refs 308, 312, and 340 for reviews). Here we shall review the role played by the CTD in these final life events of RNAPII-produced transcripts. 4.2.3.1. RNA 3′ End Processing. The formation of the 3′ end of nascent RNAPII transcripts is decisive for allowing the release of the polymerase from its template and for guarantying the accurate functionality of the mature RNA. 3′ end processing of nearly all long RNAPII transcripts consists in a two-step reaction: specific endonucleolytic cleavage downstream of the poly(A) site, followed by the addition of a polyadenosine tail (reviewed in refs 341−343). The recruitment and stabilization of the large cleavage/polyadenylation machinery to the elongating RNAPII is well characterized, and numerous studies have provided evidence that the CTD plays a key role in this coupling. Remarkably, (i) truncated CTD impairs the 3′ end processing of mRNA in both mammalian and yeast cells,180,214 (ii) reconstituted in vitro cleavage reactions require the CTD in the absence of transcription,344,345 and (iii) the CTD acts as a direct physical link between transcription and nascent mRNA processing by binding to several 3′ end processing factors. Indeed, the phospho-CTD has been shown to bind the mammalian cleavage/polyadenylation specificity factor (CPSF) and the cleavage stimulation factor (CstF)180,346 as well as the yeast cleavage factor IA (CFIA) components Pcf11/Rna14/ Rna15181 and members of the cleavage and polyadenylation factor (CPF) Pta1, Yhh1, and Ydh1182,183,189 (see Table 3). The key role played by CTD Ser2 phosphorylation in the recruitment of mRNA processing factors at the 3′ end of genes S

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

proposed to terminate transcription by unwinding the DNA/ RNA hybrid.218,371 The current model for the Nrd1-dependent termination process is that an arrangement of specific RNA sequences in the nascent transcript and high P-Ser5 but low PSer2 density within the CTD recruits the Nrd1 complex during early elongation.219,375 Sen1, like Nrd1, interacts with the CTD, although it prefers Ser2 phosphorylation, which could likely permit its recruitment to both coding and noncoding transcripts.376,377 Indeed, Sen1 has been shown to also participate in polyadenylation-dependent termination by Rat1.378−380 Nevertheless, in cells where the function of Sen1 is impaired, only short mRNA encoding genes seem defective in termination.381 Much remains to be uncovered about the mechanistic details of how Sen1 might mediate transcription termination. One possibility is that Sen1 may function as Rat1 exonuclease analogously to the Rho helicase in prokaryotes (see ref 335 for further discussion). 4.2.3.3. mRNA Export. In the past few years, studies have shown that the CTD is not only implicated in the mRNA processing, as discussed above, but also in the formation of export-competent mRNA nucleoprotein particles (mRNPs) (for reviews see refs 11, 382, and 383). It has been well documented that a number of mRNA export factors associate cotranscriptionally with the nascent transcript, including the conserved Transcription-Export (TREX) complex, known to couple transcription to mRNP export (reviewed in refs 383 and 384). TREX is principally composed of the THO subcomplex, the RNA helicase Sub2 (UAP56/HEL in mammals), and the export adaptor protein Yra1 (REF/Aly in mammals).385,386 Yra1 plays a central role by establishing a physical bridge between the mRNA and its export receptor Mex67-Mtr2 (TAPp15/NXF1-NXT1) heterodimer responsible for escorting the mRNP to the nuclear pore. Emerging evidence now sheds some light on how the CTD may play an important role in coordinating the recruiting of this export machinery to the nascent transcript. Early reports suggested that Yra1 was cotranscriptionally recruited to mRNPs through Sub2, as part of the TREX complex.387,388 Since then, however, alternative recruitment mechanisms have been suggested for Yra1. Indeed, recent findings from the laboratories of Greenleaf and Bentley reveal how the Yra1 may be loaded to elongating RNAPII via the CTD. First, experiments by Johnson et al. showed that Yra1 interacts with the 3′ end processing factor Pcf11, known, as mentioned before, to be recruited to the elongating polymerase by binding to P-Ser2.389 Consistent with this result, the Yra1 homologue REF/Aly was shown to be recruited to elongating RNAPII by interacting with Isw1, a binding partner of the elongation factor Spt6 that, as Pcf11, binds to P-Ser2 CTD.197 In yeast, this mechanism was suggested to ensure the recruitment of this export factor only if the 3′ end processing machinery was correctly in place,308 although Yra1 occupies the whole transcribed region.389,390 Second, by searching for proteins that bind to the phosphorylated CTD using yeast extracts, Phatnani and colleagues provided the first hint for the direct association of Yra1 with the phosphorylated CTD.177 Subsequent experiments showed that Yra1 indeed binds directly to the CTD doubly phosphorylated on Ser2 and Ser5.190 In addition, using ChIP, this study also showed that the CTD-binding domain of Yra1, a RRM domain (see section 3.1.5), is required for its recruitment to active genes. More experiments are needed to elucidate the functional relevance of these alternative recruitment mechanisms of Yra1 and whether

refs 1, 36, 335, 336, and 339). As for 3′ end processing, the CTD is required for this step of the transcription cycle.180 In addition, several factors involved in cleavage/polyadenylation known to bind directly to the CTD, notably Pcf11, Rna14, Rna15, and Yhh1, also appear to participate in termination.183,358 Depending on the type of RNAPII transcript, distinct termination pathways have been suggested.339 For long protein-coding mRNA or the polyadenylation-dependent pathway, two mechanisms have been proposed for how RNAPII transcription terminates upon cleavage of the nascent mRNA at the polyA site. In the “allosteric” model, the transcriptional elongation complex is destabilized by an exchange of factors binding the CTD, which likely reduces its processivity.359 In the “torpedo” model, the newly formed 5′ end of the cleaved pre-mRNA is rapidly degraded by a 5′−3′ RNA exonuclease (yeast Rat1/human Xrn2), triggering the release of RNAPII.184,360 It is noteworthy that an emerging view suggests that the polyA-dependent termination mechanism more likely reflects a combination of both models.361−363 The CTD is directly involved in the recruitment of Rat1/Xrn2. Indeed, the multifunctional protein dimer p54nrb/PSF binds the CTD185,324 and recruits the Xrn2, leading to transcription termination.361 In yeast, Rat1 associates with Rtt103 which, as mentioned in section 3.1.1, contains a CID domain and binds to P-Ser2.184,220 In addition, Rat1 recruitment to the CTD also requires Pcf11, a cleavage/polyadenylation factor interacting directly with P-Ser2.363 This finding is consistent with genomewide results showing that occupancies of Rat1 and Pcf11 correlate well.24 However, genome-wide profiles of Rtt103 and Pcf11 do not overlap with Ser2 phosphorylation levels over the coding region, but these factors rather peak at the end of genes.25,35 In line with this, it has been suggested that the newly discovered P-Tyr1 mark of the CTD blocks the recruitment of Rtt103 and Pcf11 upstream of the polyA site.35 P-Tyr1 signal then drops near the polyA site, while P-Ser2 levels remain high, allowing for the recruitment of Rtt103 and Pcf11 for subsequent transcription termination. Lunde et al. suggested that high density of P-Ser2 facilitates cooperative binding of Rtt103 and Pcf11 to neighboring P-Ser2 residues. 220 Altogether, these studies provide a mechanism for tight regulation of transcription termination by CTD phosphorylation. Over the past few years, a second termination process has been uncovered in yeast: the Nrd1-dependent pathway. This pathway predominantly operates at short distances from the TSS of nonpolyadenylated small noncoding genes including stable snRNAs and snoRNAs,218,364 unstable transcripts or CUTs,365,366 and aberrant mRNAs.367 Interestingly, this termination pathway is coupled to the recruitment of nuclear exosome.368 While the stable nuclear transcripts are trimmed by the exosome and protected by specific RNA binding proteins to become a mature stable RNA, CUTs are completely degraded (for reviews, see refs 333 and 369). Termination of these genes requires a protein complex composed of Nrd1, Nab3, and Sen1 proteins.217,218,370−372 Both Nrd1 and Nab3 are RNA-binding proteins that recognize specific RNA sequences, helping recruiting the Nrd1 complex to target genes.373,374 As discussed above (section 3.1.1), Nrd1 also possesses a CID domain that binds the CTD with P-Ser5, enabling the recruitment of the Nrd1 complex to the elongating RNAPII.219,375 Sen1 (Senataxin in humans), a conserved DNA/RNA helicase shown to function in conjunction with Nrd1 and Nab3, is T

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 7. Recruitment of chromatin regulators by the CTD. (A) Recruitment of KATs by P-Ser5. The histone H3 SAGA and the histone H4 NuA4 are both recruited to transcribed regions in a Kin28-dependent manner in yeast. (B) P-Ser5 recruits the KDAC Set3C as well as the PAF complex to transcribed regions. PAF then recruits the KMT complex COMPASS. Methylation of H3K4 by COMPASS allows for the anchoring/activation of Set3C which demethylates histones. (C) The P-Ser2/P-Ser5 CTD recruits the KMT Set2 and the KDAC Rpd3S to transcribed genes. Methylation of H3K36 by Set2 allows for the anchoring/activation of Rpd3S, which deacetylates histones. (D) Several proteins are recruited to transcribed regions to act as chaperones, preventing histone loss and trans-incorporation of acetylated histones by the Asf1 pathway. FACT and Chd1 are recruited via PAF, whereas Spt6 binds a whole variety of phospho-CTD epitopes, P-Tyr1, and P-Ser2 perhaps being the most relevant ones. See the text for details.

Yra1 is recruited to genes via the CTD alone or in the context of the TREX complex (see ref 11 for further discussion). Interestingly, how the TREX complex is itself recruited to genes remained enigmatic until recently. Indeed, Chanarat et al.

showed that the conserved splicing Prp19 complex is required for TREX occupancy at transcribed genes in S. cerevisiae.391 Notably, in mammalian cells, the Prp19 complex interacts with the splicing factor U2AF, which itself binds the phosphorylated U

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

CTD through its subunit U2AF65.243 It remains to be shown that yeast Prp19 interacts with the transcriptional machinery via binding to U2AF.

ylation of the CTD by Kin28 therefore sets the stage for the recruitment of chromatin regulators that help RNAPII progressing through nucleosomes. 4.3.2. Recruiting KDACs To Close Up Chromatin. By the way of recruiting the above-mentioned activities, RNAPII improves its ability to deal with nucleosomes. The flip side to this coin, however, is that transcription would leave behind a chromatin structure that is quite disorganized (hyperacetylated histones embed into fewer nucleosomes) unless additional activities are involved. In order to prevent such chaos, RNAPII also recruits opposing activities, notably the lysine deacetylases (KDACs, also called histone deacetylases, HDACs) Rpd3, Hos2, and Hda1.200,201 Once again, the CTD plays a critical role here. Phosphorylation of Ser5 by Kin28 is required for the recruitment of the Rpd3S complex, a KDAC made of Rpd3, Sin3, Rco1, and Eaf3, to genes in vivo200,201 (Figure 7C). Phosphorylation of Ser2 by Ctk1 is also likely involved since Rpd3S binds to CTD peptides carrying a phosphate group on both Ser2 and Ser5 better than it does on P-Ser5-only peptides.200 This biochemical data is in agreement with our genome-wide ChIP data using CTD kinase mutants.201 Quite puzzlingly, the occupancy of Rpd3S is also regulated by DSIF and Bur1/P-TEFb, but the mechanisms involved remain illdefined.201 In addition to Rpd3S, Hos2as part of the Set3C KDACand Hda1 are both recruited to active genes in a Kin28-dependent manner, although the mechanisms were not fully elucidated200 (Figure 7B). In addition, Rpd3S and Set3C interact with methylated H3K36 and H3K4, respectively.408−410 These interactions also contribute to the occupancy of the KDACs on genes, most likely by anchoring Rpd3S and Set3C on their substrate after their initial recruitment via the phosphorylated CTD.200,201 Nevertheless, these KDACs collectively contribute to oppose the disruptive effect of transcription on chromatin, and ensure that RNAPII does not leave a mess behind (see section 4.3.5). 4.3.3. Recruiting KMTs. In addition to the activities mentioned above, phosphorylation of the CTD by Kin28 triggers the recruitment of Set1, as part of the COMPASS complex (Figure 7B). Indeed, Set1 occupancy, as measured by ChIP, is abolished in a kin28 mutant.196 Moreover, RNAPII phosphorylated on Ser5 but not on Ser2 can be coimmunoprecipitated with Set1.196 Phosphorylation by Kin28, however, is not sufficient for Set1 to be recruited to genes. Indeed, subunits of the PAF complex are also required for Set1 recruitment and H3K4 trimethylation in vivo.196,411 Since PAF recruitment also requires phosphorylation of Ser5 (via DSIF297,298 and P-Ser5412) it is not clear, as of today, whether COMPASS is tethered to RNAPII via PAF only, or if direct interactions with both PAF and the phosphorylated CTD are involved. Regardless of the exact details, these interactions lead to the recruitment of COMPASS to the very 5′ end of active genes, leading to trimethylation of H3K4. Interestingly, H3K4 trimethylation also requires monoubiquitylation of H2B on lysine 123 (H2Bub, lysine 120 in metazoans) by the Rad6/Bre1 E2/E3 ubiquitin ligases.413−415 H2Bub is not required for Set1 recruitment or for monomethylation of H3K4 by Set1, however. Rather, ubiquitylation of H2B stimulates the addition of the third (and perhaps also second) methyl group on the methylated lysine. This was proposed to involve the recognition of H2Bub by the Cps35 subunit of COMPASS,416 although this was recently challenged.417 Interestingly, this H2Bub stimulation also operates on H3K79 by a different KMT, Dot1.418−420

4.3. The CTD Allows for a Feedback on Chromatin

Because chromatin creates a barrier to transcription, a lot of research had focused on determining how various chromatin regulators mediate the function of sequence-specific transcription factors at gene promoters. In this paradigm, chromatin regulation occurs prior toand is a prerequisite for transcription; relegating RNAPII to a rather passive player in that gene regulation game. Ten years ago, however, seminal work from several laboratories demonstrated that RNAPII plays an active role in chromatin regulation during elongation. Indeed, joint work by the Young and Struhl laboratories has shown that the early elongating RNAPII recruits the lysine methyltransferase (KMT) Set1 to the 5′ end of active genes, leading to the iconic peaks of H3 lysine 4 (K4) trimethylation near promoters.196 During the same period, several laboratories have shown that another KMT, Set2, is recruited by RNAPII during elongation, leading to methylation of H3 on lysine 36 (K36) over active genes.392−395 In both cases, the RNAPII CTD was quickly recognized as playing a key role (see section 4.3.3). These papers provided a shift in paradigm in showing that RNAPII is not only affected by chromatin structure but can also feed back on it. During the past decade, several chromatin regulators have been shown to be recruited to RNAPII, leading to a complex and dynamic network of interactions where the CTD acts as a hub (Figure 7). In the next sections, we will review the mechanisms leading to the recruitment of various chromatin regulators to transcribing RNAPII (sections 4.3.1−4.3.4), before discussing our current understanding of the biological outcomes (or functions) of this phenomenon (section 4.3.5). 4.3.1. Recruiting KATs and Chromatin Remodelers To Open Up Chromatin. Chromatin is known to be an impediment to elongation by RNAPII (reviewed in refs 396 and 397). To elongate through chromatin in vivo, RNAPII therefore requires the help of exogenous activities (reviewed in refs 398 and 399). One way to reduce chromatin compaction, and therefore favor transcription through nucleosomes, is via acetylation of histone tails. Two lysine acetyltransferases (KATs, also called HATs for histone acetyltransferases), namely Gcn5 and Esa1, have been shown to be recruited to active genes.126,202 Gcn5 is part of the SAGA complex, while Esa1 is a member of the NuA4 complex. SAGA and NuA4 preferably acetylate H3 and H4 tails respectively (reviewed in ref 400). Work by the Hinnebusch group has shown that both SAGA and NuA4 are recruited to active genes in a manner that requires the kinase activity of Kin28, suggesting that these KATs are recruited through the Ser5 phosphorylated CTD126,202 (Figure 7A). Interestingly, the presence of SAGA and NuA4 on ORFs was shown to stimulate nucleosome eviction, providing a mechanism by which it facilitates elongation.126,202 In addition to its KAT activity, SAGA also harbors a histone H2B deubiquitinase activity, carried over by its Ubp8 subunit.401−404 Since H2B ubiquitylation stabilizes nucleosomes,405 deubiquitylation by Ubp8 may be another mechanism via which SAGA stimulates elongation. NuA4 also has additional ways of enhancing elongation. Indeed, it has been shown to stimulate the association of the chromatin remodelers Swi/Snf and RSC to the transcribed region,202 both of which are known to stimulate elongation.406,407 PhosphorV

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

of the bona fide genes. In addition, some of these truncated transcripts may regulate gene expression in trans or be translated into peptides or proteins that may have deteriorating functions to the cell. Here we will review the recent literature trying to emphasize how the different pathways contributing to the repression of cryptic transcription are actually integrated within a complex network with the RNAPII CTD at its base (Figure 8). This topic was also recent reviewed, although with an emphasis on histone modifications rather than on the CTD, in an excellent article from Smolle and Workman.432

The recruitment of Set2, like that of COMPASS, requires both CTD phosphorylation392−394 and the PAF complex.393 In this case, however, the importance of a direct interaction with the CTD has been established. Indeed, Set2 interacts with the phosphorylated CTD via its SRI domain198,244 (see section 3.1.6). Set2 binds to both P-Ser5/P-Ser2, and its recruitment has been shown to strictly require Ctk1393,394 (Figure 7C). The recruitment of Set2 therefore occurs further downstream (3′) relative to COMPASS. Interestingly, only trimethylation of H3K36 requires the recruitment of Set2 to the CTD. Instead, H3K36 monomethylation and dimethylation appear to be achieved by a nontargeted form of Set2 as it does not require Ctk1 and can be achieved by the Set2 catalytic domain on its own.421 Accordingly, while H3K36 trimethylation correlates with the transcription rate, H3K36 dimethylation does not, although it still requires transcription.422,423 4.3.4. Recruiting Histone Chaperones. In addition to recruiting COMPASS and Set2, the PAF complex coordinates the recruitment of several other factors including the ATPdependent remodeler Chd1424−426 and the histone chaperone FACT427 (Figure 7D). Indeed, in Drosophila the recruitment of FACT to transcribed genes was shown to depend on PAF427 and on HP1c,428 whereas others have suggestedusing in vitro systems howeverthat it is FACT that recruits PAF to RNAPII.429 Most likely, all these interactions collectively contribute to the coordinated recruitment of these factors to active genes. Importantly, and regardless of the details, P-Ser5 sits at the top of this network since Kin28 was shown to be required for the recruitment of PAF412 and FACT,203 and because HP1c and PAF (Cdc73, Ctr9, and Rtf1) have been shown to interact directly with P-Ser CTD peptides.412,428 We shall also emphasize that the phosphorylated CTD is assisted in this recruitment by the CTR of Spt5 (DSIF) since deletion of the CTR or mutation of Bur1 (P-TEFb) both diminish the occupancy of PAF over active genes in vivo.297,298 Interestingly, in addition to acting as a recruiting platform, PAF was also shown to stimulate the H2B ubiquitin ligases Rad6 and Bre1, which in turn stimulate the activity of FACT429−431 (Figure 7D). While FACT has not been shown to directly interact with the CTD, Spt6which, like FACT, is a transcription-associated histone chaperonecontains a tandem SH2 domain (section 3.1.7) that binds to phosphorylated CTD peptides.35,197,207,208 These interactions, however, are of quite low affinity207 and ChIP assays have shown that other factors, yet to be identified, must be involved in Spt6 recruitment in vivo since this histone chaperone still occupies active genes when truncated of its SH2 domain.25 Spt6, like all the other factors described above, is likely to be recruited via a complex network of interactions involving, but not limited to, the phosphorylated CTD. 4.3.5. A Complex Network of Activities Ensuring Genomic Fidelity. Why does RNAPII orchestrate the recruitment of so many chromatin regulators during elongation? As mentioned above, some of these regulators contribute to elongation by helping to alleviate the barrier imposed by nucleosomes. Others, on the contrary, promote nucleosome retention and deacetylation, two activities that are known to impede on RNAPII elongation. While this may seem counterproductive, a large body of evidence has now established that this is required in order to prevent aberrant transcription initiating from within genes. This so-called “cryptic” transcription, if not repressed properly, may lead to transcripts (or transcription) “interfering” with the expression

Figure 8. Complex network of interactions involving the CTD that mediates the suppression of transcription from cryptic promoters. Physical interactions are depicted as thick double-headed gray arrows, while activation and repression are shown as thin black arrows.

4.3.5.1. Preserving Nucleosomes via Histone Chaperones. The first evidence for cryptic transcription came from the Winston and Struhl groups, who showed that mutations in either of the two best characterized transcription-associated histone chaperones, Spt6 and FACT, lead to transcription initiated from cryptic promoters located within genes.203,433 Because these same mutants are known to cause histone loss,434−436 this suggests that proper nucleosome levels need to be maintained in order to prevent the exposure of promoterlike sequences within genes. The central role of chromatin in this phenomenon was confirmed when cryptic transcription was systematically assessed in mutants for most yeast genes. In these studies, the most severe cryptic phenotypes were found in mutants in histones or in transcription factors required for the expression of histone genes.437,438 While initial work was done on a handful of genes, the use of microarrays has allowed to later show that cryptic transcription occurs on hundreds if not thousands of genes in yeast,438 highlighting the importance of that phenomenon. To date, cryptic transcription has been reported in mutants for histone chaperones (FACT, Spt6, Rtt106, Asf1, Spt2), KMTs (Set2, COMPASS), KDACs (Rpd3S, Set3C), histone ubiquitylation enzymes (Rad6/Bre1/ Lge1), chromatin remodelers (Isw1b, Chd1), elongation factors (PAF, P-TEFb, Ctk1), and even more.437,438 Although the W

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

with Isw1b, Chd1, and Spt6 (see section 4.3.5.1). Further reinforcing the idea that histone retention (and prevention of the incorporation of acetylated histones) is the major path toward preventing cryptic transcription, the Rpd3 core complex was recently shown to harbor a histone chaperone activity.447 This suggests that even components of the network previously thought to mediate repression of cryptic transcription through histone deacetylation, may rather functionat least in part by preventing histone loss and incorporation of acetylated histones. 4.3.6. Conservation in Higher Eukaryotes. In this section we have focused our discussion on yeast because it is by far the species where we know the most about interactions between the CTD and chromatin regulators. All the chromatin regulators discussed above, however, have orthologs in higher eukaryotes. In fact, in most cases, for each yeast complex, one can find several similar complexes in metazoans. For example, the yeast COMPASS has three related complexes in flies, each of which has two in human (reviewed in ref 416). Among the six human COMPASS-like complexes, however, only two appear to carry a function analogous to COMPASS. SET1A/B share a subunit composition similar to that COMPASS. Notably, (i) they carry a Cps35 ortholog (Wdr82) allowing them to generate H3K4 dimethylation and trimethylation respectively,448 (ii) they rely on PAF and H2B ubiquitylation for their activities,449 (iii) they interact with phosphorylated RNAPII,450 and (iv) they occupy the 5′ end of genes.450 Other COMPASS-like complexes (MLL1/4) can only mediate monomethylation and are unlikely to be recruited via RNAPII. In fact, they rather work as coactivators of specific sets of genes. For a recent review on COMPASS-like complexes, we refer the readers to ref 416. Similarly, human cells contain at least eight different H3K36 methyltransferases, but only one of them (KDM3A/SETD2) interacts with RNAPII and mediates H3K36 trimethylation.211,451,452 For a more complete review of the metazoan homologues of the chromatin regulators described here and their role in cryptic transcription, please refer to ref 432.

function of these proteins appears quite diverse, recent studies do suggest that they repress cryptic transcription via either of two mechanisms, namely, histone deacetylation and histone retention (or shielding).432,439 4.3.5.2. Promoting Histone Deacetylation. Cryptic transcription can arise in mutants for Ctk1, Set2, and Rpd3S,408 which, together with the fact that Set2 interacts with RNAPII phosphorylated on Ser2,408,410 leads to the model where phosphorylation of Ser2 by Ctk1 triggers the recruitment of Set2, which in turn mediates the recruitment of Rpd3S to chromatin via methylation of H3K36. 408 Rpd3S then suppresses cryptic transcription by deacetylating nucleosomes in the wake of RNAPII.408,409,440 The central role of the CTD in that pathway was further reinforced by us and others when the anchoring of Rpd3S to H3K36me was shown to require prior recruitment to the P-Ser5/2 CTD.200,201 The CTD is therefore directly involved in the recruitment of both key enzymes in this pathway, namely Set2 and Rpd3S. As discussed above, P-Ser5 also leads to the recruitment of COMPASS, which recruits the Set3C KDAC via methylation of H3K4.441 Here again, the role of the CTD was furthered by the finding that Set3C is first recruited to P-Ser5 before being handed over to the methylated nucleosome.200 Phosphorylation of P-Ser5 therefore recruits all enzymes involved in this pathway (COMPASS via PAF, and Set3C). In addition, the CTD leads to the stimulation of COMPASS via the ubiquitylation of H2B. COMPASS and Set3C mutants have much weaker cryptic phenotypes compared to the Set2/Rpd3S pathway, suggesting that the role of Set3C in the repression of cryptic transcription is not as important as that of Rpd3S.437,442 Interestingly, however, the PAF/COMPASS/Set3C and Set2/ Rpd3S pathways tend to repress cryptic transcription from the 5′ and 3′ regions of genes respectively, which is consistent with the regions where these factors are being recruited and anchored. Interestingly, a recent study showed that the recruitment of Set3C, often via transcription initiated from noncoding RNA promoters, allows for fine-tuning the kinetics of gene expression.442 4.3.5.3. Preserving Nucleosomes via Chromatin Remodelers. Recently, the Workman group showed that H3K36me represses cryptic transcription not only by anchoring Rpd3S to chromatin but also by preventing histone loss and incorporation of acetylated histones.443 This effect is mediated in part by the fact that H3K36me blocks the interaction of Asf1, a chaperone promoting the incorporation of newly synthesized (and acetylated) histones into chromatin. In addition, H3K36me recruits the Isw1b complex to transcribed genes.444,445 This recruitment, mediated by the PWWP domain of its Ioc4 subunit, favors the recycling of histones in cis during transcription, therefore preventing histone loss and incorporation of acetylated histones.444 By preventing histone loss, Isw1b contributes to the retention of the H3K36me mark during transcription, which reinforces the repression of cryptic transcription. Smolle et al. also showed that another chromatin remodeler, Chd1, has similar effects, although its recruitment is unlikely to occur via H3K36me.444 Instead, Chd1 is known to interact with both PAF and Spt5.424−426 This provides an interesting connection between both pathways as PAF mediates the recruitment of a factor (Chd1) that reinforces the Set2 pathway. This is in line with the fact that bur1 and PAF mutants cause cryptic transcription defects that are more severe than COMPASS mutants.446 Interestingly, FACT and H2Bub, both connected to PAF, also promote histone retention together

5. PERSPECTIVES AND CONCLUSION Deciphering the combinatorial complexity of the CTD is perhaps the next biggest challenge. Current understanding of how the CTD phosphorylation cycle is established is heavily based on the access to phospho-specific CTD antibodies. Despite capital in today’s research, these antibodies have a limited capacity at deciphering this complexity as they essentially inform us on the presence of a mark without providing much insight about the state of the surrounding amino acids. Antibodies do not inform either on the location, along the linear CTD, of the marks they detect. While the accessibility to an increasing number of well-characterized antibodies will continue to drive our understanding of the CTD forward, the next significant leap ahead will most likely require a new generation of tools. The development of mass spectrometry (MS) based assays to read out the CTD state, for example, would be of tremendous help. Identifying more CTD-modifying enzymes and understanding their relationships with one another is also an imperative area. It will be important to study the role of the CTD outside of transcription such as when massive shutdowns of transcription occur, such as during mitosis for example. Many of the CTD kinases known to date belong to the cyclindependent kinase (CDK) family, yet little is known about the X

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

coupling between the CTD phosphorylation cycle and the cell cycle. Another task will be determining the complete set of CTDbinding proteins, including those that may be cell type specific or cell cycle phase specific, and understanding how they interact with the CTD and other components of the transcription machinery. The recent development of in vivo cross-linking technologies, combined with clever MS approaches, are likely to be useful in that arena as it may help decipher in more detail how so many different proteins coordinately interact with the CTD. Despite being discovered more than 25 years ago,14,15

Alain R. Bataille received his Ph.D. in Molecular Biology from Université de Montréal in 2012 for his work on chromatin and gene expression regulation mechanisms, in the laboratory of Dr. François Robert at the Institut de recherches cliniques de Montréal (IRCM). He is currently a postdoctoral fellow in the laboratory of Dr. Frank Pugh in the Center for Eukaryotic Gene Regulation at Penn State University. During his Ph.D., he received studentships from the joint IRCM and Canadian Institutes of Health Research (CIHR) Cancer Research Program and the Molecular biology programs at Université de Montréal.

the CTD still remains quite mysterious and future research will likely uncover even more of its complexity.

AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Present Address §

A.R.B.: Center for Eukaryotic Gene Regulation, The

Pennsylvania State University, University Park, PA 16802, USA. Notes

The authors declare no competing financial interest. Biographies

François Robert obtained his Ph.D. from Université de Sherbrooke in 1999 for his work on the structure of transcriptional preinitiation complexes under the supervision of Dr. Benoit Coulombe. He then moved to the Boston area for four years of postdoctoral training in the laboratory of Dr. Richard Young at the Whitehead Institute in Cambridge, MA. During his postdoc, he developed, with two of his colleagues, the ChIP-chip technology. He joined the Institut de recherches cliniques de Montréal (IRCM) in 2003 as the Director of the Laboratory of Chromatin and Genomic Expression. He is now Associate Research Professor at the IRCM and Université de Montréal. The Robert laboratory is interested in the interplay between chromatin and the RNA polymerase II transcription machinery using functional genomic and proteomic approaches in different model systems such as Saccharomyces cerevisiae and mouse. He has published over 30 peer-reviewed articles and has obtained several fellowships and awards from the National Cancer Institute of Canada (NCIC), the Canadian Institutes of Health Research (CIHR), and the Fonds de recherche du Québec-Santé (FRQS).

Célia Jeronimo graduated with a Ph.D. in Biochemistry from Université de Montréal in 2008 for her work in proteomics and gene transcription with Dr. Benoit Coulombe. She then joined the Centre for Genomic Regulation (CRG) in Barcelona, Spain, for a first postdoctoral training with Dr. Luciano Di Croce in epigenetics. In 2010, she joined the group of Dr. François Robert at the Institut de recherches cliniques de Montréal (IRCM), to study gene regulation using functional genomics. During her postdoctoral training, she was awarded several fellowships including the European Molecular Biology

ACKNOWLEDGMENTS We would like to thank Daniel Zenklusen for his critical reading of the manuscript. This work was funded by a CIHR grant to F.R. (MOP-82891). C.J. is a recipient of CIHR and L’Oréal

Organization (EMBO), the Canadian Institutes of Health Research (CIHR), and the L’Oréal Canada−UNESCO for Women in Science Research Excellence. Y

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(34) Akhtar, M. S.; Heidemann, M.; Tietjen, J. R.; Zhang, D. W.; Chapman, R. D.; Eick, D.; Ansari, A. Z. Mol. Cell 2009, 34, 387. (35) Mayer, A.; Heidemann, M.; Lidschreiber, M.; Schreieck, A.; Sun, M.; Hintermair, C.; Kremmer, E.; Eick, D.; Cramer, P. Science 2012, 336, 1723. (36) Buratowski, S. Curr. Opin. Cell Biol. 2005, 17, 257. (37) Zhang, D. W.; Mosley, A. L.; Ramisetty, S. R.; RodriguezMolina, J. B.; Washburn, M. P.; Ansari, A. Z. J. Biol. Chem. 2012, 287, 8541. (38) Hintermair, C.; Heidemann, M.; Koch, F.; Descostes, N.; Gut, M.; Gut, I.; Fenouil, R.; Ferrier, P.; Flatley, A.; Kremmer, E.; Chapman, R. D.; Andrau, J. C.; Eick, D. EMBO J. 2012, 31, 2784. (39) Schwer, B.; Sanchez, A. M.; Shuman, S. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, 18024. (40) Gilchrist, D. A.; Dos Santos, G.; Fargo, D. C.; Xie, B.; Gao, Y.; Li, L.; Adelman, K. Cell 2010, 143, 540. (41) Adelman, K.; Lis, J. T. Nat Rev Genet 2012, 13, 720. (42) Sanso, M.; Lee, K. M.; Viladevall, L.; Jacques, P. E.; Page, V.; Nagy, S.; Racine, A., St; Amour, C. V.; Zhang, C.; Shokat, K. M.; Schwer, B.; Robert, F.; Fisher, R. P.; Tanny, J. C. PLoS Genet 2012, 8, e1002822. (43) Schwer, B.; Shuman, S. Mol. Cell 2011, 43, 311. (44) Burley, S. K.; Sonenberg, N. Mol. Cell 2011, 43, 163. (45) Saberianfar, R.; Cunningham-Dunlop, S.; Karagiannis, J. PLoS One 2011, 6, e24694. (46) Sukegawa, Y.; Yamashita, A.; Yamamoto, M. PLoS Genet 2011, 7, e1002387. (47) Feaver, W. J.; Gileadi, O.; Li, Y.; Kornberg, R. D. Cell 1991, 67, 1223. (48) Lu, H.; Zawel, L.; Fisher, L.; Egly, J. M.; Reinberg, D. Nature 1992, 358, 641. (49) Schroeder, S. C.; Schwer, B.; Shuman, S.; Bentley, D. Genes Dev. 2000, 14, 2435. (50) Boeing, S.; Rigault, C.; Heidemann, M.; Eick, D.; Meisterernst, M. J. Biol. Chem. 2010, 285, 188. (51) Kim, M.; Suh, H.; Cho, E. J.; Buratowski, S. J. Biol. Chem. 2009, 284, 26421. (52) Glover-Cutter, K.; Larochelle, S.; Erickson, B.; Zhang, C.; Shokat, K.; Fisher, R. P.; Bentley, D. L. Mol. Cell. Biol. 2009, 29, 5455. (53) Cho, E. J.; Kobor, M. S.; Kim, M.; Greenblatt, J.; Buratowski, S. Genes Dev. 2001, 15, 3319. (54) Mosley, A. L.; Pattenden, S. G.; Carey, M.; Venkatesh, S.; Gilmore, J. M.; Florens, L.; Workman, J. L.; Washburn, M. P. Mol. Cell 2009, 34, 168. (55) Egloff, S.; Zaborowska, J.; Laitem, C.; Kiss, T.; Murphy, S. Mol. Cell 2012, 45, 111. (56) Xiang, K.; Manley, J. L.; Tong, L. Nat. Commun. 2012, 3, 946. (57) Lee, J. M.; Greenleaf, A. L. Proc. Natl. Acad. Sci. U. S. A. 1989, 86, 3624. (58) Lee, J. M.; Greenleaf, A. L. Gene Expression 1991, 1, 149. (59) Sterner, D. E.; Lee, J. M.; Hardin, S. E.; Greenleaf, A. L. Mol. Cell. Biol. 1995, 15, 5716. (60) Murray, S.; Udupa, R.; Yao, S.; Hartzog, G.; Prelich, G. Mol. Cell. Biol. 2001, 21, 4089. (61) Bartkowiak, B.; Greenleaf, A. L. Transcription 2011, 2, 115. (62) Qiu, H.; Hu, C.; Hinnebusch, A. G. Mol. Cell 2009, 33, 752. (63) Bartkowiak, B.; Liu, P.; Phatnani, H. P.; Fuda, N. J.; Cooper, J. J.; Price, D. H.; Adelman, K.; Lis, J. T.; Greenleaf, A. L. Genes Dev. 2010, 24, 2303. (64) Liu, J.; Kipreos, E. T. Mol. Biol. Evol. 2000, 17, 1061. (65) Guo, Z.; Stiller, J. W. BMC Genomics 2004, 5, 69. (66) Keogh, M. C.; Podolny, V.; Buratowski, S. Mol. Cell. Biol. 2003, 23, 7005. (67) Schwartz, J. C.; Ebmeier, C. C.; Podell, E. R.; Heimiller, J.; Taatjes, D. J.; Cech, T. R. Genes Dev. 2012, 26, 2690. (68) Baskaran, R.; Dahmus, M. E.; Wang, J. Y. Proc. Natl. Acad. Sci. U. S. A. 1993, 90, 11167. (69) Hsin, J. P.; Sheth, A.; Manley, J. L. Science 2011, 334, 683.

Canada−UNESCO for Women in Science Research Excellence fellowships. F.R. holds a FRQS Chercheur boursier-senior salary award.

REFERENCES (1) Hsin, J. P.; Manley, J. L. Genes Dev. 2012, 26, 2119. (2) Chapman, R. D.; Heidemann, M.; Hintermair, C.; Eick, D. Trends Genet. 2008, 24, 289. (3) Nonet, M.; Sweetser, D.; Young, R. A. Cell 1987, 50, 909. (4) West, M. L.; Corden, J. L. Genetics 1995, 140, 1223. (5) Zehring, W. A.; Lee, J. M.; Weeks, J. R.; Jokerst, R. S.; Greenleaf, A. L. Proc. Natl. Acad. Sci. U. S. A. 1988, 85, 3698. (6) Bartolomei, M. S.; Halden, N. F.; Cullen, C. R.; Corden, J. L. Mol. Cell. Biol. 1988, 8, 330. (7) Meininghaus, M.; Chapman, R. D.; Horndasch, M.; Eick, D. J. Biol. Chem. 2000, 275, 24375. (8) Heidemann, M.; Hintermair, C.; Voss, K.; Eick, D. Biochim. Biophys. Acta 2013, 1829, 55. (9) Sims, R. J., 3rd; Rojas, L. A.; Beck, D.; Bonasio, R.; Schuller, R.; Drury, W. J., 3rd; Eick, D.; Reinberg, D. Science 2011, 332, 99. (10) Buratowski, S. Nat. Struct. Biol. 2003, 10, 679. (11) Bartkowiak, B.; Mackellar, A. L.; Greenleaf, A. L. Genet. Res. Int. 2011, 2011, 623718. (12) Buratowski, S. Mol. Cell 2009, 36, 541. (13) Corden, J. Chem. Rev. 2013, DOI: 10.1021/cr400158h. (14) Allison, L. A.; Moyle, M.; Shales, M.; Ingles, C. J. Cell 1985, 42, 599. (15) Corden, J. L.; Cadena, D. L.; Ahearn, J. M., Jr.; Dahmus, M. E. Proc. Natl. Acad. Sci. U. S. A. 1985, 82, 7934. (16) Cadena, D. L.; Dahmus, M. E. J. Biol. Chem. 1987, 262, 12468. (17) Bartholomew, B.; Dahmus, M. E.; Meares, C. F. J. Biol. Chem. 1986, 261, 14226. (18) Lu, H.; Flores, O.; Weinmann, R.; Reinberg, D. Proc. Natl. Acad. Sci. U. S. A. 1991, 88, 10004. (19) Laybourn, P. J.; Dahmus, M. E. J. Biol. Chem. 1990, 265, 13165. (20) Warren, S. L.; Landolfi, A. S.; Curtis, C.; Morrow, J. S. J. Cell Sci. 1992, 103 (2), 381. (21) Bregman, D. B.; Du, L.; Li, Y.; Ribisi, S.; Warren, S. L. J. Cell Sci. 1994, 107 (3), 387. (22) Komarnitsky, P.; Cho, E. J.; Buratowski, S. Genes Dev. 2000, 14, 2452. (23) Chapman, R. D.; Heidemann, M.; Albert, T. K.; Mailhammer, R.; Flatley, A.; Meisterernst, M.; Kremmer, E.; Eick, D. Science 2007, 318, 1780. (24) Kim, H.; Erickson, B.; Luo, W.; Seward, D.; Graber, J. H.; Pollock, D. D.; Megee, P. C.; Bentley, D. L. Nat. Struct. Mol. Biol. 2010, 17, 1279. (25) Mayer, A.; Lidschreiber, M.; Siebert, M.; Leike, K.; Soding, J.; Cramer, P. Nat. Struct. Mol. Biol. 2010, 17, 1272. (26) Tietjen, J. R.; Zhang, D. W.; Rodriguez-Molina, J. B.; White, B. E.; Akhtar, M. S.; Heidemann, M.; Li, X.; Chapman, R. D.; Shokat, K.; Keles, S.; Eick, D.; Ansari, A. Z. Nat. Struct. Mol. Biol. 2010, 17, 1154. (27) Bataille, A. R.; Jeronimo, C.; Jacques, P. E.; Laramee, L.; Fortin, M. E.; Forest, A.; Bergeron, M.; Hanes, S. D.; Robert, F. Mol. Cell 2012, 45, 158. (28) Coudreuse, D.; van Bakel, H.; Dewez, M.; Soutourina, J.; Parnell, T.; Vandenhaute, J.; Cairns, B.; Werner, M.; Hermand, D. Curr. Biol. 2010, 20, 1053. (29) Diamant, G.; Amir-Zilberstein, L.; Yamaguchi, Y.; Handa, H.; Dikstein, R. Cell Rep. 2012, 2, 722. (30) Amir-Zilberstein, L.; Ainbinder, E.; Toube, L.; Yamaguchi, Y.; Handa, H.; Dikstein, R. Mol. Cell. Biol. 2007, 27, 5246. (31) Hargreaves, D. C.; Horng, T.; Medzhitov, R. Cell 2009, 138, 129. (32) Chesnut, J. D.; Stephens, J. H.; Dahmus, M. E. J. Biol. Chem. 1992, 267, 10500. (33) Kang, M. E.; Dahmus, M. E. J. Biol. Chem. 1993, 268, 25033. Z

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(70) Liao, S. M.; Zhang, J.; Jeffery, D. A.; Koleske, A. J.; Thompson, C. M.; Chao, D. M.; Viljoen, M.; van Vuuren, H. J.; Young, R. A. Nature 1995, 374, 193. (71) Maldonado, E.; Shiekhattar, R.; Sheldon, M.; Cho, H.; Drapkin, R.; Rickert, P.; Lees, E.; Anderson, C. W.; Linn, S.; Reinberg, D. Nature 1996, 381, 86. (72) Hengartner, C. J.; Myer, V. E.; Liao, S. M.; Wilson, C. J.; Koh, S. S.; Young, R. A. Mol. Cell 1998, 2, 43. (73) Donner, A. J.; Ebmeier, C. C.; Taatjes, D. J.; Espinosa, J. M. Nat. Struct. Mol. Biol. 2010, 17, 194. (74) Akoulitchev, S.; Chuikov, S.; Reinberg, D. Nature 2000, 407, 102. (75) Knuesel, M. T.; Meyer, K. D.; Donner, A. J.; Espinosa, J. M.; Taatjes, D. J. Mol. Cell. Biol. 2009, 29, 650. (76) Cisek, L. J.; Corden, J. L. Nature 1989, 339, 679. (77) Xu, Y. X.; Hirose, Y.; Zhou, X. Z.; Lu, K. P.; Manley, J. L. Genes Dev. 2003, 17, 2765. (78) Chymkowitch, P.; Eldholm, V.; Lorenz, S.; Zimmermann, C.; Lindvall, J. M.; Bjoras, M.; Meza-Zepeda, L. A.; Enserink, J. M. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, 10450. (79) Chymkowitch, P.; Enserink, J. M. Transcription 2013, 4, 3. (80) Morris, M. C.; Kaiser, P.; Rudyak, S.; Baskerville, C.; Watson, M. H.; Reed, S. I. Nature 2003, 423, 1009. (81) Yu, V. P.; Baskerville, C.; Grunenfelder, B.; Reed, S. I. Mol. Cell 2005, 17, 145. (82) Chaves, S.; Baskerville, C.; Yu, V.; Reed, S. I. Mol. Cell. Biol. 2010, 30, 5284. (83) Nicodeme, E.; Jeffrey, K. L.; Schaefer, U.; Beinke, S.; Dewell, S.; Chung, C. W.; Chandwani, R.; Marazzi, I.; Wilson, P.; Coste, H.; White, J.; Kirilovsky, J.; Rice, C. M.; Lora, J. M.; Prinjha, R. K.; Lee, K.; Tarakhovsky, A. Nature 2010, 468, 1119. (84) Dey, A.; Chitsaz, F.; Abbasi, A.; Misteli, T.; Ozato, K. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 8758. (85) Zippo, A.; Serafini, R.; Rocchigiani, M.; Pennacchini, S.; Krepelova, A.; Oliviero, S. Cell 2009, 138, 1122. (86) Devaiah, B. N.; Lewis, B. A.; Cherman, N.; Hewitt, M. C.; Albrecht, B. K.; Robey, P. G.; Ozato, K.; Sims, R. J., 3rd; Singer, D. S. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, 6927. (87) Wu, S. Y.; Lee, A. Y.; Lai, H. T.; Zhang, H.; Chiang, C. M. Mol. Cell 2013, 49, 843. (88) Yang, Z.; Yik, J. H.; Chen, R.; He, N.; Jang, M. K.; Ozato, K.; Zhou, Q. Mol. Cell 2005, 19, 535. (89) Jang, M. K.; Mochizuki, K.; Zhou, M.; Jeong, H. S.; Brady, J. N.; Ozato, K. Mol. Cell 2005, 19, 523. (90) Devaiah, B. N.; Singer, D. S. J. Biol. Chem. 2012, 287, 38755. (91) Zhang, W.; Prakash, C.; Sum, C.; Gong, Y.; Li, Y.; Kwok, J. J.; Thiessen, N.; Pettersson, S.; Jones, S. J.; Knapp, S.; Yang, H.; Chin, K. C. J. Biol. Chem. 2012, 287, 43137. (92) Baskaran, R.; Wood, L. D.; Whitaker, L. L.; Canman, C. E.; Morgan, S. E.; Xu, Y.; Barlow, C.; Baltimore, D.; Wynshaw-Boris, A.; Kastan, M. B.; Wang, J. Y. Nature 1997, 387, 516. (93) Baskaran, R.; Chiang, G. G.; Wang, J. Y. Mol. Cell. Biol. 1996, 16, 3361. (94) Baskaran, R.; Escobar, S. R.; Wang, J. Y. Cell Growth Differ. 1999, 10, 387. (95) Baskaran, R.; Chiang, G. G.; Mysliwiec, T.; Kruh, G. D.; Wang, J. Y. J. Biol. Chem. 1997, 272, 18905. (96) Bregman, D. B.; Pestell, R. G.; Kidd, V. J. Front. Biosci., Landmark Ed. 2000, 5, D244. (97) Duyster, J.; Baskaran, R.; Wang, J. Y. Proc. Natl. Acad. Sci. U. S. A. 1995, 92, 1555. (98) Dahmus, M. E. J. Biol. Chem. 1981, 256, 11239. (99) Payne, J. M.; Laybourn, P. J.; Dahmus, M. E. J. Biol. Chem. 1989, 264, 19621. (100) Chapman, R. D.; Palancade, B.; Lang, A.; Bensaude, O.; Eick, D. Nucleic Acids Res. 2004, 32, 35. (101) Sawa, C.; Nedea, E.; Krogan, N.; Wada, T.; Handa, H.; Greenblatt, J.; Buratowski, S. Mol. Cell. Biol. 2004, 24, 4734.

(102) Schneider, E.; Kartarius, S.; Schuster, N.; Montenarh, M. Oncogene 2002, 21, 5031. (103) Palancade, B.; Dubois, M. F.; Bensaude, O. J. Biol. Chem. 2002, 277, 36061. (104) Egyhazi, E.; Ossoinak, A.; Filhol-Cochet, O.; Cochet, C.; Pigon, A. Mol. Cell. Biochem. 1999, 191, 149. (105) Ujvari, A.; Pal, M.; Luse, D. S. J. Biol. Chem. 2011, 286, 23160. (106) Abbott, K. L.; Renfrow, M. B.; Chalmers, M. J.; Nguyen, B. D.; Marshall, A. G.; Legault, P.; Omichinski, J. G. Biochemistry 2005, 44, 2732. (107) Abbott, K. L.; Archambault, J.; Xiao, H.; Nguyen, B. D.; Roeder, R. G.; Greenblatt, J.; Omichinski, J. G.; Legault, P. Biochemistry 2005, 44, 2716. (108) Maldonado, E.; Allende, J. E. FEBS Lett. 1999, 443, 256. (109) Cabrejos, M. E.; Allende, C. C.; Maldonado, E. J. Cell. Biochem. 2004, 93, 2. (110) Allende, J. E.; Allende, C. C. FASEB J. 1995, 9, 313. (111) Trigon, S.; Serizawa, H.; Conaway, J. W.; Conaway, R. C.; Jackson, S. P.; Morange, M. J. Biol. Chem. 1998, 273, 6769. (112) Markowitz, R. B.; Hermann, A. S.; Taylor, D. F.; He, L.; Anthony-Cahill, S.; Ahn, N. G.; Dynan, W. S. Biochem. Biophys. Res. Commun. 1995, 207, 1051. (113) Bonnet, F.; Vigneron, M.; Bensaude, O.; Dubois, M. F. Nucleic Acids Res. 1999, 27, 4399. (114) Venetianer, A.; Dubois, M. F.; Nguyen, V. T.; Bellier, S.; Seo, S. J.; Bensaude, O. Eur. J. Biochem. 1995, 233, 83. (115) Bellier, S.; Chastant, S.; Adenot, P.; Vincent, M.; Renard, J. P.; Bensaude, O. EMBO J. 1997, 16, 6250. (116) Bellier, S.; Dubois, M. F.; Nishida, E.; Almouzni, G.; Bensaude, O. Mol. Cell. Biol. 1997, 17, 1434. (117) Czudnochowski, N.; Bosken, C. A.; Geyer, M. Nat. Commun. 2012, 3, 842. (118) Viladevall, L.; St. Amour, C. V.; Rosebrock, A.; Schneider, S.; Zhang, C.; Allen, J. J.; Shokat, K. M.; Schwer, B.; Leatherwood, J. K.; Fisher, R. P. Mol. Cell 2009, 33, 738. (119) St. Amour, C. V.; Sanso, M.; Bosken, C. A.; Lee, K. M.; Larochelle, S.; Zhang, C.; Shokat, K. M.; Geyer, M.; Fisher, R. P. Mol. Cell. Biol. 2012, 32, 2372. (120) Guiguen, A.; Soutourina, J.; Dewez, M.; Tafforeau, L.; Dieu, M.; Raes, M.; Vandenhaute, J.; Werner, M.; Hermand, D. EMBO J. 2007, 26, 1552. (121) Takahashi, H.; Parmely, T. J.; Sato, S.; Tomomori-Sato, C.; Banks, C. A.; Kong, S. E.; Szutorisz, H.; Swanson, S. K.; Martin-Brown, S.; Washburn, M. P.; Florens, L.; Seidel, C. W.; Lin, C.; Smith, E. R.; Shilatifard, A.; Conaway, R. C.; Conaway, J. W. Cell 2011, 146, 92. (122) He, N.; Liu, M.; Hsu, J.; Xue, Y.; Chou, S.; Burlingame, A.; Krogan, N. J.; Alber, T.; Zhou, Q. Mol. Cell 2010, 38, 428. (123) Sobhian, B.; Laguette, N.; Yatim, A.; Nakamura, M.; Levy, Y.; Kiernan, R.; Benkirane, M. Mol. Cell 2010, 38, 439. (124) Smith, E.; Lin, C.; Shilatifard, A. Genes Dev. 2011, 25, 661. (125) Wyce, A.; Xiao, T.; Whelan, K. A.; Kosman, C.; Walter, W.; Eick, D.; Hughes, T. R.; Krogan, N. J.; Strahl, B. D.; Berger, S. L. Mol. Cell 2007, 27, 275. (126) Govind, C. K.; Zhang, F.; Qiu, H.; Hofmeyer, K.; Hinnebusch, A. G. Mol. Cell 2007, 25, 31. (127) Jones, J. C.; Phatnani, H. P.; Haystead, T. A.; MacDonald, J. A.; Alam, S. M.; Greenleaf, A. L. J. Biol. Chem. 2004, 279, 24957. (128) Larochelle, S.; Amat, R.; Glover-Cutter, K.; Sanso, M.; Zhang, C.; Allen, J. J.; Shokat, K. M.; Bentley, D. L.; Fisher, R. P. Nat. Struct. Mol. Biol. 2012, 19, 1108. (129) Serizawa, H.; Makela, T. P.; Conaway, J. W.; Conaway, R. C.; Weinberg, R. A.; Young, R. A. Nature 1995, 374, 280. (130) Larochelle, S.; Merrick, K. A.; Terret, M. E.; Wohlbold, L.; Barboza, N. M.; Zhang, C.; Shokat, K. M.; Jallepalli, P. V.; Fisher, R. P. Mol. Cell 2007, 25, 839. (131) Sun, Z. W.; Hampsey, M. Mol. Cell. Biol. 1996, 16, 1557. (132) He, X.; Khan, A. U.; Cheng, H.; Pappas, D. L., Jr.; Hampsey, M.; Moore, C. L. Genes Dev. 2003, 17, 1030. AA

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(163) Fuda, N. J.; Buckley, M. S.; Wei, W.; Core, L. J.; Waters, C. T.; Reinberg, D.; Lis, J. T. Mol. Cell. Biol. 2012, 32, 3428. (164) Yeo, M.; Lin, P. S.; Dahmus, M. E.; Gill, G. N. J. Biol. Chem. 2003, 278, 26078. (165) Zhang, Y.; Kim, Y.; Genoud, N.; Gao, J.; Kelly, J. W.; Pfaff, S. L.; Gill, G. N.; Dixon, J. E.; Noel, J. P. Mol. Cell 2006, 24, 759. (166) Yeo, M.; Lee, S. K.; Lee, B.; Ruiz, E. C.; Pfaff, S. L.; Gill, G. N. Science 2005, 307, 596. (167) Guillamot, M.; Manchado, E.; Chiesa, M.; Gomez-Lopez, G.; Pisano, D. G.; Sacristan, M. P.; Malumbres, M. Sci. Rep. 2011, 1, 189. (168) Visconti, R.; Palazzo, L.; Della Monica, R.; Grieco, D. Nat. Commun. 2012, 3, 894. (169) Kelly, W. G.; Dahmus, M. E.; Hart, G. W. J. Biol. Chem. 1993, 268, 10416. (170) Ranuncolo, S. M.; Ghosh, S.; Hanover, J. A.; Hart, G. W.; Lewis, B. A. J. Biol. Chem. 2012, 287, 23549. (171) Comer, F. I.; Hart, G. W. Biochemistry 2001, 40, 7845. (172) Li, H.; Zhang, Z.; Wang, B.; Zhang, J.; Zhao, Y.; Jin, Y. Mol. Cell. Biol. 2007, 27, 5296. (173) Chapman, R. D.; Conrad, M.; Eick, D. Mol. Cell. Biol. 2005, 25, 7665. (174) Meinhart, A.; Kamenski, T.; Hoeppner, S.; Baumli, S.; Cramer, P. Genes Dev. 2005, 19, 1401. (175) Kubicek, K.; Cerna, H.; Holub, P.; Pasulka, J.; Hrossova, D.; Loehr, F.; Hofr, C.; Vanacova, S.; Stefl, R. Genes Dev. 2012, 26, 1891. (176) Zhang, D. W.; Rodriguez-Molina, J. B.; Tietjen, J. R.; Nemec, C. M.; Ansari, A. Z. Genet. Res. Int. 2012, 2012, 347214. (177) Phatnani, H. P.; Jones, J. C.; Greenleaf, A. L. Biochemistry 2004, 43, 15702. (178) Phatnani, H. P.; Greenleaf, A. L. Methods Mol. Biol. 2004, 257, 17. (179) Usheva, A.; Maldonado, E.; Goldring, A.; Lu, H.; Houbavi, C.; Reinberg, D.; Aloni, Y. Cell 1992, 69, 871. (180) McCracken, S.; Fong, N.; Yankulov, K.; Ballantyne, S.; Pan, G.; Greenblatt, J.; Patterson, S. D.; Wickens, M.; Bentley, D. L. Nature 1997, 385, 357. (181) Barilla, D.; Lee, B. A.; Proudfoot, N. J. Proc. Natl. Acad. Sci. U. S. A. 2001, 98, 445. (182) Kyburz, A.; Sadowski, M.; Dichtl, B.; Keller, W. Nucleic Acids Res. 2003, 31, 3936. (183) Dichtl, B.; Blank, D.; Sadowski, M.; Hubner, W.; Weiser, S.; Keller, W. EMBO J. 2002, 21, 4125. (184) Kim, M.; Krogan, N. J.; Vasiljeva, L.; Rando, O. J.; Nedea, E.; Greenblatt, J. F.; Buratowski, S. Nature 2004, 432, 517. (185) Emili, A.; Shales, M.; McCracken, S.; Xie, W.; Tucker, P. W.; Kobayashi, R.; Blencowe, B. J.; Ingles, C. J. RNA 2002, 8, 1102. (186) Cho, E. J.; Takagi, T.; Moore, C. R.; Buratowski, S. Genes Dev. 1997, 11, 3319. (187) Cho, E. J.; Rodriguez, C. R.; Takagi, T.; Buratowski, S. Genes Dev. 1998, 12, 3482. (188) Ho, C. K.; Shuman, S. Mol. Cell 1999, 3, 405. (189) Rodriguez, C. R.; Cho, E. J.; Keogh, M. C.; Moore, C. L.; Greenleaf, A. L.; Buratowski, S. Mol. Cell. Biol. 2000, 20, 104. (190) MacKellar, A. L.; Greenleaf, A. L. J. Biol. Chem. 2011, 286, 36385. (191) Yuryev, A.; Patturajan, M.; Litingtung, Y.; Joshi, R. V.; Gentile, C.; Gebara, M.; Corden, J. L. Proc. Natl. Acad. Sci. U. S. A. 1996, 93, 6975. (192) Wu, X.; Wilcox, C. B.; Devasahayam, G.; Hackett, R. L.; Arevalo-Rodriguez, M.; Cardenas, M. E.; Heitman, J.; Hanes, S. D. EMBO J. 2000, 19, 3727. (193) Chang, A.; Cheang, S.; Espanel, X.; Sudol, M. J. Biol. Chem. 2000, 275, 20562. (194) Kang, M. E.; Dahmus, M. E. J. Biol. Chem. 1995, 270, 23390. (195) Svejstrup, J. Q.; Li, Y.; Fellows, J.; Gnatt, A.; Bjorklund, S.; Kornberg, R. D. Proc. Natl. Acad. Sci. U. S. A. 1997, 94, 6075. (196) Ng, H. H.; Robert, F.; Young, R. A.; Struhl, K. Mol. Cell 2003, 11, 709.

(133) Gavin, A. C.; Bosche, M.; Krause, R.; Grandi, P.; Marzioch, M.; Bauer, A.; Schultz, J.; Rick, J. M.; Michon, A. M.; Cruciat, C. M.; Remor, M.; Hofert, C.; Schelder, M.; Brajenovic, M.; Ruffner, H.; Merino, A.; Klein, K.; Hudak, M.; Dickson, D.; Rudi, T.; Gnau, V.; Bauch, A.; Bastuck, S.; Huhse, B.; Leutwein, C.; Heurtier, M. A.; Copley, R. R.; Edelmann, A.; Querfurth, E.; Rybin, V.; Drewes, G.; Raida, M.; Bouwmeester, T.; Bork, P.; Seraphin, B.; Kuster, B.; Neubauer, G.; Superti-Furga, G. Nature 2002, 415, 141. (134) Dichtl, B.; Blank, D.; Ohnacker, M.; Friedlein, A.; Roeder, D.; Langen, H.; Keller, W. Mol. Cell 2002, 10, 1139. (135) Steinmetz, E. J.; Brow, D. A. Mol. Cell. Biol. 2003, 23, 6339. (136) Nedea, E.; He, X.; Kim, M.; Pootoolal, J.; Zhong, G.; Canadien, V.; Hughes, T.; Buratowski, S.; Moore, C. L.; Greenblatt, J. J. Biol. Chem. 2003, 278, 33000. (137) Ganem, C.; Devaux, F.; Torchet, C.; Jacq, C.; QuevillonCheruel, S.; Labesse, G.; Facca, C.; Faye, G. EMBO J. 2003, 22, 1588. (138) Pappas, D. L., Jr.; Hampsey, M. Mol. Cell. Biol. 2000, 20, 8343. (139) Krishnamurthy, S.; He, X.; Reyes-Reyes, M.; Moore, C.; Hampsey, M. Mol. Cell 2004, 14, 387. (140) Gemmill, T. R.; Wu, X.; Hanes, S. D. J. Biol. Chem. 2005, 280, 15510. (141) Krishnamurthy, S.; Ghazy, M. A.; Moore, C.; Hampsey, M. Mol. Cell. Biol. 2009, 29, 2925. (142) Singh, N.; Ma, Z.; Gemmill, T.; Wu, X.; Defiglio, H.; Rossettini, A.; Rabeler, C.; Beane, O.; Morse, R. H.; Palumbo, M. J.; Hanes, S. D. Mol. Cell 2009, 36, 255. (143) Werner-Allen, J. W.; Lee, C. J.; Liu, P.; Nicely, N. I.; Wang, S.; Greenleaf, A. L.; Zhou, P. J. Biol. Chem. 2011, 286, 5717. (144) Xiang, K.; Nagaike, T.; Xiang, S.; Kilic, T.; Beh, M. M.; Manley, J. L.; Tong, L. Nature 2010, 467, 729. (145) Xiang, K.; Manley, J. L.; Tong, L. Genes Dev. 2012, 26, 2265. (146) Cho, H.; Kim, T. K.; Mancebo, H.; Lane, W. S.; Flores, O.; Reinberg, D. Genes Dev. 1999, 13, 1540. (147) Kobor, M. S.; Archambault, J.; Lester, W.; Holstege, F. C.; Gileadi, O.; Jansma, D. B.; Jennings, E. G.; Kouyoumdjian, F.; Davidson, A. R.; Young, R. A.; Greenblatt, J. Mol. Cell 1999, 4, 55. (148) Kong, S. E.; Kobor, M. S.; Krogan, N. J.; Somesh, B. P.; Sogaard, T. M.; Greenblatt, J. F.; Svejstrup, J. Q. J. Biol. Chem. 2005, 280, 4299. (149) Hausmann, S.; Erdjument-Bromage, H.; Shuman, S. J. Biol. Chem. 2004, 279, 10892. (150) Hausmann, S.; Shuman, S. J. Biol. Chem. 2002, 277, 21213. (151) Lin, P. S.; Dubois, M. F.; Dahmus, M. E. J. Biol. Chem. 2002, 277, 45949. (152) Licciardo, P.; Amente, S.; Ruggiero, L.; Monti, M.; Pucci, P.; Lania, L.; Majello, B. Nucleic Acids Res. 2003, 31, 999. (153) Kobor, M. S.; Simon, L. D.; Omichinski, J.; Zhong, G.; Archambault, J.; Greenblatt, J. Mol. Cell. Biol. 2000, 20, 7438. (154) Kamada, K.; Roeder, R. G.; Burley, S. K. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 2296. (155) Kimura, M.; Suzuki, H.; Ishihama, A. Mol. Cell. Biol. 2002, 22, 1577. (156) Archambault, J.; Chambers, R. S.; Kobor, M. S.; Ho, Y.; Cartier, M.; Bolotin, D.; Andrews, B.; Kane, C. M.; Greenblatt, J. Proc. Natl. Acad. Sci. U. S. A. 1997, 94, 14300. (157) Archambault, J.; Pan, G.; Dahmus, G. K.; Cartier, M.; Marshall, N.; Zhang, S.; Dahmus, M. E.; Greenblatt, J. J. Biol. Chem. 1998, 273, 27593. (158) Nguyen, B. D.; Abbott, K. L.; Potempa, K.; Kobor, M. S.; Archambault, J.; Greenblatt, J.; Legault, P.; Omichinski, J. G. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 5688. (159) Friedl, E. M.; Lane, W. S.; Erdjument-Bromage, H.; Tempst, P.; Reinberg, D. Proc. Natl. Acad. Sci. U. S. A. 2003, 100, 2328. (160) Kops, O.; Zhou, X. Z.; Lu, K. P. FEBS Lett. 2002, 513, 305. (161) Suh, M. H.; Ye, P.; Zhang, M.; Hausmann, S.; Shuman, S.; Gnatt, A. L.; Fu, J. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 17314. (162) Sharma, N.; Kumari, R. Crit. Rev. Microbiol. 2012, DOI: 10.3109/1040841X.2012.711742. AB

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(197) Yoh, S. M.; Cho, H.; Pickle, L.; Evans, R. M.; Jones, K. A. Genes Dev. 2007, 21, 160. (198) Kizer, K. O.; Phatnani, H. P.; Shibata, Y.; Hall, H.; Greenleaf, A. L.; Strahl, B. D. Mol. Cell. Biol. 2005, 25, 3305. (199) Dermody, J. L.; Dreyfuss, J. M.; Villen, J.; Ogundipe, B.; Gygi, S. P.; Park, P. J.; Ponticelli, A. S.; Moore, C. L.; Buratowski, S.; Bucheli, M. E. PLoS One 2008, 3, e3273. (200) Govind, C. K.; Qiu, H.; Ginsburg, D. S.; Ruan, C.; Hofmeyer, K.; Hu, C.; Swaminathan, V.; Workman, J. L.; Li, B.; Hinnebusch, A. G. Mol. Cell 2010, 39, 234. (201) Drouin, S.; Laramee, L.; Jacques, P. E.; Forest, A.; Bergeron, M.; Robert, F. PLoS Genet. 2010, 6, e1001173. (202) Ginsburg, D. S.; Govind, C. K.; Hinnebusch, A. G. Mol. Cell. Biol. 2009, 29, 6473. (203) Mason, P. B.; Struhl, K. Mol. Cell. Biol. 2003, 23, 8323. (204) Pascual-Garcia, P.; Govind, C. K.; Queralt, E.; Cuenca-Bono, B.; Llopis, A.; Chavez, S.; Hinnebusch, A. G.; Rodriguez-Navarro, S. Genes Dev. 2008, 22, 2811. (205) Daulny, A.; Geng, F.; Muratani, M.; Geisinger, J. M.; Salghetti, S. E.; Tansey, W. P. Proc. Natl. Acad. Sci. U. S. A. 2008, 105, 19649. (206) Morris, D. P.; Greenleaf, A. L. J. Biol. Chem. 2000, 275, 39935. (207) Liu, J.; Zhang, J.; Gong, Q.; Xiong, P.; Huang, H.; Wu, B.; Lu, G.; Wu, J.; Shi, Y. J. Biol. Chem. 2011, 286, 29218. (208) Sun, M.; Lariviere, L.; Dengl, S.; Mayer, A.; Cramer, P. J. Biol. Chem. 2010, 285, 41597. (209) Close, D.; Johnson, S. J.; Sdano, M. A.; McDonald, S. M.; Robinson, H.; Formosa, T.; Hill, C. P. J. Mol. Biol. 2011, 408, 697. (210) Hollingworth, D.; Noble, C. G.; Taylor, I. A.; Ramos, A. RNA 2006, 12, 555. (211) Li, M.; Phatnani, H. P.; Guan, Z.; Sage, H.; Greenleaf, A. L.; Zhou, P. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 17636. (212) Noble, C. G.; Hollingworth, D.; Martin, S. R.; Ennis-Adeniran, V.; Smerdon, S. J.; Kelly, G.; Taylor, I. A.; Ramos, A. Nat. Struct. Mol. Biol. 2005, 12, 144. (213) Jasnovidova, O.; Stefl, R. Wiley Interdiscip. Rev.: RNA 2012, 4, 1. (214) Licatalosi, D. D.; Geiger, G.; Minet, M.; Schroeder, S.; Cilli, K.; McNeil, J. B.; Bentley, D. L. Mol. Cell 2002, 9, 1101. (215) Sadowski, M.; Dichtl, B.; Hubner, W.; Keller, W. EMBO J. 2003, 22, 2167. (216) Meinhart, A.; Cramer, P. Nature 2004, 430, 223. (217) Conrad, N. K.; Wilson, S. M.; Steinmetz, E. J.; Patturajan, M.; Brow, D. A.; Swanson, M. S.; Corden, J. L. Genetics 2000, 154, 557. (218) Steinmetz, E. J.; Conrad, N. K.; Brow, D. A.; Corden, J. L. Nature 2001, 413, 327. (219) Vasiljeva, L.; Kim, M.; Mutschler, H.; Buratowski, S.; Meinhart, A. Nat. Struct. Mol. Biol. 2008, 15, 795. (220) Lunde, B. M.; Reichow, S. L.; Kim, M.; Suh, H.; Leeper, T. C.; Yang, F.; Mutschler, H.; Buratowski, S.; Meinhart, A.; Varani, G. Nat. Struct. Mol. Biol. 2010, 17, 1195. (221) Patturajan, M.; Wei, X.; Berezney, R.; Corden, J. L. Mol. Cell. Biol. 1998, 18, 2406. (222) Becker, R.; Loll, B.; Meinhart, A. J. Biol. Chem. 2008, 283, 22659. (223) Ni, Z.; Olsen, J. B.; Guo, X.; Zhong, G.; Ruan, E. D.; Marcon, E.; Young, P.; Guo, H.; Li, J.; Moffat, J.; Emili, A.; Greenblatt, J. F. Transcription 2011, 2, 237. (224) Zhang, Z.; Fu, J.; Gilmour, D. S. Genes Dev. 2005, 19, 1572. (225) Fabrega, C.; Shen, V.; Shuman, S.; Lima, C. D. Mol. Cell 2003, 11, 1549. (226) Ghosh, A.; Shuman, S.; Lima, C. D. Mol. Cell 2011, 43, 299. (227) Sudol, M.; Hunter, T. Cell 2000, 103, 1001. (228) Wang, G.; Yang, J.; Huibregtse, J. M. Mol. Cell. Biol. 1999, 19, 342. (229) Somesh, B. P.; Reid, J.; Liu, W. F.; Sogaard, T. M.; ErdjumentBromage, H.; Tempst, P.; Svejstrup, J. Q. Cell 2005, 121, 913. (230) Morris, D. P.; Phatnani, H. P.; Greenleaf, A. L. J. Biol. Chem. 1999, 274, 31583.

(231) Ma, Z.; Atencio, D.; Barnes, C.; DeFiglio, H.; Hanes, S. D. Mol. Cell. Biol. 2012, 32, 3594. (232) Verdecia, M. A.; Bowman, M. E.; Lu, K. P.; Hunter, T.; Noel, J. P. Nat. Struct. Biol. 2000, 7, 639. (233) Bedford, M. T.; Leder, P. Trends Biochem. Sci. 1999, 24, 264. (234) Carty, S. M.; Goldstrohm, A. C.; Sune, C.; Garcia-Blanco, M. A.; Greenleaf, A. L. Proc. Natl. Acad. Sci. U. S. A. 2000, 97, 9015. (235) Allen, M.; Friedler, A.; Schon, O.; Bycroft, M. J. Mol. Biol. 2002, 323, 411. (236) Gasch, A.; Wiesner, S.; Martin-Malpartida, P.; Ramirez-Espain, X.; Ruiz, L.; Macias, M. J. J. Biol. Chem. 2006, 281, 356. (237) Lu, M.; Yang, J.; Ren, Z.; Sabui, S.; Espejo, A.; Bedford, M. T.; Jacobson, R. H.; Jeruzalmi, D.; McMurray, J. S.; Chen, X. J. Mol. Biol. 2009, 393, 397. (238) Murphy, J. M.; Hansen, D. F.; Wiesner, S.; Muhandiram, D. R.; Borg, M.; Smith, M. J.; Sicheri, F.; Kay, L. E.; Forman-Kay, J. D.; Pawson, T. J. Mol. Biol. 2009, 393, 409. (239) Bonet, R.; Ruiz, L.; Morales, B.; Macias, M. J. Proteins 2009, 77, 1000. (240) Smith, M. J.; Kulkarni, S.; Pawson, T. Mol. Cell. Biol. 2004, 24, 9274. (241) Johnson, S. A.; Kim, H.; Erickson, B.; Bentley, D. L. Nat. Struct. Mol. Biol. 2011, 18, 1164. (242) Clery, A.; Blatter, M.; Allain, F. H. Curr. Opin. Struct. Biol. 2008, 18, 290. (243) David, C. J.; Boyne, A. R.; Millhouse, S. R.; Manley, J. L. Genes Dev. 2011, 25, 972. (244) Vojnic, E.; Simon, B.; Strahl, B. D.; Sattler, M.; Cramer, P. J. Biol. Chem. 2006, 281, 13. (245) Kanagaraj, R.; Huehn, D.; MacKellar, A.; Menigatti, M.; Zheng, L.; Urban, V.; Shevelev, I.; Greenleaf, A. L.; Janscak, P. Nucleic Acids Res. 2010, 38, 8131. (246) Islam, M. N.; Fox, D., 3rd; Guo, R.; Enomoto, T.; Wang, W. Mol. Cell. Biol. 2010, 30, 2460. (247) Li, M.; Xu, X.; Liu, Y. Mol. Cell. Biol. 2011, 31, 2090. (248) Pawson, T. Cell 2004, 116, 191. (249) Maclennan, A. J.; Shaw, G. Trends Biochem. Sci. 1993, 18, 464. (250) Dengl, S.; Mayer, A.; Sun, M.; Cramer, P. J. Mol. Biol. 2009, 389, 211. (251) Diebold, M. L.; Loeliger, E.; Koch, M.; Winston, F.; Cavarelli, J.; Romier, C. J. Biol. Chem. 2010, 285, 38389. (252) Ansari, S. A.; Morse, R. H. Cell. Mol. Life Sci. 2013, DOI: 10.1007/s00018-013-1265-9. (253) Thompson, C. M.; Young, R. A. Proc. Natl. Acad. Sci. U. S. A. 1995, 92, 4587. (254) Kim, Y. J.; Bjorklund, S.; Li, Y.; Sayre, M. H.; Kornberg, R. D. Cell 1994, 77, 599. (255) Thompson, C. M.; Koleske, A. J.; Chao, D. M.; Young, R. A. Cell 1993, 73, 1361. (256) Myers, L. C.; Gustafsson, C. M.; Bushnell, D. A.; Lui, M.; Erdjument-Bromage, H.; Tempst, P.; Kornberg, R. D. Genes Dev. 1998, 12, 45. (257) Nonet, M. L.; Young, R. A. Genetics 1989, 123, 715. (258) Zehring, W. A.; Greenleaf, A. L. J. Biol. Chem. 1990, 265, 8351. (259) Kim, W. Y.; Dahmus, M. E. J. Biol. Chem. 1989, 264, 3169. (260) Buratowski, S.; Hahn, S.; Guarente, L.; Sharp, P. A. Cell 1989, 56, 549. (261) Conaway, R. C.; Conaway, J. W. Curr. Opin. Genet. Dev. 2011, 21, 225. (262) Sogaard, T. M.; Svejstrup, J. Q. J. Biol. Chem. 2007, 282, 14113. (263) Liu, Y.; Kung, C.; Fishburn, J.; Ansari, A. Z.; Shokat, K. M.; Hahn, S. Mol. Cell. Biol. 2004, 24, 1721. (264) Robinson, P. J.; Bushnell, D. A.; Trnka, M. J.; Burlingame, A. L.; Kornberg, R. D. Proc. Natl. Acad. Sci. U. S. A. 2012, 109, 17931. (265) Holstege, F. C.; Jennings, E. G.; Wyrick, J. J.; Lee, T. I.; Hengartner, C. J.; Green, M. R.; Golub, T. R.; Lander, E. S.; Young, R. A. Cell 1998, 95, 717. AC

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(266) Kanin, E. I.; Kipp, R. T.; Kung, C.; Slattery, M.; Viale, A.; Hahn, S.; Shokat, K. M.; Ansari, A. Z. Proc. Natl. Acad. Sci. U. S. A. 2007, 104, 5812. (267) Hong, S. W.; Hong, S. M.; Yoo, J. W.; Lee, Y. C.; Kim, S.; Lis, J. T.; Lee, D. K. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 14276. (268) Rasmussen, E. B.; Lis, J. T. Proc. Natl. Acad. Sci. U. S. A. 1993, 90, 7923. (269) Rasmussen, E. B.; Lis, J. T. J. Mol. Biol. 1995, 252, 522. (270) Gilmour, D. S.; Lis, J. T. Mol. Cell. Biol. 1986, 6, 3984. (271) Rougvie, A. E.; Lis, J. T. Cell 1988, 54, 795. (272) Giardina, C.; Perez-Riba, M.; Lis, J. T. Genes Dev. 1992, 6, 2190. (273) Zhou, Q.; Li, T.; Price, D. H. Annu. Rev. Biochem. 2012, 81, 119. (274) Marshall, N. F.; Peng, J.; Xie, Z.; Price, D. H. J. Biol. Chem. 1996, 271, 27176. (275) Yamada, T.; Yamaguchi, Y.; Inukai, N.; Okamoto, S.; Mura, T.; Handa, H. Mol. Cell 2006, 21, 227. (276) Chen, H.; Contreras, X.; Yamaguchi, Y.; Handa, H.; Peterlin, B. M.; Guo, S. PLoS One 2009, 4, e6918. (277) Fujinaga, K.; Irwin, D.; Huang, Y.; Taube, R.; Kurosu, T.; Peterlin, B. M. Mol. Cell. Biol. 2004, 24, 787. (278) Kaplan, C. D.; Jin, H.; Zhang, I. L.; Belyanin, A. PLoS Genet. 2012, 8, e1002627. (279) Lindstrom, D. L.; Hartzog, G. A. Genetics 2001, 159, 487. (280) Jona, G.; Wittschieben, B. O.; Svejstrup, J. Q.; Gileadi, O. Gene 2001, 267, 31. (281) Lee, J. M.; Greenleaf, A. L. J. Biol. Chem. 1997, 272, 10990. (282) Ahn, S. H.; Keogh, M. C.; Buratowski, S. EMBO J. 2009, 28, 205. (283) Ahn, S. H.; Kim, M.; Buratowski, S. Mol. Cell 2004, 13, 67. (284) Gu, B.; Eick, D.; Bensaude, O. Nucleic Acids Res. 2013, 41, 1591. (285) Shuman, S. Prog. Nucleic Acid Res. Mol. Biol. 2001, 66, 1. (286) Ghosh, A.; Lima, C. D. Wiley Interdiscip. Rev.: RNA 2010, 1, 152. (287) Bentley, D. L. Curr. Opin. Cell Biol. 2005, 17, 251. (288) McCracken, S.; Fong, N.; Rosonina, E.; Yankulov, K.; Brothers, G.; Siderovski, D.; Hessel, A.; Foster, S.; Shuman, S.; Bentley, D. L. Genes Dev. 1997, 11, 3306. (289) Ho, C. K.; Lehman, K.; Shuman, S. Nucleic Acids Res. 1999, 27, 4671. (290) Takase, Y.; Takagi, T.; Komarnitsky, P. B.; Buratowski, S. Mol. Cell. Biol. 2000, 20, 9307. (291) Pei, Y.; Hausmann, S.; Ho, C. K.; Schwer, B.; Shuman, S. J. Biol. Chem. 2001, 276, 28075. (292) Moteki, S.; Price, D. Mol. Cell 2002, 10, 599. (293) Wen, Y.; Shatkin, A. J. Genes Dev. 1999, 13, 1774. (294) Ivanov, D.; Kwak, Y. T.; Guo, J.; Gaynor, R. B. Mol. Cell. Biol. 2000, 20, 2970. (295) Ping, Y. H.; Rana, T. M. J. Biol. Chem. 2001, 276, 12951. (296) Lavoie, S. B.; Albert, A. L.; Handa, H.; Vincent, M.; Bensaude, O. J. Mol. Biol. 2001, 312, 675. (297) Zhou, K.; Kuo, W. H.; Fillingham, J.; Greenblatt, J. F. Proc. Natl. Acad. Sci. U. S. A. 2009, 106, 6956. (298) Liu, Y.; Warfield, L.; Zhang, C.; Luo, J.; Allen, J.; Lang, W. H.; Ranish, J.; Shokat, K. M.; Hahn, S. Mol. Cell. Biol. 2009, 29, 4852. (299) Pei, Y.; Shuman, S. J. Biol. Chem. 2003, 278, 43346. (300) Pei, Y.; Shuman, S. J. Biol. Chem. 2002, 277, 19639. (301) Pei, Y.; Schwer, B.; Shuman, S. J. Biol. Chem. 2003, 278, 7180. (302) Schneider, S.; Pei, Y.; Shuman, S.; Schwer, B. Mol. Cell. Biol. 2010, 30, 2353. (303) Lindstrom, D. L.; Squazzo, S. L.; Muster, N.; Burckin, T. A.; Wachter, K. C.; Emigh, C. A.; McCleery, J. A.; Yates, J. R., 3rd; Hartzog, G. A. Mol. Cell. Biol. 2003, 23, 1368. (304) Mandal, S. S.; Chu, C.; Wada, T.; Handa, H.; Shatkin, A. J.; Reinberg, D. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 7572. (305) Pei, Y.; Du, H.; Singer, J.; Stamour, C.; Granitto, S.; Shuman, S.; Fisher, R. P. Mol. Cell. Biol. 2006, 26, 777.

(306) Myers, L. C.; Lacomis, L.; Erdjument-Bromage, H.; Tempst, P. Mol. Cell 2002, 10, 883. (307) Schroeder, S. C.; Zorio, D. A.; Schwer, B.; Shuman, S.; Bentley, D. Mol. Cell 2004, 13, 377. (308) Perales, R.; Bentley, D. Mol. Cell 2009, 36, 178. (309) Kim, H. J.; Jeong, S. H.; Heo, J. H.; Jeong, S. J.; Kim, S. T.; Youn, H. D.; Han, J. W.; Lee, H. W.; Cho, E. J. Mol. Cell. Biol. 2004, 24, 6184. (310) Beyer, A. L.; Osheim, Y. N. Genes Dev. 1988, 2, 754. (311) Misteli, T.; Caceres, J. F.; Spector, D. L. Nature 1997, 387, 523. (312) Moore, M. J.; Proudfoot, N. J. Cell 2009, 136, 688. (313) Montes, M.; Becerra, S.; Sanchez-Alvarez, M.; Sune, C. Gene 2012, 501, 104. (314) Hirose, Y.; Tacke, R.; Manley, J. L. Genes Dev. 1999, 13, 1234. (315) Misteli, T.; Spector, D. L. Mol. Cell 1999, 3, 697. (316) Du, L.; Warren, S. L. J. Cell Biol. 1997, 136, 5. (317) Kim, E.; Du, L.; Bregman, D. B.; Warren, S. L. J. Cell Biol. 1997, 136, 19. (318) Mortillaro, M. J.; Blencowe, B. J.; Wei, X.; Nakayasu, H.; Du, L.; Warren, S. L.; Sharp, P. A.; Berezney, R. Proc. Natl. Acad. Sci. U. S. A. 1996, 93, 8253. (319) de la Mata, M.; Kornblihtt, A. R. Nat. Struct. Mol. Biol. 2006, 13, 973. (320) Das, R.; Yu, J.; Zhang, Z.; Gygi, M. P.; Krainer, A. R.; Gygi, S. P.; Reed, R. Mol. Cell 2007, 26, 867. (321) Sapra, A. K.; Anko, M. L.; Grishina, I.; Lorenz, M.; Pabis, M.; Poser, I.; Rollins, J.; Weiland, E. M.; Neugebauer, K. M. Mol. Cell 2009, 34, 179. (322) Gornemann, J.; Barrandon, C.; Hujer, K.; Rutz, B.; Rigaut, G.; Kotovic, K. M.; Faux, C.; Neugebauer, K. M.; Seraphin, B. RNA 2011, 17, 2119. (323) Montes, M.; Cloutier, A.; Sanchez-Hernandez, N.; Michelle, L.; Lemieux, B.; Blanchette, M.; Hernandez-Munain, C.; Chabot, B.; Sune, C. Mol. Cell. Biol. 2012, 32, 751. (324) Rosonina, E.; Ip, J. Y.; Calarco, J. A.; Bakowski, M. A.; Emili, A.; McCracken, S.; Tucker, P.; Ingles, C. J.; Blencowe, B. J. Mol. Cell. Biol. 2005, 25, 6734. (325) David, C. J.; Manley, J. L. Transcription 2011, 2, 221. (326) Wahl, M. C.; Will, C. L.; Luhrmann, R. Cell 2009, 136, 701. (327) Eperon, L. P.; Graham, I. R.; Griffiths, A. D.; Eperon, I. C. Cell 1988, 54, 393. (328) Roberts, G. C.; Gooding, C.; Mak, H. Y.; Proudfoot, N. J.; Smith, C. W. Nucleic Acids Res. 1998, 26, 5568. (329) Dujardin, G.; Lafaille, C.; Petrillo, E.; Buggiano, V.; Gomez Acuna, L. I.; Fiszbein, A.; Godoy Herz, M. A.; Nieto Moreno, N.; Munoz, M. J.; Allo, M.; Schor, I. E.; Kornblihtt, A. R. Biochim. Biophys. Acta 2013, 1829, 134. (330) Carrillo Oesterreich, F.; Bieberstein, N.; Neugebauer, K. M. Trends Cell Biol. 2011, 21, 328. (331) Munoz, M. J.; Perez Santangelo, M. S.; Paronetto, M. P.; de la Mata, M.; Pelisch, F.; Boireau, S.; Glover-Cutter, K.; Ben-Dov, C.; Blaustein, M.; Lozano, J. J.; Bird, G.; Bentley, D.; Bertrand, E.; Kornblihtt, A. R. Cell 2009, 137, 708. (332) Close, P.; East, P.; Dirac-Svejstrup, A. B.; Hartmann, H.; Heron, M.; Maslen, S.; Chariot, A.; Soding, J.; Skehel, M.; Svejstrup, J. Q. Nature 2012, 484, 386. (333) Colin, J.; Libri, D.; Porrua, O. Genet. Res. Int. 2011, 2011, 653494. (334) Egloff, S.; O’Reilly, D.; Murphy, S. Biochem. Soc. Trans. 2008, 36, 590. (335) Mischo, H. E.; Proudfoot, N. J. Biochim. Biophys. Acta 2013, 1829, 174. (336) Kuehner, J. N.; Pearson, E. L.; Moore, C. Nat. Rev. Mol. Cell Biol. 2011, 12, 283. (337) Richard, P.; Manley, J. L. Genes Dev. 2009, 23, 1247. (338) Rosonina, E.; Kaneko, S.; Manley, J. L. Genes Dev. 2006, 20, 1050. (339) Rondon, A. G.; Mischo, H. E.; Proudfoot, N. J. Nat. Struct. Mol. Biol. 2008, 15, 775. AD

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(340) Hocine, S.; Singer, R. H.; Grunwald, D. Cold Spring Harbor Perspect. Biol. 2010, 2, a000752. (341) Chan, S.; Choi, E. A.; Shi, Y. Wiley Interdiscip. Rev.: RNA 2011, 2, 321. (342) Proudfoot, N. J. Genes Dev. 2011, 25, 1770. (343) Millevoi, S.; Vagner, S. Nucleic Acids Res. 2010, 38, 2757. (344) Hirose, Y.; Manley, J. L. Nature 1998, 395, 93. (345) Ryan, K.; Murthy, K. G.; Kaneko, S.; Manley, J. L. Mol. Cell. Biol. 2002, 22, 1684. (346) Fong, N.; Bentley, D. L. Genes Dev. 2001, 15, 1783. (347) Skaar, D. A.; Greenleaf, A. L. Mol. Cell 2002, 10, 1429. (348) Ni, Z.; Schwartz, B. E.; Werner, J.; Suarez, J. R.; Lis, J. T. Mol. Cell 2004, 13, 55. (349) Marzluff, W. F.; Wagner, E. J.; Duronio, R. J. Nat. Rev. Genet. 2008, 9, 843. (350) Pirngruber, J.; Shchebet, A.; Schreiber, L.; Shema, E.; Minsky, N.; Chapman, R. D.; Eick, D.; Aylon, Y.; Oren, M.; Johnsen, S. A. EMBO Rep. 2009, 10, 894. (351) Fischer, U.; Englbrecht, C.; Chari, A. Wiley Interdiscip. Rev.: RNA 2011, 2, 718. (352) Jacobs, E. Y.; Ogiwara, I.; Weiner, A. M. Mol. Cell. Biol. 2004, 24, 846. (353) Medlin, J. E.; Uguen, P.; Taylor, A.; Bentley, D. L.; Murphy, S. EMBO J. 2003, 22, 925. (354) Baillat, D.; Hakimi, M. A.; Naar, A. M.; Shilatifard, A.; Cooch, N.; Shiekhattar, R. Cell 2005, 123, 265. (355) Egloff, S.; O’Reilly, D.; Chapman, R. D.; Taylor, A.; Tanzhaus, K.; Pitts, L.; Eick, D.; Murphy, S. Science 2007, 318, 1777. (356) Egloff, S.; Szczepaniak, S. A.; Dienstbier, M.; Taylor, A.; Knight, S.; Murphy, S. J. Biol. Chem. 2010, 285, 20564. (357) Jeronimo, C.; Forget, D.; Bouchard, A.; Li, Q.; Chua, G.; Poitras, C.; Therien, C.; Bergeron, D.; Bourassa, S.; Greenblatt, J.; Chabot, B.; Poirier, G. G.; Hughes, T. R.; Blanchette, M.; Price, D. H.; Coulombe, B. Mol. Cell 2007, 27, 262. (358) Birse, C. E.; Minvielle-Sebastia, L.; Lee, B. A.; Keller, W.; Proudfoot, N. J. Science 1998, 280, 298. (359) Logan, J.; Falck-Pedersen, E.; Darnell, J. E., Jr.; Shenk, T. Proc. Natl. Acad. Sci. U. S. A. 1987, 84, 8306. (360) West, S.; Gromak, N.; Proudfoot, N. J. Nature 2004, 432, 522. (361) Kaneko, S.; Rozenblatt-Rosen, O.; Meyerson, M.; Manley, J. L. Genes Dev. 2007, 21, 1779. (362) West, S.; Proudfoot, N. J.; Dye, M. J. Mol. Cell 2008, 29, 600. (363) Luo, W.; Johnson, A. W.; Bentley, D. L. Genes Dev. 2006, 20, 954. (364) Ursic, D.; Himmel, K. L.; Gurley, K. A.; Webb, F.; Culbertson, M. R. Nucleic Acids Res. 1997, 25, 4778. (365) Arigo, J. T.; Eyler, D. E.; Carroll, K. L.; Corden, J. L. Mol. Cell 2006, 23, 841. (366) Thiebaut, M.; Kisseleva-Romanova, E.; Rougemaille, M.; Boulay, J.; Libri, D. Mol. Cell 2006, 23, 853. (367) Rondon, A. G.; Mischo, H. E.; Kawauchi, J.; Proudfoot, N. J. Mol. Cell 2009, 36, 88. (368) Vasiljeva, L.; Buratowski, S. Mol. Cell 2006, 21, 239. (369) Lykke-Andersen, S.; Jensen, T. H. Biochimie 2007, 89, 1177. (370) Steinmetz, E. J.; Brow, D. A. Mol. Cell. Biol. 1996, 16, 6993. (371) Rasmussen, T. P.; Culbertson, M. R. Mol. Cell. Biol. 1998, 18, 6885. (372) Kim, M.; Vasiljeva, L.; Rando, O. J.; Zhelkovsky, A.; Moore, C.; Buratowski, S. Mol. Cell 2006, 24, 723. (373) Carroll, K. L.; Pradhan, D. A.; Granek, J. A.; Clarke, N. D.; Corden, J. L. Mol. Cell. Biol. 2004, 24, 6241. (374) Steinmetz, E. J.; Ng, S. B.; Cloute, J. P.; Brow, D. A. Mol. Cell. Biol. 2006, 26, 2688. (375) Gudipati, R. K.; Villa, T.; Boulay, J.; Libri, D. Nat. Struct. Mol. Biol. 2008, 15, 786. (376) Ursic, D.; Chinchilla, K.; Finkel, J. S.; Culbertson, M. R. Nucleic Acids Res. 2004, 32, 2441. (377) Chinchilla, K.; Rodriguez-Molina, J. B.; Ursic, D.; Finkel, J. S.; Ansari, A. Z.; Culbertson, M. R. Eukaryotic Cell 2012, 11, 417.

(378) Kawauchi, J.; Mischo, H.; Braglia, P.; Rondon, A.; Proudfoot, N. J. Genes Dev. 2008, 22, 1082. (379) Banerjee, A.; Sammarco, M. C.; Ditch, S.; Wang, J.; Grabczyk, E. PLoS One 2009, 4, e6193. (380) Suraweera, A.; Lim, Y.; Woods, R.; Birrell, G. W.; Nasim, T.; Becherel, O. J.; Lavin, M. F. Hum. Mol. Genet. 2009, 18, 3384. (381) Steinmetz, E. J.; Warren, C. L.; Kuehner, J. N.; Panbehi, B.; Ansari, A. Z.; Brow, D. A. Mol. Cell 2006, 24, 735. (382) Oeffinger, M.; Zenklusen, D. Biochim. Biophys. Acta 2012, 1819, 494. (383) Tutucci, E.; Stutz, F. Nat. Rev. Mol. Cell Biol. 2011, 12, 377. (384) Rodriguez-Navarro, S.; Hurt, E. Curr. Opin. Cell Biol. 2011, 23, 302. (385) Rondon, A. G.; Jimeno, S.; Aguilera, A. Biochim. Biophys. Acta 2010, 1799, 533. (386) Katahira, J.; Yoneda, Y. RNA Biol. 2009, 6, 149. (387) Zenklusen, D.; Vinciguerra, P.; Wyss, J. C.; Stutz, F. Mol. Cell. Biol. 2002, 22, 8241. (388) Strasser, K.; Masuda, S.; Mason, P.; Pfannstiel, J.; Oppizzi, M.; Rodriguez-Navarro, S.; Rondon, A. G.; Aguilera, A.; Struhl, K.; Reed, R.; Hurt, E. Nature 2002, 417, 304. (389) Johnson, S. A.; Cubberley, G.; Bentley, D. L. Mol. Cell 2009, 33, 215. (390) Kim, M.; Ahn, S. H.; Krogan, N. J.; Greenblatt, J. F.; Buratowski, S. EMBO J. 2004, 23, 354. (391) Chanarat, S.; Seizl, M.; Strasser, K. Genes Dev. 2011, 25, 1147. (392) Li, J.; Moazed, D.; Gygi, S. P. J. Biol. Chem. 2002, 277, 49383. (393) Krogan, N. J.; Kim, M.; Tong, A.; Golshani, A.; Cagney, G.; Canadien, V.; Richards, D. P.; Beattie, B. K.; Emili, A.; Boone, C.; Shilatifard, A.; Buratowski, S.; Greenblatt, J. Mol. Cell. Biol. 2003, 23, 4207. (394) Li, B.; Howe, L.; Anderson, S.; Yates, J. R., 3rd; Workman, J. L. J. Biol. Chem. 2003, 278, 8897. (395) Schaft, D.; Roguev, A.; Kotovic, K. M.; Shevchenko, A.; Sarov, M.; Shevchenko, A.; Neugebauer, K. M.; Stewart, A. F. Nucleic Acids Res. 2003, 31, 2475. (396) Li, B.; Carey, M.; Workman, J. L. Cell 2007, 128, 707. (397) Rando, O. J.; Winston, F. Genetics 2012, 190, 351. (398) Petesch, S. J.; Lis, J. T. Trends Genet. 2012, 28, 285. (399) Selth, L. A.; Sigurdsson, S.; Svejstrup, J. Q. Annu. Rev. Biochem. 2010, 79, 271. (400) Lee, K. K.; Workman, J. L. Nat. Rev. Mol. Cell Biol. 2007, 8, 284. (401) Lee, K. K.; Florens, L.; Swanson, S. K.; Washburn, M. P.; Workman, J. L. Mol. Cell. Biol. 2005, 25, 1173. (402) Ingvarsdottir, K.; Krogan, N. J.; Emre, N. C.; Wyce, A.; Thompson, N. J.; Emili, A.; Hughes, T. R.; Greenblatt, J. F.; Berger, S. L. Mol. Cell. Biol. 2005, 25, 1162. (403) Daniel, J. A.; Torok, M. S.; Sun, Z. W.; Schieltz, D.; Allis, C. D.; Yates, J. R., 3rd; Grant, P. A. J. Biol. Chem. 2004, 279, 1867. (404) Henry, K. W.; Wyce, A.; Lo, W. S.; Duggan, L. J.; Emre, N. C.; Kao, C. F.; Pillus, L.; Shilatifard, A.; Osley, M. A.; Berger, S. L. Genes Dev. 2003, 17, 2648. (405) Batta, K.; Zhang, Z.; Yen, K.; Goffman, D. B.; Pugh, B. F. Genes Dev. 2011, 25, 2254. (406) Schwabish, M. A.; Struhl, K. Mol. Cell. Biol. 2007, 27, 6987. (407) Carey, M.; Li, B.; Workman, J. L. Mol. Cell 2006, 24, 481. (408) Carrozza, M. J.; Li, B.; Florens, L.; Suganuma, T.; Swanson, S. K.; Lee, K. K.; Shia, W. J.; Anderson, S.; Yates, J.; Washburn, M. P.; Workman, J. L. Cell 2005, 123, 581. (409) Joshi, A. A.; Struhl, K. Mol. Cell 2005, 20, 971. (410) Keogh, M. C.; Kurdistani, S. K.; Morris, S. A.; Ahn, S. H.; Podolny, V.; Collins, S. R.; Schuldiner, M.; Chin, K.; Punna, T.; Thompson, N. J.; Boone, C.; Emili, A.; Weissman, J. S.; Hughes, T. R.; Strahl, B. D.; Grunstein, M.; Greenblatt, J. F.; Buratowski, S.; Krogan, N. J. Cell 2005, 123, 593. (411) Krogan, N. J.; Dover, J.; Wood, A.; Schneider, J.; Heidt, J.; Boateng, M. A.; Dean, K.; Ryan, O. W.; Golshani, A.; Johnston, M.; Greenblatt, J. F.; Shilatifard, A. Mol. Cell 2003, 11, 721. AE

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(412) Qiu, H.; Hu, C.; Gaur, N. A.; Hinnebusch, A. G. EMBO J. 2012, 31, 3494. (413) Sun, Z. W.; Allis, C. D. Nature 2002, 418, 104. (414) Dover, J.; Schneider, J.; Tawiah-Boateng, M. A.; Wood, A.; Dean, K.; Johnston, M.; Shilatifard, A. J. Biol. Chem. 2002, 277, 28368. (415) Nakanishi, S.; Lee, J. S.; Gardner, K. E.; Gardner, J. M.; Takahashi, Y. H.; Chandrasekharan, M. B.; Sun, Z. W.; Osley, M. A.; Strahl, B. D.; Jaspersen, S. L.; Shilatifard, A. J. Cell Biol. 2009, 186, 371. (416) Shilatifard, A. Annu. Rev. Biochem. 2012, 81, 65. (417) Kim, J.; Kim, J. A.; McGinty, R. K.; Nguyen, U. T.; Muir, T. W.; Allis, C. D.; Roeder, R. G. Mol. Cell 2013, 49, 1121. (418) Wood, A.; Krogan, N. J.; Dover, J.; Schneider, J.; Heidt, J.; Boateng, M. A.; Dean, K.; Golshani, A.; Zhang, Y.; Greenblatt, J. F.; Johnston, M.; Shilatifard, A. Mol. Cell 2003, 11, 267. (419) Ng, H. H.; Xu, R. M.; Zhang, Y.; Struhl, K. J. Biol. Chem. 2002, 277, 34655. (420) Briggs, S. D.; Xiao, T.; Sun, Z. W.; Caldwell, J. A.; Shabanowitz, J.; Hunt, D. F.; Allis, C. D.; Strahl, B. D. Nature 2002, 418, 498. (421) Youdell, M. L.; Kizer, K. O.; Kisseleva-Romanova, E.; Fuchs, S. M.; Duro, E.; Strahl, B. D.; Mellor, J. Mol. Cell. Biol. 2008, 28, 4915. (422) Pokholok, D. K.; Harbison, C. T.; Levine, S.; Cole, M.; Hannett, N. M.; Lee, T. I.; Bell, G. W.; Walker, K.; Rolfe, P. A.; Herbolsheimer, E.; Zeitlinger, J.; Lewitter, F.; Gifford, D. K.; Young, R. A. Cell 2005, 122, 517. (423) Rao, B.; Shibata, Y.; Strahl, B. D.; Lieb, J. D. Mol. Cell. Biol. 2005, 25, 9447. (424) Simic, R.; Lindstrom, D. L.; Tran, H. G.; Roinick, K. L.; Costa, P. J.; Johnson, A. D.; Hartzog, G. A.; Arndt, K. M. EMBO J. 2003, 22, 1846. (425) Krogan, N. J.; Kim, M.; Ahn, S. H.; Zhong, G.; Kobor, M. S.; Cagney, G.; Emili, A.; Shilatifard, A.; Buratowski, S.; Greenblatt, J. F. Mol. Cell. Biol. 2002, 22, 6979. (426) Warner, M. H.; Roinick, K. L.; Arndt, K. M. Mol. Cell. Biol. 2007, 27, 6103. (427) Adelman, K.; Wei, W.; Ardehali, M. B.; Werner, J.; Zhu, B.; Reinberg, D.; Lis, J. T. Mol. Cell. Biol. 2006, 26, 250. (428) Kwon, S. H.; Florens, L.; Swanson, S. K.; Washburn, M. P.; Abmayr, S. M.; Workman, J. L. Genes Dev. 2010, 24, 2133. (429) Pavri, R.; Zhu, B.; Li, G.; Trojer, P.; Mandal, S.; Shilatifard, A.; Reinberg, D. Cell 2006, 125, 703. (430) Wood, A.; Schneider, J.; Dover, J.; Johnston, M.; Shilatifard, A. J. Biol. Chem. 2003, 278, 34739. (431) Osley, M. A.; Fleming, A. B.; Kao, C. F. Results Probl. Cell Differ. 2006, 41, 47. (432) Smolle, M.; Workman, J. L. Biochim. Biophys. Acta 2013, 1829, 84. (433) Kaplan, C. D.; Laprade, L.; Winston, F. Science 2003, 301, 1096. (434) Jamai, A.; Puglisi, A.; Strubin, M. Mol. Cell 2009, 35, 377. (435) Ivanovska, I.; Jacques, P. E.; Rando, O. J.; Robert, F.; Winston, F. Mol. Cell. Biol. 2011, 31, 531. (436) Jensen, M. M.; Christensen, M. S.; Bonven, B.; Jensen, T. H. FEBS J. 2008, 275, 2956. (437) Silva, A. C.; Xu, X.; Kim, H. S.; Fillingham, J.; Kislinger, T.; Mennella, T. A.; Keogh, M. C. J. Biol. Chem. 2012, 287, 1709. (438) Cheung, V.; Chua, G.; Batada, N. N.; Landry, C. R.; Michnick, S. W.; Hughes, T. R.; Winston, F. PLoS Biol. 2008, 6, e277. (439) Smolle, M.; Workman, J. L.; Venkatesh, S. Epigenetics 2013, 8, 10. (440) Lickwar, C. R.; Rao, B.; Shabalin, A. A.; Nobel, A. B.; Strahl, B. D.; Lieb, J. D. PLoS One 2009, 4, e4886. (441) Kim, T.; Buratowski, S. Cell 2009, 137, 259. (442) Kim, T.; Xu, Z.; Clauder-Munster, S.; Steinmetz, L. M.; Buratowski, S. Cell 2012, 150, 1158. (443) Venkatesh, S.; Smolle, M.; Li, H.; Gogol, M. M.; Saint, M.; Kumar, S.; Natarajan, K.; Workman, J. L. Nature 2012, 489, 452. (444) Smolle, M.; Venkatesh, S.; Gogol, M. M.; Li, H.; Zhang, Y.; Florens, L.; Washburn, M. P.; Workman, J. L. Nat. Struct. Mol. Biol. 2012, 19, 884.

(445) Maltby, V. E.; Martin, B. J.; Schulze, J. M.; Johnson, I.; Hentrich, T.; Sharma, A.; Kobor, M. S.; Howe, L. Mol. Cell. Biol. 2012, 32, 3479. (446) Chu, Y.; Simic, R.; Warner, M. H.; Arndt, K. M.; Prelich, G. EMBO J. 2007, 26, 4646. (447) Chen, X. F.; Kuryan, B.; Kitada, T.; Tran, N.; Li, J. Y.; Kurdistani, S.; Grunstein, M.; Li, B.; Carey, M. Curr. Biol. 2012, 22, 56. (448) Wu, M.; Wang, P. F.; Lee, J. S.; Martin-Brown, S.; Florens, L.; Washburn, M.; Shilatifard, A. Mol. Cell. Biol. 2008, 28, 7337. (449) Kim, J.; Guermah, M.; McGinty, R. K.; Lee, J. S.; Tang, Z.; Milne, T. A.; Shilatifard, A.; Muir, T. W.; Roeder, R. G. Cell 2009, 137, 459. (450) Lee, J. H.; Skalnik, D. G. Mol. Cell. Biol. 2008, 28, 609. (451) Edmunds, J. W.; Mahadevan, L. C.; Clayton, A. L. EMBO J. 2008, 27, 406. (452) Sun, X. J.; Wei, J.; Wu, X. Y.; Hu, M.; Wang, L.; Wang, H. H.; Zhang, Q. H.; Chen, S. J.; Huang, Q. H.; Chen, Z. J. Biol. Chem. 2005, 280, 35261. (453) Hughes, C. M.; Rozenblatt-Rosen, O.; Milne, T. A.; Copeland, T. D.; Levine, S. S.; Lee, J. C.; Hayes, D. N.; Shanmugam, K. S.; Bhattacharjee, A.; Biondi, C. A.; Kay, G. F.; Hayward, N. K.; Hess, J. L.; Meyerson, M. Mol. Cell 2004, 13, 587. (454) Milne, T. A.; Dou, Y.; Martin, M. E.; Brock, H. W.; Roeder, R. G.; Hess, J. L. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 14765. (455) Francis, J.; Lin, W.; Rozenblatt-Rosen, O.; Meyerson, M. PLoS One 2011, 6, e16119. (456) Xiao, T.; Hall, H.; Kizer, K. O.; Shibata, Y.; Hall, M. C.; Borchers, C. H.; Strahl, B. D. Genes Dev. 2003, 17, 654. (457) Rodriguez-Paredes, M.; Ceballos-Chavez, M.; Esteller, M.; Garcia-Dominguez, M.; Reyes, J. C. Nucleic Acids Res. 2009, 37, 2449. (458) Finkel, J. S.; Chinchilla, K.; Ursic, D.; Culbertson, M. R. Genetics 2010, 184, 107. (459) Phatnani, H. P.; Greenleaf, A. L. Genes Dev. 2006, 20, 2922. (460) Bennett, C. B.; Westmoreland, T. J.; Verrier, C. S.; Blanchette, C. A.; Sabin, T. L.; Phatnani, H. P.; Mishina, Y. V.; Huper, G.; Selim, A. L.; Madison, E. R.; Bailey, D. D.; Falae, A. I.; Galli, A.; Olson, J. A.; Greenleaf, A. L.; Marks, J. R. PLoS One 2008, 3, e1448. (461) Fan, H.; Sakuraba, K.; Komuro, A.; Kato, S.; Harada, F.; Hirose, Y. Biochem. Biophys. Res. Commun. 2003, 301, 378. (462) Hirose, Y.; Iwamoto, Y.; Sakuraba, K.; Yunokuchi, I.; Harada, F.; Ohkuma, Y. Biochem. Biophys. Res. Commun. 2008, 369, 449.

AF

dx.doi.org/10.1021/cr4001397 | Chem. Rev. XXXX, XXX, XXX−XXX