Molecular Mechanism and Evolution of Nuclear Pre-mRNA and Group

Jan 29, 2018 - He graduated from Sparwood Secondary School in 1991 and moved to Calgary, AB, to pursue his undergraduate studies. He obtained his B.Sc...
0 downloads 5 Views 8MB Size
Review Cite This: Chem. Rev. XXXX, XXX, XXX−XXX

pubs.acs.org/CR

Molecular Mechanism and Evolution of Nuclear Pre-mRNA and Group II Intron Splicing: Insights from Cryo-Electron Microscopy Structures Wojciech P. Galej,† Navtej Toor,‡ Andrew J. Newman,§ and Kiyoshi Nagai*,§ †

EMBL Grenoble, 71 Avenue des Martyrs, 38042 Grenoble Cedex 09, France Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California 92093, United States § MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, U.K. ‡

ABSTRACT: Nuclear pre-mRNA splicing and group II intron self-splicing both proceed by two-step transesterification reactions via a lariat intron intermediate. Recently determined cryoelectron microscopy (cryo-EM) structures of catalytically active spliceosomes revealed the RNA-based catalytic core and showed how pre-mRNA substrates and reaction products are positioned in the active site. These findings highlight a strong structural similarity to the group II intron active site, strengthening the notion that group II introns and spliceosomes evolved from a common ancestor. Prp8, the largest and most conserved protein in the spliceosome, cradles the active site RNA. Prp8 and group II intron maturase have a similar domain architecture, suggesting that they also share a common evolutionary origin. The interactions between maturase and key group II intron RNA elements, such as the exon-binding loop and domains V and VI, are recapitulated in the interactions between Prp8 and key elements in the spliceosome’s catalytic RNA core. Structural comparisons suggest that the extensive RNA scaffold of the group II intron was gradually replaced by proteins as the spliceosome evolved. A plausible model of spliceosome evolution is discussed.

CONTENTS 1. Introduction 2. Folding and Active Site of the Group II Intron 2.1. Phylogenetic Classes of Group II Introns 2.2. Secondary Structure 2.3. Mechanistic Similarities between Nuclear and Group II Intron Splicing 2.4. Active Site Components of the Group II Intron 2.5. Crystal Structure of a Bacterial Group IIC Intron 2.6. Conformational Rearrangements during First Step Hydrolysis 2.7. Crystal Structure of a Eukaryotic Group II Intron Lariat 2.8. Future Directions 3. Structure of the Active Site in the Spliceosome 3.1. Biochemistry and Genetics Identify the Key Components of the Active Site 3.2. High Resolution Structure of the Active Site from Cryo-EM Studies 3.3. Future Directions 4. Comparison between the Active Sites of the Group II Intron and the Spliceosome 5. Conformational Rearrangements of the Spliceosome Active Site 6. Conformational Rearrangements during Group II Intron Splicing © XXXX American Chemical Society

7. Prp8: A Key Protein in the Spliceosome 7.1. A Brief History of Prp8 7.2. Domain Structure of Prp8 7.3. Prp8 Cradles the Catalytic RNA Elements 7.4. Dynamic Rearrangements of Prp8 during the Splicing Cycle 8. Maturase and Group II Intron as Mobile Element 8.1. Maturase Assists Splicing and Retrotransposition 8.2. Coevolution of the Maturase and the Group II Intron RNA 8.3. Interaction between Group II Intron RNA and Maturase 8.4. Implications of the Maturase for the Dispersal of Spliceosomal Introns 9. Evolution of the Spliceosome Author Information Corresponding Author ORCID Notes Biographies Acknowledgments References

B C C C C D E E F G G G H I

J J K L L M M M N N N P P P P P Q Q

I I

Special Issue: RNA: From Single Molecules to Medicine

J

Received: August 16, 2017

A

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 1. Nuclear pre-mRNA and group II intron splicing. (a) Diagram illustrating two-step transesterification splicing pathway proceeding via lariat intron intermediate, common for nuclear pre-mRNA and group II introns. (b) Secondary structure diagram for the catalytic (U5 and U2/U6 snRNAs) spliceosome. U6 ISL, U6 snRNA internal stem-loop; U6 5′-SL, U6 snRNA 5′ stem-loop. (c) Secondary structure diagram of a self-splicing group IIB intron.20 Functionally equivalent elements between the two systems are highlighted with appropriate colors. (d) Structure of the spliceosomal complex C with a highlighted RNA catalytic core.21 The colors correspond to the secondary structure elements indicated in (b). (e) Structure of the group IIB intron lariat with highlighted catalytic core.

1. INTRODUCTION

Therefore, the notion of a common ancestor for the two splicing systems was a bold and prescient proposal. In 1981 it was shown that IgG from patients with systemic lupus erythematosus, which precipitated U1 snRNP, inhibited splicing of adenoviral RNA suggesting that U1 snRNA is an integral component of splicing activity.11 Following the discovery of the interaction between U1 snRNA and the premRNA 5′ splice site (5′SS), the interactions between the U2 and U6 snRNAs, as well as between pre-mRNA and U2, U6, and U5 snRNAs during the two catalytic steps, were established in the 1980s and early 1990s using both genetic and biochemical tools.12−16 Some similarities between the catalytic RNA core of the spliceosome and key RNA elements of group II introns emerged from these studies, prompting Sharp17 to propose that the spliceosomal snRNAs may be fragments of

The discovery of introns by Philip Sharp and Richard Roberts in 19771,2 founded the field of nuclear pre-mRNA splicing and led to four decades of intense investigation into the mechanism of intron excision from mRNA precursors. In 1984 it was shown that pre-mRNA splicing proceeds through a lariat intermediate analogous to group II intron self-splicing3−6 (Figure 1a). The mechanistic similarity between nuclear premRNA splicing and group II intron splicing prompted Sharp and Cech to propose that they may have evolved from a common ancestor.7,8 A group II intron secondary structure had been determined by Francois Michel,9 but little was known at the time about how the RNA components of the spliceosome interact with each other and with the pre-mRNA substrate.10 B

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

newly discovered group II introns revealed that they exhibit autocatalytic activity in the absence of proteins.45,46 The production of a branched RNA product (later known as the lariat) during the group II intron self-splicing reaction45,46 attracted great interest because nuclear spliceosomal introns are also known to form a similar product. This was the first indication of a possible evolutionary relationship between group II introns and the spliceosome.

group II introns. When Steitz and Steitz proposed a general two-metal-ion mechanism for transesterification reactions, it was predicted to be shared between group II introns and spliceosomes.18 In 2008, the first crystal structure of a group II intron was reported at 3.1 Å resolution, which provided the first glimpse of its two-metal-ion active site.19 The phosphate oxygens involved in binding the catalytic magnesium ions in the spliceosomal active site were elucidated by phosphorothioate substitution and metal rescue experiments that highlighted the similarity of the catalytic RNA centers of the spliceosome and group II intron,22,23 reinforcing the notion that they may have evolved from a common ancestor. A series of high-resolution (2.7−4.0 Å) crystal structures of group IIC intron in the presence of different combinations of mono- and divalent metal ions provided a comprehensive view of the intron active site in the hydrolytic pathway.24 In 2014, the structure of the group II intron lariat was determined at 3.7 Å resolution,20 which allowed visualization of the branch-site helix domain VI (DVI) that is analogous to the pre-mRNA branch-site sequence pairing with U2 snRNA. This structure further extended the parallels between the spliceosome and the group II intron. Prp8 is the largest protein in the spliceosome and was shown to interact intimately with the catalytic RNA core.25−32 Hence Prp8 was subjected to intense bioinformatics and structural analysis. Similarity between a short stretch of Prp8 sequence and group II intron maturase was first noted by a bioinformatics analysis.33 High-resolution (2.0 Å) crystal structure of Prp8 revealed that it contains reverse transcriptase-like and endonuclease-like domains,34 with an overall domain architecture similar to that seen in the group II intron maturase. The crystal structure of Prp8 raised the intriguing possibility that a core protein component of the spliceosome, which interacts intimately with the catalytic RNA core, may also share an evolutionary origin with the group II intron maturase. The structure of the RT domain of maturase was subsequently determined in isolation35 and in complex with group II intron RNA,36 revealing a structural similarity between the RT and thumb/X domains of Prp8 and group II intron maturase. Over the past two years, several structures of the spliceosome at different catalytic stages have been determined by cryo-electron microscopy (cryo-EM). 21,37−43 This work revealed the structure of the catalytic RNA core and how it interacts with the substrate pre-mRNA, providing a detailed mechanistic understanding of pre-mRNA splicing. The structure of a group II intron in complex with the maturase36 allows us to make a direct comparison of the group II intron and pre-mRNA splicing systems to see if the interaction between group II intron RNA and maturase is maintained in the spliceosome. In this review we compare the structure and catalytic mechanism of the two systems and speculate freely about how ancestral self-splicing group II introns could have evolved into a transacting spliceosome that processes pre-mRNAs.

2.1. Phylogenetic Classes of Group II Introns

Group II introns are found in bacteria,47 in archaea,48 and in the organellar genomes of plants,49 fungi,9 and some primitive animals.50 Group II introns are originally classified into three main groupings on the basis of secondary structure, IIA, IIB, and IIC,51 but recently this classification has been extended to include classes D−F.52,53 The IIC introns are the smallest (∼400 nucleotides) with phylogenetic analysis indicating that they are primitive and therefore likely to be most similar to the primordial group II intron ancestor.54 In addition, while IIA and IIB introns are found in both prokaryotes and eukaryotes, IIC introns are exclusively found in bacteria.48,51 There are also significant differences in the mechanism of splicing in vitro with the IIC introns using the idiosyncratic hydrolytic pathway.55 In contrast to IIA and IIB introns, group IIC introns require the presence of the cognate maturase protein in order to splice via the canonical splicing pathway to form lariat.56 IIA and IIB introns are similar in size (600−800 nucleotides); however, they differ in their structural recognition of the 5′ and 3′ splice sites. 2.2. Secondary Structure

Group II intron RNAs range in size from 400 to 800 nucleotides and have a characteristic six-domain architecture. The secondary structure of group II introns is typically drawn in an arrangement in which the six domains emanate from a central hub (Figure 1c). Domain I (DI) is the largest of the domains and serves two main functions: (1) it serves as a structural scaffold for formation of the active site;57 (2) it contains the exon-binding sequences (EBS) 1−3, which are responsible for the recognition of the 5′ and 3′ exons for splice site positioning.58,59 This is followed by domain II (DII), which contains two tertiary interactions that are crucial for positioning the branch site helix during catalysis.20,60 Domain III (DIII) is responsible for reinforcing the overall fold of the group II intron through multiple tetraloop-receptor interactions with other domains.61 In some group II introns, domain IV (DIV) contains an open reading frame (ORF) for a protein having reverse transcriptase activity.62 The most highly conserved substructure in group II introns is DV and was long thought to contribute the essential catalytic moieties for splicing.63 The adenosine nucleophile for the first step of splicing is provided by DVI, which is analogous to the branch helix in the spliceosome.64

2. FOLDING AND ACTIVE SITE OF THE GROUP II INTRON Shortly after the discovery of catalytic RNAs and self-splicing group I introns,44 another novel class of noncoding sequences was found to interrupt genes in yeast and fungal mitochondria.1 This new class of introns was not found to share any sequence homology with group I introns, and they were significantly larger. As a result, these introns were initially described as belonging to “class 2” and were later redubbed the “group II introns”.9 Cloning and biochemical characterization of these

2.3. Mechanistic Similarities between Nuclear and Group II Intron Splicing

Group II introns engage in a two-step splicing reaction that results in ligated exons and the excised intron in lariat form. This group II intron lariat contains the 2′−5′ phosphodiester bond that is also formed during the splicing of nuclear introns by the spliceosome. In addition, the stereochemistry of the splicing reaction in both systems is identical. This was shown through the incorporation of Rp and Sp diastereomers at the 5′ and 3′ splice sites in the aI5γ group II intron from C

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 2. Active site of the spliceosome and group II intron. (a) Organization of the RNA catalytic core in group IIB intron lariat (poststep 2 configuration).20 (b) Organization of the RNA catalytic core and pre-mRNA substrate in the spliceosomal complex C* (prestep 2 configuration).41 (c) Catalytic triplex in self-splicing group IIC intron.24 (d) Catalytic triplex in the spliceosome.21 (e) Diagram of the RNA secondary structure and stereochemistry of metal ion coordination in group IIC intron. M1 and M2 indicate catalytic magnesium ions. For the first step reaction, R1 represents 2′ hydroxyl of branch point adenosine, R2 the intron, and R3 the pro-Sp oxygen. For the second step (exon ligation) reaction, R1 represents 3′ oxygen leaving group, R2 the pro-Sp oxygen, and R3 the 3′ exon. (f) Diagram of the RNA secondary structure and stereochemistry of metal ion coordination at the active site of the spliceosome. Labels as in (e).

Saccharomyces cerevisiae65 (Figure 2e). The data revealed that both steps of group II intron splicing could only progress in the presence of an Sp phosphorothioate and not the Rp isomer. This indicated that an essential metal ion was bound to the Rp oxygen. In addition, an inversion of stereochemistry was seen with the Sp form converting into the Rp during splicing. These stereochemical preferences are identical to those seen in the splicing of nuclear introns (Figure 2f).66 Therefore, this suggested that both systems utilize similar active site architecture in terms of the position of the nucleophile relative

to the splice sites. This is in contrast to the phylogenetically unrelated group I introns, which exhibit different stereochemical preferences.67,68 2.4. Active Site Components of the Group II Intron

From early alignments of group II intron sequences, it was clear that DV displayed the highest level of conservation. This led to the initial hypothesis that it formed the active site for group II intron splicing. This was shown through both in vitro and in vivo mutational analysis of domain V.63,69 Domain V contains two motifs that are especially sensitive to mutagenesis (Figure D

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

2): (1) a three-nucleotide AGC (or CGC) “catalytic triad” at the base of the DV stem-loop and (2) a two-nucleotide bulge in the middle of DV. In addition, the distance between these two motifs is absolutely conserved with a five base pair separation.51 The first evidence suggesting the mechanism of DV catalysis came from extensive phosphorothioate substitution showing that this domain was responsible for binding metal ions essential for splicing.70,71 In both the group II intron and the spliceosome12,72,73 the nucleophile for the first step is a bulged adenosine residue that is located within an RNA duplex near the 3′ end of the intron. In group II introns, the bulged adenosine residue within DVI exists as a single-nucleotide bulge at the base of the DVI stemloop (Figure 1c). There are no identified base pairing interactions with this bulged adenosine from biochemical and structural studies. Furthermore, the lack of covariation of this residue with other nucleotides in the intron suggests a complex binding pocket composed of multiple residues possibly involving base triples or quartets. The conservation of this residue between group II introns and the spliceosome strongly suggests that it may be involved in base triple interactions as observed in the spliceosome.

metal ions while still retaining the overall fold of DV. The intact 5′ splice site was found to be directly placed over the active site cleft of DV where the two catalytic metal ions are found in the postcatalytic structure. Therefore, this suggested that the group II intron could accommodate both steps of splicing using a single active site. 2.6. Conformational Rearrangements during First Step Hydrolysis

The conformational dynamics during first step hydrolysis in the group IIC intron were first captured by Marcia and Pyle in 2012.24 In this work, the intron was crystallized in Ca2+, which does not support catalysis. As a result, the precatalytic structure could be captured without mutagenesis, thus allowing visualization of catalytic metal ion positions relative to the intact 5′ splice site, as well as the incoming water nucleophile. In this structure, Ca2+ ions are bound in the catalytic metal ion positions typically occupied by Mg2+. This structure also revealed conformational dynamics in the J2/3 linker region during the first step of hydrolysis. Specifically, J2/3 is found to be disengaged from the catalytic triad in crystallization conditions that do not support catalysis. Based on the fact that two distinct conformational states are observed for J2/3, along with mutational data on splicing kinetics, it was proposed that toggling between these two states assists the exit of first step substrates from the active site in preparation for the second step. This work also represents the first data showing that the catalytic triplex is dynamic during the progression of the splicing reaction. Although the IIC intron system was useful to visualize the active site architecture, there are some disadvantages to using the O. iheyensis intron to gain insight into the entire splicing pathway. IIC introns require the presence of the cognate maturase protein in order to form lariat, and in its absence, they splice predominantly through the hydrolysis pathway.56 Biochemical evidence suggests that the maturase plays an important role in positioning DVI and the bulged adenosine for nucleophilic attack in IIC introns.56 All of the IIC intron structures solved to date have been in the absence of the maturase. Therefore, this only allows limited interrogation of the first step in a wild-type context due to the absence of the bulged adenosine nucleophile. IIC introns also engage in promiscuous catalytic activity with nonspecific cleavage reactions at the 3′ end of the intron RNA in the absence of the maturase.55,78 This results in the O. iheyensis intron randomly cleaving off DVI and the entire 3′ splice site,78 thereby precluding visualization of the branching reaction and the second step. A recent structure was published of the O. iheyensis intron with an attached DVI from a different IIC intron species.79 This structure contains bound 5′ exon, but the 3′ splice site is absent. In this work, multiple chimeric IIC intron constructs were tested to find a combination that formed lariat in the presence of Mn2+. The resulting crystal structure has a truncated DVI with no tertiary interactions involving the DVI stem. DVI tertiary interactions are especially important as they play an essential role in conformational rearrangements during the transition from the first to the second step of splicing.20 These tertiary interactions with DVI in IIC introns are expected to consist of RNA−protein contacts with the maturase, which is not present in this latest structure.

2.5. Crystal Structure of a Bacterial Group IIC Intron

The first crystal structure of a group II intron from Oceanobacillus iheyensis was solved in 2008 at 3.1 Å and revealed the active site architecture required for the catalysis of RNA splicing.19 This structure is of a bacterial group IIC intron in the postcatalytic state after the completion of splicing. The active site consists of the bulge and catalytic triad of DV forming a pocket that binds two metal ions with high affinity (Figure 2c,e). The overall architecture of DV within the intact intron is drastically different from that seen in the crystal and NMR structures of DV in isolation.74 In the previous structures of the DV stem-loop, it adopts an arrangement in which the stems above and below the two-nucleotide bulge are coaxially stacked in the absence of supporting tertiary interactions. In contrast, in the group IIC intron structure, tertiary interactions with DI force DV to adopt a ∼45° bend at the two-nucleotide bulge region. This brings the backbones of the bulge and catalytic triad into close proximity to form a highly negatively charged pocket that binds the two catalytic magnesium ions. This is typical of a two-metal-ion mechanism of catalysis, which was first proposed for RNA splicing by Steitz and Steitz.18 Furthermore, the junction between the 5′ and 3′ exons of the bound ligated exon product directly coordinates to the two catalytic metal ions.75 In addition, the group IIC intron crystal structure allowed the first visualization of the catalytic triplex, which consists of three consecutive base triples involving the catalytic triad, the two-nucleotide bulge, and the junction sequence between domains II and III called J2/3 (Figure 2c). This catalytic triplex is thought to reinforce the binding site for the catalytic metal ions and also plays a role in mediating the transition between the two steps of splicing (vide infra). Another long-standing question in group II intron biochemistry was whether there is one or multiple active sites corresponding to the two different steps of splicing. Biochemical analysis suggested a single active site,76 which was later confirmed by the structure of the IIC intron in the precatalytic state at 3.7 Å.77 The intron was trapped in the precatalytic state through mutagenesis of the central guanosine of the catalytic triad, which abrogated binding of the catalytic E

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 3. Dynamics of the branch helix during catalytic stages of the splicing cycle. (a) Organization of the active site in spliceosomal complex Bact.39,40 Catalytic magnesium ions and branch point are labeled with yellow and red dots, respectively. U6 ISL, U6 snRNA internal stem-loop; EBS1, exon binding loop 1; Prp8RH, RNaseH-like domain of Prp8. (b) Organization of the active site in complex C.21,38 (c) Organization of the active site in complex C*.41,42 The active site of the postcatalytic P complex is similar to C* but the 3′ splice site is docked into the active site and the 5′ and 3′ exons are ligated together41 (see Figure 6). (d) Organization of the active site in group IIB intron lariat.20 (e−h) Interactions stabilizing position of the branch helix (or domain VI) in the corresponding complexes.

π−π′ interaction was not visible. This suggests that the π−π′ interaction is dynamic and is only engaged for the second step. In the crystal structure, the branch point is displaced 20 Å from the active site, with the π−π′ interaction involving the two base pairs directly adjacent to the bulge at the base of DVI. Therefore, in the first step of splicing, the π−π′ interaction would have to be disengaged to allow the bulged adenosine to reach the 5′ splice site to form lariat. The engagement of the π−π′ interaction then serves to remove the lariat from the active site in preparation for the 3′ splice site for the second step. In the recent structure of the spliceosome C* complex, the lariat is also displaced 20 Å from the active site41−43,80 in preparation for the 3′ splice site docking, which shows that the magnitude of the displacement seen for the branched adenosine is conserved between the two systems (Figure 3). In the structure of the chimeric IIC intron construct, a transition from a one- to a two-nucleotide bulge in DVI was observed and postulated as a conformational change associated with catalysis.79 The biological relevance of this observation is unclear since the truncated DVI in this structure lacks any tertiary interactions that would be expected to affect its position and conformational dynamics. In contrast, only a singlenucleotide bulge is observed for the branched adenosine in the currently available structures of both the spliceosome and the group IIB intron. It is possible that conformational dynamics in IIC introns may differ from those observed in IIB introns and

2.7. Crystal Structure of a Eukaryotic Group II Intron Lariat

Many group IIA and IIB introns are able to form large amounts of lariat in vitro in the absence of the maturase protein. Also, some IIA/IIB introns do not encode a maturase ORF; therefore they exhibit less dependence upon this protein cofactor compared to the IIC introns.51 In addition, the more evolutionarily advanced IIA/IIB introns also exhibit a higher fidelity of splicing compared with the nonspecific cleavage events seen in IIC introns. However, the larger size of IIA and IIB introns makes them a more difficult target for structure determination via X-ray crystallography. In 2014, the crystal structure of a 622-nucleotide eukaryotic group IIB intron lariat was solved at 3.7 Å resolution.20 This mitochondrial intron is located in the large rRNA gene of the brown alga Pylaiella littoralis. This postcatalytic structure revealed the location and orientation of the intact DVI within the intron structure. DVI was found to participate in two tertiary interactions with DII, known as η−η′ and π−π′ (Figures 1c and 3h), which are both tetraloop-receptor interactions. Mutagenesis of either η−η′ or π−π′ resulted in a strong inhibition of the second step. Mutating both interactions results in an almost complete block of the second step. Therefore, these interactions are required for the second step and exon ligation. The 7 Å structure of the precatalytic form of this intron revealed that the η−η′ interaction remained engaged; however, density for the region of DVI forming the F

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 4. Active site and metal ion coordination in group IIB intron lariat.20 M1 and M2, catalytic magnesium ions.

the spliceosome. Future structures of the IIB intron at different stages of splicing may resolve these outstanding questions. Active site metal ions in the group IIB intron were probed using Yb3+ soaks. Yb3+ has a large anomalous signal and octahedral coordination geometry similar to that of Mg2+. Based on the group I, IIC, and IIB crystal structures, Yb3+ seems to specifically bind to highly coordinated Mg2+ sites where the majority of coordination positions are occupied with RNA ligands, rather than water molecules.19,20,81 In the group IIB intron, there are four large Yb3+ peaks clustered together in the catalytic core (Figure 4). Two of these metals, M1 and M2, form the two-metal-ion catalytic center seen in the group IIC intron. As in the group IIC intron,82 additional metal ions have been found near the active site. M3 and M4 coordinate to the highly conserved GUGYG 5′ end of the intron and may play a role in positioning of the 5′ splice site in the first step. A comparison of the IIB and IIC structures reveals the existence of a conserved monovalent ion directly coordinated to the guanosine nucleobase of the J2/3 linker. This monovalent ion is located in close proximity to the twometal-ion catalytic center in both structures. The conservation of this metal ion across both prokaryotic and eukaryotic group II introns is striking and indicates an essential role in catalyzing splicing. However, the precise role of this monovalent ion is unclear and requires further study.

transient interactions occurring between conserved nucleotides during catalysis that are not observed in the available crystal structures. Characterization of these transient contacts would require the development of new techniques to capture putative dynamics in group II introns.

3. STRUCTURE OF THE ACTIVE SITE IN THE SPLICEOSOME 3.1. Biochemistry and Genetics Identify the Key Components of the Active Site

Once in vitro splicing systems were established from both yeast and HeLa cells, the interactions between the snRNAs and between pre-mRNA and snRNAs were investigated extensively. Initially, the 5′ splice site is recognized by the U1 snRNP through base pairing between the 5′ end of the U1 snRNA and the 5′ splice site (SS).83−92 The sequence around the branch point adenosine was shown to base pair with a conserved sequence in U2 snRNA to form the branch-site helix with the crucial adenosine bulged out from the double helical region12,72,73 in a manner analogous to the bulged adenosine in DVI of group II introns (Figure 1b,c). After U4/U6.U5 trisnRNP is recruited to the spliceosome, U1 interactions with the 5′SS are disrupted and replaced by interactions with U6 snRNA,93−97 forming base pairs between the 5′ end of the intron (GUAUGU) and an invariant ACAGAG motif in U6 snRNA.98,14 Genetic and biochemical analysis15,16 showed that an invariant loop sequence in U5 snRNA (loop 1) contacts exon sequences at both 5′ and 3′ splice sites, reminiscent of EBS (exon binding sequence)−IBS (intron binding sequence) interactions in group II introns.58,15,16 Functional reconstitution of yeast U5 snRNPs in vitro revealed that whereas loop 1 of U5 snRNA is dispensable for catalytic step 1 (branching) it is strictly required for step 2 (exon ligation).99 U6 snRNA is extensively base paired with U4 snRNA prior to catalytic activation.100 The sequence of U6 snRNA is highly conserved, and mutagenesis experiments101 identified invariant nucleotides whose mutations lead to a lethal phenotype. Most of these mutations are not rescued by compensatory mutations in U4 snRNA which restore base pairing, suggesting that these nucleotides are involved in an essential function such as

2.8. Future Directions

Additional structural intermediates of the group IIB intron are required in order to capture the full spectrum of conformational rearrangements during the splicing reaction. For example, a high-resolution precatalytic group IIB intron structure would reveal the structural basis for the conservation of the bulged adenosine residue. It is likely that there is a defined binding pocket for the nucleobase of this residue that is responsible for the proper positioning of the 2′-OH nucleophile relative to the 5′ splice site for lariat formation. The precise nature of the conformational dynamics involving DVI also remains unknown at the moment. In addition, the biochemical basis for the patterns of nucleotide conservation observed in residues directly surrounding the active site remains a mystery. It is possible that there are G

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Table 1. Structures of Group II Intron, snRNPs, and Spliceosomes Solved to Date (November 2017) in Chronological Order Group II Introns complex

source

state

IIC intron IIC intron IIC intron IIC intron IIB intron IIA−maturase IIC−hybrid

O. iheyensis O. iheyensis O. iheyensis O. iheyensis P. littoralis L. lactis O. iheyensis

poststep 2 prestep 1 prestep 1 poststep 1 poststep 2 poststep 2 prestep 2

resolution [Å]

subunits

3.1 1 3.7 1 3.1 1 2.9 1 3.7 1 3.8 2 3.4 1 snRNPs and Spliceosomes

method

PDB

year

authors

X-ray X-ray X-ray X-ray X-ray cryo-EM X-ray

3IGI 4DS6 4FAQ 4FAR 4R0D 5G2X 5J01

2008 2012 2012 2012 2014 2016 2016

Toor/Pyle19 Toor77 Pyle24 Pyle24 Toor20 Belfort/Wang36 Costa79

complex

source

state

resolution [Å]

subunits

method

PDB

year

authors

U1 snRNP U1 snRNP U4 Sm complex U1 snRNP U6 snRNP U6 Lsm complex tri-snRNP ILS complex tri-snRNP tri-snRNP tri-snRNP complex C complex C complex Bact complex Bact complex C* complex C* complex C* complex C* complex B complex B ILS complex U1 snRNP complex P complex P complex P

H. sapiens H. sapiens H. sapiens H. sapiens S. cerevisiae S. cerevisiae S. cerevisiae S. pombe S. cerevisiae S. cerevisiae H. sapiens S. cerevisiae S. cerevisiae S. cerevisiae S. cerevisiae S. cerevisiae S. cerevisiae H. sapiens H. sapiens S. cerevisiae H. sapiens S. cerevisiae S.cerevisiae S. cerevisiae S. cerevisiae S. cerevisiae

subunit subunit subunit subunit subunit subunit subunit predisassembly subunit subunit subunit poststep 1 poststep 1 activated activated prestep 2 prestep 2 prestep 2 prestep 2 precatalytic precatalytic predisassembly subunit poststep 2 poststep 2 poststep 2

5.5 4.4 3.6 3.3 1.7 2.8 5.9 3.6 3.7 3.8 7.0 3.8 3.4 3.5 5.8 4.0 3.8 5.9 3.8 3.7 4.5 3.5 3.6 3.7 3.3 3.6

10 10 8 10 2 8 33 38 34 34 33 44 44 45 31 44 46 46 50 58 57 40 16 45 41 41

X-ray X-ray X-ray X-ray X-ray X-ray cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM cryo-EM

3CW1 3PGW 4WZJ 4PJO 4N0T 4M7A N/A 3JB9 5GAN 3JCM 3JCR 5LJ5 5GMK 5GM6 5LQW 5WSG 5MQ0 5MQF 5XJC 5NRL 5O9Z 5Y88 5UZ5 6EXN 6BK8 5YLZ

2009 2010 2011 2014 2014 2014 2015 2015 2016 2016 2016 2016 2016 2016 2016 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 2017

Nagai108 Luhrmann/Wahl109 Nagai110 Nagai111 Brow/Butcher112 Shi176 Nagai113 Shi37 Nagai114 Shi115 Luhrmann/Stark116 Nagai21 Shi38 Shi39 Luhrmann/Stark40 Shi42 Nagai41 Luhrmann/Stark43 Shi80 Nagai105 Luhrmann/Stark43 Shi117 Zhao/Zhou118 Nagai119 Zhao/Zhou120 Shi121

stitutions at analogous residues in the catalytic core of the spliceosome. This resulted in the identification of all the phosphate oxygens of nucleotides involved in catalytic metal ion binding (Figure 2f). The crystal structure of the O. iheyensis group II intron19 showed that the catalytic triad nucleotides form three consecutive base triples with the two-nucleotide bulge of DV and the junction linker between domains 2 and 3 (J2/3). Strong evidence to support the existence of a similar catalytic triplex in the U6 snRNA was provided through in vivo genetic experiments in the spliceosome104 (Figure 2c,d).

catalysis, independent of their ability to base pair with U4 snRNA.101 U6 snRNA forms an intramolecular stem-loop, and an elegant in vivo mutational analysis by Madhani and Guthrie13 revealed a pairing between U2 and U6 snRNAs to form helices Ia and Ib which are separated by a two-nucleotide bulge. These findings provided the first details of the RNA interaction network in the active spliceosome (Figure 2). The intramolecular stem-loop (ISL) of U6 snRNA bears remarkable similarity to domain V (DV) in group II introns, suggesting that U80 and the AGC trinucleotide in U6 snRNA (which forms helix Ib with U2 snRNA) may be involved in the binding of catalytic magnesium ions. Mutational analysis and phosphorothioate interference assays102,103 were carried out to test this idea. Phosphorothioate substitution and metal rescue experiments22 showed that splicing is inhibited when pro-Rp or proSp phosphoryl oxygen atoms are substituted with sulfur. Furthermore, this inhibition could be overcome through the addition of the thiophilic metal ion Mn2+, providing convincing evidence that the pro-Sp phosphate oxygen of U80 is involved in binding of the catalytic magnesium ions. Fica et al.23 greatly expanded the scope of these metal rescue experiments and used the active site configuration from the group II intron crystal structure19 as a guide to engineer phosphorothioate sub-

3.2. High Resolution Structure of the Active Site from Cryo-EM Studies

The cryo-EM structure of the intron lariat spliceosome (ILS) complex revealed the active site in a postsplicing state, primed for disassembly and recycling.37 The structure of the C complex stalled immediately after the first transesterification reaction allowed visualization of the active site of the spliceosome with the products of the first transesterification reaction (branching), providing insights into the mechanism of this reaction.21,38 The intramolecular stem-loop (ISL) of U6 snRNA adopts a highly folded structure similar to DV in the group II intron (Figure 2a,b). U6 snRNA upstream of helix Ia folds back with two H

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

4. COMPARISON BETWEEN THE ACTIVE SITES OF THE GROUP II INTRON AND THE SPLICEOSOME The availability of high-resolution structures of the spliceosome and group IIB/IIC introns enables a detailed comparison of the active site architecture and metal ion configuration required for RNA splicing. Figure 2 shows a comparison of the active sites for these two splicing systems. The catalytic triplex in the group IIC intron is identical to that found in the spliceosome; however, the base triple configuration in the group IIB intron lariat differs slightly in that one of the bases of J2/3 has shifted down by one base pair relative to the IIC intron.20 Given that the IIC intron structure lacks the second step substrate (3′ splice site), it is likely that this configuration represents the first step state. In contrast, the IIB structure contains the entire intron (including DVI and an intact 3′ splice site) in a postcatalytic state and may therefore likely represent the J2/3 conformation required for exon ligation. It is hypothesized that these conformational rearrangements serve to transition between the different splice sites during catalysis (see section 6 for details). The observation of different configurations of the catalytic triplex is consistent with a dynamic role for J2/3 first proposed by Marcia and Pyle.24 The spliceosome has been solved at many different stages of catalysis; however, the catalytic triplex configuration is the same as the active state of the IIC intron. It is possible that the spliceosome may have evolved a reliance upon protein cofactors to serve the same dynamic function during splicing. The structure of the group IIB intron lariat allows a direct comparison with the spliceosome in terms of the structural requirements for lariat formation. In the postcatalytic group IIB intron, domains V and VI are oriented almost 180° from each other, but are not coaxially stacked. In the spliceosome, the branch helix adopts different positions stabilized by stepspecific proteins at the various stages of assembly and catalytic activation (Figure 3). The C* and postsplicing intron lariat spliceosome (ILS) structures of the spliceosome exhibit the greatest similarity in the placement of the branch-site helix with respect to DVI in the group II intron. It should also be noted that the precatalytic state of the branch-site helix in group II introns has not yet been visualized; therefore the magnitude of the angular movement during the transition from the first to the second step is not known.

nucleotides, G52 and A53, forming base triples with G60(U6)− C22(U2) and A59(U6)−U23(U2) base pairs and followed by U80 bulged out from the ISL to form a base triple with the C61(U6)−G21(U2) base pair. These three stacked base triples (the catalytic triplex) stabilize a highly twisted backbone conformation of the two nucleotides, which, together with the backbone of the catalytic triad, form the binding site for two catalytic magnesium ions (M1 and M2) consistent with the two-metal-ion mechanism for catalysis by the spliceosome18 (Figure 2c,d). Five of the 20 phosphate oxygens caused branching defects when substituted with sulfur: G60-PS(Rp), U80-PS(Sp), U80-PS(Rp), G78-PS(Sp), and A59-PS(Sp). These residues are located in close proximity to form a binding pocket for magnesium ions M1 and M2 in the active site. The C complex contains the products of the first transesterification (branching): the cleaved 5′ exon and a lariat intron intermediate in which the 5′-phosphate of the first intron nucleotide G(+1) is linked to the 2′-OH group of the branch point adenosine (A70 in UBC4 pre-mRNA substrate).21 The 3′-OH group of the 5′ exon and the 5′-phosphate of G(+1) remain close to each other and the magnesium ions. This suggests that the structure of the active site before branching (B*) is likely to be very similar. The pre-mRNA branch helix is highly distorted from the canonical A form and is docked into the active site by step 1 factors Yju2, Cwc25, and Isy1. Several other cryo-EM structures of the spliceosome have been recently reported (Table 1). These include C* complex from yeast41,42 and human,80,43 which provided insights into the active site remodeling between two catalytic steps of splicing (Figure 3) as well as precatalytic complex B105,106 and activated complex Bact,39,40 which shed light on catalytic activation and formation of the active site. Most recently, a structure of complex P has been reported,119−121 which was trapped immediately after the second step of splicing. This structure provided crucial insights into the mechanism of 3′SS selection and docking. 3.3. Future Directions

In the past two years, tremendous progress has been made in understanding basic splicing mechanisms via high-resolution cryo-EM structures.107 Nevertheless, deep mechanistic understanding of the pre-mRNA splicing will require many more structures to be solved in the future. Complex B* poised just before the first step would complement the existing structure of the C complex and allow visualization of subtle rearrangements occurring during the first transesterification. All step 2 complexes (C*) solved to date are either missing or have very weak density for the 3′ splice site. Understanding the role of step 2 specific protein factors will require trapping and structurally analyzing this transient intermediate, where the 3′SS is aligned on the U5 snRNA loop 1 for the second transesterification. Early splicing events, including formation of the complexes E and A as well as catalytic activation, are still structurally poorly understood. Future single particle studies will require development of more sophisticated biochemical approaches to trap transient intermediate states of the machinery as well as new algorithms to track continuous domain movements within the system. Electron cryo-tomography combined with focused ion beam (FIB) milling has recently showed its great potential in the analysis of biological specimens in their native cellular environment.122 Further developments in this area may allow visualization of the pre-mRNA splicing machinery in situ, as it assembles on the nascent pre-mRNA transcripts.

5. CONFORMATIONAL REARRANGEMENTS OF THE SPLICEOSOME ACTIVE SITE During the first catalytic step, the branch point adenosine attacks the 5′ splice site producing the lariat intron intermediate and the cleaved 5′ exon. In the second catalytic step, the 3′-OH group of the 5′ exon acts as a nucleophile to attack the 3′ splice site producing the ligated exon (mRNA) and excised intron in lariat form. Given that two different substrates were selected, there was much debate as to whether one or two active sites were responsible for catalyzing RNA splicing. In the spliceosome, phosphorothioate substitution and metal rescue experiments showed that identical phosphate atoms coordinate two catalytic metal ions for both steps, thus providing strong evidence for a single active site.23 The three-dimensional structures of the spliceosome and group II intron have provided conclusive evidence for a single active site for both steps. Both group II intron and nuclear pre-mRNA splicing systems therefore require a mechanism to dock a substrate for the first step into the active site, remove its products from the I

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

active site, then dock a substrate for the second step, and finally release its products. In the transition from the B to the Bact complex, the functional active site is formed and the 5′ splice site is positioned in the catalytic core; however, the branch helix harboring the adenosine nucleophile is held 50 Å away from the active site by SF3b, a large protein complex that is part of U2 snRNP (Figure 3a,e). Interestingly, structures of the B and Bact complexes105,39,40 show that the nucleotide base of the branch point adenosine is flipped out of the helix such that its 2′-OH group which acts as a nucleophile for the first step does not project outward. By the action of the DEAH box helicase Prp2, the branch helix is released from SF3b and is free to rotate. This step involves a rotation of the branch helix around the bond between nucleotides 29 and 30, so that it can be docked into the active site by step 1 factors Cwc25, Yju2, and Isy1.21,38 The structure of C complex immediately after the first catalytic step shows that the branch helix is considerably distorted from its canonical A form, likely induced by binding step 1 factors21,38 (Figure 3b,f). This distortion enables the branch point adenosine to penetrate deeply into the active site so that its 2′-hydroxyl group can attack the 5′ splice site. Following the chemical reaction the DEAH box helicase Prp16 is activated and dissociates from the active site together with step 1 factors. The branch helix released from step 1 factors is free to rotate and is stabilized into a new orientation. The Prp8 RNaseH domain interacting with U2 snRNP in complex C now rotates so that its β-hairpin protrudes toward Cef1, and this interaction is stabilized by the WD40 domain of Prp17.41,42 This movement of the branch helix removes the branch point adenosine from the active site together with the GUAUGU sequence at the 5′ end of the intron. This creates a space for the 3′ exon to dock into the active site and rearranges the RNA−RNA interactions (Figure 3c,g). U5 snRNA loop 1 aligns the 5′ and 3′ exons such that the 3′-OH group of the 5′ exon can attack the 3′ splice site.15,16 The structure of P complex shows the ligated 5′ and 3′ exons and the 3′ splice still docked at the active site.119−121 The ligated mRNA is then released by Prp22. Hence in the spliceosome the DEAH helicases and their associated proteins play a crucial role in docking of the substrate into the active site and removal of the product from the active site.

In addition, more subtle conformational dynamics within the catalytic triplex may also play a role in assisting the intricate swap of the 5′ splice site for the 3′ splice site. J2/3 participates in the γ−γ′ interaction, which consists of a single Watson− Crick pair between a J2/3 nucleotide and the very last residue of the intron. As a result, γ−γ′ is critical for 3′ splice site placement for the second step. The fact that J2/3 can exist in different configurations in group II intron seems to suggest the possibility that these conformational dynamics may assist in the formation of the γ−γ′ interaction for the second step. It should be added that the precise nature of the large-scale movement of DVI currently remains unresolved pending the structure determination of the precatalytic state of a lariatforming group II intron. In addition, the mechanism for removal of ligated exons from the active site also remains unknown.

7. PRP8: A KEY PROTEIN IN THE SPLICEOSOME 7.1. A Brief History of Prp8

PRP8 was first identified in the yeast S. cerevisiae via a genetic screen for temperature-sensitive mutations that affected RNA metabolism.123,124 PRP8 was cloned via complementation of the temperature-sensitive growth defect in a prp8-1 mutant strain and shown to encode a 280 kDa protein comprising about 2400 residues.125 Immunodepletion of Prp8 from yeast extracts resulted in inactivated pre-mRNA splicing, while genetic depletion of the protein via a GAL1-regulated copy of PRP8 inhibited splicing in vivo and led to accumulation of complex A (prespliceosomes containing U1 and U2 snRNPs) in vitro.125,126 Further experiments with antibodies against Prp8 showed that it is present in U5 snRNP, in U4/U6.U5 trisnRNP, and in spliceosomes during both catalytic steps of splicing.125−128 Prp8 orthologues were identified in various eukaryotes; invariably these were proteins of about 2400 residues, present in nuclear extracts and cross-reactive with antibodies raised against yeast Prp8.31 Alignments of Prp8 sequences from diverse eukaryotes displayed remarkable conservation from yeast to human to trypanosome to plant (for example, 61% identity between the yeast and human proteins), but initially revealed little in terms of specific protein motifs: Prp8 contains a putative nuclear localization signal (NLS) near the N-terminus, which has been shown to be required during U5 snRNP biogenesis in yeast for efficient nuclear import of a cytoplasmic precursor U5 snRNP particle containing the U5-specific assembly factor Aar2.129−131 Sequence analysis of Prp8 orthologues also revealed a Cterminal Jab1/MPN domain which subsequently became the first domain of the protein to be crystallized and analyzed structurally.132,133 Various genetic approaches in yeast identified prp8 mutations that could suppress pre-mRNA splicing defects, for example, those caused by 5′ splice site (5′SS) or 3′ splice site (3′SS) mutations.25,29,134−137 Some of these mutations (so-called first step alleles) improved first catalytic step activity while inhibiting the second step, whereas others (second step alleles) improved second catalytic step activity while inhibiting the first step.137 It was striking that many of these suppressors mapped to two clusters: one between Prp8 residues 1547−1660, and the other around 200 residues downstream. Sequence comparisons showed that the upstream region of Prp8 has been extraordinarily conserved during evolution (72% identity between yeast and human), suggesting a crucial function in pre-mRNA splicing. Only some years later

6. CONFORMATIONAL REARRANGEMENTS DURING GROUP II INTRON SPLICING The transition between the two steps of splicing in group II introns requires significant conformational rearrangements in order to empty the active site of first step substrates in preparation for exon ligation in the second step. It was first proposed by Chanfreau and Jacquier that DVI was involved in conformational rearrangements between the two steps of splicing to accomplish this goal.60 This work led to the first identification of the η−η′ interaction, which was thought to be a transient interaction required only for the second step. This model involved the η−η′ interaction being disengaged in the first step followed by DVI moving toward DII for the second step. In this model, the second step state is favored by the engagement of the η−η′ interaction. The structure of the IIB intron revealed the existence of a second tertiary interaction (π−π′) involving DVI. This work suggested that π−π′ was the dynamic tertiary interaction responsible for removing first step substrates out of the active site as it moves the branch site adenosine 20 Å away from the active site, after the completion of the first step. J

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

did it become clear from structural analysis that the upstream region of Prp8 forms part of the active site cavity of the spliceosome34 while the downstream region corresponds to the RNaseH-like domain of Prp8,138−140 which appears to play multiple roles in spliceosome dynamics.141 Protein−RNA cross-linking experiments were crucial in establishing that Prp8 interacts extensively with the catalytic RNA core of the spliceosome, including all three sites of splicing chemistry in the substrate (5′SS, BP, and 3′SS), reviewed in ref 31. Cross-links between Prp8 and the 5′SS in human were mapped to a short peptide, later shown by structural analysis to be part of the RNaseH-like domain of Prp8.30,138−140,142 Together with genetic experiments this led to an initial suggestion that the RNaseH-like domain might form part of the spliceosome active site,134,135,142 but more recently the cryo-EM structures of Bact, C, and C* spliceosomes have shown this is not the case (see below). The development of functional spliceosomal snRNP reconstitution systems in yeast pre-mRNA splicing extracts99,143,144 provided the opportunity to analyze protein−snRNA contacts in functional snRNP particles via site-specific cross-linking (reviewed in ref 145). These studies and parallel biochemical experiments in the human system revealed extensive contacts between Prp8 and U5 snRNA, notably with the invariant U5 snRNA loop I sequence in both U5 snRNP and U4/U6.U5 tri-snRNP.146,147 Prp8 was also shown to contact nucleotide U54 in U6 snRNA,148 which would later be shown to reside at the center of the spliceosome’s catalytic RNA core.13,104 These findings, together with other genetic and biochemical evidence for specific contacts between U5 snRNA loop I and exon sequences at the 5′SS and 3′SS,15,16,149 made it clear that Prp8 must be closely involved in events at the spliceosome’s catalytic RNA core. The observation that the U5 snRNA loop I is dispensable for branching but essential for exon ligation in yeast splicing extracts99,150 supports the hypothesis that Prp8 could act in conjunction with U5 snRNA loop 1 and step 2 specific factors, such as Slu7 and Prp18, to facilitate alignment of the 5′ and 3′ exons and docking of the 3′SS at the spliceosome’s RNA core during the second step of splicing.146,150−157 Hence a wealth of experimental data collected over the years has provided compelling evidence that Prp8 is located at the very center of the spliceosome.31

Figure 5. Comparison of Prp8 and group II intron maturase. (a) Domain organization of Prp8 and group II intron maturase. NTD, Nterminal domain; HB, helix bundle domain; RT1−7, reverse transcriptase-like domain, motifs 1−7; Th/X, thumb/X domain of the RT; LN, Linker region; EN, endonuclease domain; RH, RNaseHlike domain; JM, Jab1/MPN domain; DBD, DNA binding domain. (b) Structure of Prp8RT domain and its interactions with the catalytic core.21 (c) Structure of the RT domain of LtrA group II intron maturase and its interactions with catalytic core.36 EBS1, exon binding loop 1. (d) Prp8RT and RNA catalytic core in the context of full-length Prp8. (e) RT domain of LtrA group II intron maturase and catalytic core in the context of entire group II intron. (f) Accessory domains in Prp8 form multiple contacts with catalytic core providing a scaffold for its assembly. BP, branch point.

7.2. Domain Structure of Prp8

The crystal structure34 revealed that Prp8 can be divided into four structurally distinct domains (N-terminal, Large, RNaseHlike, and Jab1/MPN), which change their relative positions and interaction networks during the splicing cycle (Figure 5a,d). The N-terminal domain is predominantly responsible for holding U5 snRNA, and it forms tight contacts with the spliceosomal GTPase Snu114.37,113 The Large domain is composed of the helix bundle (HB), the reverse transcriptase-like domain (RT), thumb/X, Linker, and type II restriction endonuclease-like domain (EN) which are linked through a flexible but highly conserved linker region.34 The RNaseH-like138−140 and Jab1/MPN132,133 domains are loosely attached to the main body of Prp8, and they exhibit dramatic movements during spliceosome assembly. Three Asp residues in the active site of all polymerases coordinate two catalytic magnesium ions,158 but in the RT domain of Prp8 two of the three Asp residues are mutated to Thr and Arg residues and thus it has lost its polymerase activity during evolution. The maturase contains an HNH-type nuclease but it is replaced by a

type II restriction endonuclease-like domain in Prp8.34 The Mg2+ coordinating residues in the endonuclease domain of Prp8 are conserved, but this domain is unlikely to function as an endonuclease as the active site is occluded by polypeptide loops. Indeed mutations of these Asp residues have no effect on yeast viability or growth, confirming that its endonuclease domain is not active or not involved in essential functions such as splicing. The RNaseH-like domain may have been added to maturase (ancestral Prp8) to facilitate degradation of group II intron RNA after its integration into the genome and use as a K

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 6. Conserved loop 1570−1615 in Prp8 (α-finger region) and its interactions at the active site. (a) α-Finger contacts Cwc24 and Prp11 in complex Bact.39 (b) Stabilization of the step 1 factor, Cwc25, by the α-finger in complex C.21 (c) The α-finger forms an extended α-helix and stabilizes 3′ exon and 3′ splice site in complex P.119

Figure 7. Structural dynamics of Prp8 in various splicing complexes. (a) Movement of the N-terminal domain shapes the active site cavity to allow binding of the RNA catalytic core. tri, U4/U6.U5 tri-snRNP; B, Bact; C and C*, corresponding splicing complexes; ILS, intron lariat spliceosome. (b) Highly mobile RNaseH-like domain adopts different conformations in various spliceosomal complexes. (c) Switch-loop toggles between two states in response to 5′ exon binding.

HB domains of Prp8 (Figure 5f). U5 snRNA loop I is firmly gripped by the N-terminal domain of Prp8 and base pairs with the 5′ exon. The closure of the N-terminal and RT/En domains of Prp8 places loop I in close proximity to the intramolecular stem-loop of the U6 snRNA (ISL), which coordinates catalytic magnesium ions. The 5′ exon is stabilized by Cwc21 and the so-called switch-loop of Prp8, which toggles between two states that correlate with the presence of the 5′ exon. During the catalytic stages the 5′ single-stranded region of U6 snRNA is secured onto the N-terminal domain of Prp8 by Cwc2, Bud31, and Ecm2. These interactions remain unchanged in all activated complexes (Bact, C, C*, and P).

template for cDNA synthesis, but again the RNaseH-like domain in Prp8 has no catalytic activity. The Prp8 RNaseH-like domain changes its position during the assembly and activation of the spliceosome (Figure 7). It plays a particularly important role in stabilizing the branch helix into different positions. 7.3. Prp8 Cradles the Catalytic RNA Elements

The active site of group II intron is stabilized by multiple RNA interactions, while in the spliceosome protein components play a crucial role in creating and stabilizing the active site. The substrates are docked and moved out of the active site by stepspecific factors during nuclear pre-mRNA splicing. The first transesterification reaction (branching) produces a free 5′ exon and a lariat intron intermediate leading to the formation of C complex which, after remodeling, becomes capable of catalyzing the second transesterification reaction (exon ligation) using the products of branching as substrates. Hence C complex reveals the active site of the spliceosome in the catalytically active form and how it interacts with the surrounding proteins. The entire RNA catalytic core is cradled deeply in the active site cavity mainly formed by the N-terminal, Linker, and RT domains of Prp8. U2/U6 snRNA helix Ia and Ib is packed against the helix bundle (HB) domain: an additional domain not present in maturase and appended to the N-terminus of the RT domain.21,38 The folded intramolecular stem-loop region of U6 snRNA (ISL) is wedged between the N-terminal and the

7.4. Dynamic Rearrangements of Prp8 during the Splicing Cycle

Numerous interactions between Prp8 and spliceosomal RNAs are transient and specific to a particular complex along the splicing pathway. In particular the branch helix remains highly mobile as a result of the action of three stage-specific DEAHbox helicases (Prp2, Prp16, and Prp22) (Figure 3, discussed in section 5). In complex C it is docked onto a highly conserved, electropositive surface of the Large domain21,38 positioning the branch point adenosine for the nucleophilic attack at the 5′SS. A highly conserved loop 1570−1615 (α-finger region) projects out of this surface. This region was previously identified to encompass cross-linking sites for BP and 3′SS at L

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

the second step of chemistry.119 The α-finger adopts multiple conformations in different splicing complexes. In tri-snRNP and complex B105,106 it interacts with the U4 snRNA phosphate backbone around a three-way junction; in complex Bact its two helices partially unfold and form close contacts with U2/U6 helix I, Cwc24, and Prp11 (Figure 6a). After displacement of the SF3a and SF3b complexes, the αfinger folds on top of the long α helix of Cwc25 (in complex C) and likely contributes to stabilization of the docked branch helix configuration (Figures 3b and 6b). After dissociation of Cwc25 for exon ligation, the α-finger becomes disordered in complex C*.41 The postsplicing P complex represents the spliceosome immediately after exon ligation (mRNA formation) but prior to mRNA release and harbors the ligated exon and 3′ splice site in the active site.119−121 The α-finger region folds into an α-helix to extend helix 8 in the Linker domain (Figure 6c). The extended helix interacts with both the 3′ exon and 3′ splice site showing that prior to exon ligation the 3′ splice site−3′ exon wraps around the extended helix to present a scissile phosphate work for nucleophilic attack by the 3′-hydroxyl group of the 5′ exon for exon ligation. An interesting structural toggling has also been observed for the so-called switch loop of Prp8, which likely stabilizes the 5′ exon at the active site (Figure 7c). Discovery of the RNaseH-like domain in Prp8 triggered a discussion about putative involvement of the protein components in coordinating catalytic metal ions at the active site.138−141,159 This was mainly suggested by an accumulation of splice site suppressor mutations in this domain25,134 and a cross-link to the 5′SS detected in early human complexes.142 The hypothesis of a composite RNA−protein active site has been put forward,138 but to date there is no evidence for a direct involvement of the RNaseH-like domain in splicing catalysis,141 in agreement with biochemical23 and structural data.21,38 Nevertheless, the RNaseH-like domain appears to play an important role in guiding precise movements of the branch helix, acting together with step 1 and step 2 specific factors. In complex Bact, the RNaseH-like domain is held in place by its interactions with SF3B factor Hsh155, Prp45, and Cwc22 and it remains away from the branch helix.39,40 Displacement of the SF3b proteins allows rotation of the RNaseH-like domain and its movement closer to the active site (Figure3e). In the resulting complex C, the RNaseH-like domain guides the extended branch helix and provides a platform for docking of the step 1 factor Cwc2521,38 (Figure 3f). Upon Prp16-mediated remodeling, the branch helix is displaced from the active site, vacating the space necessary for docking of the 3′ splice site for exon ligation. This new configuration is stabilized by the RNaseH-like domain together with step 2 specific factors41−43,80 (Figure 3g).

The maturase facilitates folding of the intron RNA to form a catalytically competent RNA structure under the low ionic conditions typically encountered in vivo. In contrast, most group II introns typically require large concentrations of monovalent ions and Mg2+, combined with high temperatures, to splice in vitro in the absence of the maturase. The maturase contains four domains that are essential for retrotransposition via a mechanism known as target-primed reverse transcription163 (Figure 5a,c). In this mechanism, the spliced intron RNA and the maturase first form a high affinity complex that has a Kd in the picomolar range.161 This RNA− protein interaction largely involves the X domain within the maturase. The resulting complex recognizes a double-stranded DNA target through Watson−Crick pairing interactions between the exon binding sequences (EBS) 1 and 2 found within the intron and the target DNA sequence. In addition, the DNA-binding domain of the maturase also assists in recognition of the DNA. The first step of integration is RNAcatalyzed with the intron reverse splicing into the top strand of the dsDNA.162 This results in the intron being covalently attached at both the 5′ and 3′ ends to the target DNA. This is followed by the endonuclease domain cutting the bottom strand of the target DNA to generate a free 3′-OH. This 3′-OH is then used as a primer for cDNA synthesis by the reverse transcriptase domain using the integrated intron RNA as a template. This results in a cDNA copy of the intron being synthesized in the bottom strand. The integrated intron RNA in the upper strand is thought to be removed and filled in with DNA nucleotides through host DNA repair mechanisms. The end result is that the group II intron has multiplied and retrotransposed to a different genomic location, with the original source copy still being intact. The target-primed reverse transcription employed by group II introns is similar to that seen in the LINE elements found in humans and other eukaryotes. Due to this mechanistic similarity and some homology between maturases and LINE RTs, it has been hypothesized that group II introns are also ancestral to LINEs and other non-LTR retroelements.164 NonLTR retroelements have had an especially large impact upon the evolution of humans with more than 40%165 of the human genome being derived from these selfish elements. 8.2. Coevolution of the Maturase and the Group II Intron RNA

The percentage of group II introns containing maturase ORFs has decreased over the course of evolution. For example, >99% of bacterial group II introns contain ORFs, compared with fungal group II introns of which ∼50% contain maturases. In higher plants, group II introns are almost all ORF-less; therefore, the general pattern is the loss/separation of the maturase from the splicing RNA component proceeding from bacteria to higher eukaryotes.51 In addition, group II intron distribution relative to protein-coding genes also exhibits a discernible pattern. Bacterial group II introns are almost exclusively found in intergenic locations and therefore function more like selfish retroelements, rather than as introns.166 Fungal group II introns are located in both conserved housekeeping genes and intergenic regions. Introns in higher eukaryotes are almost exclusively located in genes, suggesting a gradual change in insertion behavior from that of a retroelement to an intron. The cumulative data suggests that the primordial group II intron was a bacterial retroelement

8. MATURASE AND GROUP II INTRON AS MOBILE ELEMENT 8.1. Maturase Assists Splicing and Retrotransposition

Many group II introns encode a maturase protein in the DIV region. The maturase is a multifunctional protein with multiple domains that assist in splicing and retrotransposition. The maturase plays two major roles in group II introns: (1) to assist in the folding of the RNA component to form a catalytic structure that can self-splice;160,161 (2) to allow group II introns to insert into dsDNA through a copy-and-paste mechanism.162,163 M

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

followed by reverse transcription of the newly integrated sequence into a cDNA. Homologous recombination then occurs to incorporate the intron at a new position in the genome. The resulting intron was found to be fully functional and did not disrupt the function of the target gene. Therefore, the cumulative data regarding both group II and spliceosomal introns supports the “introns late” hypothesis in which introns invaded genes relatively late in the evolution of life on Earth.170

containing a maturase ORF, thus leading to the formulation of the “retroelement ancestor hypothesis”.51 8.3. Interaction between Group II Intron RNA and Maturase

The cryo-EM structure of a group II intron−maturase complex from the bacterium Lactococcus lactis was recently determined.36 This also represents the first structure of a group IIA intron. This work revealed extensive RNA−protein contacts between the maturase and domains I and IV of the intron RNA, which is consistent with previous biochemistry.167 Especially significant is the fact that the active sites of the intron and the reverse transcriptase are in close proximity to each other (Figure 5c), which is consistent with the expectation that they must engage the same DNA substrate during retrotransposition. However, there is no observed density for the endonuclease domain of the maturase, which may be due to the absence of DNA substrate in the structure. In other words, the endonuclease domain may only be ordered and engaged in the presence of DNA substrate. Interestingly, the thumb domain of the RT interacts with both exon-binding sequence and the mRNA. Therefore, it was hypothesized that the thumb domain may play a role in ligated exon release. The high-resolution crystal structures of the RT domain from IIC introns have been solved to better than 2 Å resolution.35 These RTs from two different IIC introns both crystallize as a dimer through the same interface. Based on the extensive nature of this dimer interface, as well as ultracentrifugation and light scattering experiments that include the DIV RNA, it was hypothesized that the fully intact RNP forms a dimer with 2:2 stoichiometry. Therefore, the RT protein is thought to nucleate the formation of an RNP dimer. However, a dimer was not seen in the cryo-EM structure36 of the group II intron RNP and it is not yet clear as to the reason for this discrepancy. One possibility is that the dimer may have been disrupted during the purification of the group II intron RNP from L. lactis. This could be due to the fact that the dimerized RNP may be labile and exist in both monomeric and dimeric forms during different stages of catalysis.

9. EVOLUTION OF THE SPLICEOSOME As discussed in section 8, group II introns can be integrated into the host genome by a reverse splicing mechanism after excision from the transcript. The endonuclease activity of maturase then nicks the DNA strand opposite the integrated group II intron RNA and the reverse transcriptase activity of maturase produces a complementary DNA copy of the integrated RNA. The integrated group II intron RNA is then degraded by host RNase H activity and the gap produced by RNA degradation can be filled by DNA polymerase. In this way group II introns can invade the genome. In addition to this retrohoming mechanism, group II introns could be inserted into the genome by retrotransposition.171 Excised group II intron is reverse-spliced into an ectopic site in mRNA, and following cDNA and second-strand synthesis the group II intron sequence could be integrated in sense orientation into the gene by homologous recombination. Koonin argues that group II introns in α-proteobacteria (ancestors of mitochondria) invaded the genome of emerging proto-eukaryotic cells.172 However, in principle group II introns could have spread at any stage of eukaryotic evolution. This mechanism relies on the RNA-catalyzed self-splicing and reverse splicing activities of the group II intron, and hence the sequence and folding of the intron RNA had to be conserved. Trans-acting group II introns can form from independently transcribed RNAs that bind together to support self-splicing activity.173 This discovery prompted Sharp to propose that snRNAs may be fragments“five easy pieces”of group II introns.17 Such fragments could be generated by DNA rearrangements within one or more cis-acting group II introns. When Sharp’s perspective article was written the similarity between group II intron domain VI and the branch helix had already been recognized,12 but the structural similarities between group II intron domain V and the active site RNA of the spliceosome were not yet known.13,23,104 As discussed in the previous sections the active site triplex in group II intron domain V is structurally very similar to the spliceosome active site triplex comprising U6 and U2 snRNAs. Furthermore, U5 snRNA loop I in the spliceosome fulfills the same function as the exon binding loop in group II introns.15 When we determined the crystal structure of Prp8, the structure immediately suggested that the Large domain of Prp8 comprising the RT, thumb/X, Linker, and endonuclease domains may have evolved from the group II intron maturase while retaining the same domain architecture.34 Furthermore, structural comparison between the maturase−group II intron complex and the spliceosome shows that the positions of the catalytic triplex and the exon binding loop with respect to the maturase/Prp8 are also conserved, strengthening the notion that they evolved from a common ancestor. How could the fragmented group II intron have recruited protein components to become a trans-acting spliceosome? Folding studies of group II introns have shown that domain I folds first and serves as an assembly template for rapid and

8.4. Implications of the Maturase for the Dispersal of Spliceosomal Introns

Group II introns are currently thought to share ancestry with the spliceosome and non-LTR retroelements based on both structural and sequence homology.168 The group II introns are likely to have first evolved in bacteria billions of years ago. These group II introns then became incorporated into eukaryotes during the endosymbiont event in which bacteria evolved into the mitochondria and chloroplasts. This hypothesis is supported by the fact that group II introns are only found in these organelles in eukaryotes. It is thought that the evolution of the maturase and intron RNA diverged in the migration of group II introns into the nucleus. This resulted in the RNA component of group II introns forming the spliceosomal catalytic core, while the group II intron maturase evolved into the non-LTR retroelements. Together, both of these genetic elements comprise ∼70% of the human genome (including derived sequence from nonfunctional retroelements). This evolutionary hypothesis leads to the question of how spliceosomal introns have dispersed and replicated throughout eukaryotic genomes. Recent work by Lee and Stevens has captured the act of spliceosomal intronogenesis using an in vivo reporter assay in yeast.169 This mechanism involves the lariat first reverse splicing into a noncognate mRNA sequence; this is N

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

Figure 8. Plausible steps of the evolution of the spliceosome. (a) Secondary structure of group II introns which form a complex with maturase as shown in Figure 5e. (inset) Domain I of group II intron.33 (b) Group II intron in which domain I is replaced by a protein domain (Prp8 NTD) which is attached to the RT domain of maturase and harbors the exon binding loop 1. (c) Insertion in domain V provides extra anchoring points in the absence of domain I. (d) Domains V and VI undergo fragmentation giving rise to U6 and U2 snRNAs, respectively. Domains II, III, and IV are replaced by protein domains to cradle the active site RNA (U2, U6, and pre-mRNA) instead of the scaffold of RNA. The ancestral trans-acting spliceosome can catalyze excision of introns.

faithful assembly of the other five domains (DII−DVI).174,57 DI is the largest domain and harbors the exon binding loops (Figure 8). The conserved loop I of U5 snRNA is a functional equivalent of the exon binding loop of group II introns: it tethers the 5′ exon during the first step and aligns the 5′ and 3′ exons during exon ligation.15,16 In addition to its function as a folding platform, the group II intron domain I interacts extensively with the maturase and positions the exon binding loop and the catalytic Mg2+ binding site of domain V in the active site cavity. It is interesting to note that stem-loop I of U5 snRNA binds in the cleft between the two halves of the Nterminal domain of Prp8 and is firmly held by a polypeptide chain which binds across the minor groove of its stem and places its exon binding loop within the active site cavity, fulfilling the function of domain I. Another important function of domain I is positioning the apical loop of domain V through the ζ−ζ′ interaction so that the catalytic triplex can be placed into the active site cavity of the maturase. In the spliceosome the Prp8 N-terminal domain firmly grips U5 snRNA stem-loop I to position its exon binding loop I in the active site of the spliceosome, replacing the function of group II intron domain I (Figure 8). The apical loop of intramolecular stem-loop of U6

snRNA is not involved in RNA tertiary interaction, but the catalytic triplex makes extensive interaction with the Linker domain and the helix bundle (HB) domain of Prp8. In group II intron the catalytic triplex is positioned by a RNA−tertiary interaction (ζ−ζ′) with domain I, but this interaction is replaced by RNA−protein interactions in the spliceosome. In the absence of domain I the rest of group II intron, i.e., domains II−VI, may still be able to maintain its structure by tertiary interactions such as π−π′, η−η′, and τ−τ′ (Figure 1c) and interacts with maturase mainly through domains IV, V, and VI allowing all the catalytic elements to be placed into the active site cavity. Gradual replacement of domain I with a protein component (i.e., Prp8-Nterm) might have been followed by the insertion of a new sequence into domain V which later became U2/U6 helix II. It is very plausible that such insertion occurred by a reverse splicing mechanism as proposed for the U6 snRNA introns in Schizosaccharomyces pombe, Rhodotorula hasegawae, and Rhodosporidium dacryoidum.175 In the spliceosome the U2/U6 snRNA helix II region is secured onto the HB domain of Prp8 by CefI and Syf2. Hence with this insertion the group II intron RNA can be secured to maturase/ Prp8 even in the absence of domains II−IV. By opening the O

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

chances of retention and propagation of the trans-acting spliceosome. The structural homology observed between the group II intron and the spliceosome hints at many tantalizing possibilities in the field of molecular evolution. It is possible that these splicing machines led to the evolution of eukaryotic genomic complexity through intron dispersal and alternative splicing. This is likely to have led to the first appearance of complex animals and diverse body plans seen in the Cambrian explosion. The structural conservation of the core also has significant implications for the field of RNA biochemistry. During the course of evolution the spliceosome has lost large scaffolding regions of the ancestral group II intron RNA and has replaced them with protein domains. However, it is remarkable that the spliceosome remains a ribozyme and still uses a single active site to catalyze the two steps of pre-mRNA splicing reactions.

apical loops of domain VI and the U6 snRNA insertion, U2 snRNA could be separated from U6 snRNA. With this arrangement U2 snRNA can form helices 1a and 1b13 as well as the branch helix. The catalytic components forming ancestral trans-acting spliceosome could finally be separated from premRNA substrates. Formation of trans-acting spliceosome removed the evolutionary pressure from introns to maintain their tertiary structure and allowed them to freely evolve into the current form. In group II introns the 5′ exon sequence is bound to the exon binding loop in a sequence-specific manner, whereas in the spliceosome the 5′ end of intron is bound to the ACAGAGA sequence in U6 snRNA and the 5′ exon sequence is bound to the conserved U-rich loop I of U5 snRNA so that there is little sequence restriction in the 5′ exon sequence. It is not clear if the ancestral group II intron fragmentation preceded gradual replacement of the scaffolding RNA with proteins. In principle, both processes could have happened independently. Presumably U4 snRNA evolved later to keep the spliceosome catalytically inactive until the pre-mRNA branch sequence was correctly recognized by U2 snRNA and the 5′ splice site by the ACAGAGA region of U6 snRNA. In the yeast spliceosome the interaction between the 5′ intron sequence and the ACAGAGA sequence is not sufficient to select the 5′ splice site unambiguously. U1 snRNA evolved to select a pre-mRNA 5′ splice site initially and then hand it over to U6 snRNA to introduce it into the active site of the spliceosome. The interactions between snRNAs and between pre-mRNA and snRNAs involve base pairing of relatively short sequences. The snRNAs became complexed with proteins to facilitate the recognition of these sequences and to assist spliceosome assembly. There are many examples of RNA tertiary interactions in the group II intron that have been replaced by analogous RNA− RNA or RNA−protein contacts in the spliceosome. For example, there are similar pairing interactions at the 5′ ends of both group II and spliceosomal introns that serve to properly position the 5′ splice site. The GUGYG sequence (bold indicates interacting residues) at the 5′ end of group II introns pairs to the ε′ motif found in DI. This interaction is replaced in the spliceosome by a pairing between the intron GUAUGU sequence and the U6 snRNA ACAGAGA sequence. The helix bundle domain added to the N-terminus of the RT domain provides a binding platform for the U2 snRNA helix Ia−Ib region in a manner reminiscent of the κ−κ′ interaction between DI and the catalytic DV in the group II intron. The group II intron and maturase coevolved and tertiary interactions between different regions of the group II intron RNA are replaced by protein-mediated interactions in the spliceosome. As group II intron domains V and VI evolved into U6 and U2 snRNAs, more protein domains were added to Prp8 and more proteins were recruited to the spliceosome to facilitate the assembly of snRNAs into the catalytic RNA core and their stable binding to Prp8. Trans- and cis-acting spliceosomes are likely to have coexisted at the beginning, but as the trans-acting spliceosome became more efficient, Prp8 lost its RT and endonuclease activities and thus lost its homing activity to become an RNA assembly platform. The positive selection for the trans-acting version may have arisen from its ability to bind and splice introns of highly variable size and its ability to engage in the act of alternative splicing, which generates protein diversity in eukaryotes. This diversity would be hypothesized to have a positive effect on the host organism, therefore increasing the

AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. ORCID

Kiyoshi Nagai: 0000-0003-1785-6510 Notes

The authors declare no competing financial interest. Biographies Wojciech Galej was born in Rzeszow, Poland, in 1985. He completed his primary education in Sanok, Poland, after which he moved to Warsaw to pursue his undergraduate studies in mathematics and natural sciences. He obtained a B.Sc. degree in molecular biology in 2007 from the University of Warsaw. In 2009 he completed two M.Sc. degrees: in molecular biology working with Joanna Kufel and Andrzej Dziembowski and in Chemistry (small molecule crystallography) working with Paulina Dominiak and Krzysztof Wozniak. He moved to Cambridge in 2009 to pursue Ph.D. studies with Kiyoshi Nagai, where together with Andy Newman they determined a crystal structure of Prp8, revealing the active site cavity of the spliceosome. He obtained his Ph.D. from the University of Cambridge in 2013 and continued working as a postdoc at the MRC Laboratory of Molecular Biology, applying newly emerging cryo-EM methodologies to study splicing complexes. During the past three years he was involved in the structure determination of several splicing complexes including U4/ U6.U5 tri-snRNP and complexes C, C*, and P. Since 2017 he has been a group leader at the EMBL in Grenoble, France. Navtej Toor was born in the Rocky Mountains of Sparwood, BC, Canada, in 1973. He graduated from Sparwood Secondary School in 1991 and moved to Calgary, AB, to pursue his undergraduate studies. He obtained his B.Sc. degree in biochemistry from the University of Calgary in 1996. He conducted graduate research under Steven Zimmerly also at the University of Calgary, where he studied the evolution of group II introns. After earning his Ph.D. in 2004, he began his postdoctoral studies at Yale University under Anna Marie Pyle. During this period, he worked to solve the first crystal structure of a group II intron. He started his independent faculty position in 2009 as an assistant professor at the University of California, San Diego, in the Department of Chemistry and Biochemistry. In 2014, his group solved the first structure of a group II intron lariat. He was promoted to associate professor in 2015. He currently lives in San Diego with his wife Parmi and three children. P

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(6) Domdey, H.; Apostol, B.; Lin, R. J.; Newman, A.; Brody, E.; Abelson, J. Lariat Structures Are in Vivo Intermediates in Yeast PremRNA Splicing. Cell 1984, 39 (3), 611−621. (7) Sharp, P. A. On the Origin of RNA Splicing and Introns. Cell 1985, 42 (2), 397−400. (8) Cech, T. R. The Generality of Self-Splicing RNA: Relationship to Nuclear mRNA Splicing. Cell 1986, 44 (2), 207−210. (9) Michel, F.; Jacquier, A.; Dujon, B. Comparison of Fungal Mitochondrial Introns Reveals Extensive Homologies in RNA Secondary Structure. Biochimie 1982, 64 (10), 867−881. (10) Steitz, J. A.; Black, D. L.; Gerke, V.; Parker, K. A.; Krämer, A.; Frendewey, D.; Keller, W. Functions of the Abundant U-snRNPs. In Structure and Function of Major and Minor Small Nuclear Ribonucleoprotein Particles; Springer: Berlin, 1988; pp 115−154. (11) Yang, V. W.; Lerner, M. R.; Steitz, J. A.; Flint, S. J. A Small Nuclear Ribonucleoprotein Is Required for Splicing of Adenoviral Early RNA Sequences. Proc. Natl. Acad. Sci. U. S. A. 1981, 78 (3), 1371−1375. (12) Parker, R.; Siliciano, P. G.; Guthrie, C. Recognition of the TACTAAC Box During mRNA Splicing in Yeast Involves Base Pairing to the U2-Like snRNA. Cell 1987, 49 (2), 229−239. (13) Madhani, H. D.; Guthrie, C. A Novel Base-Pairing Interaction Between U2 and U6 snRNAs Suggests a Mechanism for the Catalytic Activation of the Spliceosome. Cell 1992, 71 (5), 803−817. (14) Lesser, C. F.; Guthrie, C. Mutations in U6 snRNA That Alter Splice Site Specificity: Implications for the Active Site. Science 1993, 262 (5142), 1982−1988. (15) Newman, A. J.; Norman, C. U5 snRNA Interacts with Exon Sequences at 5′ and 3′ Splice Sites. Cell 1992, 68 (4), 743−754. (16) Sontheimer, E. J.; Steitz, J. A. The U5 and U6 Small Nuclear RNAs as Active Site Components of the Spliceosome. Science 1993, 262 (5142), 1989−1996. (17) Sharp, P. A. Five Easy Pieces. Science 1991, 254 (5032), 663. (18) Steitz, T. A.; Steitz, J. A. A General Two-Metal-Ion Mechanism for Catalytic RNA. Proc. Natl. Acad. Sci. U. S. A. 1993, 90 (14), 6498− 6502. (19) Toor, N.; Keating, K. S.; Taylor, S. D.; Pyle, A. M. Crystal Structure of a Self-Spliced Group II Intron. Science 2008, 320 (5872), 77−82. (20) Robart, A. R.; Chan, R. T.; Peters, J. K.; Rajashankar, K. R.; Toor, N. Crystal Structure of a Eukaryotic Group II Intron Lariat. Nature 2014, 514 (7521), 193−197. (21) Galej, W. P.; Wilkinson, M. E.; Fica, S. M.; Oubridge, C.; Newman, A. J.; Nagai, K. Cryo-EM Structure of the Spliceosome Immediately After Branching. Nature 2016, 537 (7619), 197−201. (22) Yean, S. L.; Wuenschell, G.; Termini, J.; Lin, R. J. Metal-Ion Coordination by U6 Small Nuclear RNA Contributes to Catalysis in the Spliceosome. Nature 2000, 408 (6814), 881−884. (23) Fica, S. M.; Tuttle, N.; Novak, T.; Li, N.-S.; Lu, J.; Koodathingal, P.; Dai, Q.; Staley, J. P.; Piccirilli, J. A. RNA Catalyses Nuclear PremRNA Splicing. Nature 2013, 503 (7475), 229−234. (24) Marcia, M.; Pyle, A. M. Visualizing Group II Intron Catalysis Through the Stages of Splicing. Cell 2012, 151 (3), 497−507. (25) Query, C. C.; Konarska, M. M. Suppression of Multiple Substrate Mutations by Spliceosomal Prp8 Alleles Suggests Functional Correlations with Ribosomal Ambiguity Mutants. Mol. Cell 2004, 14 (3), 343−354. (26) Teigelkamp, S.; Whittaker, E.; Beggs, J. D. Interaction of the Yeast Splicing Factor PRP8 with Substrate RNA During Both Steps of Splicing. Nucleic Acids Res. 1995, 23 (3), 320−326. (27) Wyatt, J. R.; Sontheimer, E. J.; Steitz, J. A. Site-Specific CrossLinking of Mammalian U5 snRNP to the 5′ Splice Site Before the First Step of Pre-mRNA Splicing. Genes Dev. 1992, 6 (12B), 2542−2553. (28) Umen, J. G.; Guthrie, C. A Novel Role for a U5 snRNP Protein in 3′ Splice Site Selection. Genes Dev. 1995, 9 (7), 855−868. (29) Umen, J. G.; Guthrie, C. Mutagenesis of the Yeast Gene PRP8 Reveals Domains Governing the Specificity and Fidelity of 3′ Splice Site Selection. Genetics 1996, 143 (2), 723−739.

Andrew James Newman was born in 1955 in Totnes, U.K. He studied natural sciences at the University of Cambridge and graduated in 1977. In 1981 he received his Ph.D. in molecular biology from the University of Edinburgh. He carried out postdoctoral research on pre-mRNA splicing under the supervision of John Abelson at the University of California, San Diego, CA, USA [1981−1983], and the California Institute of Technology, Pasadena, CA, USA [1983−1984]. Since 1985 he has been working on pre-mRNA splicing at the MRC Laboratory of Molecular Biology, Cambridge, U.K. In 1992 he showed that U5 snRNA interacts with exon sequences in the catalytic core of the spliceosome. He was elected a member of EMBO in 1995. In 2013, together with Wojtek Galej and Kiyoshi Nagai he showed that the Prp8 protein forms the active site cavity of the spliceosome. He continues to work closely with Kiyoshi Nagai on spliceosome structure and function. Kiyoshi Nagai was born in Osaka, Japan in 1949. He obtained his B.Sc. and M.Sc. degrees from Osaka University and began working toward his Ph.D. in 1974 on the allosteric effect in hemoglobin under the supervision of Hideki Morimoto. During his Ph.D. studies he spent 18 months as a visiting student at the MRC Laboratory of Molecular Biology, Cambridge, and worked with John Kilmartin and Max Perutz. After completing his Ph.D. in Japan, he returned to MRC LMB to work on overproduction of eukaryotic proteins in E. coli and studied the hemoglobin evolution by protein engineering. In 1987 he became a tenured group leader and began his structural work on DNA and RNA binding proteins. His group determined the crystal structure of many components of the spliceosome as protein or RNA−protein complexes including human U1 snRNP. In 2000 he was elected a fellow of the Royal Society and a member of EMBO. In 2013 with Wojtek Galej and Andy Newman he determined the structure of Prp8, which interacts intimately with the catalytic RNA core of the spliceosome. From 2014 his group has applied cryo-EM to capture the structures of the spliceosome in different assembly and catalytic states and provided crucial insight into the catalytic mechanism of premRNA splicing. He is a keen amateur cellist and lives in Cambridge with his wife, Yoshiko.

ACKNOWLEDGMENTS The authors thank Marco Marcia, Sebastian Fica, Pei-Chun Lin, Clemens Plaschka, Clément Charenton, Lisa Strittmatter, and Max Wilkinson for critical reading of the manuscript. This work was supported by NIH Grant 5R01GM102216 awarded to N.T. and the Medical Research Council (MC_U105184330) and European Research Council Advanced Grant (693087SPLICE3D) awarded to K.N. REFERENCES (1) Chow, L. T.; Gelinas, R. E.; Broker, T. R.; Roberts, R. J. An Amazing Sequence Arrangement at the 5′ Ends of Adenovirus 2 Messenger RNA. Cell 1977, 12 (1), 1−8. (2) Berget, S. M.; Moore, C.; Sharp, P. A. Spliced Segments at the 5′ Terminus of Adenovirus 2 Late mRNA. Proc. Natl. Acad. Sci. U. S. A. 1977, 74 (8), 3171−3175. (3) Padgett, R. A.; Konarska, M. M.; Grabowski, P. J.; Hardy, S. F.; Sharp, P. A. Lariat RNA’s as Intermediates and Products in the Splicing of Messenger RNA Precursors. Science 1984, 225 (4665), 898−903. (4) Ruskin, B.; Krainer, A. R.; Maniatis, T.; Green, M. R. Excision of an Intact Intron as a Novel Lariat Structure During Pre-mRNA Splicing in Vitro. Cell 1984, 38 (1), 317−331. (5) Rodriguez, J. R.; Pikielny, C. W.; Rosbash, M. In Vivo Characterization of Yeast mRNA Processing Intermediates. Cell 1984, 39 (3), 603−610. Q

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(30) Reyes, J. L.; Kois, P.; Konforti, B. B.; Konarska, M. M. The Canonical GU Dinucleotide at the 5′ Splice Site Is Recognized by P220 of the U5 snRNP Within the Spliceosome. RNA 1996, 2 (3), 213−225. (31) Grainger, R. J.; Beggs, J. D. Prp8 Protein: at the Heart of the Spliceosome. RNA 2005, 11 (5), 533−557. (32) Turner, I. A.; Norman, C. M.; Churcher, M. J.; Newman, A. J. Dissection of Prp8 Protein Defines Multiple Interactions with Crucial RNA Sequences in the Catalytic Core of the Spliceosome. RNA 2006, 12 (3), 375−386. (33) Dlakić, M.; Mushegian, A. Prp8, the Pivotal Protein of the Spliceosomal Catalytic Center, Evolved From a RetroelementEncoded Reverse Transcriptase. RNA 2011, 17 (5), 799−808. (34) Galej, W. P.; Oubridge, C.; Newman, A. J.; Nagai, K. Crystal Structure of Prp8 Reveals Active Site Cavity of the Spliceosome. Nature 2013, 493 (7434), 638−643. (35) Zhao, C.; Pyle, A. M. Crystal Structures of a Group II Intron Maturase Reveal a Missing Link in Spliceosome Evolution. Nat. Struct. Mol. Biol. 2016, 23 (6), 558−565. (36) Qu, G.; Kaushal, P. S.; Wang, J.; Shigematsu, H.; Piazza, C. L.; Agrawal, R. K.; Belfort, M.; Wang, H.-W. Structure of a Group II Intron in Complex with Its Reverse Transcriptase. Nat. Struct. Mol. Biol. 2016, 23 (6), 549−557. (37) Yan, C.; Hang, J.; Wan, R.; Huang, M.; Wong, C. C. L.; Shi, Y. Structure of a Yeast Spliceosome at 3.6-Angstrom Resolution. Science 2015, 349 (6253), 1182−1191. (38) Wan, R.; Yan, C.; Bai, R.; Huang, G.; Shi, Y. Structure of a Yeast Catalytic Step I Spliceosome at 3.4 Å Resolution. Science 2016, 353 (6302), 895−904. (39) Yan, C.; Wan, R.; Bai, R.; Huang, G.; Shi, Y. Structure of a Yeast Activated Spliceosome at 3.5 Å Resolution. Science 2016, 353 (6302), 904−911. (40) Rauhut, R.; Fabrizio, P.; Dybkov, O.; Hartmuth, K.; Pena, V.; Chari, A.; Kumar, V.; Lee, C.-T.; Urlaub, H.; Kastner, B.; et al. Molecular Architecture of the Saccharomyces Cerevisiae Activated Spliceosome. Science 2016, 353 (6306), 1399−1405. (41) Fica, S. M.; Oubridge, C.; Galej, W. P.; Wilkinson, M. E.; Bai, X.-C.; Newman, A. J.; Nagai, K. Structure of a Spliceosome Remodelled for Exon Ligation. Nature 2017, 542 (7641), 377−380. (42) Yan, C.; Wan, R.; Bai, R.; Huang, G.; Shi, Y. Structure of a Yeast Step II Catalytically Activated Spliceosome. Science 2017, 355 (6321), 149−155. (43) Bertram, K.; Agafonov, D. E.; Liu, W.-T.; Dybkov, O.; Will, C. L.; Hartmuth, K.; Urlaub, H.; Kastner, B.; Stark, H.; Lührmann, R. Cryo-EM Structure of a Human Spliceosome Activated for Step 2 of Splicing. Nature 2017, 542 (7641), 318−323. (44) Kruger, K.; Grabowski, P. J.; Zaug, A. J.; Sands, J.; Gottschling, D. E.; Cech, T. R. Self-Splicing RNA: Autoexcision and Autocyclization of the Ribosomal RNA Intervening Sequence of Tetrahymena. Cell 1982, 31 (1), 147−157. (45) Peebles, C. L.; Perlman, P. S.; Mecklenburg, K. L.; Petrillo, M. L.; Tabor, J. H.; Jarrell, K. A.; Cheng, H. L. A Self-Splicing RNA Excises an Intron Lariat. Cell 1986, 44 (2), 213−223. (46) van der Veen, R.; Arnberg, A. C.; van der Horst, G.; Bonen, L.; Tabak, H. F.; Grivell, L. A. Excised Group II Introns in Yeast Mitochondria Are Lariats and Can Be Formed by Self-Splicing in Vitro. Cell 1986, 44 (2), 225−234. (47) Ferat, J. L.; Michel, F. Group II Self-Splicing Introns in Bacteria. Nature 1993, 364 (6435), 358−361. (48) Dai, L.; Toor, N.; Olson, R.; Keeping, A.; Zimmerly, S. Database for Mobile Group II Introns. Nucleic Acids Res. 2003, 31 (1), 424−426. (49) Hausner, G.; Olson, R.; Simon, D.; Johnson, I.; Sanders, E. R.; Karol, K. G.; McCourt, R. M.; Zimmerly, S. Origin and Evolution of the Chloroplast trnK (matK) Intron: a Model for Evolution of Group II Intron RNA Structures. Mol. Biol. Evol. 2006, 23 (2), 380−391. (50) Vallès, Y.; Halanych, K. M.; Boore, J. L. Group II Introns Break New Boundaries: Presence in a Bilaterian’s Genome. PLoS One 2008, 3 (1), e1488.

(51) Toor, N.; Hausner, G.; Zimmerly, S. Coevolution of Group II Intron RNA Structures with Their Intron-Encoded Reverse Transcriptases. RNA 2001, 7 (8), 1142−1152. (52) Toro, N.; Martínez-Abarca, F. Comprehensive Phylogenetic Analysis of Bacterial Group II Intron-Encoded ORFs Lacking the DNA Endonuclease Domain Reveals New Varieties. PLoS One 2013, 8 (1), e55102. (53) Nagy, V.; Pirakitikulr, N.; Zhou, K. I.; Chillón, I.; Luo, J.; Pyle, A. M. Predicted Group II Intron Lineages E and F Comprise Catalytically Active Ribozymes. RNA 2013, 19 (9), 1266−1278. (54) Rest, J. S.; Mindell, D. P. Retroids in Archaea: Phylogeny and Lateral Origins. Mol. Biol. Evol. 2003, 20 (7), 1134−1142. (55) Toor, N.; Robart, A. R.; Christianson, J.; Zimmerly, S. SelfSplicing of a Group IIC Intron: 5′ Exon Recognition and Alternative 5′ Splicing Events Implicate the Stem-Loop Motif of a Transcriptional Terminator. Nucleic Acids Res. 2006, 34 (22), 6461−6471. (56) Robart, A. R.; Seo, W.; Zimmerly, S. Insertion of Group II Intron Retroelements After Intrinsic Transcriptional Terminators. Proc. Natl. Acad. Sci. U. S. A. 2007, 104 (16), 6620−6625. (57) Zhao, C.; Rajashankar, K. R.; Marcia, M.; Pyle, A. M. Crystal Structure of Group II Intron Domain 1 Reveals a Template for RNA Assembly. Nat. Chem. Biol. 2015, 11 (12), 967−972. (58) Jacquier, A.; Michel, F. Multiple Exon-Binding Sites in Class II Self-Splicing Introns. Cell 1987, 50 (1), 17−29. (59) Costa, M.; Michel, F.; Westhof, E. A Three-Dimensional Perspective on Exon Binding by a Group II Self-Splicing Intron. EMBO J. 2000, 19 (18), 5007−5018. (60) Chanfreau, G.; Jacquier, A. An RNA Conformational Change Between the Two Chemical Steps of Group II Self-Splicing. EMBO J. 1996, 15 (13), 3466−3476. (61) Fedorova, O.; Pyle, A. M. Linking the Group II Intron Catalytic Domains: Tertiary Contacts and Structural Features of Domain 3. EMBO J. 2005, 24 (22), 3906−3916. (62) Michel, F.; Lang, B. F. Mitochondrial Class II Introns Encode Proteins Related to the Reverse Transcriptases of Retroviruses. Nature 1985, 316 (6029), 641−643. (63) Boulanger, S. C.; Belcher, S. M.; Schmidt, U.; Dib-Hajj, S. D.; Schmidt, T.; Perlman, P. S. Studies of Point Mutants Define Three Essential Paired Nucleotides in the Domain 5 Substructure of a Group II Intron. Mol. Cell. Biol. 1995, 15 (8), 4479−4488. (64) Lambowitz, A. M.; Zimmerly, S. Group II Introns: Mobile Ribozymes That Invade DNA. Cold Spring Harbor Perspect. Biol. 2011, 3 (8), a003616−a003616. (65) Padgett, R. A.; Podar, M.; Boulanger, S. C.; Perlman, P. S. The Stereochemical Course of Group II Intron Self-Splicing. Science 1994, 266 (5191), 1685−1688. (66) Moore, M. J.; Sharp, P. A. Evidence for Two Active Sites in the Spliceosome Provided by Stereochemistry of Pre-mRNA Splicing. Nature 1993, 365 (6444), 364−368. (67) McSwiggen, J. A.; Cech, T. R. Stereochemistry of RNA Cleavage by the Tetrahymena Ribozyme and Evidence That the Chemical Step Is Not Rate-Limiting. Science 1989, 244 (4905), 679−683. (68) Rajagopal, J.; Doudna, J. A.; Szostak, J. W. Stereochemical Course of Catalysis by the Tetrahymena Ribozyme. Science 1989, 244 (4905), 692−694. (69) Schmidt, U.; Podar, M.; Stahl, U.; Perlman, P. S. Mutations of the Two-Nucleotide Bulge of D5 of a Group II Intron Block Splicing in Vitro and in Vivo: Phenotypes and Suppressor Mutations. RNA 1996, 2 (11), 1161−1172. (70) Gordon, P. M.; Fong, R.; Piccirilli, J. A. A Second Divalent Metal Ion in the Group II Intron Reaction Center. Chem. Biol. 2007, 14 (6), 607−612. (71) Gordon, P. M.; Piccirilli, J. A. Metal Ion Coordination by the AGC Triad in Domain 5 Contributes to Group II Intron Catalysis. Nat. Struct. Biol. 2001, 8 (10), 893−898. (72) Zhuang, Y.; Weiner, A. M. A Compensatory Base Change in Human U2 snRNA Can Suppress a Branch Site Mutation. Genes Dev. 1989, 3 (10), 1545−1552. R

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(73) Wu, J. A.; Manley, J. L. Base Pairing Between U2 and U6 snRNAs Is Necessary for Splicing of a Mammalian Pre-mRNA. Nature 1991, 352 (6338), 818−821. (74) Zhang, L.; Doudna, J. A. Structural Insights Into Group II Intron Catalysis and Branch-Site Selection. Science 2002, 295 (5562), 2084−2088. (75) Toor, N.; Rajashankar, K.; Keating, K. S.; Pyle, A. M. Structural Basis for Exon Recognition by a Group II Intron. Nat. Struct. Mol. Biol. 2008, 15 (11), 1221−1222. (76) de Lencastre, A.; Hamill, S.; Pyle, A. M. A Single Active-Site Region for a Group II Intron. Nat. Struct. Mol. Biol. 2005, 12 (7), 626− 627. (77) Chan, R. T.; Robart, A. R.; Rajashankar, K. R.; Pyle, A. M.; Toor, N. Crystal Structure of a Group II Intron in the Pre-Catalytic State. Nat. Struct. Mol. Biol. 2012, 19 (5), 555−557. (78) Toor, N.; Keating, K. S.; Fedorova, O.; Rajashankar, K.; Wang, J.; Pyle, A. M. Tertiary Architecture of the Oceanobacillus Iheyensis Group II Intron. RNA 2010, 16 (1), 57−69. (79) Costa, M.; Walbott, H.; Monachello, D.; Westhof, E.; Michel, F. Crystal Structures of a Group II Intron Lariat Primed for Reverse Splicing. Science 2016, 354 (6316), aaf9258−aaf9258. (80) Zhang, X.; Yan, C.; Hang, J.; Finci, L. I.; Lei, J.; Shi, Y. An Atomic Structure of the Human Spliceosome. Cell 2017, 169 (5), 918−929.e14. (81) Adams, P. L.; Stahley, M. R.; Kosek, A. B.; Wang, J.; Strobel, S. A. Crystal Structure of a Self-Splicing Group I Intron with Both Exons. Nature 2004, 430 (6995), 45−50. (82) Marcia, M.; Pyle, A. M. Principles of Ion Recognition in RNA: Insights From the Group II Intron Structures. RNA 2014, 20 (4), 516−527. (83) Lerner, M. R.; Boyle, J. A.; Mount, S. M.; Wolin, S. L.; Steitz, J. A. Are snRNPs Involved in Splicing? Nature 1980, 283 (5743), 220− 224. (84) Rogers, J.; Wall, R. A Mechanism for RNA Splicing. Proc. Natl. Acad. Sci. U. S. A. 1980, 77 (4), 1877−1879. (85) Mount, S. M.; Pettersson, I.; Hinterberger, M.; Karmas, A.; Steitz, J. A. The U1 Small Nuclear RNA-Protein Complex Selectively Binds a 5′ Splice Site in Vitro. Cell 1983, 33 (2), 509−518. (86) Zhuang, Y.; Weiner, A. M. A Compensatory Base Change in U1 snRNA Suppresses a 5′ Splice Site Mutation. Cell 1986, 46 (6), 827− 835. (87) Krämer, A.; Keller, W.; Appel, B.; Lührmann, R. The 5′ Terminus of the RNA Moiety of U1 Small Nuclear Ribonucleoprotein Particles Is Required for the Splicing of Messenger RNA Precursors. Cell 1984, 38 (1), 299−307. (88) Chabot, B.; Steitz, J. A. Recognition of Mutant and Cryptic 5′ Splice Sites by the U1 Small Nuclear Ribonucleoprotein in Vitro. Mol. Cell. Biol. 1987, 7 (2), 698−707. (89) Bindereif, A.; Green, M. R. An Ordered Pathway of snRNP Binding During Mammalian Pre-mRNA Splicing Complex Assembly. EMBO J. 1987, 6 (8), 2415−2424. (90) Ruby, S. W.; Abelson, J. An Early Hierarchic Role of U1 Small Nuclear Ribonucleoprotein in Spliceosome Assembly. Science 1988, 242 (4881), 1028−1035. (91) Séraphin, B.; Kretzner, L.; Rosbash, M. A U1 snRNA:PremRNA Base Pairing Interaction Is Required Early in Yeast Spliceosome Assembly but Does Not Uniquely Define the 5′ Cleavage Site. EMBO J. 1988, 7 (8), 2533−2538. (92) Séraphin, B.; Rosbash, M. Identification of Functional U1 snRNA-Pre-mRNA Complexes Committed to Spliceosome Assembly and Splicing. Cell 1989, 59 (2), 349−358. (93) Staley, J. P.; Guthrie, C. An RNA Switch at the 5′ Splice Site Requires ATP and the DEAD Box Protein Prp28p. Mol. Cell 1999, 3 (1), 55−64. (94) Sawa, H.; Abelson, J. Evidence for a Base-Pairing Interaction Between U6 Small Nuclear RNA and 5′ Splice Site During the Splicing Reaction in Yeast. Proc. Natl. Acad. Sci. U. S. A. 1992, 89 (23), 11269− 11273.

(95) Sawa, H.; Shimura, Y. Association of U6 snRNA with the 5′Splice Site Region of Pre-mRNA in the Spliceosome. Genes Dev. 1992, 6 (2), 244−254. (96) Johnson, T. L.; Abelson, J. Characterization of U4 and U6 Interactions with the 5′ Splice Site Using a S. Cerevisiae in Vitro Trans-Splicing System. Genes Dev. 2001, 15 (15), 1957−1970. (97) Chan, S.-P.; Kao, D.-I.; Tsai, W.-Y.; Cheng, S.-C. The Prp19pAssociated Complex in Spliceosome Activation. Science 2003, 302 (5643), 279−282. (98) Kandels-Lewis, S.; Séraphin, B. Involvement of U6 snRNA in 5′ Splice Site Selection. Science 1993, 262 (5142), 2035−2039. (99) O’Keefe, R. T.; Norman, C.; Newman, A. J. The Invariant U5 snRNA Loop 1 Sequence Is Dispensable for the First Catalytic Step of Pre-mRNA Splicing in Yeast. Cell 1996, 86 (4), 679−689. (100) Brow, D. A.; Guthrie, C. Spliceosomal RNA U6 Is Remarkably Conserved From Yeast to Mammals. Nature 1988, 334 (6179), 213− 218. (101) Madhani, H. D.; Bordonné, R.; Guthrie, C. Multiple Roles for U6 snRNA in the Splicing Pathway. Genes Dev. 1990, 4 (12B), 2264− 2277. (102) Fabrizio, P.; Abelson, J. Thiophosphates in Yeast U6 snRNA Specifically Affect Pre-mRNA Splicing in Vitro. Nucleic Acids Res. 1992, 20 (14), 3659−3664. (103) Yu, Y. T.; Maroney, P. A.; Darzynkiwicz, E.; Nilsen, T. W. U6 snRNA Function in Nuclear Pre-mRNA Splicing: a Phosphorothioate Interference Analysis of the U6 Phosphate Backbone. RNA 1995, 1 (1), 46−54. (104) Fica, S. M.; Mefford, M. A.; Piccirilli, J. A.; Staley, J. P. Evidence for a Group II Intron-Like Catalytic Triplex in the Spliceosome. Nat. Struct. Mol. Biol. 2014, 21 (5), 464−471. (105) Plaschka, C.; Lin, P.-C.; Nagai, K. Structure of a Pre-Catalytic Spliceosome. Nature 2017, 546 (7660), 617−621. (106) Bertram, K.; Agafonov, D. E.; Dybkov, O.; Haselbach, D.; Leelaram, M. N.; Will, C. L.; Urlaub, H.; Kastner, B.; Lührmann, R.; Stark, H. Cryo-EM Structure of a Pre-Catalytic Human Spliceosome Primed for Activation. Cell 2017, 170 (4), 701−713.e711. (107) Scheres, S. H.; Nagai, K. CryoEM Structures of Spliceosomal Complexes Reveal the Molecular Mechanism of Pre-mRNA Splicing. Curr. Opin. Struct. Biol. 2017, 46, 130−139. (108) Pomeranz Krummel, D. A.; Oubridge, C.; Leung, A. K. W.; Li, J.; Nagai, K. Crystal Structure of Human Spliceosomal U1 snRNP at 5.5 a Resolution. Nature 2009, 458 (7237), 475−480. (109) Weber, G.; Trowitzsch, S.; Kastner, B.; Lührmann, R.; Wahl, M. C. Functional Organization of the Sm Core in the Crystal Structure of Human U1 snRNP. EMBO J. 2010, 29 (24), 4172−4184. (110) Leung, A. K. W.; Nagai, K.; Li, J. Structure of the Spliceosomal U4 snRNP Core Domain and Its Implication for snRNP Biogenesis. Nature 2011, 473 (7348), 536−539. (111) Kondo, Y.; Oubridge, C.; van Roon, A.-M. M.; Nagai, K. Crystal Structure of Human U1 snRNP, a Small Nuclear Ribonucleoprotein Particle, Reveals the Mechanism of 5′ Splice Site Recognition. eLife 2015, 4, e04986. (112) Montemayor, E. J.; Curran, E. C.; Liao, H. H.; Andrews, K. L.; Treba, C. N.; Butcher, S. E.; Brow, D. A. Core Structure of the U6 Small Nuclear Ribonucleoprotein at 1.7-Å Resolution. Nat. Struct. Mol. Biol. 2014, 21 (6), 544−551. (113) Nguyen, T. H. D.; Galej, W. P.; Bai, X.-C.; Savva, C. G.; Newman, A. J.; Scheres, S. H. W.; Nagai, K. The Architecture of the Spliceosomal U4/U6.U5 Tri-snRNP. Nature 2015, 523 (7558), 47− 52. (114) Nguyen, T. H. D.; Galej, W. P.; Bai, X.-C.; Oubridge, C.; Newman, A. J.; Scheres, S. H. W.; Nagai, K. Cryo-EM Structure of the Yeast U4/U6.U5 Tri-snRNP at 3.7 Å Resolution. Nature 2016, 530 (7590), 298−302. (115) Wan, R.; Yan, C.; Bai, R.; Wang, L.; Huang, M.; Wong, C. C. L.; Shi, Y. The 3.8 Å Structure of the U4/U6.U5 Tri-snRNP: Insights Into Spliceosome Assembly and Catalysis. Science 2016, 351 (6272), 466−475. S

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(116) Agafonov, D. E.; Kastner, B.; Dybkov, O.; Hofele, R. V.; Liu, W.-T.; Urlaub, H.; Lührmann, R.; Stark, H. Molecular Architecture of the Human U4/U6.U5 Tri-snRNP. Science 2016, 351 (6280), 1416− 1420. (117) Wan, R.; Yan, C.; Bai, R.; Lei, J.; Shi, Y. Structure of an Intron Lariat Spliceosome From Saccharomyces Cerevisiae. Cell 2017, 171 (1), 120−132.e12. (118) Li, X.; Liu, S.; Jiang, J.; Zhang, L.; Espinosa, S.; Hill, R. C.; Hansen, K. C.; Zhou, Z. H.; Zhao, R. CryoEM Structure of Saccharomyces Cerevisiae U1 snRNP Offers Insight Into Alternative Splicing. Nat. Commun. 2017, 8 (1), 1035. (119) Wilkinson, M. E.; Fica, S. M.; Galej, W. P.; Norman, C. M.; Newman, A. J.; Nagai, K. Postcatalytic Spliceosome Structure Reveals Mechanism of 3′-Splice Site Selection. Science 2017, 358 (6368), 1283−1288. (120) Liu, S.; Li, X.; Zhang, L.; Jiang, J.; Hill, R. C.; Cui, Y.; Hansen, K. C.; Zhou, Z. H.; Zhao, R. Structure of the Yeast Spliceosomal Postcatalytic P Complex. Science 2017, 358 (6368), 1278−1283. (121) Bai, R.; Yan, C.; Wan, R.; Lei, J.; Shi, Y. Structure of the PostCatalytic Spliceosome From Saccharomyces Cerevisiae. Cell 2017, 171 (7), 1589−1598.e8. (122) Mahamid, J.; Pfeffer, S.; Schaffer, M.; Villa, E.; Danev, R.; Cuellar, L. K.; Förster, F.; Hyman, A. A.; Plitzko, J. M.; Baumeister, W. Visualizing the Molecular Sociology at the HeLa Cell Nuclear Periphery. Science 2016, 351 (6276), 969−972. (123) Hartwell, L. H. Macromolecule Synthesis in TemperatureSensitive Mutants of Yeast. J. Bacteriol. 1967, 93 (5), 1662−1670. (124) Hartwell, L. H.; McLaughlin, C. S.; Warner, J. R. Identification of Ten Genes That Control Ribosome Formation in Yeast. Mol. Gen. Genet. 1970, 109 (1), 42−56. (125) Jackson, S. P.; Lossky, M.; Beggs, J. D. Cloning of the RNA8 Gene of Saccharomyces Cerevisiae, Detection of the RNA8 Protein, and Demonstration That It Is Essential for Nuclear Pre-mRNA Splicing. Mol. Cell. Biol. 1988, 8 (3), 1067−1075. (126) Brown, J. D.; Beggs, J. D. Roles of PRP8 Protein in the Assembly of Splicing Complexes. EMBO J. 1992, 11 (10), 3721−3729. (127) Lossky, M.; Anderson, G. J.; Jackson, S. P.; Beggs, J. Identification of a Yeast snRNP Protein and Detection of snRNPsnRNP Interactions. Cell 1987, 51 (6), 1019−1026. (128) Whittaker, E.; Lossky, M.; Beggs, J. D. Affinity Purification of Spliceosomes Reveals That the Precursor RNA Processing Protein PRP8, a Protein in the U5 Small Nuclear Ribonucleoprotein Particle, Is a Component of Yeast Spliceosomes. Proc. Natl. Acad. Sci. U. S. A. 1990, 87 (6), 2216−2219. (129) Gottschalk, A.; Kastner, B.; Lührmann, R.; Fabrizio, P. The Yeast U5 snRNP Coisolated with the U1 snRNP Has an Unexpected Protein Composition and Includes the Splicing Factor Aar2p. RNA 2001, 7 (11), 1554−1565. (130) Boon, K.-L.; Grainger, R. J.; Ehsani, P.; Barrass, J. D.; Auchynnikava, T.; Inglehearn, C. F.; Beggs, J. D. Prp8 Mutations That Cause Human Retinitis Pigmentosa Lead to a U5 snRNP Maturation Defect in Yeast. Nat. Struct. Mol. Biol. 2007, 14 (11), 1077−1083. (131) Weber, G.; Cristão, V. F.; Santos, K. F.; Jovin, S. M.; Heroven, A. C.; Holton, N.; Lührmann, R.; Beggs, J. D.; Wahl, M. C. Structural Basis for Dual Roles of Aar2p in U5 snRNP Assembly. Genes Dev. 2013, 27 (5), 525−540. (132) Pena, V.; Liu, S.; Bujnicki, J. M.; Lührmann, R.; Wahl, M. C. Structure of a Multipartite Protein-Protein Interaction Domain in Splicing Factor Prp8 and Its Link to Retinitis Pigmentosa. Mol. Cell 2007, 25 (4), 615−624. (133) Zhang, L.; Shen, J.; Guarnieri, M. T.; Heroux, A.; Yang, K.; Zhao, R. Crystal Structure of the C-Terminal Domain of Splicing Factor Prp8 Carrying Retinitis Pigmentosa Mutants. Protein Sci. 2007, 16, 1024−1031. (134) Collins, C. A.; Guthrie, C. Allele-Specific Genetic Interactions Between Prp8 and RNA Active Site Residues Suggest a Function for Prp8 at the Catalytic Core of the Spliceosome. Genes Dev. 1999, 13 (15), 1970−1982.

(135) Siatecka, M.; Reyes, J. L.; Konarska, M. M. Functional Interactions of Prp8 with Both Splice Sites at the Spliceosomal Catalytic Center. Genes Dev. 1999, 13 (15), 1983−1993. (136) Ben-Yehuda, S.; Russell, C. S.; Dix, I.; Beggs, J. D.; Kupiec, M. Extensive Genetic Interactions Between PRP8 and PRP17/CDC40, Two Yeast Genes Involved in Pre-mRNA Splicing and Cell Cycle Progression. Genetics 2000, 154 (1), 61−71. (137) Liu, L.; Query, C. C.; Konarska, M. M. Opposing Classes of Prp8 Alleles Modulate the Transition Between the Catalytic Steps of Pre-mRNA Splicing. Nat. Struct. Mol. Biol. 2007, 14 (6), 519−526. (138) Pena, V.; Rozov, A.; Fabrizio, P.; Lührmann, R.; Wahl, M. C. Structure and Function of an RNase H Domain at the Heart of the Spliceosome. EMBO J. 2008, 27 (21), 2929−2940. (139) Yang, K.; Zhang, L.; Xu, T.; Heroux, A.; Zhao, R. Crystal Structure of the Beta-Finger Domain of Prp8 Reveals Analogy to Ribosomal Proteins. Proc. Natl. Acad. Sci. U. S. A. 2008, 105 (37), 13817−13822. (140) Ritchie, D. B.; Schellenberg, M. J.; Gesner, E. M.; Raithatha, S. A.; Stuart, D. T.; MacMillan, A. M. Structural Elucidation of a PRP8 Core Domain From the Heart of the Spliceosome. Nat. Struct. Mol. Biol. 2008, 15 (11), 1199−1205. (141) Abelson, J. A Close-Up Look at the Spliceosome, at Last. Proc. Natl. Acad. Sci. U. S. A. 2017, 114 (17), 4288−4293. (142) Reyes, J. L.; Gustafson, E. H.; Luo, H. R.; Moore, M. J.; Konarska, M. M. The C-Terminal Region of hPrp8 Interacts with the Conserved GU Dinucleotide at the 5′ Splice Site. RNA 1999, 5 (2), 167−179. (143) Fabrizio, P.; McPheeters, D. S.; Abelson, J. In Vitro Assembly of Yeast U6 snRNP: a Functional Assay. Genes Dev. 1989, 3 (12B), 2137−2150. (144) McPheeters, D. S.; Fabrizio, P.; Abelson, J. In Vitro Reconstitution of Functional Yeast U2 snRNPs. Genes Dev. 1989, 3 (12B), 2124−2136. (145) Favre, A.; Saintomé, C.; Fourrey, J. L.; Clivio, P.; Laugâa, P. Thionucleobases as Intrinsic Photoaffinity Probes of Nucleic Acid Structure and Nucleic Acid-Protein Interactions. J. Photochem. Photobiol., B 1998, 42 (2), 109−124. (146) Dix, I.; Russell, C. S.; O’Keefe, R. T.; Newman, A. J.; Beggs, J. D. Protein-RNA Interactions in the U5 snRNP of Saccharomyces Cerevisiae. RNA 1998, 4 (12), 1674−1686. (147) Urlaub, H.; Hartmuth, K.; Kostka, S.; Grelle, G.; Lührmann, R. A General Approach for Identification of RNA-Protein Cross-Linking Sites Within Native Human Spliceosomal Small Nuclear Ribonucleoproteins (snRNPs). Analysis of RNA-Protein Contacts in Native U1 and U4/U6.U5 snRNPs. J. Biol. Chem. 2000, 275 (52), 41458−41468. (148) Vidal, V. P.; Verdone, L.; Mayes, A. E.; Beggs, J. D. Characterization of U6 snRNA-Protein Interactions. RNA 1999, 5 (11), 1470−1481. (149) Newman, A. J.; Teigelkamp, S.; Beggs, J. D. snRNA Interactions at 5′ and 3′ Splice Sites Monitored by Photoactivated Crosslinking in Yeast Spliceosomes. RNA 1995, 1 (9), 968−980. (150) O’Keefe, R. T.; Newman, A. J. Functional Analysis of the U5 snRNA Loop 1 in the Second Catalytic Step of Yeast Pre-mRNA Splicing. EMBO J. 1998, 17 (2), 565−574. (151) Frank, D.; Patterson, B.; Guthrie, C. Synthetic Lethal Mutations Suggest Interactions Between U5 Small Nuclear RNA and Four Proteins Required for the Second Step of Splicing. Mol. Cell. Biol. 1992, 12 (11), 5197−5205. (152) Frank, D.; Guthrie, C. An Essential Splicing Factor, SLU7, Mediates 3′ Splice Site Choice in Yeast. Genes Dev. 1992, 6 (11), 2112−2124. (153) Umen, J. G.; Guthrie, C. Prp16p, Slu7p, and Prp8p Interact with the 3′ Splice Site in Two Distinct Stages During the Second Catalytic Step of Pre-mRNA Splicing. RNA 1995, 1 (6), 584−597. (154) Teigelkamp, S.; Newman, A. J.; Beggs, J. D. Extensive Interactions of PRP8 Protein with the 5′ and 3′ Splice Sites During Splicing Suggest a Role in Stabilization of Exon Alignment by U5 snRNA. EMBO J. 1995, 14 (11), 2602−2612. T

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX

Chemical Reviews

Review

(155) Zhang, X.; Schwer, B. Functional and Physical Interaction Between the Yeast Splicing Factors Slu7 and Prp18. Nucleic Acids Res. 1997, 25 (11), 2146−2152. (156) Aronova, A.; Bacíková, D.; Crotti, L. B.; Horowitz, D. S.; Schwer, B. Functional Interactions Between Prp8, Prp18, Slu7, and U5 snRNA During the Second Step of Pre-mRNA Splicing. RNA 2007, 13 (9), 1437−1444. (157) Horowitz, D. S. The Mechanism of the Second Step of PremRNA Splicing. Wiley Interdiscip Rev. RNA 2012, 3 (3), 331−350. (158) Joyce, C. M.; Steitz, T. A. Function and Structure Relationships in DNA Polymerases. Annu. Rev. Biochem. 1994, 63 (1), 777−822. (159) Schellenberg, M. J.; Wu, T.; Ritchie, D. B.; Fica, S.; Staley, J. P.; Atta, K. A.; LaPointe, P.; MacMillan, A. M. A Conformational Switch in PRP8 Mediates Metal Ion Coordination That Promotes Pre-mRNA Exon Ligation. Nat. Struct. Mol. Biol. 2013, 20 (6), 728−734. (160) Saldanha, R.; Chen, B.; Wank, H.; Matsuura, M.; Edwards, J.; Lambowitz, A. M. RNA and Protein Catalysis in Group II Intron Splicing and Mobility Reactions Using Purified Components. Biochemistry 1999, 38 (28), 9069−9083. (161) Wank, H.; SanFilippo, J.; Singh, R. N.; Matsuura, M.; Lambowitz, A. M. A Reverse Transcriptase/Maturase Promotes Splicing by Binding at Its Own Coding Segment in a Group II Intron RNA. Mol. Cell 1999, 4 (2), 239−250. (162) Zimmerly, S.; Guo, H.; Eskes, R.; Yang, J.; Perlman, P. S.; Lambowitz, A. M. A Group II Intron RNA Is a Catalytic Component of a DNA Endonuclease Involved in Intron Mobility. Cell 1995, 83 (4), 529−538. (163) Zimmerly, S.; Guo, H.; Perlman, P. S.; Lambowitz, A. M. Group II Intron Mobility Occurs by Target DNA-Primed Reverse Transcription. Cell 1995, 82 (4), 545−554. (164) Lambowitz, A. M.; Belfort, M. Mobile Bacterial Group II Introns at the Crux of Eukaryotic Evolution. Microbiol. Spectrum 2015. (165) Deininger, P. L.; Batzer, M. A. Mammalian Retroelements. Genome Res. 2002, 12 (10), 1455−1465. (166) Dai, L.; Zimmerly, S. Compilation and Analysis of Group II Intron Insertions in Bacterial Genomes: Evidence for Retroelement Behavior. Nucleic Acids Res. 2002, 30 (5), 1091−1102. (167) Singh, R. N.; Saldanha, R. J.; D’Souza, L. M.; Lambowitz, A. M. Binding of a Group II Intron-Encoded Reverse Transcriptase/ Maturase to Its High Affinity Intron RNA Binding Site Involves Sequence-Specific Recognition and Autoregulates Translation. J. Mol. Biol. 2002, 318 (2), 287−303. (168) Zhao, C.; Pyle, A. M. The Group II Intron Maturase: a Reverse Transcriptase and Splicing Factor Go Hand in Hand. Curr. Opin. Struct. Biol. 2017, 47, 30−39. (169) Lee, S.; Stevens, S. W. Spliceosomal Intronogenesis. Proc. Natl. Acad. Sci. U. S. A. 2016, 113 (23), 6514−6519. (170) Logsdon, J. M.; Tyshenko, M. G.; Dixon, C.; D-Jafari, J.; Walker, V. K.; Palmer, J. D. Seven Newly Discovered Intron Positions in the Triose-Phosphate Isomerase Gene: Evidence for the IntronsLate Theory. Proc. Natl. Acad. Sci. U. S. A. 1995, 92 (18), 8507−8511. (171) Cousineau, B.; Lawrence, S.; Smith, D.; Belfort, M. Retrotransposition of a Bacterial Group II Intron. Nature 2000, 404 (6781), 1018−1021. (172) Martin, W.; Koonin, E. V. Introns and the Origin of NucleusCytosol Compartmentalization. Nature 2006, 440 (7080), 41−45. (173) Goldschmidt-Clermont, M.; Choquet, Y.; Girard-Bascou, J.; Michel, F.; Schirmer-Rahire, M.; Rochaix, J. D. A Small Chloroplast RNA May Be Required for Trans-Splicing in Chlamydomonas Reinhardtii. Cell 1991, 65 (1), 135−143. (174) Su, L. J.; Waldsich, C.; Pyle, A. M. An Obligate Intermediate Along the Slow Folding Pathway of a Group II Intron Ribozyme. Nucleic Acids Res. 2005, 33 (21), 6674−6687. (175) Tani, T.; Ohshima, Y. mRNA-Type Introns in U6 Small Nuclear RNA Genes: Implications for the Catalysis in Pre-mRNA Splicing. Genes Dev. 1991, 5 (6), 1022−1031. (176) Zhou, L.; Hang, J.; Zhou, Y.; Wan, R.; Lu, G.; Yin, P.; Yan, C.; Shi, Y. Nature 2014, 506 (7486), 116−120.

U

DOI: 10.1021/acs.chemrev.7b00499 Chem. Rev. XXXX, XXX, XXX−XXX