Defining Functional Structured RNA inside Living Cells - American

Oct 24, 2017 - nucleotide.5 Using dimethyl sulfate (DMS) as the chemical probe and a ... cleaved and processed had lower predicted free energy scores...
0 downloads 0 Views 338KB Size
Viewpoint Cite This: Biochemistry XXXX, XXX, XXX-XXX

pubs.acs.org/biochemistry

Defining Functional Structured RNA inside Living Cells Dalen Chan† and Robert C. Spitale*,†,‡ †

Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, United States Department of Chemistry, University of California, Irvine, Irvine, California 92697, United States



T

exploration of RNA structures controlling 3′-end maintenance and degradation. Classifying the site of polyadenylation of each transcript is difficult because of the large number of nucleotides between the polyadenylation signal (PAS) and the variability of the polyadenylation site. To first identify optimal sequence distances, they screened an artificial library of sequences between the PAS and the poly(A) site in HEK293T cells for in vivo polyadenylation processing.2 An optimal distance of 15−18 nucleotides between the upstream PAS and the site of polyadenylation was observed. However, an interesting finding is that distances much larger than this optimal frame were predicted to be more stable than the optimal distances. One hypothesis from these findings was that RNA structure elements were positioning the PAS and the site of polyadenylation for optimal regulation of decay. To test this hypothesis, they first folded all the sequences from the screen in silico and noticed that sequences that were cleaved and processed had lower predicted free energy scores when the distance between PAS and the site of polyadenylation exceeded 27 nucleotides. In sequences between 17 and 27 nucleotides in length, the opposite was observed, indicating that these RNAs are likely to be unfolded. Distances of fewer that 14 nucleotides were too short to have substantial structure predicted, which is consistent with a lack of cleavage at those sites. Overall, these observations suggested that there is an optimal length between PAS and the site of polyadenylation (15−18 nucleotides), and longer distances may utilize structure to bring the PAS and the site of polyadenylation into the proximity of each other to enhance cleavage (Figure 1). With the addition of the new methods, the lab explored the potential of structured (or unstructured) regions for polyadenylation processing.2 Incorporating DMS-MAP and their sequencing technique (DIM-2P-seq) in HEK293T native RNA provided evidence of small stem−loop substructures, along with unstructured regions upstream of the polyA sequence. A remarkable demarcation of structure bordered the PAS: upstream residues were unfolded, and downstream residues were highly structured. Approximately 50% of all A and C residues were base paired between the PAS and polyA site. On the basis of this finding, the authors proposed that the structured regions of distal PAS sites bring the PAS and polyA site into the proximity of each other for optimal processing. Further investigation of the role of these structured regions was performed with mutagenesis experiments. Mutations and compensatory mutations on the stem−loop structure decreased and restored polyadenylation efficiency, respectively. Experi-

here is a growing wealth of knowledge detailing functional RNA folding in all aspects of biology. Most functional RNA structures can easily be explored in vitro, but it is often difficult to predict functional RNA structures when they are studied in their native environment.1 Furthermore, the structural relevance of less folded regions on RNA is more difficult to interpret, because dynamic RNA conformations, along with potential protein interactions, may lead to multiple functions. Recent studies that have profiled the structures of cellular RNAs have seen the merger of traditional chemical probing with the advent of deep sequencing.1 Modification sites are identified as stop sites in conventional reverse transcription (RT). RT stop can then be mapped to a reference transcriptome to obtain RNA structure profiles of whole intact RNAs inside living cells. Such approaches have shed light on structured regions that are important for regulating translation, RNA−protein interactions, and even RNA modifications.1 Nevertheless, it has been a challenge to identify causal RNA structure regions that play critical roles in regulating RNA biology. One critical limitation that all sequencing protocols have, by virtue of the library construction steps, is the inability to sample structures near the very 3′-end of transcripts. This has left a major hole in our understanding of not only the structural content of 3′-ends but also how such structures could control important processes such as RNA metabolism and degradation. Creating a complete picture of RNA structures near the 3′-end of RNAs and potentially discovering functional structures that are important for RNA processing are critical challenges for the field, both experimentally and biologically. Publishing recently in Cell,2 the Bartel lab explored the idea of whether less structured regions of mRNA 3′-ends can still be structurally relevant in polyadenylation processing and stability. To gain insight into RNA structures near the 3′-end of RNAs, the Bartel lab began by making novel advances in structure probing and 3′-end profiling. First, they took advantage of the 3′-polyA tail by designing all of their RT experiments to prime off the 3′-end of the RNA through poly(T) priming.3 Second they adopted a newly proven mutational map approach from a previously reported protocol, RNA interaction groups by mutational profiling (RING-MaP),4 which relies on the ability of reverse transcriptase enzymes to incorporate mutations at nucleotides paired with a modified nucleotide.5 Using dimethyl sulfate (DMS) as the chemical probe and a highly processive reverse transcriptase, stop sites can now be read through as single-point mutations. This DMS mutational mapping approach (DMS-MAP) is advantageous over conventional RT stops because of its ability to read through the multiple mutations on a single RNA and generate longer reads from the 3′-end of the RNA,5 enabling the © XXXX American Chemical Society

Received: August 22, 2017

A

DOI: 10.1021/acs.biochem.7b00816 Biochemistry XXXX, XXX, XXX−XXX

Viewpoint

Biochemistry

3′-end analyses, and other methods focused on protein binding such as CLIP. Novel structure reagents, and clever development of protocols to utilize such methods, can complement current approaches in the performance of a more thorough investigation of RNA structure in cells. The combined analyses from these tools will shape our future perspective on functional RNA structure.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. ORCID

Robert C. Spitale: 0000-0002-3511-8098 Funding

The Spitale lab is supported by start-up funds from the University of California, Irvine, and the National Institutes of Health (Grants 1DP2GM119164 and 1RO1MH109588 to R.C.S.). R.C.S. is a Pew Biomedical Scholar. Notes

The authors declare no competing financial interest.

■ ■

Figure 1. Cartoon detailing the role of RNA structure in 3′-end processing. The optimal distance between the PAS and 3′-end cleavage site is 15−18 nucleotides in primary sequence length. Primary sequences that are >27 nucleotides in length and are unfolded will result in rapid RNA degradation. For sequences that are >27 nucleotides in length and have stable structures that bring the distance closer to the cleavage site, RNAs are efficiently cleaved and polyadenylated.

ACKNOWLEDGMENTS The authors thank members of the Spitale lab for their careful reading and critique of the manuscript. REFERENCES

(1) Spitale, R. C., Flynn, R. A., Zhang, Q. C., Crisalli, P., Lee, B., Jung, J. W., Kuchelmeister, H. Y., Batista, P. J., Torre, E. A., Kool, E. T., and Chang, H. Y. (2015) Structural imprints in vivo decode RNA regulatory mechanisms. Nature 519, 486−490. (2) Wu, X., and Bartel, D. P. (2017) Widespread Influence of 3′-End Structures on Mammalian mRNA Processing and Stability. Cell 169, 905−917.e911. (3) Spies, N., Burge, C. B., and Bartel, D. P. (2013) 3′ UTR-isoform choice has limited influence on the stability and translational efficiency of most mRNAs in mouse fibroblasts. Genome Res. 23, 2078−2090. (4) Homan, P. J., Favorov, O. V., Lavender, C. A., Kursun, O., Ge, X., Busan, S., Dokholyan, N. V., and Weeks, K. M. (2014) Single-molecule correlated chemical probing of RNA. Proc. Natl. Acad. Sci. U. S. A. 111, 13858−13863. (5) Zubradt, M., Gupta, P., Persad, S., Lambowitz, A. M., Weissman, J. S., and Rouskin, S. (2017) DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo. Nat. Methods 14, 75−82.

ments in which the proposed stem−loop structure was deleted revealed higher activity, demonstrating that the effective distance is a determinant for RNA processing. Only destabilizing or elongating the stem−loop structure reduces the polyadenylation efficiency. A surprising result from this investigation is that the structures enriched near the PAS site are no more predicted to fold (with similar energy values) than random sequences of the same length and nucleotide content. One explanation for this difference is that the sequence and space context of the structured regions is what governs function. To test this, comparisons of random structures to the native form were performed, and the results demonstrated that random structures (and sequences) could serve just as well to control polyA site usage, as long as they are folded. One way to reconcile this finding is to inspect more closely the distribution of structure. In the native RNAs, there are segments upstream of the PAS that are highly unstructured and downstream regions that are very structured. This distribution would even out the predicted free energy to be at the level of a random sequence. As such, the way in which the energy of the structures is distributed along the RNA sequence is critical. What do these observations mean? The role that structure could play in bridging the PAS and the site of polyadenylation together still has to be mechanistically flushed out. One likely reason is the promotion of optimal binding and positioning of proteins that would promote decay. PolyA binding proteins (PABPs) recognize the primary sequences near the 3′-end to control decay and regulation of mRNA levels. The structured regions of the RNAs (of distal PASs) could control how PABPs bind to the 3′-end, either through controlling the orientation for different domains within PABPs or by helping with oligimerization of the PABPs on the 3′-end. Such possibilities could be worked out with a combination of structure probing, B

DOI: 10.1021/acs.biochem.7b00816 Biochemistry XXXX, XXX, XXX−XXX