Rational design of mini-Cas9 for transcriptional activation - ACS

Mar 21, 2018 - Nuclease dead Cas9 (dCas9) has been widely used for modulating gene expression by fusing with different activation or repression domain...
7 downloads 7 Views 3MB Size
Subscriber access provided by Queen Mary, University of London

Letter

Rational design of mini-Cas9 for transcriptional activation Dacheng Ma, Shuguang Peng, Weiren Huang, Zhiming Cai, and Zhen Xie ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.7b00404 • Publication Date (Web): 21 Mar 2018 Downloaded from http://pubs.acs.org on March 22, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Synthetic Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Rational design of mini-Cas9 for transcriptional activation

Dacheng Ma1, Shuguang Peng1, Weiren Huang2, Zhiming Cai2 and Zhen Xie1* 1

MOE Key Laboratory of Bioinformatics and Bioinformatics Division, Center for

Synthetic and System Biology, Department of Automation, Tsinghua National Lab for Information Science and Technology, Tsinghua University, Beijing 100084, China 2

State Engineering Laboratory of Medical Key Technologies Application of Synthetic

Biology, Shenzhen Second People’s Hospital, the First Affiliated Hospital of Shenzhen University, Shenzhen, China * To whom correspondence should be addressed: [email protected] (Z.X.).

ABSTRACT Nuclease dead Cas9 (dCas9) has been widely used for modulating gene expression by fusing with different activation or repression domains. However, delivery of the CRISPR/Cas system fused with variant effector domains in a single adeno-associated virus (AAV) remains challenging due to the payload limit. Here, we engineered a set of downsized variants of Cas9 including Staphylococcus aureus Cas9 (SaCas9) that retained DNA binding activity by deleting conserved functional domains. We demonstrated that fusing FokI nuclease domain to the N-terminal of the minimal SaCas9 (mini-SaCas9) or to the middle of the split mini-SaCas9 can trigger efficient DNA cleavage. In addition, we constructed a set of compact transactivation domains based on the tripartite VPR activation domain and self-assembled arrays of split SpyTag:SpyCatch peptides, which are suitable for fusing to the mini-SaCas9. Lastly, we produced a single AAV containing the mini-SaCas9 fused with a downsized transactivation domain along with an optimized gRNA expression cassette, which showed efficient transactivation activity. Our results highlighted a practical approach to generate down-sized CRISPR/Cas9 and gene activation systems for in-vivo applications. KEYWORDS: CRISPR/Cas9, gene activation, AAV, VPR

1 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 25

INTRODUCTION The

CRISPR/Cas9 system enables programmable and precise genetic

manipulations, providing viable opportunities for personalized therapies

1–4

. Through

mutating the active residues related to the nuclease function, the Cas9 protein can be engineered into the nuclease-null or ‘dead’ Cas9 variant (dCas9) that only retains the DNA binding ability but lacks detectable nuclease activity 5. dCas9 can be fused to different domains for efficient transactivation, such as the four tandem copies of herpes simplex viral protein 16 (VP64), the repeating SunTag peptide array and the tripartite activator VP64-p65-Rta (VPR)

6–8

.The dCas9 based transactivator for specific gene

activation has been used to activate the expression of endogeneous gene for ameliorating disease phenotypes and induce cell differentiation 9,10. Although the AAV vector has a limited payload capacity, which is inefficient above 5-kb, it is still an attractive delivery vehicle for the CRISPR/Cas9 system because of low pathogenic risk, reduced immunogenicity and wide range of tissue tropism

11–13

. To bypass the AAV payload limit, the 4.2-kb Cas9 from Streptococcus

pyogenes (SpCas9) is split and packaged into two separate AAVs along with the guide RNA (gRNA) expression unit, which allows functional reconstitution of full-length SpCas9 in vivo

13,14

. Nevertheless, this dual-AAV system may reduce the delivery

efficiency. Another strategy is to search for natural class 2 CRISPR effectors with a diminished size, such as the 3.2-kb SaCas9, ~3-kb CasX and CjCas9 identified in uncultivated organisms by using metagenomic datasets

15–19

. To further reduced the

transgene size, the ~70-bp glutamine tRNA can be used to replace the ~250-bp RNA polymerase III promoter to drive expression of the tRNA:gRNA fusion transcript that is cleaved by endogenous tRNase Z to produce the active gRNA

20

. These efforts

facilitate the construction of an all-in-one AAV delivery vector for in vivo applications of the CRISPR/Cas technology 15,17,21. Recent structural studies of SpCas9, SaCas9 and AsCpf1 have elucidated functions of conserved domains among these class 2 CRISPR effectors, including a recognition (REC) domain, a protospacer adjacent motif (PAM) interacting (PI)

2 ACS Paragon Plus Environment

Page 3 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

domain, and nuclease domains

22–25

. The HNH/NUC and RuvC nuclease domains

respectively cleave the complementary and the non-complementary DNA strands22–25. Interestingly, truncated SpCas9 mutant without HNH domain displays nearly intact DNA binding activity, while truncated SpCas9 mutant without REC2 domain retains half of the wild-type cleavage activity

23,24

. To improve genome editing specificity,

FokI nuclease domain is fused to the N-terminal of dCas9 in the PAM-out orientation, which dimerizes with the guidance of two gRNAs and causes DNA cleavage

26

.

However, the length of these dCas9 fusion genes usually impede directly loading into a single AAV vector due to the restrictive cargo size. In this study, we constructed a set of downsized variants of SaCas9 by deleting conserved functional domains, which retained the DNA binding activity but displayed no DNA cleavage activity. We also constructed a set of compact transactivation domains based on the tripartite VPR activation domain and self-assembled arrays of split SpyTag:SpyCatch peptides, which allow us to produce a single AAV containing the mini-SaCas9 fused with a downsized transactivation domain along with an optimized gRNA expression cassette.

RESULTS AND DISCUSSION We firstly constructed two mini-dSpCas9 genes by respectively deleting the C-terminal region of REC1 domain (REC-C, △501-710) and the HNH domain (△777-891) that may be dispensable for DNA binding activity of the dCas9, and respectively fused to the VPR transactivation domain. By using our previously established reporting system in cultured human embryonic kidney 293 (HEK293) cells (Figure 1A)

27

, we demonstrated that the two mini-dSpCas9:VPR variants

retained more than 50% of transactivation capacity compared to the dSpCas9:VPR (Figure 1B and Supplementary Figure 2A). Recently, the wildtype SaCas9 has been engineered to recognize an altered PAM (“NNNRRT”)

3 ACS Paragon Plus Environment

28

. It has been shown that

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 25

introducing an A-U flip or a U-G conversion to disrupt the putative RNA Pol III terminator sequences in the first stem loop of the gRNA scaffold enhances the efficiency of the dCas9-mediated DNA labeling29,30. We demonstrated that the endogenous transactivation efficiency of dSaCas9:VPR for IL1RN gene with optimized guide RNA-2 was ~8-fold higher than the result when the wild type guide RNA was used by introducing both of two point mutations in the putative RNA Pol III terminator sequences (Supplymentary Figure 1A). Based on this mutant SaCas9 retargeted by optimized guide RNA, we constructed the mini-SaCas9-1:VPR by replacing the conserved REC-C domain (△234-444) with a “GGGGSGGGG” linker (GS-linker), which only retained ~ 21% transactivation activity of the dSaCas9:VPR (Figure 1C and Supplementary Figure 2B). Although the GS-linker is widely used as a flexible linker, it may still distort the SaCas9 structure. Inspired by a recent computational protocol called SEWING31, we developed an adjacent residue searching (ARS) protocol to search for existing structures between discontinued Cas9 fragments (details in Methods). Replacement of the GS-linker with a “KRRRRHR” (R-linker) from the SaCas9 BH domain that appropriately filled in the REC-C deletion gap by using the ARS protocol, resulted in a 4-fold increase in the transactivation capacity of mini-SaCas9-2 (Figure 1C and Supplementary Figure 2B). Interestingly, the mini-SaCas9-3 that was generated by deleting the REC-C domain without any linker displayed a similar transactivation efficiency to the mini-SaCas9-2 (Figure 1C and Supplementary Figure 2B). We also found a “GSK” linker, derived from a putative gene (Accession number, O67859) in Aquifex aeolicus by using the SAR protocol, fit the deletion gap of the HNH domain (△479-649). The resulted mini-SaCas9-4 exerted a stronger transactivation efficiency than the dSaCas9:VPR (Figure 1C and Supplementary Figure 2B). However, the 2-kb mini-SaCas9-5 by deleting both the REC-C domain and the HNH domain only retained 41% transactivation activity over the wild-type control (Figure 1C and Supplementary Figure

2B).

Interestingly, the

transactivation efficiency of

miniSaCas9-5 was increased ~30-fold by optimized guide RNA than wild type guide

4 ACS Paragon Plus Environment

Page 5 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

RNA (Supplymentary Figure 3A). In addition, we observed ~50% transactivation activity when we use glutamine tRNA instead of the U6 promoter to drive gRNA expression (Supplymentary Figure 3B). To evaluate DNA cleavage efficiency of mini-SaCas9 variants, we used a reporter reconstitution assay as described in our previous study

27

, where DNA cleavage can

trigger the reconstitution of the active enhanced yellow fluorescent protein (EYFP) reporter gene from the inactive form (Supplementary Figure 2C). We demonstrated that either deleting the REC-C or deleting both the REC-C and HNH domains resulted in a background EYFP expression, suggesting that the domain deletion abolished the DNA cleavage activity of mini-SaCas9 variants (Supplementary Figure 2D). In addition, we showed that deleting REC2 domain (△324-525) retained 46% transactivation capacity of AsCpf1:VPR but abolished the DNA cleavage activity, suggesting that this deletion strategy is applicable for distinct class 2 CRISPR effectors (Figure 1D and Supplementary Figure 2E-G). To assay whether the compact SaCas9 varants can efficiently activate expression of endogenous gene, we retargeted the miniCas9:VPR variants and dSaCas9:VPR to the IL1RN promoter by coexpressing the optimized gRNA. The result showed that the mRNA level was increased at least 1000-fold through coexpressing the optimized guide RNA-2 and dSaCas9:VPR (Supplementary Figure 4). We demonstrated that all compact SaCas9:VPR variants coexpressed with the optimized guide RNA-2 can activate the IL1RN gene expression. The mini-SaCas9-4:VPR generated a comparable IL1RN gene activaton level to that of the dSaCas9:VPR (Supplymentary Figure 4). As shown in Figure 2A, we used a similar EYFP reconstitution assay to evaluate DNA cleavage efficiency of dSaCas9 derivatives fused with FokI along with two truncated gRNAs that respectively containing 18-nt sequences complementary to the target 27. We demonstrated that the FokI:mini-dSaCas9-4 that was made by fusing the FokI domain to the N-terminal of the mini-dSaCas9-4 displayed a similar DNA

5 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

cleavage activity compared to the FokI:dSaCas9 control, with a spacer between two gRNA complementary regions ranging from 12-bp to 24-bp in the PAM-out orientation (Figure 2B-C). We then searched for another appropriate FokI insertion position in the mini-dSaCas9-4. However, the predicted distance between the N-terminal and the C-terminal of the FokI nuclease domain was 35 Å (Supplementary Figure 5), which makes challenging to find an appropriate insertion position in the middle of the dSaCas9. Alternatively, we split mini-dSaCas9-4 at residue 733 and fused the FokI after the splitting point with a triplicate G-linker. We further removed the first four residues of the C-terminal fragment (△734-737) that might interfere with the reconstitution of two split fragments. The split dSaCas9 or split mini-dSaCas9-4 without the HNH domain resulted in ~3% to 33% of the EYFP expression level induced by the wild-type SaCas9, less than ~20% to 62% induce efficiency of FokI:dSaCas9 or FokI:mini-dSaCas9-4, with a spacer ranging from 12-bp to 24-bp in the PAM-out orientation (Figure 2B and 2C). In addition, we evaluate the DNA cleavage activity of these FokI fusions when targeting the human CCR5 gene in HEK293 cells by using the T7 Endonuclease I (T7E1) assay 32. The results showed that all fusions of FokI and dSaCas9 with or without the HNH domain triggered the DNA cleavage on the CCR5 gene, although deleting the HNH domain reduced DNA cleavage efficiency (Figure 2D). Next, we sought to engineer compact transcription activators based on dCas9-VPR 7

. The entire P65 contains a DNA binding domain in the N-terminal, and two

transactivation domains (TA1 and TA2) in the C-terminal 33. However, only the TA2 and the partial TA1 are included in the tripartite VPR domain 7. To reduce the size of the dCas9-VPR, we constructed the mini-SaCas9-5:VTR1 by replacing the P65 domain in the VPR with the TA1 and TA2 domains (Figure 3A). We further substituted the P65 domain in the VPR with two repeats of the TA1 domain, termed VTR2 (Figure 3A). We demonstrated that the VTR1 and VTR2 domains retain 47% and 55% transactivation efficiency of the VPR domain (Figure 3A). We further removed the partial RTA domain, resulted in a 0.8-kb transactivation domains (VTR3), which

6 ACS Paragon Plus Environment

Page 6 of 25

Page 7 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

retained 53% transactivation efficiency of the VPR domain (Figure 3A). We also found the mini-SaCas9-5:VTR3 paired with the optimized gRNA showed 80% transactivation efficiency compared to the split SaCas9:VPR paried with original gRNA (Supplementary Figure 6A and 6B). In addition, we compared the transactivation efficiency of the VTRs with the VP64 domain fused to the mini-SaCas9-5 only. The result showed that fusion of the different transactivation domain behind the VP64 domain enhanced the activation efficiency (Figure 3A). Recently, a 13-residue SpyTag derived from FbaB has been shown to form a covalent bond with its 116-residue binding partner, called SpyCatcher (Supplementary Figure 6C)

34

. To construct a repeating peptide array with a smaller size than the

SunTag system 8, we fused four tandem repeats of SpyTag to the C-terminal of mini-SaCas9-5 and fused the SpyCatcher with the VPR domain, allowing spontaneous assembly of a VPR transactivation scaffold in cells (Figure 3B). We showed that the SpyTag system induced the expression of the enhanced blue fluorescent protein 2 (EBFP2) reporter gene to 100-fold compared to the negative control (Figure 3B). To search for the homologue of SpyTag and SpyCatcher, we found a putative protein (accession No. WP_054278706) from Streptococcus phocae shared 60% sequence similarity to the FbaB (Supplementary Figure 6C). We hypothesized that this protein can be split similarly as the SpyTag and SpyCatcher. We developed a similar scaffold system called the MoonTag system by fusing four tandem repeats of the 13-residue MoonTag to the mini-SaCas9-5 and making a hybrid of the MoonCatcher and VPR domains (Figure 3B). Although the MoonTag system was not orthogonal to the SpyTag system, the MoonTag system was 5-fold more efficient to activate the EBFP2 expression (Figure 3B). The AAV load size limitation can be overcomed by splitting the Cas9 into an N terminal and a C terminal fragment separately expressed by two single AAV vectors. We previously demonstrated that intein-mediated split Cas9 reconstitution displayed enhanced activity compared with their counterparts without intein fusion 27. Next, we sought to compare the efficiency of the previous double-AAV-vector system by

7 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

coexpressing split Cas9 fragments with the all-in-one AAV system using the compact Cas9 variants. First, we split the dSaCas9 at residue 739, fused the N and C fragments with intein-N and intein-C respectively. The dSaCas9N:Intein-N driven by the constitutive CMV promoter and the U6-driven original gRNA targeting the TRE promoter were loaded into one AAV vector (AAV-splitN), while the CMV-driven dSaCas9C:VPR was loaded into the second AAV vector (AAV-splitC) (Figure 4A). In the all-in-one AAV system, a single AAV virus (AAV-single-1) encoded a constitutively expressed mini-SaCas9-5:VTR1 and an optimized gRNA recongnizing the TRE promoter (Figure 4A). We introduced a TRE-driven EBFP2 reporter gene into HEK293 cells by transient transfection (Figure 4A), followed by infection of the indicated AAV viruses. An AAV virus containing the mini-SaCas9-5:VTR1 and optimized gRNA-2 recongnizing off-target site instead of the TRE promoter was added as the negative control. After four days, we observed ~40% transfected cells were activated by AAV-single-1. Different stoichiometric ratios of the double-AAV system were tested, and the 70:30 ratio yielded a maximum cell activation of ~30% (Figure 4B and 4C). However, the overall EBFP2 level induced by AAV-single-1 was about 3 fold less than the that resulted by using AAV-splitN and AAV-splitC with a 90:10 ratio (Supplementary Figure 7A), which was likely because the transactivation of the VTR1 domain was less than the VPR domain (Figure 3). We also made three all-in-one AAV variants (AAV-single-2, -3 and -4) by replacing the VTR1 domain with either VTR2 or VTR3, using either the CMV or EFS promoter (Supplementary Figure 7B and 7C). However, the ratio of the activated cell decreased, suggesting that the AAV-single-1 is more suitable for transactivation with a high infection efficiency. In this study, we engineered a set of compact Cas9 derivatives that retained efficient DNA binding activity but no DNA cleavage activity by deleting conserved HNH and/or REC-C domains based on the structural information (Figure 1). Recently the single molecular studies showed that both of the HNH nuclease domain the REC-C terminal domain control the conformation of the RUC nuclease domain activity which may be the possible reason why these set of compact Cas9 derivatives have no DNA cleavage activity 23,35. In addition, we provided a novel strategy to engineer the dimeric

8 ACS Paragon Plus Environment

Page 8 of 25

Page 9 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

gRNA-guided nuclease by splitting the mini-dSaCas9 and fusing the FokI domain right after the split point (Figure 2). We also observed that the DNA cleavage efficiency of each FokI fusion Cas9 is different when the spacer distance changes and when targeting endoengous gene (Figure 2). Further experiments are needed to quantify this difference and optimize the DNA cleavage efficiency in different situations. The VPR domain was made by fusing p65, Rta and VP64. However, Rta and p65 contain DNA binding and transcription activation domains that can initiate promiscuous genome interactions and gene activations 33,36. We developed a series of compact transactivation domains (VTR1, VTR2 and VTR3) by deleting the DNA binding domains in Rta and p65 (Figure 3), which may benefit for a more specific transactivation than the VPR domain. Our study highlighted a practical approach to load all essential elements of the CRISPR/Cas transactivation system in one AAV system (Figure 4). Due to the AAV cargo restriction, only one gRNA was used in the all-in-one system. For endogenous gene activation, multiple gRNAs targeting the same promoter region can induce higher gene expression. New strategies are needed for expressing multiple gRNAs in the all-in-one system, such as the tRNA driven and self-splicing system. In summary, our downsized CRISPR/Cas9 and gene activation systems will be particularly appealing in biomedical applications that require safe and efficient delivery in vivo.

MATERIAL AND METHODS Reagents and enzymes Restriction endonuclease, polynucleotide kinase (PNK), T4 DNA ligase, Quick DNA ligase, and Q5 High-Fidelity DNA Polymerase were purchased from New England Biolabs. Oligonucleotides were synthesized by Genewiz and Sangon Biotech.

Plasmid DNA constructs

9 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

The gRNA sequences were listed in Supplementary Table 1. The protein sequences of Cas9 derivatives and domains used in this study were summarized in Supplementary Table 2.

Cell culture and transfection HEK293 cell line was purchased from Life Technologies. HEK293 cells were cultured in high-glucose DMEM complete media (Dulbecco’s modified Eagle’s medium (DMEM), 4.5 g/L glucose, 0.045 unit/mL of penicillin, 0.045 g/mL streptomycin, and 10% FBS (Life Technologies)) at 37 °C, 100% humidity, and 5% CO2. One day before transfection, ∼1.2 × 105 HEK293 cells in 0.5 mL of high-glucose DMEM complete media were seeded into each well of 24-well plastic plates (Falcon). Shortly before transfection, the medium was replaced with fresh DMEM complete media. The transfection experiments were performed by using Attractene transfection reagent (Qiagen) by following the manufacturer’s protocol. The amount of plasmid DNAs used in transfection experiments is listed in Supplementary Table 3. Cells were cultured for 2 days before flow cytometry and imaging analysis.

AAV packaging, purification and infection For AAV production, ~1.2 × 106 HEK293 cells were seeded into 10 cm petri dish. The pAAV-DJ serotype packaging DNA construct, pHelper construct and pAAV construct were purchased from Cell Biolabs. Cells were cotransfected with 3 µg of pAAV-DJ, 3 µg of pHelper and 3 µg of pAAV plasmid DNA carrying the gene of interest by using Attractene transfection reagent (Qiagen). After 3 days, all media including trypsinized cells were harvested in a 50 mL sterile tube. About 0.1 volume of chloroform was added into the steril tube and the tube was shaked at 250 rpm/min at 37°C for 1 h. Then NaCl was added to the mixture to make the final concentration at 1 M. After centrifugation at 12000 rpm/min for 15 min, the upper aqueous phase was transferred to a new 50 mL sterile tube and mixed with PEG-8000 (10% final w/v)

10 ACS Paragon Plus Environment

Page 10 of 25

Page 11 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

followed by incubation on ice for 1 h. The mixture was centrifuged at 11000 rpm/min for 15 min. The pellet was resuspended in 1.2 mL D-PBS (with Ca2+ and Mg2+) purchased from Beyotime. DNase and RNase (Solarbio) were added to the final concentration of 1 µg/mL, respectively, and the mixture was incubated for 30 min at room temperature. Equal volume of chloroform was added into the mixture, followed by centrifugation at 12000 rpm/min for 5 min. The upper aqueous phase was collected for the following AAV infection experiment. To extract viral genome, 1 µL of purified AAV virus were incubated with 1 U of DNase I (NEB) in a 50-µL reaction containing 25 mM of Tris-HCl (pH 7.4) and 10 mM of MgCl2 at 37 °C for 30 min, and then incubated at 75 °C for 10 min. Then the reaction was treated with 200 µg proteinase K in the presence of 5 mM Tris-HCl (pH=8.0), 10 mM Na2EDTA and 20 mM NaCl2 at 37 °C for 1 h. A 10-fold serial dilution of the corresponding plasmid DNA used for AAV virus package was made between 0.2 ng/µL and 0.02 pg/µL for virus titer measurement. The AAV titer was determined by quantitative PCR using 2× EvaGreen Master Mix (Syngentech). HEK293 cells were seeded in 24-well plates at a density of 1× 105 cells/well. After 1 day, cells were transfected with the plasmid DNAs that encode the EBFP2 reporter gene, and the mKate2 gene that served as an internal control (Supplementary Table 3). Then, purified AAV was added to each well 1 day after transfection. Cells were cultured for additional 4 days before flow cytometry and imaging analysis.

Flow cytometry Cells were trypsinized 48 h after transfection and centrifuged at 300 g for 7 min at 4 °C. The supernatant was removed, and the cells were resuspended in 1× PBS that did not contain calcium or magnesium. Fortessa flow analyzer (BD Biosciences) was used for fluorescence-activated cell sorting (FACS) analysis with the following settings. EBFP2 was measured using a 405 nm laser and a 450/50 filter with a photomultiplier tube (PMT) set at 275 V. The EYFP was measured with a 488 nm

11 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

laser and a 530/30 filter using a PMT set at 270 V. The mKate2 was measured with a 561 nm laser and a 670/30 filter using a PMT set at 350 V. The iRFP was measured with a 640 nm laser and a 780/60 filter using a PMT set at 480 V. For each sample, ∼1× 104 to ∼5 × 104 cell events were collected. The relative florescence intensity of EBFP2 is defined as the average fluorescence intensity of EBFP2 divided by the average fluorescence intensity of the internal control florescence (such as mKate2). "Fold change (EBFP2)" was defined as the average of relative fluorescence intensity of EBFP2 of samples divided by the that of the control samples. "%EBFP2+" is calculated as the fraction of positive EBFP2 cell divided by the fraction of positive mKate2 cell to normalize the transfection difference.

Fluorescence microcopy Approximately 48 h after transfection, fluorescent images of cultured HEK293 cells were captured by using Leica DMi8 microscope with the 10× objective lens. The EBFP2 fluorescence was observed by using a DAPI filter cube with excitation at 350/50 nm, dichroic at 400 nm and emission at 460/50 nm. The mKate2 fluorescence was observed with the TXR filter cube with excitation at 560/40 nm, dichroic at 585 nm and emission at 630/75 nm. Image acquisition and post-acquisition analysis were performed using the LASX software suite (Leica).

Adjacent residue searching The BLASTP program was used to search for suitable linker sequences in the NCBI PDB database to close the domain deletion gap between the N-terminal and C-terminal anchors

37

. The query sequences were composed of the last three amino

acid residues upstream of the N-terminal anchor and the degenerate residue sequences with an estimated length and the first three amino acid residues downatream of the C-terminal anchor. The candidate linker sequences was selected if the following three criteria were satisfied: 1) its N-terminal region shared high structural similarity with the N-terminal anchor region based on the structure similarity; 2) its C-terminal

12 ACS Paragon Plus Environment

Page 12 of 25

Page 13 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

region shared high structural similarity with the C-terminal anchor region based on the structure similarity; 3) superposition of the candidate linker sequnences into the deletion gap does not create any steric clashes.

RNA purification and Quantitative PCR Total RNA from HEK293 cells was extracted with Trizol reagent (Life Tech). 500 ng of RNA was reversed transcripted by ReverTra Ace qPCR RT Master Mix with gDNA Remover Kit (TOYOBO), and 1 µl of cDNA was used for each qPCR reaction, using the 2× EvaGreen Master Mix (Syngentech). qPCR primers were listed in Supplementary Table 1. qRT-PCR was run and analyzed in the Light clycler 480 II (Roche), with all target gene expression levels normalized to β-actin mRNA levels.

T7E1 DNA cleavage assay DNA was harvested 72 h post-transfection using an genomic DNA extraction kit (TianGen) according to the manufacture’s protocol. PCR was performed to amplify the

fragment

of

the

CCR5

gene

by

(5’-CGTGTCACAAGCCCACAGATATTT-3’

using

two

primers and

5’-GCACAGGGTGGAACAAGATGG-3’). All reactions were performed with the high-fidelity DNA polymerase (TsingKe) with the resulting products purified by using the Gel Extraction Kit (GenStar). T7E1 digestion was then performed in

NEB Buffer 2 according to the manufacturer’s instructions. The image of the electrophoresis of the cleavage products was analyzed by using the ImageJ software.

SUPPORTING INFORMATION The Supporting Information is available free of charge on the ACS Publications website at DOI: XXX. Supplymentary Figure 1-7, Supplymentary Table 1-3.

13 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

AUTHOR INFORMATION Corresponding Author Zhen Xie Email: [email protected] ORCID Zhen Xie: 0000-0001-8798-9592

Author Contributions Z.X. and D.M. conceived of the ideas implemented in this work. D.M. performed experiments. Z.X., D.M., S.P., W.H. and Z.C. analysed the data. Z.X. supervised the project. Z.X., D.M. and S.P. wrote the paper.

ACKNOWLEDGEMENTS The research is supported by the National Key Basic Research Program of China (2014CB745200), National Natural Science Foundation of China (31771483, 81772737), Shenzhen Municipal Government of China (JCYJ20170413161749433) and Basic Research Program of Tsinghua National Lab for Information Science and Technology. We thank members of Xie Lab for helpful discussions. We thank Fei Sun and Ting Zhu for insightful discussion. We thank Yuxi Ke for proof reading of our manuscript.

CONFLICT OF INTEREST Z.X., D.M. and S.P. have filed a patent application to State Intellectual Property Office of China based on the findings in this work. The remaining authors declare no competing financial interests.

REFERENCES (1) Cong, L., Ran, F. A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P. D., Wu, X., Jiang, W., Marraffini, L. A., and Zhang, F. (2013) Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–23.

14 ACS Paragon Plus Environment

Page 14 of 25

Page 15 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

(2) Gilbert, L. A., Larson, M. H., Morsut, L., Liu, Z., Brar, G. A., Torres, S. E., Stern-ginossar, N., Brandman, O., Whitehead, E. H., Doudna, J. A., Lim, W. A., Weissman, J. S., and Qi, L. S. (2012) CRISPR-Mediated Modular RNA-Guided Regulation of Transcription in Eukaryotes. Cell 154, 442–451. (3) Hilton, I. B., D’Ippolito, A. M., Vockley, C. M., Thakore, P. I., Crawford, G. E., Reddy, T. E., and Gersbach, C. A. (2015) Epigenome editing by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers. Nat. Biotechnol. 33, 510–7. (4) Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A., and Liu, D. R. (2016) Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424. (5) Qi, L. S., Larson, M. H., Gilbert, L. a, Doudna, J. a, Weissman, J. S., Arkin, A. P., and Lim, W. a. (2013) Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152, 1173–83. (6) Farzadfard, F., Perli, S. D., and Lu, T. K. (2013) Tunable and Multifunctional Eukaryotic Transcription Factors Based on CRISPR/Cas. ACS Synth. Biol. 2, 604– 613. (7) Chavez, A., Scheiman, J., Vora, S., Pruitt, B. W., Tuttle, M., P R Iyer, E., Lin, S., Kiani, S., Guzman, C. D., Wiegand, D. J., Ter-Ovanesyan, D., Braff, J. L., Davidsohn, N., Housden, B. E., Perrimon, N., Weiss, R., Aach, J., Collins, J. J., and Church, G. M. (2015) Highly efficient Cas9-mediated transcriptional programming. Nat. Methods 12, 326–328. (8) Tanenbaum, M. E., Gilbert, L. A., Qi, L. S., Weissman, J. S., and Vale, R. D. (2014) A Protein-Tagging System for Signal Amplification in Gene Expression and Fluorescence Imaging. Cell 159, 635–646. (9) Liao, H. K., Hatanaka, F., Araoka, T., Reddy, P., Wu, M. Z., Sui, Y., Yamauchi, T., Sakurai, M., O’Keefe, D. D., Núñez-Delicado, E., Guillen, P., Campistol, J. M., Wu, C. J., Lu, L. F., Esteban, C. R., and Izpisua Belmonte, J. C. (2017) In Vivo Target Gene Activation via CRISPR/Cas9-Mediated Trans-epigenetic Modulation. Cell 171, 1495–1507.e15. (10) Nihongaki, Y., Furuhata, Y., Otabe, T., Hasegawa, S., Yoshimoto, K., and Sato, M. (2017) CRISPR-Cas9-based photoactivatable transcription systems to induce neuronal differentiation. Nat. Methods 14, 963–966. (11) Zincarelli, C., Soltys, S., Rengo, G., and Rabinowitz, J. E. (2008) Analysis of AAV serotypes 1-9 mediated gene expression and tropism in mice after systemic injection. Mol. Ther. 16, 1073–1080. (12) Mingozzi, F., and High, K. A. (2011) Therapeutic in vivo gene transfer for

15 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

genetic disease using AAV: progress and challenges. Nat. Rev. Genet. 12, 341–355. (13) Chew, W. L., Tabebordbar, M., Cheng, J. K. W., Mali, P., Wu, E. Y., Ng, A. H. M., Zhu, K., Wagers, A. J., and Church, G. M. (2016) A multifunctional AAV– CRISPR–Cas9 and its host response. Nat. Methods 13, 868–874. (14) Truong, D.-J. J., Kühner, K., Kühn, R., Werfel, S., Engelhardt, S., Wurst, W., and Ortiz, O. (2015) Development of an intein-mediated split-Cas9 system for gene therapy. Nucleic Acids Res. 43, 6450–6458. (15) Ran, F. A., Cong, L., Yan, W. X., Scott, D. a., Gootenberg, J. S., Kriz, A. J., Zetsche, B., Shalem, O., Wu, X., Makarova, K. S., Koonin, E. V., Sharp, P. a., and Zhang, F. (2015) In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186–190. (16) Nelson, C. E., Hakim, C. H., Ousterout, D. G., Thakore, P. I., Moreb, E. A., Castellanos Rivera, R. M., Madhavan, S., Pan, X., Ran, F. A., Yan, W. X., Asokan, A., Zhang, F., Duan, D., and Gersbach, C. A. (2016) In vivo genome editing improves muscle function in a mouse model of Duchenne muscular dystrophy. Science 351, 403–7. (17) Tabebordbar, M., Zhu, K., Cheng, J. K. W., Chew, W. L., Widrick, J. J., Yan, W. X., Maesner, C., Wu, E. Y., Xiao, R., Ran, F. A., Cong, L., Zhang, F., Vandenberghe, L. H., Church, G. M., and Wagers, A. J. (2016) In vivo gene editing in dystrophic mouse muscle and muscle stem cells. Science 351, 407–11. (18) Burstein, D., Harrington, L. B., Strutt, S. C., Probst, A. J., Anantharaman, K., Thomas, B. C., Doudna, J. A., and Banfield, J. F. (2017) New CRISPR-Cas systems from uncultivated microbes. Nature 542, 237–241. (19) Kim, E., Koo, T., Park, S. W., Kim, D., Kim, K., Cho, H., Song, D. W., Lee, K. J., Jung, M. H., Kim, S., Kim, J. H., Kim, J. H., and Kim, J. (2017) In vivo genome editing with a small Cas9 orthologue derived from Campylobacter jejuni. Nat. Commun. 8, 1–12. (20) Mefferd, A. L., Kornepati, A. V. R., Bogerd, H. P., Kennedy, E. M., and Cullen, B. R. (2015) Expression of CRISPR/Cas single guide RNAs using small tRNA promoters. RNA 21, 1683–9. (21) Friedland, A. E., Baral, R., Singhal, P., Loveluck, K., Shen, S., Sanchez, M., Marco, E., Gotta, G. M., Maeder, M. L., Kennedy, E. M., Kornepati, A. V, Sousa, A., Collins, M. A., Jayaram, H., Cullen, B. R., and Bumcrot, D. (2015) Characterization of Staphylococcus aureus Cas9: a smaller Cas9 for all-in-one adeno-associated virus delivery and paired nickase applications. Genome Biol 16, 257. (22) Nishimasu, H., Ran, F. A., Hsu, P. D., Konermann, S., Shehata, S. I., Dohmae, N., Ishitani, R., Zhang, F., and Nureki, O. (2014) Crystal structure of Cas9 in complex

16 ACS Paragon Plus Environment

Page 16 of 25

Page 17 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

with guide RNA and target DNA. Cell 156, 935–949. (23) Sternberg, S. H., LaFrance, B., Kaplan, M., and Doudna, J. A. (2015) Conformational control of DNA target cleavage by CRISPR-Cas9. Nature 527, 1–14. (24) Nishimasu, H., Cong, L., Yan, W. X., Ran, F. A., Zetsche, B., Li, Y., Kurabayashi, A., Ishitani, R., Zhang, F., and Nureki, O. (2015) Crystal Structure of Staphylococcus aureus Cas9. Cell 162, 1113–1126. (25) Yamano, T., Nishimasu, H., Zetsche, B., Hirano, H., Slaymaker, I. M., Li, Y., Fedorova, I., Nakane, T., Makarova, K. S., Koonin, E. V., Ishitani, R., Zhang, F., and Nureki, O. (2016) Crystal Structure of Cpf1 in Complex with Guide RNA and Target DNA. Cell 165, 949–962. (26) Guilinger, J. P., Thompson, D. B., and Liu, D. R. (2014) Fusion of catalytically inactive Cas9 to FokI nuclease improves the specificity of genome modification. Nat. Biotechnol. 32, 577–582. (27) Ma, D., Peng, S., and Xie, Z. (2016) Integration and exchange of split dCas9 domains for transcriptional controls in mammalian cells. Nat. Commun. 7, 100084. (28) Kleinstiver, B. P., Prew, M. S., Tsai, S. Q., Topkar, V. V, Nguyen, N. T., Zheng, Z., Gonzales, A. P., Li, Z., Peterson, R. T., Yeh, J. R., Aryee, M. J., and Joung, J. K. (2015) Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature 523, 481–485. (29) Ma, H., Tu, L. C., Naseri, A., Huisman, M., Zhang, S., Grunwald, D., and Pederson, T. (2016) Multiplexed labeling of genomic loci with dCas9 and engineered sgRNAs using CRISPRainbow. Nat Biotechnol 34, 528–530. (30) Chen, B., Hu, J., Almeida, R., Liu, H., Balakrishnan, S., Covill-Cooke, C., Lim, W. A., and Huang, B. (2016) Expanding the CRISPR imaging toolset with Staphylococcus aureus Cas9 for simultaneous imaging of multiple genomic loci. Nucleic Acids Res. 44, e75. (31) Jacobs, T. M., Williams, B., Williams, T., Xu, X., Eletsky, A., Federizon, J. F., Szyperski, T., and Kuhlman, B. (2016) Design of structurally distinct proteins using strategies inspired by evolution. Science 352, 687–90. (32) Cradick, T. J., Fine, E. J., Antico, C. J., and Bao, G. (2013) CRISPR/Cas9 systems targeting beta-globin and CCR5 genes have substantial off-target activity. Nucleic Acids Res. 41, 9584–9592. (33) Schmitz, M. L., and Baeuerle, P. A. (1991) The p65 subunit is responsible for the strong transcription activating potential of NF-kappa B. EMBO J. 10, 3805–3817. (34) Zakeri, B., Fierer, J. O., Celik, E., Chittock, E. C., Schwarz-Linek, U., Moy, V.

17 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

T., and Howarth, M. (2012) Peptide tag forming a rapid covalent bond to a protein, through engineering a bacterial adhesin. Proc. Natl. Acad. Sci. U. S. A. 109, E690-7. (35) Chen, J. S., Dagdas, Y. S., Kleinstiver, B. P., Welch, M. M., Sousa, A. A., Harrington, L. B., Sternberg, S. H., Joung, J. K., Yildiz, A., and Doudna, J. A. (2017) Enhanced proofreading governs CRISPR-Cas9 targeting accuracy. Nature 550, 407– 410. (36) Dourmishev, L. A., Dourmishev, A. L., Palmeri, D., Schwartz, R. A., and Lukac, D. M. (2003) Molecular genetics of Kaposi’s sarcoma-associated herpesvirus (human herpesvirus-8) epidemiology and pathogenesis. Microbiol. Mol. Biol. Rev. 67, 175– 212, table of contents. (37) Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST:a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.

Figure legends Figure 1. Rational Design of the Compact CRISPR/Cas9 System. (A) Diagram of EBFP2 transcription activation assay for the compact Cas9 derivatives fused with the VPR domain. The constitutively expressed mKate2 was used as a transfection control. (B) Diagram of dSpCas9, mini-dSpCas9-1 and mini-dSpCas9-2 domain organization (left) and their corresponding gene activation efficiency (right). (C) The schematic of domain organization of dSaCas9 and SaCas9 derivatives are shown on the left, and the results of their gene activation efficiency are shown on the right. (D) Diagram of dAsCpf1 and mini-AsCpf1-1 domain organization (left), and the corresponding gene activation efficiency (right). (B-D) Data are shown as the mean ± SEM fold change of EBFP2 fluorescence from three independent replicates measured by using flow cytometer 48 h after transfection into HEK293 cells.

Figure 2. Effect of the Compact SaCas9 Derivatives on DNA Cleavage. (A) Diagram of the EYFP reconstitution assay to evaluate the DNA cleavage efficiency. (B) Domain organization of dSaCas9 and mini-dSaCas9-4 fused with the FokI domain at the N terminal are shown in the upper panel. Domain organization of split dSaCas9 or split mini-dSaCas9-4 fused with the FokI domain in the middle is shown in the lower

18 ACS Paragon Plus Environment

Page 18 of 25

Page 19 of 25 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

panel. (C) DNA cleavage efficiency by the FokI:dSaCas9, the FokI:mini-dSaCas9-4, and the split fusions of FokI and dSaCas9 with or without the HNH domain, with a spacer length ranging from 12-bp to 24-bp. Each bar shows mean fold changes (mean ± SEM; n = 3) of EYFP fluorescence measured by using flow cytometer 48 h after transfection in HEK293 cells. (D) The DNA cleavage assay targeting the human CCR5 gene by SaCas9, SaCas9-Nick, FokI:dSaCas9, FokI:mini-dSaCas9-4, and the split fusions of FokI and dSaCas9 with or without the HNH domain.

Figure 3. Construction of the Compact VTR and MoonTag Systems for Transcription Activation. (A) Schematic representation of VPR, VTR1, VTR2, VTR3 and VP64 transcription activation domain and their corresponding gene activation efficiency evaluated by using the EBFP2 reporting system shown in Figure 1A. Unpaired t-test was performed for comparison between indicated samples. “****”, p