Straightforward Delivery of Linearized Double-stranded DNA

Straightforward Delivery of Linearized Double-stranded DNA Encoding sgRNA and Donor DNA for the Generation of Single Nucleotide Variants Based on the...
1 downloads 0 Views 841KB Size
Subscriber access provided by Stony Brook University | University Libraries

Letter

Straightforward Delivery of Linearized Double-stranded DNA Encoding sgRNA and Donor DNA for the Generation of Single Nucleotide Variants Based on the CRISPR/Cas9 System Soyeong Jun, Hyeonseob Lim, Hoon Jang, Wookjae Lee, Jinwoo Ahn, Ji Hyun Lee, and Duhee Bang ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.7b00345 • Publication Date (Web): 20 Jun 2018 Downloaded from http://pubs.acs.org on June 21, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Straightforward Delivery of Linearized Double-stranded DNA Encoding sgRNA and Donor DNA for the Generation of Single Nucleotide Variants Based on the CRISPR/Cas9 System Soyeong Jun1, 3, Hyeonseob Lim1, 3, Hoon Jang1, Wookjae Lee1, Jinwoo Ahn1, Ji Hyun Lee2,*, Duhee Bang1,* 1

Department of Chemistry, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea 2

Department of Clinical Pharmacology and Therapeutics, College of Medicine, Kyung Hee University, 26 Kyungheedae-ro, Dongdaemun-gu, Seoul 02447, Republic of Korea 3

The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint first authors. *Correspondence should be addressed to D.B. ([email protected]) or J.H.L. ([email protected]) ABSTRACT CRISPR/Cas9 for genome editing requires delivery of a guide RNA sequence and donor DNA for targeted homologous recombination. Typically, single-stranded oligodeoxynucleotide, serving as the donor template, and a plasmid encoding guide RNA are delivered as two separate components. However, in the multiplexed generation of single nucleotide variants, this two-component delivery system is limited by difficulty of delivering a matched pair of sgRNA and donor DNA to the target cell. Here, we describe a novel co-delivery system called “sgR-DNA” that uses a linearized doublestranded DNA consisting of donor DNA component and a component encoding sgRNA. Our sgR-DNA–based method is simple to implement because it does not require cloning steps. We also report the potential of our delivery system to generate multiplex genomic substitutions in Escherichia coli and human cells.

KEYWORDS: mutagenesis, genome editing, CRISPR, multiplex, gene technology

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

INTRODUCTION Discerning the functional effects of genetic mutations is considered to be one of the most competitive fields in biological research for interrogating functional and clinical effects. Functional screening from a population of cells modified to have one mutation per cell is becoming a routine approach to rapidly identify a myriad of genetic mutations. A recently developed genome editing method based on the RNAguided clustered regularly interspaced short palindromic repeats (CRISPR)associated nuclease Cas91 is an innovative approach for genetic modifications such as knockout2 and transcriptional regulation3. This method requires a single-guide RNA (sgRNA) that consists of a CRISPR RNA (crRNA) fused with a transactivating crRNA (tracrRNA). The technology demonstrates scalability with its ability to target tens of thousands of loci. However, unlike knockout approaches or transcriptional regulation, CRISPR-based generation of single nucleotide variants (SNVs) is primarily limited to a single locus per experiment4. We reasoned that development of a system that simultaneously delivers sgRNA and donor DNA would allow high-throughput generation of substitution mutations. Unlike gene knockout, which requires only a guide RNA vector to induce indels at a target sequence by non-homologous end joining (NHEJ) in mammalian cells, substitution of a nucleotide requires an additional component (i.e., donor DNA); the substitution mutation can be generated by homologous recombination (HR) when sgRNA and donor DNA that target the same gene are co-delivered. However, scalable production of both sgRNA and donor DNA, which are usually prepared as a plasmid and as single-stranded DNA, respectively4-6, is a limitation to extending the technology to multiplex. Moreover, although the libraries of sgRNA and donor DNA are constructed, only a limited number of molecules7 can be transformed into a cell, which reduces the probability that both guide RNAs and donor DNA targeting same gene are delivered in same cell (Supplementary Note, Figure 1a). Therefore, a new delivery system that delivers the sgRNA and donor DNA pair as a conjugated unit for each target is necessary to efficiently generate substitution mutations. Recently, new conjugated delivery methods showed an increase of HR efficiency and precision compared to unconjugated delivery. These included “gDonor”8, which used a chemically ligated sgRNA-donor DNA pair, and “CRISPR-gold”9, which used a nanoparticle encapsulating the sgRNA-Cas9 complex and donor DNA. However, these delivery systems were not appropriate for multiplex generation of substitution mutations because the preparation of gDonor or CRISPR-gold libraries was too costand labor-intensive to be scalable. For these methods, the reaction for conjugating each targeted sgRNA and donor pair should be performed individually. Here, we describe the development of a matched sgRNA-donor DNA pair (sgR-DNA) by assembly of sgRNA-encoding DNA and donor DNA (Figure 1b and Supplementary Figure S1) to facilitate scalable production of matched sgRNA-

ACS Paragon Plus Environment

Page 2 of 19

Page 3 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

donor DNA conjugates by a standard polymerase chain reaction (PCR) protocol. We used our system to engineer Escherichia coli (E. coli) lacking the non- NHEJ pathway and then extended our system to mammalian cell engineering. We observed that a pool of sgR-DNAs could be used to generate substitution mutations in both E. coli (11-plex) and human cells (10-plex).

RESULTS AND DISCUSSION We first targeted the endogenous galK gene EcHB3 (a derivative of EcNR25, 10) in a singleplex reaction. This strain has an inactivated endogenous mutS gene, which could enhance the substitution efficiency. Donor DNA of various lengths (63, 93, or 123 nt) was designed with protospacer adjacent motif (PAM) modification to revert the premature stop codon (I239*) of galK in the galKOFF EcHB3 strain and prevent recutting after engineering (Figure 2a). The sgRNA was designed with a 20-nt spacer complementary to the target genomic locus and a 77-nt scaffold region. Then the entire sgR-DNA was constructed by PCR (Supplementary Figure S2). Expression of the sgR-DNA was controlled by the T7 promoter, because of strong sgR-DNA expression before removal from cells and low amplification bias, compared to construction of longer prokaryotic promoters. Promoters that do not require additional plasmids but are longer than T7, such as the Lac and Tac promoters, can also be used. A construct with the Lac promoter showed 2.7% efficiency estimated by counting colonies on MacConkey plates (Supplementary Table S1), which was lower than that with the T7 promoter (Figure 2b).

For sgR-DNA–based engineering, E. coli cells were sequentially transformed with plasmids expressing T7 polymerase (pN249) and Cas9 (pET-Cas9) by electroporation, followed by the sgR-DNA construct. As controls, we tested another method using linear double-stranded DNA expressing sgRNA (i.e., linear sgRNA) and single-stranded oligodeoxynucleotides (ssODNs), which is more efficient than double-stranded DNA11 and usually used for donor DNA in methods such as Multiplex Automated Genomic Engineering (MAGE)10 and CRISPR/Cas9 and λ Red Recombineering-based MAGE (CRMAGE)6 (Supplementary Figure S2). Engineering efficiency of the two methods was compared by next generation sequencing (NGS). Our results showed that linear dsDNA constructs could express sgRNA and modify the target locus with ssODN. The efficiency using linear sgRNA with 123-nt ssODN (3rd bar in Figure 2b, Supplementary Figure S3a) was 4%, and a maximum efficiency of 11% was observed using 1 µM sgR-DNA composed of 123-nt donor DNA (last bar in Figure 2b, Supplementary Figure S3b). The efficiency was

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

increased when using sgR-DNA with Cas9 compared to the efficiency of lambda-red recombination when used with conventional ssODN (2nd bar in Figure 2b) and sgRDNA (4th bar in Figure 2b). We confirmed that sgRNA was expressed from sgR-DNA, which had an extra donor sequence, and was controlled by the T7 promoter and functioned with Cas9. To do this, we detected the presence of transcribed sgR-DNA by reverse transcription (RT)-PCR and Cas9-mediated cleavage on a protospacercontaining DNA fragment (Supplementary Figure S3c and d). In summary, we showed that sgR-DNA with Cas9 promoted genomic substitution. We next tested the sgR-DNA construct in multiplex genome engineering. Genomewide distributed loci at 11 aromatic amino acid biosynthesis genes (aroA, aroB, aroC, aroD, aroE, aroF, aroG, aroK, aroL, aroM, and aroP) were selected. A pool of sgRDNA sequences targeting these genes was delivered to the cells, and efficiency was determined with the same procedure used to evaluate the singleplex reactions (Figure 2c). Control cells were transformed with a pool of aro gene-targeting sequences consisting of linear sgRNA with ssODNs. We observed homologous recombination (HR) at all targeted loci (Figure 2d, e). Yields were 2.5% with sgR-DNA (3rd bar in Figure 2d) and 0.19% with linear sgRNA/ssODN (2nd bar in Figure 2d). The lower efficiency compared with the singleplex experiment may be due to the fact that although one locus was engineered, the sgRNA for other genes could cleave the DNA, inducing cell death. We observed fewer colony-forming units (CFU) in another experiment that targeted six genes (aroA, aroB, aroC, aroK, aroM, and aroP) at 2 hours after electroporation, compared to an experiment targeting only aroC (Supplementary Figure S3e). Nonetheless, our method provided better results than using sgRNA/ssODN, because sgR-DNA delivered sgRNA and the matched donor DNA together, whereas sgRNA and ssODN were often unpaired in the target cells. The viability of cells electroporated with the sgRNA-expressing vector with donor DNA (CRMAGE-like conditions) was also less than that of cells transformed with sgR-DNA. We next tested sgR-DNA for SNV generation in mammalian cells (HEK 293T). We targeted the nucleotide of EGFR that corresponds to an amino acid change from wild type Leu858 to Arg858, a driver mutation in lung cancer.12 We used the human U6 promoter to drive expression of sgR-DNA (Figure 3a), and the donor DNA contained four mismatches, three to introduce the desired amino acid change (p.L858R), and one to introduce the PAM modification. The sgR-DNA was assembled by PCR (Supplementary Figure S4), delivered to the cells with pCas9-GFP, and efficiency was assessed by NGS. We also examined other delivery systems (Supplementary Figure S5) and used FACS to enrich for Cas9-transduced cell using GFP. We found that sgR-DNA could mediate HR in mammalian cells, with a yield of 0.6%, which is less than when linear sgRNA with ssODN is used (1.4%) (Figure 3b). This result is consistent with the knowledge that ssODN is generally preferred for HR in

ACS Paragon Plus Environment

Page 4 of 19

Page 5 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

mammalian cells13. Additionally, we believe that the addition of a non-homologous sequence of sgR-DNA, which is not present with ssODN, also decreased the efficiency. However, by considering the difficulty of large-scale production of ssODN, we think that the 0.8% difference in efficiency of linear sgRNA with ssODN over sgRDNA is not as important as the advantage of being able to produce sgR-DNA on a large-scale. We found that indels via NHEJ were predominant, with an efficiency as high as 23.9% unlike E. coli engineering (Supplementary Figure S6a). The proportion of FACS-enriched Cas9-transduced cells engineered by sgR-DNA (substitution: 2.0%, indels: 84%) was higher than that of unsorted cells (Figure 3b and Supplementary Figure S6b). We validated the genotype of the target locus in FACS-isolated single clones by Sanger sequencing (Supplementary Figure S6c). Two of the 43 clones had a substitution mutation in the heterozygous allele (2.3% of SNV generation): mutant clone #1 (carrying an allele with p.L858R and a wild type allele) and mutant clone #2 (carrying an allele with p.L858R and an allele with a 12bp deletion). Next, we tested the potential of multiplex (3-plex) genome editing in human cells by pooling three sgR-DNAs targeting EGFR, BRAF, and KRAS. The overall efficiency of SNV generation was 0.35% (EGFR, 0.19%; BRAF, 0.05%; and KRAS, 0.11%) (Supplementary Figure S7). This result was comparable to the average of 0.38% for singleplex genome editing (0.6%, 0.19%, and 0.34%, respectively) Supplementary Figure S8 and S9). A comparable level of indels was also observed (30.7%). We then extended the sgR-DNA–based method to 10 loci [CTNNB1, DNMT3A, GNAQ, GNAS, HRAS, IDH2, NOTCH1, NRAS, PIK3CA, and TP53] (Figure 3c and Supplementary Table S2), selected based on their high frequency in the Catalogue of Somatic Mutations in Cancer database.6 We constructed and pooled individual sgR-DNAs to generate a sgR-DNA library. We compared sgR-DNA–based genome editing to several controls. The results of NGS showed a total of 0.28% substitutions using the sgR-DNA-based method and 0.97% substitutions in the FACS-sorted cell population (Figure 3d, 3e, and Supplementary Figure S10). Although the efficiency was very low, both the sgRNAs/ssODNs and sgR-DNA methods were capable of multiplexing. sgR-DNA can also be used with microarray-derived oligonucleotides, which can provide thousands of constructs. For example, if the guide RNA sequence of AsCpf114 is used, the sgR-DNA is as short as 168 bp (Supplementary Figure S11a), and is a favorable length to be synthesized by microarray. Therefore, Cpf1-based sgR-DNA was examined to test whether thousands of loci could be substituted. At first, the galK gene was targeted to revert the premature stop codon in the EcHB3 strain, as used above, and we observed an efficiency of 1-4% (Supplementary Figure S11b). The relatively low efficiency compared to that of Cas9 can be explained by recent reports that AsCpf1 has lower activity in temperatures at which

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

EcHB3 grows.15 We finally constructed the library composed of 4,081 sgR-DNA that could substitute 98% of all E. coli ORFs, and examined the library. We observed that 369 loci were substituted, as we had designed, although the efficiency was very low (Supplementary Figure S11c). Due to this low efficiency, it is still not feasible to perform functional screening using our method. However, we believe that sgR-DNA will enable thousands of genetic substitutions with increased efficiency. In mammalian cells, our sgR-DNA method is limited by the large number of deletions due to NHEJ events. Other studies have shown that key components of the NHEJ pathway can be inhibited using short hairpin RNA, SCR7, or adenovirus 4 proteins.16 Cas9 nickase can also be used to lower the NHEJ rate.17 As the number of target loci increases, we must devise an efficient method to examine target loci as the allelic mutation frequency decreases. We believe that target-capture sequencing18-19 can be used to detect low-frequency substitutions at target loci.20 In summary, we have demonstrated a new delivery method, which has the potential for multiplex HR-mediated genome modification through delivery of linearized double-stranded sgRNA-encoding DNA and donor DNA (i.e., sgR-DNA). Construction of sgR-DNA is straightforward, involving the production of small linear DNA fragments by PCR, avoiding the laborious cloning step. If the number of targets is less than several hundred, an sgR-DNA library could be prepared using gene fragments obtained from a synthesis company (Supplementary Table S3). We used pooled oligonucleotides cleaved from a programmable microarray21 to construct a sgR-DNA library for E. coli with the Cpf1 system and proposed a similar strategy for mammalian cells (Supplementary Figure S12). Our cost to prepare a pool of sgRDNA by programmable microarray ($0.696 per target) is cheaper than preparing a pool of ssODN above the picomole scale (>$40 per target) for use in lambda-red recombination, MAGE, or CRMAGE. In general, exogenous linear DNA fragments are known to be relatively unstable in host cells22. However, we successfully achieved recombineering with linear DNA and observed moderate viability compared to the plasmid form, indicating continuous expression of markers and effective selection pressure (Supplementary Figure S3e). Although the efficiency was lower than other competing methods (CRMAGE reached over 70%), we think the sgR-DNA method reported here has the advantage of scalability due to the cost-efficient and simplified library preparation steps, which can be performed within a day. We believe that sgR-DNA–based genome editing will be useful for multiplex SNV generation for interrogating related functions in E. coli and human cells.

METHODS Plasmids expressing T7 polymerase, Cas9, single-guide RNA (sgRNA), and single-stranded oligodeoxynucleotides (ssODNs). The T7 polymerase-encoding plasmid pN249 was a gift from George Church (Harvard Medical School) and the

ACS Paragon Plus Environment

Page 6 of 19

Page 7 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Cas9-encoding plasmid pET-Cas9 was generated by cloning the Cas9 sequence from pMJ806 (Addgene plasmid 39312) into pET-28a. Cas9-encoding plasmid under the control of the Lac promoter was generated by inserting the Lac promoter fragment into pET-Cas9. Plasmids expressing green fluorescent protein (GFP)-linked human codon-optimized Cas9 nuclease (Addgene plasmid 44719) and guide RNA (gRNA) under control of the U6 promoter (Addgene plasmid 41824) were purchased from Addgene (USA). For controls, we cloned double-stranded DNA containing the target site into a TOPO cloning vector (Enzynomics, Korea) for E. coli and AflII-digested gRNA cloning vector for human cells. The ssODNs for targeted loci were synthesized by Integrated DNA Technologies (USA). The ssODN sequences for E. coli and human cells are shown in Supplementary Table S2 and S4, respectively. sgR-DNA library design. The sgR-DNA construct consists of the T7 promoter (24 bp) or U6 promoter (264 bp), sgRNA-encoding DNA (97 bp), termination sequence (6 Ts), and donor DNA (63 bp, 93 bp, or 123 bp) (Figure 1b). In the first step of designing sgRNA for E. coli genome engineering, the protospacer adjacent motifs (PAMs) nearest to the target loci were selected, and sequence adjacent to the selected PAM was determined as a spacer sequence of sgRNA (Supplementary Table S5). In designing sgR-DNA for human cells, an extra 5' G was added to the 19-bp sequence adjacent to the PAM as a spacer sequence of sgRNA. A 20-bp spacer was fused to the 77-bp scaffold to generate sgRNA-encoding DNA (Supplementary Table S6). Donor DNA was constructed to contain a 3-nt variant codon and 30-nt, 45-nt, or 60-nt homology arms (Supplementary Table S2 and S4). Donor DNA was designed to introduce variant codons with a ≥2-nt mismatch with the target codon and, thus could be distinguished from next generation sequencing error. The donor DNA contained a PAM with synonymous codons after substitution to prevent further PAM sequence recognition after generating the single nucleotide variant. Subsequently, sgR-DNA was constructed by assembly of sgRNA-encoding DNA and donor DNA. sgR-DNA library construction for E. coli. We synthesized the sgR-DNA constructs (190, 220, or 250 bp containing homology arms of 30, 45, or 60 nt respectively) in three parts: i) right portion of T7 promoter (20 nt), spacer (20 nt), and sequence encoding the region homologous to the sgRNA scaffold (20 nt); ii) sequence encoding sgRNA (77 nt) and termination sequence (6Ts); and iii) donor DNA (63 nt, 93 nt, or 123 nt) (Supplementary Figure S2). Each part was made independently by solid-phase oligonucleotide synthesis (Integrated DNA Technologies, USA). The parts were then assembled by PCR with the T7 forward primer and target-specific middle oligo (sequences shown in Supplementary Table S2, S7, and S8). We generated the sgR-DNA construct by PCR using i-pfu 2× PCR Master Mix Solution (Intron, Korea) and the following thermal cycling conditions: 95°C for 5 minutes; followed by 20 cycles of 95°C for 30 seconds, 50°C for 30 seconds, and 72°C for 30

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

seconds; and a final elongation step at 72°C for 5 minutes. The PCR products were separated on a 2% agarose gel and then purified using the MinElute Gel Extraction Kit (Qiagen, Germany). Construction of sgR-DNA targeting galK under the control of the Lac promoter. We synthesized the construct in two parts: i) Lac promoter and sequence encoding the region homologous to the sgRNA spacer targeting galK; ii) sequence encoding sgRNA spacer, scaffold, and donor DNA (123 nt). Part i was assembled by PCR, using oligonucleotides pLacO 1, 2, and 3. Part ii was amplified from sgR-DNA with the T7 promoter using the spacer forward and galK reverse primers. Oligonucleotides used in this step are described in Supplementary Table S13. Parts i and ii were assembled by PCR with the i-pfu 2x PCR Master Mix Solution. PCR cycling was carried out under the following conditions: initial denaturation at 95°C for 5 minutes; 20 cycles of 95°C for 30 seconds, 57°C for 30 seconds, and 72°C for 30 seconds; and final elongation at 72°C for 5 minutes. Assembled constructs were separated on a 2% agarose gel and then purified using the MinElute Gel Extraction Kit. sgR-DNA library construction for human cells. We synthesized the sgR-DNA constructs (490 bp) in three parts: i) U6 promoter (264 bp); ii) sequence encoding the region homologous to the U6 promoter (19 bp), sequence encoding the sgRNA scaffold (77 bp), termination sequence (6 Ts), and region homologous to the donor DNA (18–35 bp); and iii) donor DNA (123 bp) (Supplementary Figure S4). Each part was synthesized independently. Part i was amplified from the gRNA cloning vector using the U6 forward and reverse primers (Supplementary Table S9). Parts ii and iii were made by solid-phase oligonucleotide synthesis (Integrated DNA Technologies, USA) (Supplementary Table S9 and S4) and assembled by PCR. We generated the sgR-DNA construct by PCR with the i-pfu 2x PCR Master Mix Solution. PCR cycling was carried out under the following conditions: initial denaturation at 95°C for 5 minutes; 20 cycles of 95°C for 30 seconds, 58°C for 30 seconds, and 72°C for 30 seconds; and a final elongation step at 72°C for 5 minutes. Assembled constructs were separated on a 2% agarose gel and then purified using the MinElute Gel Extraction Kit. E. coli cell culture and electroporation. We used the E. coli strain EcHB3 (a gift from HB Kim), an EcNR210 derivative in which cat, bla, galK, malK were inactivated by multiplex automated genome engineering, with targeting oligonucleotide sequences described by Wang et al. The cells were grown at 30°C in LB or LB agar plates supplemented with 30 µg/ml kanamycin or 100 µg/ml spectinomycin. Isopropyl β-D-1-thiogalactopyranoside (IPTG; 1 mM) was used to induce the lac promoter. E. coli cells were electroporated with a pN249 or pET-Cas9 plasmid using a Bio-Rad Gene Pulser as follows. The cells were grown at 30°C. At mid-log phase (OD600 = ~0.8) the cells (1 ml) were harvested by centrifugation at 15,000 rpm for 1 minute.

ACS Paragon Plus Environment

Page 8 of 19

Page 9 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

The cell pellet was washed twice with 1 ml chilled ddH2O and resuspended with 50 µL of the appropriate plasmid. Cell suspensions were electroporated at 1.8 kV in 1mm gap cuvettes, after which LB medium without antibiotics was added. The cells were allowed to recover for 3 hours and transferred to LB with the appropriate antibiotic. For electroporation of sgR-DNA or linear DNA fragments, the cells were grown in the presence of 1 mM IPTG to express Cas9 or T7 polymerase, controlled by the IPTGinducible pT7lacO promoter. At mid-log phase (OD600 = ~0.8) the cells were heatshocked at 42°C for 15 minutes to express λ Red recombination system controlled by the heat-inducible pL promoter. Then, 1 ml cells were harvested, washed twice with ddH2O, and pulsed with linear DNA fragments (0.1 µM or 1 µM). The cells were then cultured overnight to target region amplification and confirm engineering efficiency. In some conditions, we analyzed cell viability 2 hours after electroporation. We spread 300

l of cell onto plates containing appropriate antibiotics and Isopropyl

β-D-1-thiogalactopyranoside (IPTG) and counted the number of colonies after 14 hours incubation at 30°C. Functional evaluation of sgR-DNA with Cas9. To confirm the existence of sgRNA transcribed from sgR-DNA, we extracted RNA from sgR-DNA treated cells using the Easy-blue Total RNA extraction kit (Intron, Korea), and added Turbo DNase I (Ambion, USA) to the extracted product for 1 hour at 37°C. We then performed RTPCR. Reverse transcription was performed using the galK_sgR-DNA_rev primer “AAACTGGTTTTCTGCTTCCT” by Maxima H Minus Reverse Transcriptase (Thermo Scientific, USA), according to the manufacturer’s protocol. The PCR reaction was performed using KAPA HotStart ReadyMix with scaffold_fwd primer “GTTTTAGAGCTAGAAATAGCAAG” and the sgR-DNA_rev primer. We confirmed the identity of the products by gel electrophoresis. To test whether sgRNA expressed from sgR-DNA could function with Cas9, we obtained sgRNA expressed from sgR-DNA by in vitro transcription using the MEGAscript T7 Transcription kit (Thermo Fisher Scientific, USA). Then, sgRNA and Cas9 were pre-assembled at 37°C for 5 min and then 100 ng of PCR amplified target DNA was incubated with this ribonucleoprotein complex at a 1:100 molecular ratio. The reaction took place in NEB 3 buffer at 37°C for 1 hour. After the incubation, we resolved the cleavage of target DNA by 2.5% agarose gel electrophoresis. Human cell culture and transfection. HEK 293T cells were cultured in Dulbecco’s modified Eagle medium (Gibco/Life Technologies, USA) supplemented with 10% heat-inactivated fetal bovine serum (Gibco/Life Technologies) and 1% penicillin/streptomycin (Gibco/Life Technologies) at 37°C in a 5% CO2 humidified atmosphere. On the day before transfection, cells were seeded in a 6-well plate or 100-mm dish. For singleplex experiments, cells were transfected with 1.6 µg sgRDNA and/or 0.8 µg pCas9-GFP using Lipofectamine 3000 (Invitrogen, USA)

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

according to the manufacturer’s protocol. In multiplex experiments, cells were transfected with 10 µg sgR-DNA and 5 µg pCas9-GFP. Fluorescence-activated cell sorting. HEK 293T cells were sorted using a BDFACS Aria II or III instrument (BD Biosciences, USA) 48 hours after transfection with the pCas9-GFP and sgR-DNA library. In brief, cultured cells were trypsinized and pelleted by centrifugation and then washed in phosphate-buffered saline (PBS). The pellet was resuspended in sorting buffer (PBS containing 2 mM EDTA, 25 mM HEPES, and 1% bovine serum albumin) at a final density of 3–4 × 106 cells/ml. Finally, the cells were filtered through a cell strainer to prepare a single-cell suspension. Clonal expansion and genotyping. To generate clonal cell lines with defined mutations, 48 hours after transfection the GFP-positive cells (transfected with pCas9-GFP and sgR-DNA) were plated as single cells in 96-well plates containing complete medium. Cells were grown for 2–3 weeks, and then genomic DNA was extracted. Target regions were amplified by PCR and genotyped by Sanger sequencing. Genomic DNA isolation. Genomic DNA was isolated 48h after transfection, after fluorescence-activated cell sorting (FACS) analysis or after clonal expansion using the DNeasy Blood and Tissue Kit (Qiagen, Germany). Cells were harvested, and the cell pellet was resuspended in 200 µl PBS and lysed with proteinase K. The lysate was loaded onto a spin column, washed twice, and eluted in nuclease-free water. Isolated genomic DNA was quantified using the Qubit dsDNA BR Assay Kit (Life Technologies, USA). Confirmation of target modification by sequencing in the E. coli experiment. PCR primers were designed to amplify the target region (amplicon size approximately 200 bp) (Supplementary Table S10). To prevent capture of donor DNA, primers were designed to capture the genomic region sufficiently far from the codon intended for modification. The target region was amplified using Kapa HiFi HotStart ReadyMix (Kapa Biosystems, USA) using the following cycling conditions: initial denaturation at 95°C for 3 minutes; followed by 27 cycles of denaturation at 95°C for 20 seconds, annealing at 60°C for 15 seconds, and elongation at 72°C for 15 seconds; and a final elongation step at 72°C for 1 minute. The PCR products were pooled, prepared with the SPARK DNA Sample Prep Kit (Enzymatics, USA), and sequenced using a HiSeq 2500 system (Illumina, USA). Confirmation of target modification by sequencing in the human cell experiment. For singleplex experiments, PCR primers were designed to amplify the target region (amplicon size approximately 400 bp) (Supplementary Table S11). In multiplex experiments, the capture region starts 100–134 bp upstream of the substitution position and extends to 100–129 bp downstream (Supplementary Table S11). To prevent capture of donor DNA, primers were designed to capture the

ACS Paragon Plus Environment

Page 10 of 19

Page 11 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

genomic region sufficiently far from the codon intended for modification. The target region was amplified using Kapa HiFi HotStart ReadyMix under the following cycling conditions: initial denaturation at 95°C for 3 minutes; followed by 27 cycles of denaturation at 95°C for 20 seconds, annealing at 60°C for 15 seconds, and elongation at 72°C for 15 seconds; and a final elongation step at 72°C for 1 minute. The PCR products were pooled, prepared with the SPARK DNA Sample Prep Kit, and sequenced using a HiSeq 2500 system. Cpf1 experiments. The sgR-DNA targeting galK consists of the T7 promoter (25 bp), guide RNA-encoding DNA (42 bp), a termination sequence (6 Ts), and a donor DNA (95 bp) (Supplementary Figure S11a). Except for the change of PAM sequence to 5′-TTN-3′, the spacer sequence and donor DNA were designed as described in the Cas9 experiment. In a singleplex experiment, sgR-DNA was constructed by assembly of guide RNA-encoding DNA and donor DNA (Supplementary Table 12). In a multiplex experiment, microarray-derived oligonucleotides were designed for 175-nt sequences containing part of the T7 promoter (5 nt), guide RNA-encoding DNA (42 nt), a termination sequence (6 Ts), donor DNA (93 nt), the enzymatic cut site (7 nt), and an amplification arm sequence (20 nt). Oligonucleotides were initially amplified via 27 cycles of PCR using common flanking primers and the T7 promoter was flanked via 6 cycles of PCR, then the amplification arm sequence was removed by EarI digestion overnight. Each PCR reaction was performed using KAPA HiFi HotStart ReadyMix under the following conditions: initial denaturation at 95°C for 3 minutes; followed by appropriate number of cycles of denaturation at 95°C for 20 seconds, annealing at 56°C for 15 seconds, and elongation at 72°C for 15 seconds. PCR primers and designed oligonucleotides are listed in Supplementary Tables 8 and 13. Most of the subsequent experiments and data analyses were performed as for the Cas9 experiment. To confirm target modification in multiplex experiments, a whole genome library was prepared using a SPARK DNA Sample Prep Kit and subjected to NGS. Data processing and analysis. After sequencing, we trimmed low-quality ends (phred quality score < 25) and removed reads with an average phred quality < 20. To determine the efficiency of recombineering in the E. coli experiment, we counted reads with the desired modification. In the human cell experiment, reads were aligned to the reference human genome sequence (hg19) using Novoalign V2.07.18 (Novocraft Technologies, Malaysia), and we counted reads with insertions or deletions within 10 bp of the predicted cutting site (i.e., 3 bp upstream of the PAM sequence). To determine substitution efficiency, we counted reads containing the modification sequence.

ASSOCIATED CONTENT Supporting Information

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Additional figures and tables. (PDF)

AUTHOR INFORMATION Corresponding Authors Correspondence to Duhee bang ([email protected]) or Ji Hyun Lee ([email protected]) Author Contributions D.B. conceived the study. S.J., and H.L. designed and performed the experiments, and analyzed the sequencing data. J.H., W.L., and J.A. performed the experiments. S.J., H.L., J.H.L., and D.B. wrote the manuscript. D.B. and J.H.L. managed the study. All authors read and approved the manuscript. Notes D.B., J.H.L., S.J., and H.L. are authors of a patent application for the method described in this paper (Targeted genome editing based on CRISPR/Cas9 system using short linearized double-stranded DNA, 10-1785847). The remaining authors declare no competing financial interest.

ACKNOWLEDGMENTS This work was supported by the Mid-career Researcher Program (NRF2015R1A2A1A10055972); Pioneer Research Center Program (NRF-2012-0009557); Basic Science Research Program (NRF-2015R1A2A2A03006577); Bio & Medical Technology Development Program (NRF-2016M3A9B6948494, NRF2018M3A9H3024850) funded by the Ministry of Science & ICT through National Research Foundation of Korea.

REFERENCES -LQHN 0 &K\OLQVNL . )RQIDUD , +DXHU 0 'RXGQD - $ &KDUSHQWLHU ( $ SURJUDPPDEOH GXDO 51$ JXLGHG '1$ HQGRQXFOHDVH LQ DGDSWLYH EDFWHULDO LPPXQLW\ 6FLHQFH 6KDOHP 2 6DQMDQD 1 ( +DUWHQLDQ ( 6KL ; 6FRWW ' $ 0LNNHOVRQ 7 +HFNO ' (EHUW % / 5RRW ' ( 'RHQFK - * =KDQJ ) *HQRPH VFDOH &5,635 &DV NQRFNRXW VFUHHQLQJ LQ KXPDQ FHOOV 6FLHQFH

ACS Paragon Plus Environment

Page 12 of 19

Page 13 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

*LOEHUW / $ +RUOEHFN 0 $ $GDPVRQ % 9LOODOWD - ( &KHQ < :KLWHKHDG ( + *XLPDUDHV & 3DQQLQJ % 3ORHJK + / %DVVLN 0 & 4L / 6 .DPSPDQQ 0 :HLVVPDQ - 6 *HQRPH 6FDOH &5,635 0HGLDWHG &RQWURO RI *HQH 5HSUHVVLRQ DQG $FWLYDWLRQ &HOO 0DOL 3