A highly characterized synthetic landing pad system for precise multi

4 days ago - ... Landing Pads (LP) into the S. cerevisiae genome to act as sites for high-level gene integration. ... involved in the first committed ...
0 downloads 0 Views 3MB Size
Subscriber access provided by The Univ of Iowa Libraries

Article

A highly characterized synthetic landing pad system for precise multi-copy gene integration in yeast Vincent J. J. Martin, Leanne Bourgeois, and Michael E Pyne ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.8b00339 • Publication Date (Web): 29 Oct 2018 Downloaded from http://pubs.acs.org on October 29, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

A highly characterized synthetic landing pad system for precise multi-copy gene integration in yeast

Leanne Bourgeois,a,b,*,† Michael E. Pyne,a,b,* Vincent J. J. Martina,b,#

aDepartment bCentre

of Biology, Concordia University, Montréal, Québec, Canada;

for Applied Synthetic Biology, Concordia University, Montréal, Québec, Canada

#Address correspondence to Vincent J. J. Martin, [email protected] † Present

address: Hyasynth Bio, Montréal, Québec, Canada

* Contributed equally to this work.

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 39

ABSTRACT 1

A fundamental undertaking of metabolic engineering involves identifying and troubleshooting

2

metabolic bottlenecks that arise from imbalances in pathway flux. To expedite the systematic

3

screening of enzyme orthologs in conjunction with DNA copy number tuning, here we develop a

4

simple and highly characterized CRISPR-Cas9 integration system in Saccharomyces cerevisiae.

5

Our engineering strategy introduces a series of synthetic DNA Landing Pads (LP) into the S.

6

cerevisiae genome to act as sites for high-level gene integration. LPs facilitate multi-copy gene

7

integration of one, two, three, or four DNA copies in a single transformation, thus providing

8

precise control of DNA copy number. We applied our LP system to norcoclaurine synthase

9

(NCS), an enzyme with poor kinetic properties involved in the first committed step of the

10

production of high-value benzylisoquinoline alkaloids. The platform enabled rapid construction

11

of a 40-strain NCS library by integrating ten NCS orthologs in four gene copies each. Six active

12

NCS variants were identified, whereby production of (S)-norcoclaurine could be further

13

enhanced by increasing NCS copy number. We anticipate the LP system will aid in metabolic

14

engineering efforts by providing strict control of gene copy number and expediting strain and

15

pathway engineering campaigns.

16 17

Keywords: Saccharomyces cerevisiae, landing pad, CRISPR, metabolic engineering, alkaloids,

18

norcoclaurine

1 ACS Paragon Plus Environment

Page 3 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Engineering microorganisms to synthesize non-native compounds involves both the reconstruction of heterologous metabolic pathways and the modification of host activities. Following validation and reconstitution of a target metabolic pathway, extensive pathway optimization is required to reach commercially viable productivities. This process can last 5-10 years and cost more than $50 million1. Metabolic bottlenecks arise from an array of confounding issues, such as poor gene expression, formation of misfolded or unstable proteins, improper protein localization, loss of intermediates to off-target reactions, product toxicity, or poor enzyme kinetics2-4. Owing to the central role of debottlenecking in metabolic engineering, substantial focus has been directed at troubleshooting poor pathway performance. These strategies include screening orthologous enzymes, enhancing protein stability, boosting gene copy number, improving enzyme activity through rational or random approaches, and sequestering full or partial pathways to a cellular compartment5. Bioprospecting is a common approach to overcome enzyme inefficiencies by screening for improved or novel catalytic activities from a library of natural variants6. Genetic tuning involves the application of fundamental genetic tools to modulate gene expression and increase overall metabolic flux through a target pathway7. Regulating gene transcription is the primary mode of flux control and is generally performed by swapping genetic regulatory elements8-11. Because single copy expression strategies are limited by promoter strength, titrating the copy number of biosynthetic or regulatory genes is a complementary strategy to further boost mRNA and thus enzyme levels12-14. The type II CRISPR-Cas9 system from Streptococcus pyogenes has become the preferred genome-editing tool in yeast by enabling scarless, marker-free gene integration15-17. By expressing multiple gRNAs to simultaneously target different regions of the genome, the 2 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 39

CRISPR-Cas9 system is amenable to multiplexed engineering17-20. Multiplexing has greatly increased the speed and efficiency of pathway engineering by reducing the number of successive manipulations required to build a desired production strain. Since the initial demonstration16, many CRISPR-Cas9-based strategies have emerged and expanded the genome-editing toolbox. Ronda et al.19 reported the highest integration efficiencies for multiplexed integration of three large DNA constructs, while Jakočiūnas et al.17 demonstrated the in vivo assembly of three fivepart assemblies, each integrated at a distinct locus. Strategies employing more than one gRNA are only as efficient as the least effective gRNA18,19,21, thus placing major constraints on the practicality of multiplexing workflows. Bao et al.15 reported 100% efficiency for a single gene disruption, which declined to only 27% efficiency for a triple-gene disruption. Even with rapidly-advancing tools for gRNA design and efficacy prediction22,23, current protocols recommend pre-screening three gRNAs per target site24. Hence, genomic integration at four loci requires designing, assembling, and assaying 12 gRNAs prior to attempting the intended integration. At least two groups have instead investigated multi-loci gene integration by targeting repeat regions within the genome using a single gRNA and donor DNA. By designing and integrating a synthetic DNA “wicket”, Hou et al.25 targeted 3, 6, 9, or 12 genomic sites with a single gRNA and donor. Tandem duplication of the DNA donor was found to be an unintended outcome of the wicket system, whereby integration into the three-copy wicket generated copy numbers well above three. Similarly, by targeting yeast delta sequences of the Ty1 and Ty2 transposons, of which more than 100 copies exist in the S. cerevisiae genome26, Shi et al.13 integrated up to 18 copies of a 24 kb donor construct in a single reaction, a technique termed DiCRISPR. In this case, the target integration must be linked to a screenable or selectable phenotype to identify high copy number integrants. While the wicket and Di-CRISPR techniques

3 ACS Paragon Plus Environment

Page 5 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

are powerful tools to attain higher-order DNA copy numbers (up to 22 copies), optimal output of a target product is not always achieved by maximizing gene dosage27. For instance, Xie et al.28 observed increased lycopene formation upon integration of two-copies of crtI relative to a strain expressing one copy, yet lycopene and biomass synthesis decreased upon implementation of a third copy. On the other hand, increasing gene dosage may not influence enzyme levels at all and so expressing additional copies could prove energetically wasteful29. Benzylisoquinoline alkaloids (BIA) are a diverse class of plant secondary metabolites that includes the analgesics morphine and codeine, the antitussive and anti-cancer agent noscapine, the vasodilator papaverine, and the antimicrobial agents berberine and sanguinarine5,30. With roughly ~2,500 BIAs identified, only a small fraction have been investigated for pharmacological activity, and an even smaller fraction are produced commercially5. The recent elucidation of enzymes involved in BIA synthesis has led to the reconstitution of several BIA pathways in S. cerevisiae for production of (S)-reticuline31,32, sanguinarine33, noscapine34, as well as morphine and codeine35,36. Entry into BIA biosynthesis begins with the formation of (S)-norcoclaurine, the central scaffold from which all BIAs originate37-39. Synthesis of (S)-norcoclaurine in yeast requires heterologous expression of norcoclaurine synthase (NCS), which catalyzes the enantioselective Pictet-Spengler condensation of dopamine and 4-hydroxyphenylacetaldehyde (4-HPAA)38,39. Presently, production of BIAs in S. cerevisiae is inefficient, due in part by the low catalytic activity of NCS40,41. This represents the major bottleneck in the BIA pathway, as NCS converts 24 mg/L dopamine to only 104.6 µg/L (S)-norcoclaurine in S. cerevisiae31. To expand the multiplexed CRISPR-Cas9 toolbox, here we describe an integration system in S. cerevisiae based on synthetic DNA landing pads (LPs). Our envisioned LP system 4 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 39

was designed to simplify enzyme library screens and provide strict control of DNA copy number. We comprehensively characterized and optimized the LP system for multi-copy gene integration such that gene expression is directly proportionate to gene copy number. We applied the LP system to build and screen a 40-strain enzyme variant and copy number library of NCS, the major rate-limiting enzyme in the upstream BIA biosynthetic network.

RESULTS AND DISCUSSION Design of a Synthetic Landing Pad System for Multi-Copy Gene Integration in S. cerevisiae. We designed a multiplexed CRISPR-Cas9 integration platform to simplify the screening of enzyme libraries and facilitate precise DNA copy number tuning in S. cerevisiae. The envisioned system leverages synthetic DNA parts, called Landing Pads (LPs), to facilitate multi-copy gene integration (Figure 1A). The LP platform is comprised of four distinct LP constructs (LP1, LP2, LP3, LP4; LP number denotes genomic copies) dispersed throughout the yeast genome at different copy numbers (Figure 1B). Each LP contains a unique gRNA target sequence flanked by two recombinogenic regions that are used for CRISPR-Cas9-mediated gene integration. Since each LP within the same copy number motif contains identical target sites and recombinogenic regions, a single gRNA and donor DNA construct are used for integration of genes in one, two, three, or four copies in a single transformation. Although analogous in design and objective to the so-called wicket integration system25, which utilizes four distinct hosts to modulate copy number, only one master strain is necessary to vary copy number using the LP system. The wicket system is also characterized by uncontrolled tandem duplication of donor DNA, thus preventing precise control of DNA copy number, which is a central objective of the LP system. 5 ACS Paragon Plus Environment

Page 7 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Unlike most prior multiplexed CRISPR-Cas9 engineering strategies17-20, the LP system is composed entirely of synthetic parts, allowing us to systematically scrutinize and select optimal DNA parts, including LPs and Cas9 targeting sequences, prior to building the platform. We first designed the LP constructs by randomly generating four different 560 bp sequences with ~50% GC (arbitrarily designated LP1 to LP4). We also generated synthetic gRNAs (20 nt) that when placed directly upstream of a PAM (5'-CGG-3') would serve as synthetic Cas9 target sites within the LP host. LP target sites were queried against the S. cerevisiae genome to eliminate those with potential off-target homology. Ten such target sites were selected for Cas9-gRNA targeting assays and were used to assemble ten LP1 constructs (LP1.T1 to LP1.T10). Because the envisioned LP system requires only four synthetic gRNAs, we first tested the efficacy of each synthetic gRNA in vivo. The ten LP1 constructs were integrated into the same FgF20 locus42 in S. cerevisiae CEN.PK2-1D, yielding ten strains that vary only in the 23-nt Cas9 target site. The corresponding gRNA expression cassettes were then introduced with pCAS into the corresponding LP1 strains to first evaluate gRNA targeting efficiency in the absence of a repair donor (Figure 2A). Targeting efficiency was calculated by comparing the total number of colony-forming units (CFU) resulting from introduction of a synthetic gRNA relative to the same strains transformed with a control non-targeting gRNA cassette (see Materials and Methods). Targeting efficiency of each synthetic gRNA assayed in this manner ranged from 92-100% (Figure 2B). gRNA8 and gRNA10 produced nearly 100% targeting efficiency, closely followed by gRNA5, gRNA7, gRNA9, and gRNA1 (98% targeting efficiency). To rule out possible offtarget effects and ensure the observed targeting efficiencies represent correct targeting between the synthetic gRNA and LP1, we also performed the same gRNA-targeting assay in the wildtype strain lacking a corresponding LP1 target site (Figure 2B). While potential off-target

6 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 39

activity was observed for gRNA4 (12.2 ± 14.1%), gRNA5 (17.0 ± 25.0%), and gRNA6 (15.9 ± 13.9%) in the wild-type strain, these outcomes were not statistically different from zero based on a one sample t test (P > 0.1 for all). Since all ten synthetic gRNAs yielded high targeting efficiencies, we next assayed integration efficiency using the selected gRNAs and a repair donor. To quantify the efficiency of integration into the ten LP1 strains, we again introduced the linear gRNA cassettes and pCAS backbone, this time including a GFP expression cassette (PTDH3-GFP-TCYC1) flanked by homology to LP1 (Figure 2A). GFP integration efficiency assayed in this manner ranged from 58–92% across all ten synthetic gRNAs (Figure 2C). The highest integration efficiencies were observed for gRNA9 (91.7 ± 4.2%) and gRNA10 (91.7 ± 4.2%), whereas integration efficiency using gRNA2 was only 58.3 ± 18.1%. To eliminate the possibility that integration occurred in the absence of a Cas9-induced double strand DNA break (DSB), we transformed the GFP repair template without a targeting gRNA (–gRNA). Out of 72 colonies screened in this manner, cells from only one colony contained the GFP cassette integrated into LP1. Based on integration efficiency data attained using GFP as donor, we selected gRNA7, gRNA8, gRNA9, and gRNA10 to be implemented in the LP system. All four gRNAs demonstrated near perfect targeting efficiency and offered among the highest targeted integration efficiencies (83-92%) of the ten synthetic gRNAs. High-level integration and expression of genetic constructs are critical criteria of the envisioned LP platform, both of which are dictated by the genomic locus where integration occurs43. In this context, we selected 16 genomic loci based on previous work42,44 in which gene expression was quantified using a gene reporter. Twelve of the sites chosen for this work contain long terminal repeats, which are removed upon LP integration, while the remaining three consist

7 ACS Paragon Plus Environment

Page 9 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

of an inactive URA3 locus, the PDC6 locus, and the intergenic region between housekeeping genes SPB1 and PBN1. We first assayed and compared integration efficiency at the 16 loci. To perform these tests, we integrated the LP1 construct harboring target site 3 (LP1.T3) into all 16 loci, yielding 16 LP1.T3 strains. Integration efficiency at each genomic locus containing LP1.T3 was measured by introducing the LP1-GFP donor cassette with linearized gRNA3 and pCAS plasmid into each LP1.T3 strain. Efficiency of targeted GFP integration assessed in this manner varied from 64–92% across all 16 loci (Figure 3A). An integration efficiency of ≥80% was observed for 12 of the 16 loci (FgF7, 8, 11, 12, 14, 16, 18, 19, 20, 21, 24, and USERXII-1). For at least one experimental replicate, targeted integration efficiency reached 100% for sites FgF7, FgF16, FgF18, and FgF24. We next assayed gene expression from each of the 16 candidate loci by measuring GFP fluorescence from GFP integrant colonies resulting from each locus. The results showed a nearly seven-fold difference in GFP fluorescence across the 16 loci (Figure 3B). Integration of GFP at sites FgF1 and FgF11 resulted in the highest levels of fluorescence. Thirteen loci exhibited similar expression profiles and were considered good candidates for use in the LP platform. A similar lack of variation in gene expression between genomic loci has been reported using the oleaginous yeast Yarrowia lipolytica45. Consequently, we selected sites FgF7, 12, 16, 18, 19, 20, 21, 22, 24, and USERXII-1 to harbor LPs within the LP system. Despite high levels of gene expression at sites FgF1 and FgF11, these loci were not utilized in the LP system since sites with roughly comparable levels of gene expression were desired such that gene expression would be directly correlated with gene copy number. Construction and Performance of the LP System. Based on the results of our landing pad characterization, we chose 10 of the top performing genomic loci and four of the best synthetic Cas9 target sites to construct the LP system in yeast. The composition of the final LP platform

8 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 39

includes four LP motifs that were integrated into the S. cerevisiae genome at different copy numbers (LP1, LP2, LP3, and LP4) (Figure 4A). Each of the four LPs was assigned a unique synthetic Cas9 target site: LP1 contains the gRNA8 target site (LP1.T8), LP2 contains the gRNA10 target site (LP2.T10), LP3 contains the gRNA7 target site (LP3.T7), and LP4 contains the gRNA9 target site (LP4.T9). LPs were then inserted into quadruple auxotroph CEN.PK2-1D at the genomic loci assigned to each LP motif. Because some reports have indicated that introducing multiple DSBs to a single chromosome can cause genome instability and decrease cell viability46, identical LPs were integrated into sites on separate chromosomes. LP1 was integrated at FgF20; LP2 was integrated at FgF18 and FgF24; LP3 was integrated at USERXII-1, FgF7, and FgF19; and LP4 was integrated at FgF12, FgF16, FgF21 and FgF22. The resulting LP host strain was designated CEN.LP. Following integration of all four LP motifs at the assigned genomic loci, we again assessed targeting specificity of the four synthetic gRNAs to ensure that high targeting efficiency was preserved. The LP host was transformed with pCAS harboring one of the four gRNA constructs assigned to LP1, LP2, LP3, and LP4 (henceforth referred to as LPX.gRNA). All four LP.gRNA constructs displayed targeting efficiencies >95% (Figure 4B), demonstrating that efficacy of the selected gRNAs was maintained within the LP platform. Next we evaluated the utility of the system by measuring the efficiency of CRISPR-Cas9-mediated integration into LP1, LP2, LP3, and LP4, again using GFP as DNA donor. The GFP expression cassette was integrated into the chromosome in 1-4 copies by targeting each of the LP motifs in four parallel transformations. Overall, the rate of integration decreased as the number of targeted LPs increased (Figure 4C). Single-copy integration into LP1 showed the highest efficiency of 97% compared to 81% into LP2, 53% into LP3, and 39% into LP4. In contrast to the aforementioned

9 ACS Paragon Plus Environment

Page 11 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

multi-copy Wicket integration system25, it is noteworthy that we have not observed tandem duplication of donor DNAs using the LP system.

Improving Efficiency of Multi-Copy Gene Integration. While integration into LP1 was highly efficient, multi-copy integration efficiency declined as the number of targeted LP integration sites increased. Genotyping showed that targeting of the LP2, LP3, or LP4 motifs frequently yielded partial integrant colonies in which at least one of the targeted LPs did not contain the GFP donor cassette. Sequencing of all LP3 and LP4 loci from representative partial integrant colonies revealed unmodified LP target sites, suggesting that a Cas9-mediated DSB had not been introduced to these sites, thus allowing cells to survive despite the presence of Cas9 and gRNA. To determine if gRNA or cas9 mutations were responsible for partial integration events, the pCAS plasmid was recovered and sequenced from three partial integrant colonies per LP motif. None of the recovered pCAS plasmids from partial integrant colonies contained mutations within the gRNA expression cassette or cas9 coding sequence, suggesting that the CRISPR-Cas9 system is active within these cells. Accordingly, we next investigated whether incubating cell suspensions in liquid selective media following transformation outgrowth would improve integration efficiency by providing more time for the Cas9-gRNA complex to generate DSBs at each of the targeted LP integration sites. This approach was inspired by previous reports13,15, in which continued selective pressure extends exposure to Cas9 and provides more opportunities to generate necessary DSBs. Following introduction of pCAS, synthetic gRNA, and GFP donor DNA into the LP platform, transformation cultures were first recovered overnight in nonselective medium and subsequently transferred into fresh liquid medium containing G418 selection. These cultures were then incubated for two days in liquid selection medium to select

10 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 39

for expression of the pCAS plasmid prior to plating. Twelve resultant colonies were then screened for integration of the GFP donor into each of the LP motifs. The additional two-day growth period in liquid selection resulted in nearly 100% integration into all LP motifs (Figure 4D). Integration into LP3 and LP4 showed dramatic improvements, achieving efficiencies of 94% for LP3 and 97% for LP4 integrations. These efficiencies of multi-loci integration are comparable or superior to other multiplexed integration strategies using CRISPR-Cas9. Integration efficiencies of LP1 and LP2 also improved, with LP2 reaching 97% efficiency for two-copy integration and LP1 achieving 100% efficiency for all experimental replicates. Prolonged expression of the Cas9-gRNA complex enriches for the targeted integration event by providing selection against unmodified cells and triggering chromosomal repair in partial integrant cells, in which newly formed DSBs are repaired by chromosomal copies of integrated DNA donor rather than the original, exogenously-supplied template47. In this manner, prolonging interrogation by the CRISPR-Cas9 system converts partial integrant cells to ones in which all targeted loci harbor a copy of the DNA donor. Although we did not establish the upper limit of copy-number integration into the LP platform, the system has the potential to facilitate integration of up to ten copies of a gene in a single transformation by employing different combinations of LPs and associated gRNAs. To further increase the efficiency of our integration strategy, we could apply other CRISPR-Cas9 modifications, such as the HI-CRISPR system15 that co-expresses an improved Cas9 variant (iCas9). It would also be advantageous to generate new recombinogenic regions and target sites upon integration into the LP system such that LP motifs can be recycled for additional use. This would extend the utility of the LP platform by tailoring expression levels of multiple genes within a pathway.

11 ACS Paragon Plus Environment

Page 13 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Following optimization of multi-copy integration into the LP platform, we next measured fluorescence from strains expressing GFP from each LP copy number motif. Since GFP mRNA is directly proportional to the level of GFP fluorescence48, fluorescence intensity could be used as a proxy to determine whether gene expression is proportional to copy number using the LP platform. Fluorescence intensity of the four LP-GFP strains was observed to correlate with gene dosage, as higher levels of fluorescence were observed in strains containing more copies of GFP (Figure 5A). A roughly 50% increase in fluorescence was observed for each additional copy of GFP. To investigate whether multi-copy GFP expression levels are equivalent to the sum of expression levels from single-copy integrations at each locus, we summed fluorescence output obtained from single-copy GFP integrations (Figure 5B). Overall, the fluorescence generated by the sum of individual LP integrants mirrored the trend observed for strains harboring multi-copy GFP integrations. Taken together, these results demonstrate that the LP platform provides strict control of DNA copy number and can be exploited to precisely titrate gene expression in S. cerevisiae. Application to Synthesis of BIAs. Having established multi-copy integration of GFP using the LP platform, we next sought to apply our methodology to a target metabolic pathway bottleneck. In this regard, we targeted production of (S)-norcoclaurine, an early precursor to all BIAs49. (S)Norcoclaurine is formed by condensation of 4-HPAA and dopamine, and is catalyzed by the enzyme NCS (Figure 6A). NCS embodies one of the least efficient enzymes in the BIA biosynthetic pathway5,31. To increase titers of (S)-norcoclaurine, we focused on improving conversion efficiency of the NCS-catalyzed reaction. Using the LP platform, we sought to: 1) identify NCS variants showing improved enzymatic properties, and 2) test whether increasing NCS gene copy number produces higher titers of (S)-norcoclaurine in yeast. 4-HPAA is formed

12 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 39

endogenously in S. cerevisiae from 4-hydroxyphenylpyruvate (4-HPP), an intermediate in tyrosine catabolism50, whereas synthesis of dopamine is dependent on introduction of heterologous enzymes into the cell (Figure 6A). Formation of dopamine from tyrosine is a twostep process involving hydroxylation and decarboxylation, yielding the intermediate L-DOPA31. To impart dopamine biosynthetic capabilities to our LP strain and provide a background for (S)norcoclaurine formation, we first introduced a dopamine production pathway to the LP host strain31. The resulting dopamine-producing LP strain was designated CEN.LP.D. After 48 h growth in SC medium containing 4% glucose, millimolar concentrations of dopamine were detected from the supernatant of CEN.LP.D growing cultures (data not shown). The selected NCS candidates were part of a library of enzymes identified by querying the PhytoMetaSyn Project database51. The NCS collection includes 10 orthologs from the plant families Papaveraceae, Ranunculaceae, Berberidaceae, and Menispermaceae. Half of the NCS candidates have been expressed in yeast, including PsNCS1 and PsNCS3 from Papaver somniferum, NdNCS (Nandina domestica), TcNCS (Tinospora cordifolia), and TfNCS (Thalictrum flavum)31,40. To our knowledge the remaining NCS candidates have not been tested for activity in yeast and include AmNCS (Argenome mexicana), EcNCS (Eschscholzia californica), PsNCS2 (P. somniferum), ScNCS (Sanguinaria canadensis), and SdNCS (Stylophorum diphyllum). To enhance enzyme expression levels, the NCS sequences were codon optimized for expression in yeast and cloned into the pBOT-HIS vector system52 between the PTEF1 and TPGI1 regulatory elements. One to four copies of each NCS variant was integrated into the dopamine producing LP host by introducing the LPX-NCS donors with pCAS and the associated LP.gRNAX. Transformants were plated on selective medium and screened for complete integration into each LP motif. The assay produced a total of 40 NCS strains

13 ACS Paragon Plus Environment

Page 15 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

representing each variant integrated into S. cerevisiae genome in 1-4 copies. The resulting NCS variant and copy number library was assayed for de novo synthesis of (S)-norcoclaurine. Since racemic norcoclaurine has been known to form spontaneously32, the background strain lacking NCS was included to determine the rate of spontaneous production. After 96 h cultivation, LC/MS was used to measure the concentration of (S)-norcoclaurine in culture supernatants, as nearly all (S)-norcoclaurine synthesized by S. cerevisiae is secreted by the cell31. In the absence of an NCS enzyme, the CEN.LP.D strain did not produce detectable levels of (S)-norcoclaurine, suggesting that all (S)-norcoclaurine detected in cultures containing an NCS was produced enzymatically. Furthermore, enzymatic condensation of 4-HPAA and dopamine is enantioselective, therefore norcoclaurine produced by strains expressing NCS is expected to represent (S)-norcoclaurine53. Of the ten NCS variants, six produced detectable levels of (S)norcoclaurine (EcNCS, NdNCS, PsNCS3, ScNCS, SdNCS, and TfNCS) (Figure 6B). The best producers were strains containing ScNCS and NdNCS, which yielded 60 μg/L and 51 μg/L, respectively, for single-copy integrations. Generally, increasing NCS copy number improved production of (S)-norcoclaurine in strains harboring an active NCS variant, though the degree of improvement varied between the NCS candidates. Strains expressing four copies of EcNCS, NdNCS, PsNCS3, or ScNCS produced two-fold more (S)-norcoclaurine compared to the respective single-copy strains. Interestingly, not all functional NCS variants responded to increasing gene copies. For instance, modulating gene dosage of SdNSC or TfNCS did not affect production of (S)-norcoclaurine, which remained stable between copy number variants (Figure 6B). These variants contain highly hydrophobic C termini not found in the other variants, which may lead to insertion of these variants into a cellular membrane. It is possible that such a membrane becomes saturated upon expression of only one copy of TfNCS or SdNCS. Differences

14 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 39

in protein translation or folding may also place constraints on enzyme levels27,54. In contrast, expressing additional copies of the most active NdNCS and ScNCS orthologs further increased production of (S)-norcoclaurine to roughly 130 µg/L. This titer exceeds early reports of (S)norcoclaurine synthesis in yeast and demonstrates that increasing NCS copy number is an effective strategy for enhancing product formation and entry into the BIA biosynthetic pathway.

MATERIALS AND METHODS Strains, Cultivation, and DNA Manipulation. Strains employed in this study are listed in Table S1. S. cerevisiae CEN.PK2-1D (MATα; his3-Δ1; leu2-3,112; ura3-52; trp1-289; MAL2-8c; SUC2) was utilized as the basis for construction of the LP system. Yeast cultures were grown in YPD (10 g/L yeast extract, 20 g/L tryptone, 20 g/L dextrose), YPS (10 g/L yeast extract, 20 g/L peptone, 40 g/L sucrose), or synthetic complete (SC) medium (6.8 g/L Yeast Nitrogen Base without amino acids, 1.92 g/L Synthetic Drop-out Medium Supplement without histidine, 0.76 mg/L L-histidine, 40 g/L glucose). When appropriate, 200 µg/ml of geneticin (G418) was added to media for selection of the pCAS plasmid. Escherichia coli cultures were cultivated in Lysogeny Broth (LB) supplemented with 50 µg/ml kanamycin or 100 μg/ml ampicillin where applicable and grown overnight at 37°C with shaking at 200 rpm. Plasmids used in this study are listed in Table S2 and were maintained in E. coli DH5α. The pCAS plasmid was purchased from Addgene (Plasmid #60847)20. Selected constructs were cloned into pJet1.2 using the CloneJet PCR cloning kit (Thermo Fisher Scientific). The GFP expression cassette was amplified from pGREG503ref55. Genes required for dopamine synthesis were amplified from the yeast integrative plasmid pWCD2249 (Genbank KR232306.1)31. NCS variants were synthesized and cloned into pBOT-HIS52. Plasmids were purified from E. coli 15 ACS Paragon Plus Environment

Page 17 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

using the GeneJET plasmid mini prep kit (Thermo Fisher Scientific). Oligonucleotides used in this study are listed in Table S3. Synthetic LPs and genes are listed in Table S4 and were assembled into expression cassettes using genetic promoters and terminators listed in Table S5. Synthetic LPs were designed using the random DNA sequence generator FaBox56 with 50% GC content and were queried against the S. cerevisiae genome to ensure no sequence similarity was identified. DNA constructs were amplified by PCR using Phusion High-Fidelity DNA polymerase (Thermo Fisher Scientific). DNA fragments were purified using the GeneJET Gel Extraction Kit (Thermo Fisher Scientific). Synthetic gRNA targeting sequences are listed in Table S6. pCAS vector backbone and gRNA expression cassettes were amplified with overlapping homology regions and assembled in yeast18. To program new gRNA targeting sequences, the gRNA expression cassette was amplified in two universal parts: Left (tRNATyr3'HDV) and Right (scaffold-TSNR52) using primers that insert the new overlapping gRNA target sequence. The control –gRNA expression cassette was generated by fusing the left and right gRNA fragments together without a 20-nt targeting region. S. cerevisiae was transformed using a modified protocol18,57. For transformations requiring an extended outgrowth in selective medium, recovered cells were diluted 1:100 into fresh YPD+G418 medium and grown for 48h at 30°C with shaking before plating on YPD+G418 medium. All genomic integrations were performed using the CRISPR-Cas9 delivery system developed by Ryan et al.20. The pCAS plasmid encodes the kanMX marker for selection in media containing G418. Between 1-4 µg of donor DNA were used for CRISPR-Cas9mediated integration. The FgF20-LP1.TX constructs (×10) were integrated separately into CEN.PK2-1D at the FgF20 locus by introducing pCAS and FgF20 gRNA along with the corresponding LP1.TX 16 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 39

donor. Targeting efficiency was calculated according to the equation below, where +gRNA represents a synthetic targeting gRNA and –gRNA represents a control non-targeting gRNA cassette:

( (

( + )𝑔𝑅𝑁𝐴 𝑇𝑎𝑟𝑔𝑒𝑡𝑖𝑛𝑔 𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑐𝑦 (%) = 1 ― 𝐶𝐹𝑈 ( ― )𝑔𝑅𝑁𝐴

)) × 100

Integration efficiency was measured by introducing linearized gRNAX and pCAS vector backbone with the LP1.GFP donor. To compare integration efficiency and gene expression at various genomic loci we introduced the LP1.T3 construct at 16 genomic loci42,44. Sixteen LP1.T3 donors were generated by fusing ~500 bp homology arms for each genomic locus. LP1.T3 constructs were introduced into the CEN.PK2-1D strain at the corresponding locus by cotransforming pCAS with a locus-specific gRNA. Integration efficiency into LP1.T3 at each genomic locus was measured by introducing linearized gRNA3 and pCAS vector backbone with the LP1.GFP donor into each LP1.T3 strain. Full-length LPX.TX constructs were generated by PCR using primers to introduce target sites to the LPX.A and LPX.Z recombinogenic regions. The LPX donor constructs were generated by PCR using primers to attach ~60 bp homology to the assigned genomic loci. Donor constructs were: LP1.T8 with homology to site FgF20; LP2.T10 donors with homology to sites FgF18 and FgF24; LP3.T7 donors with homology to sites FgF7, FgF19, and USERXII-1; LP4.T9 donors with homology to sites FgF12, FgF16, FgF21, FgF22. The LP platform was constructed by integrating one or two LP donor constructs in successive transformations. Integration efficiency into each LP motif was measured by co-transforming linearized LPX.gRNA (400-800 ng) and pCAS (250 ng) vector backbone with LPX.GFP donor constructs (1-4 µg): LP1.GFP donor was co-transformed with gRNA8; LP2.GFP donor was cotransformed with gRNA10; LP3.GFP donor was co-transformed with gRNA7; and LP4.GFP

17 ACS Paragon Plus Environment

Page 19 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

donor was co-transformed with gRNA9. Integration efficiency was calculated by genotyping a minimum of 12 colonies per transformation In order to assay NCS enzyme variants, a three-gene dopamine production cassette (PTDH3-CYP76AD1W13L/F309L-TTDH1-PCCW12-DODC-TADH1-PPGK1-ARO4FBR-TPGK1 ) was introduced into the LP strain to supply dopamine for de novo synthesis of (S)-norcoclaurine31. The cassette was amplified from pWCD224931 in two overlapping fragments along with flanking homology to the ARO4 locus. The four donor fragments were co-transformed with pCAS vector backbone and a gRNA targeting ARO4. The NCS enzyme library used in this study was curated by querying the PhytoMetaSyn transcriptome database (http://www.phytometasyn.net)51. NCS nucleotide sequences were codon-optimized for expression in S. cerevisiae. NCS genes were synthesized by Gen9 and cloned into the pBOT-HIS expression vector between the TEF1 promoter and the GFP fusion tag52. The GFP tag was subsequently removed by restriction enzyme digestion with KasI.

Metabolite Analysis and Analytical Methods. Fluorescence levels were measured for strains expressing GFP at different genomic loci and copy numbers by cultivating strains overnight in SC with 2% glucose. Overnight cultures were then diluted 10 × into fresh medium and incubated for an additional 4 h to obtain log phase cells. Fluorescence was measured using the M200 plate reader (Tecan) using an excitation wavelength of 485 nm and an emission wavelength of 525 nm. Gain was adjusted manually for each sample. The background strain lacking GFP was used to correct for autofluorescence. Dopamine- and (S)-norcoclaurine-producing colonies were picked in triplicate and inoculated into 400 μl of YPS in two ml deep well microtiter plates. Cultures were incubated at 18 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 39

30 °C with shaking at 350 rpm for 16 h. Overnight cultures were then diluted to an initial OD600 of 0.3 in 900 μl of YPS. After 96 h, cultures were centrifuged at 4,000 rpm for 5 minutes and supernatant was collected and suspended 1:1 in 60% acetonitrile + 0.2% formic acid (final concentration 30% acetonitrile + 0.1% formic acid) for LC/MS analysis. Dopamine and (S)norcoclaurine were analyzed using the 1290 Infinity II LC system, (Agilent Technologies) with a Zorbax Eclipse Plus C18 50 × 4.6 mm column (Agilent Technologies). Solvent A (100% water, 0.1% formic acid) and solvent B (100% acetonitrile, 0.1% formic acid) were used in a gradient elution to separate metabolites. Samples were separated using a linear gradient: 0-5 min 98% A/ 2% B, 5-7 min 90% A/ 10% B, 7-7.1 min 15% A/ 85% B at a flow rate of 0.3 ml/min followed by a 3 min equilibration at 100% A at a flow rate of 0.4 ml/min31. Following LC separation, eluent was injected into a 6560 Ion Mobility Q-TOF LC/MS (Agilent Technologies). The system was operated in positive electrospray (ESI+) mode using the following parameters: capillary voltage 4000V; fragmentor voltage 400V; source temperature 325°C; nebulizer pressure 55 psig; gas flow 10 L/min. Dopamine and (S)-norcoclaurine standards (Toronto Research Chemicals Inc.) were used to determine retention times and generate calibration curves. LC/MS spectra were analyzed using MassHunter quantitative analysis software (Agilent Technologies).

ASSOCIATED CONTENT Supporting Information Supporting Tables (PDF)

ABBREVIATIONS

19 ACS Paragon Plus Environment

Page 21 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

BIA, benzylisoquinoline alkaloid; DSB, double strand DNA break; 4-HPAA, 4hydroxyphenylacetaldehyde; 4-HPP, 4-hydroxyphenylpyruvate; LP, landing pad; NCS, norcoclaurine synthase; PAM, protospacer-adjacent motif

AUTHOR INFORMATION aDepartment bCentre

of Biology, Concordia University, Montréal, Québec, Canada;

for Applied Synthetic Biology, Concordia University, Montréal, Québec, Canada

*E-mail: [email protected]

Author Contributions V.J.J.M. conceived of the study and participated in its design and coordination. L.B. and M.E.P. participated in the design of the study and performed all experiments. All authors helped draft the manuscript and have read and approved the final version.

Notes The authors declare no competing financial interest.

ACKNOWLEDGMENTS This work was supported financially by an NSERC-Industrial Biocatalysis Network (IBN) grant, an NSERC Discovery grant, and an NSERC Strategic grant. M.E.P. was supported by an NSERC postdoctoral fellowship and L.B. was supported by an NSER Concordia graduate scholarship. V.J.J.M. is supported by a Concordia University Research Chair

20 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 39

FIGURE CAPTIONS Figure 1. Design of the LP system in S. cerevisiae. (A) LPs consist of modular parts: a Cas9 target site and LP.A and LP.Z recombinogenic regions for gene integration via HDR. (B) Four distinct LPs (LP1, LP2, LP3, LP4) are integrated into ten different genomic loci. LP number corresponds to the number of copies of each LP in the genome.

Figure 2. Targeting specificity of ten synthetic gRNA candidates. (A) Schematic overview of synthetic gRNA efficacy tests. i) Synthetic gRNA targeting efficiency tests were performed by introducing each synthetic gRNA into the complementary LP1.TX target strain. ii) Targeted integration efficiency was evaluated by introducing each synthetic gRNA with a linear LP1-GFP donor cassette into the corresponding LP1.TX strain. (B) Synthetic gRNA targeting efficiency. Comparison of targeting efficiency for each synthetic gRNA in wt (–) versus the appropriate LP1.TX strain (+). (C) Synthetic gRNA targeted integration efficiency. Integration efficiency from each synthetic gRNA was assayed using a LP1-GFP donor cassette and compared against a non-targeting -gRNA control (-). Integration efficiency was determined by genotyping at least 12 selected colonies. Error bars represent the mean ± s.d. of three separate assays. wt: CEN.PK21D.

Figure 3. Evaluation of selected genomic loci. (A) Integration efficiency at candidate genomic loci. (B) Gene expression at candidate loci. Fluorescence measurements were corrected against autofluorescence of the wt control (–) and normalized against OD600. Error bars represent the mean ± s.d. of three biological replicates.

21 ACS Paragon Plus Environment

Page 23 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Figure 4. Overview and evaluation of the yeast LP system. (A) Design of the LP system. (B) Targeting efficiency of synthetic gRNAs in the LP system. Targeting efficiency was evaluated by comparing the number of transformants following introduction of the appropriate synthetic gRNA against a non-targeting -gRNA control. (C) Integration efficiency into the LP motifs using the unmodified protocol. A GFP expression cassette flanked by LP homology arms was used as donor and was introduced with each LPX.gRNA into the LP strain. Transformation cultures were plated on selective medium directly following recovery for 16 h. (D) Integration efficiency into LP motifs after an additional two-day incubation in selective liquid medium prior to plating. Error bars represent the mean ± s.d. of three separate assays.

Figure 5. GFP expression analysis. (A) Fluorescence of LP platform strains expressing 1-4 copies of GFP. Error bars represent the mean ± s.d. of three biological replicates. (B) Sum of fluorescence of single-copy GFP integrations at each LP motif locus. Total fluorescence of LP2, LP3, and LP4 motifs was quantified by summing fluorescence of individual strains containing single-site GFP insertions at loci 18+24 (LP2), XII+7+19 (LP3), and 12+16+21+22 (LP4). All fluorescence measurements were corrected against the mean fluorescence of the wt (–) control.

Figure 6. Application to NCS. (A) Heterologous de novo pathway of BIA biosynthesis in S. cerevisiae. Feedback-resistant Aro4 (Aro4FBR; grey text) increases production of tyrosine. Heterologous enzymes are shown in colored text. Blue and pink text indicates enzymes involved in the synthesis of dopamine and (S)-norcoclaurine, respectively. Examples of major downstream BIA families are shown in the grey box. Hatched arrows indicate multiple enzymatic steps. Abbreviations: CYP76AD1, tyrosine hydroxylase; DODC, L-DOPA decarboxylase; NCS,

22 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 39

norcoclaurine synthase; 4-HPAA, 4-hydroxyphenylacetaldehyde; 4-HPP, 4hydroxyphenylpyruvate. (B) LC/MS analysis of (S)-norcoclaurine in the supernatant of cultures expressing an NCS variant in 1–4 copies. Error bars represent the mean ± s.d. of three biological replicates.

REFERENCES 1.

Nielsen, J., and Keasling, J. D. (2016) Engineering cellular metabolism, Cell 164, 11851197.

2.

Jensen, M. K., and Keasling, J. D. (2014) Recent applications of synthetic biology tools for yeast metabolic engineering, FEMS Yeast Res 15, 1-11.

3.

Jones, J. A., Toparlak, Ö. D., and Koffas, M. A. (2015) Metabolic pathway balancing and its role in the production of biofuels and chemicals, Curr. Opin. Biotechnol. 33, 52-59.

4.

Lian, J., Mishra, S., and Zhao, H. (In press) Recent advances in metabolic engineering of Saccharomyces cerevisiae: New tools and their applications, Metab. Eng.

5.

Narcross, L., Fossati, E., Bourgeois, L., Dueber, J. E., and Martin, V. J. (2016) Microbial factories for the production of benzylisoquinoline alkaloids, Trends Biotechnol. 34, 228241.

6.

Narcross, L., Bourgeois, L., Fossati, E., Burton, E., and Martin, V. J. (2016) Mining enzyme diversity of transcriptome libraries through DNA synthesis for benzylisoquinoline alkaloid pathway optimization in yeast, ACS Synth Biol. 5, 15051518.

23 ACS Paragon Plus Environment

Page 25 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

7.

Redden, H., Morse, N., and Alper, H. S. (2015) The synthetic biology toolbox for tuning gene expression in yeast, FEMS Yeast Res 15, 1-10.

8.

Lee, M. E., DeLoache, W. C., Cervantes, B., and Dueber, J. E. (2015) A highly characterized yeast toolkit for modular, multipart assembly, ACS Synth Biol. 4, 975-986.

9.

Reider Apel, A., d'Espaux, L., Wehrs, M., Sachs, D., Li, R. A., Tong, G. J., Garber, M., Nnadi, O., Zhuang, W., and Hillson, N. J. (2016) A Cas9-based toolkit to program gene expression in Saccharomyces cerevisiae, Nucleic Acids Res. 45, 496-508.

10.

Sun, J., Shao, Z., Zhao, H., Nair, N., Wen, F., Xu, J. H., and Zhao, H. (2012) Cloning and characterization of a panel of constitutive promoters for applications in pathway engineering in Saccharomyces cerevisiae, Biotechnol. Bioeng. 109, 2082-2092.

11.

Yamanishi, M., Ito, Y., Kintaka, R., Imamura, C., Katahira, S., Ikeuchi, A., Moriya, H., and Matsuyama, T. (2013) A genome-wide activity assessment of terminator regions in Saccharomyces cerevisiae provides a ″terminatome″ toolbox, ACS Synth Biol. 2, 337347.

12.

Clare, J., Rayment, F., Ballantine, S., Sreekrishna, K., and Romanos, M. (1991) Highlevel expression of tetanus toxin fragment C in Pichia pastoris strains containing multiple tandem integrations of the gene, Nat. Biotechnol. 9, 455-460.

13.

Shi, S., Liang, Y., Zhang, M. M., Ang, E. L., and Zhao, H. (2016) A highly efficient single-step, markerless strategy for multi-copy chromosomal integration of large biochemical pathways in Saccharomyces cerevisiae, Metab. Eng. 33, 19-27.

24 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

14.

Page 26 of 39

Tyo, K. E., Ajikumar, P. K., and Stephanopoulos, G. (2009) Stabilized gene duplication enables long-term selection-free heterologous pathway expression, Nat. Biotechnol. 27, 760-765.

15.

Bao, Z., Xiao, H., Liang, J., Zhang, L., Xiong, X., Sun, N., Si, T., and Zhao, H. (2014) Homology-integrated CRISPR–Cas (HI-CRISPR) system for one-step multigene disruption in Saccharomyces cerevisiae, ACS Synth Biol. 4, 585-594.

16.

DiCarlo, J. E., Norville, J. E., Mali, P., Rios, X., Aach, J., and Church, G. M. (2013) Genome engineering in Saccharomyces cerevisiae using CRISPR-Cas systems, Nucleic Acids Res. 41, 4336-4343.

17.

Jakočiūnas, T., Bonde, I., Herrgård, M., Harrison, S. J., Kristensen, M., Pedersen, L. E., Jensen, M. K., and Keasling, J. D. (2015) Multiplex metabolic pathway engineering using CRISPR/Cas9 in Saccharomyces cerevisiae, Metab. Eng. 28, 213-222.

18.

Horwitz, A. A., Walter, J. M., Schubert, M. G., Kung, S. H., Hawkins, K., Platt, D. M., Hernday, A. D., Mahatdejkul-Meadows, T., Szeto, W., and Chandran, S. S. (2015) Efficient multiplexed integration of synergistic alleles and metabolic pathways in yeasts via CRISPR-Cas, Cell Systems 1, 88-96.

19.

Ronda, C., Maury, J., Jakočiu̅nas, T., Jacobsen, S. A. B., Germann, S. M., Harrison, S. J., Borodina, I., Keasling, J. D., Jensen, M. K., and Nielsen, A. T. (2015) CrEdit: CRISPR mediated multi-loci gene integration in Saccharomyces cerevisiae, Microb. Cell Fact. 14, 97.

25 ACS Paragon Plus Environment

Page 27 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

20.

Ryan, O. W., Skerker, J. M., Maurer, M. J., Li, X., Tsai, J. C., Poddar, S., Lee, M. E., DeLoache, W., Dueber, J. E., and Arkin, A. P. (2014) Selection of chromosomal DNA libraries using a multiplex CRISPR system, Elife 3, e03703.

21.

Jakočiu̅nas, T., Rajkumar, A. S., Zhang, J., Arsovska, D., Rodriguez, A., Jendresen, C. B., Skjødt, M. L., Nielsen, A. T., Borodina, I., and Jensen, M. K. (2015) CasEMBLR: Cas9-facilitated multiloci genomic integration of in vivo assembled DNA parts in Saccharomyces cerevisiae, ACS Synth Biol. 4, 1226-1234.

22.

Doench, J. G., Fusi, N., Sullender, M., Hegde, M., Vaimberg, E. W., Donovan, K. F., Smith, I., Tothova, Z., Wilen, C., and Orchard, R. (2016) Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9, Nat. Biotechnol. 34, 184-191.

23.

Labuhn, M., Adams, F. F., Ng, M., Knoess, S., Schambach, A., Charpentier, E. M., Schwarzer, A., Mateo, J. L., Klusmann, J.-H., and Heckl, D. (2017) Refined sgRNA efficacy prediction improves large-and small-scale CRISPR–Cas9 applications, Nucleic Acids Res. 46, 1375-1385.

24.

Shen, J. P., Zhao, D., Sasik, R., Luebeck, J., Birmingham, A., Bojorquez-Gomez, A., Licon, K., Klepper, K., Pekin, D., and Beckett, A. N. (2017) Combinatorial CRISPR– Cas9 screens for de novo mapping of genetic interactions, Nat. Methods 14, 573-576.

25.

Hou, S., Qin, Q., and Dai, J. (2018) Wicket: A versatile tool for the integration and optimization of exogenous pathways in Saccharomyces cerevisiae, ACS Synth Biol. 7, 782-788.

26 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

26.

Page 28 of 39

Cameron, J. R., Loh, E. Y., and Davis, R. W. (1979) Evidence for transposition of dispersed repetitive DNA families in yeast, Cell 16, 739-751.

27.

Hohenblum, H., Gasser, B., Maurer, M., Borth, N., and Mattanovich, D. (2004) Effects of gene dosage, promoters, and substrates on unfolded protein stress of recombinant Pichia pastoris, Biotechnol. Bioeng. 85, 367-375.

28.

Xie, W., Lv, X., Ye, L., Zhou, P., and Yu, H. (2015) Construction of lycopeneoverproducing Saccharomyces cerevisiae by combining directed evolution and metabolic engineering, Metab. Eng. 30, 69-78.

29.

Lee, W., and DaSilva, N. A. (2006) Application of sequential integration for metabolic engineering of 1, 2-propanediol production in yeast, Metab. Eng. 8, 58-65.

30.

Hagel, J. M., and Facchini, P. J. (2013) Benzylisoquinoline alkaloid metabolism–a century of discovery and a brave new world, Plant Cell Physiol. 54, 647-672.

31.

DeLoache, W. C., Russ, Z. N., Narcross, L., Gonzales, A. M., Martin, V. J., and Dueber, J. E. (2015) An enzyme-coupled biosensor enables (S)-reticuline production in yeast from glucose, Nat. Chem. Biol. 11, 465-471.

32.

Trenchard, I. J., Siddiqui, M. S., Thodey, K., and Smolke, C. D. (2015) De novo production of the key branch point benzylisoquinoline alkaloid reticuline in yeast, Metab. Eng. 31, 74-83.

33.

Fossati, E., Ekins, A., Narcross, L., Zhu, Y., Falgueyret, J.-P., Beaudoin, G. A., Facchini, P. J., and Martin, V. J. (2014) Reconstitution of a 10-gene pathway for synthesis of the plant alkaloid dihydrosanguinarine in Saccharomyces cerevisiae, Nat. Commun. 5, 3283.

27 ACS Paragon Plus Environment

Page 29 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

34.

Li, Y., and Smolke, C. D. (2016) Engineering biosynthesis of the anticancer alkaloid noscapine in yeast, Nat. Commun. 7, 12137.

35.

Fossati, E., Narcross, L., Ekins, A., Falgueyret, J.-P., and Martin, V. J. (2015) Synthesis of morphinan alkaloids in Saccharomyces cerevisiae, PloS One 10, e0124459.

36.

Galanie, S., Thodey, K., Trenchard, I. J., Interrante, M. F., and Smolke, C. D. (2015) Complete biosynthesis of opioids in yeast, Science 349, 1095-1100.

37.

Samanani, N., and Facchini, P. J. (2001) Isolation and partial characterization of norcoclaurine synthase, the first committed step in benzylisoquinoline alkaloid biosynthesis, from opium poppy, Planta 213, 898-906.

38.

Samanani, N., and Facchini, P. J. (2002) Purification and characterization of norcoclaurine synthase. The first committed enzyme in benzylisoquinoline alkaloid biosynthesis in plants, J. Biol. Chem. 277, 33878-33883.

39.

Samanani, N., Liscombe, D. K., and Facchini, P. J. (2004) Molecular cloning and characterization of norcoclaurine synthase, an enzyme catalyzing the first committed step in benzylisoquinoline alkaloid biosynthesis, Plant J. 40, 302-313.

40.

Li, J., Lee, E.-J., Chang, L., and Facchini, P. J. (2016) Genes encoding norcoclaurine synthase occur as tandem fusions in the Papaveraceae, Sci. Rep. 6, 39256.

41.

Lichman, B. R., Gershater, M. C., Lamming, E. D., Pesnot, T., Sula, A., Keep, N. H., Hailes, H. C., and Ward, J. M. (2015) ‘Dopamine‐first’mechanism enables the rational engineering of the norcoclaurine synthase aldehyde activity profile, The FEBS journal 282, 1137-1151.

28 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

42.

Page 30 of 39

Bai Flagfeldt, D., Siewers, V., Huang, L., and Nielsen, J. (2009) Characterization of chromosomal integration sites for heterologous gene expression in Saccharomyces cerevisiae, Yeast 26, 545-551.

43.

Thompson, A., and Gasson, M. J. (2001) Location effects of a reporter gene on expression levels and on native protein synthesis in Lactococcus lactis and Saccharomyces cerevisiae, Appl. Environ. Microbiol. 67, 3434-3439.

44.

Mikkelsen, M. D., Buron, L. D., Salomonsen, B., Olsen, C. E., Hansen, B. G., Mortensen, U. H., and Halkier, B. A. (2012) Microbial production of indolylglucosinolate through engineering of a multi-gene pathway in a versatile yeast expression platform, Metab. Eng. 14, 104-111.

45.

Schwartz, C., Shabbir-Hussain, M., Frogue, K., Blenner, M., and Wheeldon, I. (2016) Standardized markerless gene integration for pathway engineering in Yarrowia lipolytica, ACS Synth Biol. 6, 402-409.

46.

Jessop‐Fabre, M. M., Jakočiūnas, T., Stovicek, V., Dai, Z., Jensen, M. K., Keasling, J. D., and Borodina, I. (2016) EasyClone-MarkerFree: A vector toolkit for marker-less integration of genes into Saccharomyces cerevisiae via CRISPR-Cas9, Biotechnology journal 11, 1110-1117.

47.

Tsaponina, O., and Haber, J. E. (2014) Frequent interchromosomal template switches during gene conversion in S. cerevisiae, Mol. Cell 55, 615-625.

48.

Soboleski, M. R., Oaks, J., and Halford, W. P. (2005) Green fluorescent protein is a quantitative reporter of gene expression in individual eukaryotic cells, The FASEB journal 19, 440-442. 29 ACS Paragon Plus Environment

Page 31 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

49.

Stadler, R., Kutchan, T. M., and Zenk, M. H. (1989) (S)-Norcoclaurine is the central intermediate in benzylisoquinoline alkaloid biosynthesis, Phytochemistry 28, 1083-1086.

50.

Hazelwood, L. A., Daran, J.-M., van Maris, A. J., Pronk, J. T., and Dickinson, J. R. (2008) The Ehrlich pathway for fusel alcohol production: a century of research on Saccharomyces cerevisiae metabolism, Appl. Environ. Microbiol. 74, 2259-2266.

51.

Xiao, M., Zhang, Y., Chen, X., Lee, E.-J., Barber, C. J., Chakrabarty, R., DesgagnéPenix, I., Haslam, T. M., Kim, Y.-B., and Liu, E. (2013) Transcriptome analysis based on next-generation sequencing of non-model plants producing specialized metabolites of biotechnological interest, J. Biotechnol. 166, 122-134.

52.

Pyne, M., Narcross, L., Fossati, E., Bourgeois, L., Burton, E., Gold, N., and Martin, V. (2016) Reconstituting plant secondary metabolism in Saccharomyces cerevisiae for production of high-value benzylisoquinoline alkaloids, Methods Enzymol. 575, 195-224.

53.

Minami, H., Dubouzet, E., Iwasa, K., and Sato, F. (2007) Functional analysis of norcoclaurine synthase in Coptis japonica, J. Biol. Chem. 282, 6274-6282.

54.

Shi, S., Valle-Rodríguez, J. O., Siewers, V., and Nielsen, J. (2014) Engineering of chromosomal wax ester synthase integrated Saccharomyces cerevisiae mutants for improved biosynthesis of fatty acid ethyl esters, Biotechnol. Bioeng. 111, 1740-1747.

55.

Jansen, G., Wu, C., Schade, B., Thomas, D. Y., and Whiteway, M. (2005) Drag&Drop cloning in yeast, Gene 344, 43-51.

56.

Villesen, P. (2007) FaBox: an online toolbox for FASTA sequences, Mol. Ecol. Notes 7, 965-968.

30 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

57.

Page 32 of 39

Gietz, R. D., and Schiestl, R. H. (2007) High-efficiency yeast transformation using the LiAc/SS carrier DNA/PEG method, Nature protocols 2, 31-34.

31 ACS Paragon Plus Environment

Page 33 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

ACS Synthetic Biology

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44

ACS Paragon Plus Environment

Page 34 of 39

Page 35 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

ACS Synthetic Biology

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Paragon Plus Environment

Page 36 of 39

Page 37 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

ACS Synthetic Biology

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

ACS Paragon Plus Environment

Page 38 of 39

Page 39 of 39 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

ACS Synthetic Biology

CRISPR-Cas9-mediated gene copy number integration LP1

T8

1 – locus gRNA8

LP2 LP3 LP4

T10

2 – loci

gene T7 T9

LP1 gene

1 – copy

3 – loci gRNA10

4 – loci

gene gene

gene

gRNA7

gRNA9

ACS Paragon Plus Environment

3 – copy

LP4 gene gene gene gene

gene

2 – copy

LP3 gene gene gene

gene

Landing Pad Platform

LP2

4 – copy