OptSSeq: High-Throughput Sequencing Readout of Growth

Jul 12, 2016 - ... Sequencing Readout of Growth Enrichment Defines Optimal Gene ... ACS AuthorChoice - This is an open access article published under ...
0 downloads 0 Views 2MB Size
Subscriber access provided by UNIV OF CALIFORNIA SAN DIEGO LIBRARIES

Article

OptSSeq: High-throughput sequencing readout of growth enrichment defines optimal gene expression elements for homoethanologenesis Indro Neil Ghosh, and Robert Landick ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.6b00121 • Publication Date (Web): 12 Jul 2016 Downloaded from http://pubs.acs.org on July 16, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Synthetic Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

OptSSeq: High-throughput sequencing readout of growth enrichment defines optimal gene expression elements for homoethanologenesis Indro Neil Ghosh†,‡,§ and Robert Landick*,†,‡,§,||



DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison,

Wisconsin 53726, United States ‡

Cell and Molecular Biology Graduate Training Program, §Department of Biochemistry, and

||

Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin 53706,

United States

*To whom correspondence should be addressed; Email: [email protected]

Ghosh et al

Page 1

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 51

ABSTRACT The optimization of synthetic pathways is a central challenge in metabolic engineering. OptSSeq (Optimization by Selection and Sequencing) is one approach to this challenge. OptSSeq couples selection of optimal enzyme expression levels linked to cell growth rate with high-throughput sequencing to track enrichment of gene expression elements (promoters and ribosome-binding sites) from a combinatorial library. OptSSeq yields information on both optimal and suboptimal enzyme levels, and helps identify constraints that limit maximal product formation. Here we report a proof-of-concept implementation of OptSSeq using homoethanologenesis, a two-step pathway consisting of pyruvate decarboxylase (Pdc) and alcohol dehydrogenase (Adh) that converts pyruvate to ethanol and is naturally optimized in the bacterium Zymomonas mobilis. We used OptSSeq to determine optimal gene expression elements and enzyme levels for Z. mobilis Pdc, AdhA, and AdhB expressed in Escherichia coli. By varying both expression signals and gene order, we identified an optimal solution using only Pdc and AdhB. We resolved current uncertainty about the functions of the Fe2+-dependent AdhB and Zn2+-dependent AdhA by showing that AdhB is preferred over AdhA for rapid growth in both E. coli and Z. mobilis. Finally, by comparing predictions of growth-linked metabolic flux to enzyme synthesis costs, we established that optimal E. coli homoethanologenesis was achieved by our best pdc-adhB expression cassette and that the remaining constraints lie in the E. coli metabolic network or inefficient Pdc or AdhB function in E. coli. OptSSeq is a general tool for synthetic biology to tune enzyme levels in any pathway whose optimal function can be linked to cell growth or survival.

KEYWORDS (6): Ethanol, Combinatorial Optimization, Metabolic Engineering, Ribosome Binding Site, Promoter, Synthetic Biology

Ghosh et al

Page 2

ACS Paragon Plus Environment

7/11/2016

Page 3 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

■ INTRODUCTION Engineering metabolic pathways to maximize intracellular metabolic flux towards products of interest is a central design objective for synthetic biology.1 A key challenge in achieving this goal is identifying optimal expression levels for relevant enzymes in a pathway so as to remove bottlenecks in metabolite flux,2 limit diversion of metabolites away from essential cellular processes,3 and avoid accumulation of toxic intermediates4 while minimizing the energetic cost of synthesizing necessary enzyme levels.5 Widely different catalytic efficiencies, allosteric interactions, substrate channeling, multiple substrate dependencies, enzyme localization, and other effects often complicate simple prediction of optimal intracellular enzyme levels based on in vitro kinetic parameters.6,7 Additionally, expression of exogenous enzymes for a synthetic pathway competes for resources needed for cell growth and viability, including transcriptional and translational capacities, energy supplies, and molecular building blocks.8-10 Excess expression of synthetic pathway enzymes can reduce cell growth and viability;5,11 therefore, optimal enzyme levels for a synthetic pathway must balance flux through the targeted pathway with requirements for these other cellular processes. Most strategies to optimize enzyme levels for these tradeoffs have either varied gene expression elements to achieve rationally predicted enzyme expression levels4,12,13 or have screened combinations of enzyme expression levels to identify optima through individual testing.14-19 These approaches have varied gene expression using libraries of ribosome-binding sites (RBSs),15,18,19 promoters,13 intergenic untranslated elements,17 or combinations of these elements.16 Another approach to enzyme-level optimization is to tie expression to a key metabolite or product using feedback control with a ligand-responsive transcription factor, which enables dynamic control.20,21 The dynamic control approach is especially well-suited to varying operon expression in response to varying conditions, whereas combinatorial optimization is well suited to finding optimal ratios of enzymes in a multistep pathway. A common issue with all combinatorial optimization approaches is the exponential increase in numbers of variants that must be evaluated to find the optimum, sometimes referred to as a ‘combinatorial explosion’.14 For example, an optimization problem involving 5 gene expression Ghosh et al

Page 3

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 51

elements (e.g., 4 RBSs and a promoter), each with 10 possible expression levels, has 100,000 possible cases to test – more than an order of magnitude higher than screening approaches using individual assays have been able to evaluate (98% of the enriched population (Figures 4D and S5A). The same pdc RBS#4 was also significantly enriched in the PBA and PB, whereas pdc RBS#3 was enriched in PA (Figures 5C and S5A). Values of EpdcRBS were lower for PBA, PB and PA cassette configurations, and there was minor variability between replicates of growth selection. RBS#3 and RBS#4 were predicted to be weaker by factors of ~8 and ~20, respectively, compared to the strongest predicted pdc RBS#1 (Table S3). To determine the actual strengths of the selected pdc RBSs, we identified a set of plasmids (Table S1) that encoded the same promoter (Promoter#37) with pdc RBSs#1-4, and then measured the Pdc levels produced by these plasmids using anti-Pdc western blots (Table S7; Methods). Interestingly, pdc RBS#3 and RBS#4 drove expression of significantly higher Pdc levels than RBS#1 and RBS#2 (210,000-260,000 Pdc/cell vs.130,000-160,000 Pdc/cell), reflecting modest inaccuracy of the RBS TIR predictions.30 These Pdc levels were validated by measuring σ70 levels in the same cells (1800-3500 σ70/cell in anaerobic MMG-1 medium; compared to 4700 ±2400 σ70/cell reported for MG1655 grown aerobically in MMG-1,61 and 2900 ± 700 σ70/cell measured for aerobically grown RL3000; Table S8). Thus, growth rate selection enriched for the strongest pdc RBSs (#3 and #4; Figures 4D and S5A), but less than maximal Pdc levels due to the submaximal promoter strengths co-selected with the RBSs (see previous section).

Ghosh et al

Page 12

ACS Paragon Plus Environment

7/11/2016

Page 13 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Fe2+-dependent AdhB is preferred over Zn2+-dependent AdhA for ethanologenic growth To determine whether AdhA, AdhB, or both would best support growth-linked ethanologenesis in E. coli, we investigated predicted strengths of adhB and adhA RBSs enriched in cassettes PBA and ABP during growth selection. Strong predicted adhB RBSs and weak predicted adhA RBSs (Figure 4A, D, and E) were present in populations of optimal strains. To verify that this result reflected selection of high levels of AdhB and low levels of AdhA, we measured AdhA and AdhB levels in optimal strain isolates from enriched libraries using antiAdhA and anti-AdhB western blots (Figure S9B and C, Table 3; Methods). Representative strain isolates containing optimal PBA and ABP cassettes consistently expressed high levels of AdhB (36,000-47,000 AdhB/cell), whereas levels of AdhA varied from 170,000 AdhA/cell (pPBA2) to undetectable (pABP plasmids; Table 6). More generally, the PBA and ABP growth enrichments selected the strongest predicted adhB RBS and diverse (PBA) or weak (APB) predicted adhA RBSs (Figures 5D and E). Thus, rapid growth selected for AdhB expression rather than AdhA expression regardless of gene order. We hypothesized that AdhB is preferred in E. coli due to a superior catalytic rate (Fe2+dependent AdhB kcat 310 s-1 vs. Zn2+-dependent AdhA kcat 59 s-1 reported in Z. mobilis).6,38 To confirm that this difference indeed drove selections of strong adhB RBSs and weak adhA RBSs, we compared RBS selections in cassettes PA and PB that contained only one Adh variant. PA and PB would be expected to undergo selection for only the encoded Adh, unlike in PBA and ABP where ratios of AdhA and AdhB would be selected. Strains containing optimized PB cassettes grew and synthesized ethanol with rates at par with or better than optimized ABP and PBA cassettes (e.g. pPB1, 0.24 ± 0.01 h-1 and 13.3 ± 1.3 pmol EtOH s-1 µg-1 TCP versus pPBA2 0.16 ± 0.01 h-1 and 8.7 ± 1.0 pmol EtOH s-1 µg-1 TCP; Table 3), indicating that co-expression of AdhA and AdhB is not required to ensure maximal ethanologenesis. This result is contrasts with the suggestion of Yomano et al that both AdhB and AdhA may be needed for maximal rates of ethanologenesis in E. coli.36 Our data also establish that a ratio of one AdhB to five Pdc enzyme molecules (29,000 AdhB to 170,000 Pdc/cell for pPB1; Table 3) permits maximal ethanologenic

Ghosh et al

Page 13

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 51

growth and this ratio must be sufficient to avoid accumulation of acetaldehyde to toxic levels.62,63 In contrast to ABP or PBA, the PA library yielded strong predicted adhA RBSs upon growth rate enrichment (Figure 5A, D, and E). The resulting plasmids supported growth and ethanologenesis but not the maximal rates (e.g., pPA2, 0.22 ± 0.01 h-1 and 8.9 ± 1.2 pmol EtOH s-1 µg-1 TCP; Table 2). Although these rates were close to those enabled by pPB1, the levels of AdhA from pPA plasmids were significantly higher than AdhB from pPB plasmids (307,000344,000 AdhA vs. 29,000-69,000 AdhB/cell; Table 3). These observations establish that AdhA can support ethanologenic growth, but requires greater protein synthesis cost to achieve reasonable growth rates. To verify that the selective pressures on AdhB or AdhA expression reflected a role in ethanol synthesis and not an unanticipated activity, we also enriched the PA and PB libraries in an adhE+ strain (RL3018; Table S1). E. coli AdhE catalyzes the same NADH-dependent acetaldehyde-to-ethanol conversion as AdhA and AdhB, and is thus expected to add to the activity of AdhA and AdhB in RL3018.64,65 Consistent with a pressure to minimize protein synthesis burden, we observed selection of weaker adhB and adhA RBSs in the adhE+ RL3018 strain than in the adhE- RL3019 strain (Figure S7). We conclude that the selections for high levels of AdhB and low levels of AdhA were indeed driven by the opposing requirements of catalytic properties vs. costs of protein synthesis. The preferential enrichment of AdhB vs. AdhA in our experiments contrasts with some previous proposals that AdhB in Z. mobilis is primarily responsible for conversion of accumulated ethanol back into acetaldehyde.44 To investigate which enzyme was present at highest levels during anaerobic growth of Z. mobilis, we performed anti-AdhB and anti-AdhA western blot measurements of Z. mobilis lysates collected from mid-exponential phase cells growing in optimal conditions. Interestingly, AdhB also is the most prevalent enzyme expressed during Z. mobilis ethanologenic growth at ~47,000 AdhB/cell vs. ~18,000 AdhA/cell (Table 3, Figure S9B and C). Especially considering the higher catalytic rate of AdhB, our results establish that, as for E. coli and in contrast to some previous conclusions,7 Fe2+-dependent AdhB rather

Ghosh et al

Page 14

ACS Paragon Plus Environment

7/11/2016

Page 15 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

than Zn2+-dependent AdhA is primarily responsible for ethanologenesis during anaerobic growth of Z. mobilis. Translational coupling and operon polarity effects likely resulted in stringent selection of elements in ABP cassettes relative to PBA We next investigated whether the observed enrichment of strong pdc RBSs in PBA, PB and PA libraries and weak adhA RBSs in the ABP library were influenced by gene order effects. Upstream genes in operons can influence the expression of downstream genes either by translational coupling or due to operon polarity in which weak or no expression of 5′-proximal genes stimulates Rho termination or ribonuclease degradation that reduces expression of 3′proximal genes.32,66,67 To investigate the influence of gene order in our experiment, we compared pdc and adhA RBSs enriched in cassettes PBA and ABP orientations. Strong pdc RBSs and weak adhA RBSs were selected independently of gene order, although the degrees of selection varied (Figure 5C, E; S5A, C). Thus, we inferred that selective pressure was dominated by the consequences and synthesis costs of enzyme activity rather than a need to avoid operon polarity. However, we noticed that Ex values for promoters, pdc RBSs, and adhA RBSs were much higher in the ABP library than the PBA library (Figures 5A-C and E). We postulated that selective pressures differed for these two gene orders due translational coupling of adhA to the highly expressed adhB in PBA, which could cause high AdhA levels irrespective of the adhA RBS (Figure 5E; Table 3). In contrast, placing adhA first in the ABP operon avoided its expression by translational coupling, thus selection of a weak adhA RBS now led to little or no AdhA expression. High pdc expression was observed despite possible polar effects (undetectable AdhA vs. 220,000 Pdc/cell; Table 3), possibly because the small size of adhA may have limited the potential for polarity to reduce downstream adhB and pdc expression in ABP. Our results also demonstrate that highly expressed genes need not be placed early in synthetic bacterial operons to ensure optimum levels of expression and illustrate the importance of testing multiple operon configurations in OptSSeq to facilitate effective exploration of a maximal range of enzyme expression levels.16

Ghosh et al

Page 15

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 51

pPB1 enables optimal E. coli ethanologenesis with Pdc and AdhB levels comparable to Z. mobilis One objective of OptSSeq is to identify optimal expression systems for enzymes in an engineered metabolic pathway. In our test case of Pdc and Adh for homoethanologenesis in E. coli, OptSSeq yielded this result in the form of the plasmid pPB1. Although it is possible that further enrichment steps, libraries with even stronger expression signals, or characterization of greater numbers of individual isolates from our enriched populations might yield an even better plasmid, it is clear that pPB1 greatly outperforms previous rationally designed plasmids like pPBwt or pPBAsyn and is superior to plasmids that include adhA (Figure 6A). Further, pPB1 achieves these maximal ethanol synthesis and growth rates with much less enzyme than the pPB2 plasmid, which supports significantly lower rates of ethanologenesis and growth (~170,000 vs. ~270,000 Pdc/cell, ~29,000 vs. ~69,000 AdhB/cell, ~13.3 vs. ~6.7 pmol EtOH s-1 µg-1 TCP, and ~0.24 vs. ~0.21 hr-1; Tables 2 and 3; Figure 6). The only difference between pPB1 and pPB2 is the presence of a consensus promoter (Promoter#1, Table S6) in pPB2 that gave 15fold higher expression of an RFP reporter compared to the near-consensus promoter in pPB1 (Promoter#25, Table S6; both plasmids contain the same Pdc and AdhB RBSs). Thus, the poorer performance of pPB2 is directly attributable to stronger transcription of the pdc-adhB. That pPB2 gave only ~2-fold higher levels of Pdc and AdhB than pPB1 probably reflects competition for translation with cellular mRNAs.11 This result suggests that Pdc and AdhB levels produced by pPB1 are near optimal. pPB1, with a stable, low-copy replicon and near-optimal levels of Pdc and AdhB, may be a generally useful reagent for future studies of E. coli homoethanologenesis. Interestingly, these optimal levels of Pdc and AdhB are relatively close to those observed in rapidly growing Z. mobilis. Although Z. mobilis grew ~59% faster than RL3019 pPB1 and converted glucose to ethanol at more than five times the rate despite being cultured at a lower temperature (30 °C vs. 37 °C for E. coli; Table 2), it contained ~53% the amount Pdc (~1 vs. ~1.9 pmol Pdc µg1 TCP or ~110,000 vs. ~170,000 Pdc/cell because the Z. mobilis cells were ~32% larger than the E. coli cells based on TCP) and ~160% the amount of AdhB (~420 vs. ~320 fmol AdhB µg-1 TCP or ~47,000 vs. ~29,000 AdhB/cell; Table 3 and Figure 6A). Thus, the

Ghosh et al

Page 16

ACS Paragon Plus Environment

7/11/2016

Page 17 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

trade-offs between protein synthesis costs, ethanologenesis, and growth rate appear to differ significantly in these two bacteria. This difference can be explained in part by exclusive use of Entner-Doudoroff glycolysis in Z. mobilis,68 which is more thermodynamically favorable and uses lower amounts of total enzyme than Embden-Meyerhof-Parnas glycolysis used by E. coli.69 Additionally, ATP production is uncoupled from growth in Z. mobilis.70 E. coli homoethanologenic growth rate is limited by protein synthesis cost Levels of Pdc and AdhB higher than those generated by pPB1 led to slower rather than faster rates of homoethanologenic E. coli growth (e.g., pPB2), suggesting that the cost of protein overexpression may become limiting at higher than pPB1 levels. This effect was also true for pPBA1, pPBA2, pABP1, pABP2 (Figure 6A). To investigate this observation further, we used the metabolic cost-benefit model of Scott et al36 to explore the competition between Pdc and Adh overexpression and the growth rate-linked synthesis of ribosomal proteins. This model predicts the impact of protein overexpression on bacterial growth rate, and gave an upper bound to the rate of strain RL3019 growth as a function of total overexpressed enzyme (red line, Fig. 6B; Methods). Notably, plasmids yielding increasing amounts of Pdc and Adh below the upper bound exhibited increased growth rates (pPBwt, pBPAsyn, and pPB1), whereas the decreased growth rates of plasmids expressing higher levels of enzymes were consistent with the predicted limit based on protein synthesis cost (Fig. 6B). Thus, the energetic costs of enzyme overexpression appears to limit the growth rate of E. coli at ~0.25 hr-1, near that achieved by pPB1. The inability of E. coli to exceed this level or to approach that observed for Z. mobilis may reflect the differences in metabolic networks described above, but could also be explained in part by expression of the Z. mobilis enzymes in the nonnative environment of the E. coli cytoplasm. We cannot exclude the possibility that the heterologous environment causes suboptimal catalytic activity or folding of the Z. mobilis enzymes, for instance by disrupting proper co-translational folding. Thus, future efforts to improve E. coli ethanologenesis rates could focus on both on ensuring maximal catalytic efficiency of recombinant enzymes and on retailoring E. coli metabolism to increase glucose-to-pyruvate flux.

Ghosh et al

Page 17

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 51

Conclusions and Prospects OptSSeq enabled the identification of an optimal cassette for the expression of the homoethanologenesis pathway in E. coli and provided answers to several key questions about homoethanologenic growth that had not previously been resolved. Z. mobilis AdhB is preferred over Z. mobilis AdhA for homoethanologenic growth of E. coli. This preference also is true for Z. mobilis, in which higher levels of AdhB than AdhA are present during growth. For E. coli, optimal homoethanologenic growth was obtained at ~170,000 Pdc/cell and a ratio of ~1:5 AdhB:Pdc, with AdhA being dispensable for maximal growth rates. Although pPB1 achieved a growth rate near the limit imposed by the cost of enzyme overexpression, it is possible that higher AdhB:Pdc ratios (not accessed in our experiments due to a limited strength of our strongest adhB RBS) would enable marginally greater fluxes. Industrially relevant fermentation mostly occurs in stationary phase cells,71 and it is possible that different levels of Pdc and AdhB, or the inclusion of AdhA, would support maximal glucose-to-ethanol flux in non-growing cells relevant to industrial production. One strategy could be to drive expression of higher total enzyme levels at the OptSSeq-defined Pdc-AdhB or Pdc-AdhA ratios as a switch to arrest cell growth at optimal cell density for industrial production. OptSSeq has proven to be an exceptionally powerful tool in this proof-of-principle study and can be expanded in several ways. The method is readily applicable to a wide range of end products whose production can be linked to cell growth. With appropriate genetic modifications, these end products may include important biofuel molecules like butanol,72 isobutanol,73 isoprenes,74 and potentially many other coproducts.75,76 The range of metabolites that can be optimized may be further expanded by linking metabolite-sensing transcription factors to transcription of antibiotic resistance genes.27-29 Other applications could include ameliorating stress responses77 and optimizing nutrient utilization pathways (e.g., nitrogen fixation or alternative carbon sources).16,78,79 Additionally, multiplex emulsion PCR methods like TRACE80 barcoding strategies such as CombiGEM,81 or long-read strategies such as SMRT82 and Nanopore sequencing83 would enable extension of OptSSeq to determine the co-selection of interdependent elements that would be especially useful when multiple optima exist. Coupling OptSSeq with dynamic metabolic pathway control20,21 would enable variation in the total levels

Ghosh et al

Page 18

ACS Paragon Plus Environment

7/11/2016

Page 19 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

of enzymes at optimized ratios in response to diverse conditions in which changes in cellular state might dictate increases or decreases in expression cassette transcription. Our study provides an initial demonstration of the OptSSeq method and useful insights into ethanologenesis; the method’s greatest utility will be realized by these future applications, adaptations, and extensions.

Ghosh et al

Page 19

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 51

■ METHODS Materials DNA oligonucleotides were obtained from Integrated DNA Technologies (Coralville, IA). Synthetic double-stranded DNAs were obtained from GeneArt, Thermo-Fisher Scientific (Pittsburg, PA). Enzymes for genetic manipulations were obtained from New England Biolabs (Ipswich, MA). Polyclonal rabbit antibodies specific to Pdc, AdhA, and AdhB were generated by ProteinTech (Rosemont, IL). Monoclonal mouse-anti-σ70 antibody 2G10 was obtained from Neoclone (Madison, WI). Goat-anti-rabbit antibody coupled to Cy5 and goat-anti-mouse antibody coupled to Cy3 were obtained from GE Healthcare (Marlborough, MA). Kanamycin (Kn), gentamycin (Gn), spectinomycin (Sp) and other reagents were obtained from SigmaAldrich (St Louis, MO), Gold Biotechnology (St Louis, MO), or Thermo-Fisher Scientific, unless otherwise specified. Bacterial strains, plasmids and genetic manipulations All strains used in this study were derivatives of E. coli K12 RL3000 (Table S1). RL3000 is an ilvG+ rph+ pyrE+ derivative of an MG165584,85 lineage (MG1655*) naturally lacking insA-5 and insB-5 upstream from flhDC, and therefore is not hyperflagellated. MG1655* rph+ pyrE+ correcting the pyrE bradytroph phenotype of MG165586 was selected as a larger colony on plates containing MOPS minimal medium with 0.2% glucose (MMG-0.2)87 after 3 rounds of growth enrichment in MMG-0.2, 25 µg Kn/ml of MG1655* transformed with a ColE1 KnR plasmid (pMK-T) containing 408 bp of synthetic rph+ pyrE+ DNA centered on the rph– frameshift mutation (obtained from GeneArt). MG1655* rph+ pyrE+ lost pMK-Trph+ after growth in MMG-0.2 without Kn (RL2730; confirmed by sequencing rph-pyrE). RL2730 was then P1 transduced to ilvG+(ValR) using a lysate grown on MG1655* ilvG+ pMK-TilvG+ (RL2732) and plated on MMG-0.2 plates containing 60 µg valine/ml to yield MG1655* rph+ pyrE+ ilvG+ (RL3000). RL2732 was obtained by selection on the same medium of an ilvG+(ValR) derivative of MG1655* transformed with pMK-T containing 468 bp of synthetic ilvG+ DNA centered on the ilvG– mutation in MG165588 (GeneArt). Complete genome sequence of RL3000 (Genbank accession XXXXXX) confirmed fhlDC+ rph+ pyrE+ ilvG+ and additionally detected wbbL::insHGhosh et al

Page 20

ACS Paragon Plus Environment

7/11/2016

Page 21 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

7 (rfb-50; O antigen– characteristic of E. coli K-12), nudF(G102V), ybhJ(L54I), yebN(G25D) ycfK:97bp, yciI::82bp, yecE::79bp, glcB::IS5. RL3000 grew 25% faster than MG1655 in MMG-0.2 aerobically (0.83 ± 0.02 h-1 vs. 0.67 ± 0.02 hr-1) and 16% faster anaerobically (0.52 ± 0.02 h-1 vs. 0.46 ± 0.01 h-1). To construct RL3018 and RL3019, P1 transductions of relevant Keio collection46 strain lysates were used to replace relevant genes with a FLP site flanked aphA(KnR) as described previously.89 Yeast Flp recombinase was used to excise aphA through methods previously described.90 Strains RL3000, RL3018, and RL3019 will be deposited with ATCC or the E. coli Genetic Stock Center (Yale Univ.). pPBwt (also called pJGG2)91 was previously constructed by cloning a section of pLOI29535 containing pdc and adhB into pBBR1MCS-5(GnR).48 To construct pPBAsyn, we assembled the pBBR1MCS-5 replicon and mob region with synthetic DNAs (GeneArt) encoding the anaerobically induced ydfZ promoter50,51 and the stationary-phase induced dps100 promoter,49 codon-optimized (with E. coli codon frequencies)55 Z. mobilis pdc, adhB, and adhA ORFs with native RBSs, an EvoGlow92 fluorescence reporter, and aphA(KnR) . The plasmid pRH52 was derived from pBBR1MCS-5,48 by replacing aacC1(GnR) with aadA1(SpR) from Tn2193 and inserting synthetic DNA encoding an E. coli-codon-optimized lacIq followed by the strong B. subtilis glnA terminator94 and a designed lacOid-Ptrc-lacO- lacZ′α promoter segment95,96 upstream from “superfolder” gfp97 and a phage P22 terminator.98 We constructed pRS002 (Table S1) by Gibson assembling a DNA fragment containing a MCS (which contains an EcoRV site), P22 terminator, pBBR ori and aadA1(SpR) from pRH52 with a second fragment containing the B. subtilis glnA terminator. pRS002 has the Bsu glnA terminator upstream and the P22 terminator downstream of the EcoRV site. To generate the combinatorial plasmid libraries, we first digested pRS002 with EcoRV to linearize the plasmid, then Gibson assembled this linearized plasmid with DNA fragments containing the promoter and RBS libraries and the codon-optimized pdc, adhB, and adhA from pPBAsyn (see ‘design and construction of ethanologenic gene cassette libraries’ below). To construct pING1001 (Table S1), we Gibson assembled six fragments. Fragment one contained the pBBR ori and aadA1(SpR) from pRH52, fragment two was a

Ghosh et al

Page 21

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 51

synthetic DNA segment encoding the promoter PT7A1,99 fragment three contained ‘superfolder’ gfp and the P22 terminator from pRH52, fragment four contained the Bsu glnA terminator from pRH52, fragment five contained a synthetic DNA segment encoding promoter # 1 (Table S4), and fragment six contained rfp100 and the rrnBT1 terminator101 from pGR-BBA_B0010.102 Complete sequences of pPBwt, pPBAsyn, pRH52, pRS0002, pING1001, pPB1, pPA1, pPBA1, and pABP1 are available from Genbank (accession IDs pending), and the plasmids will deposited with AddGene. Design and construction of ethanologenic gene cassette libraries We designed RBS libraries encoding a wide range of predicted translational initiation rates (TIRs; Table S3) using the ‘RBS Calculator’ (www.denovodna.com/software).30 To generate RBS-ORF DNA fragments, we PCR-amplified pdc, adhB or adhA from pPBAsyn using primers containing 5′ overhangs with degenerate sequences encoding the RBS libraries. We designed a library of promoters with degeneracies at key positions of the a consensus promoter 5′-gctggacctc-YTKAYA-attaatcatccggctcg-BATAAT-GBG-tggAattg (upper case -35, -10, discriminator, and transcription start site’ Table S4)31,52,53,60,103 and PCR amplified it from Gibson assembled overlapping 60mers generated using ‘DNAWorks’ (http://helixweb.nih.gov/dnaworks/).54 These fragments with appropriate 40-bp overlaps were designed using ‘NEBuilder’ (http://nebuilder.neb.com; Table S2) and were Gibson-assembled with the pRS0002 backbone [low-copy pBBR1 ori and aadA1(SpR); Table S1] to generate the pPB, pPA, pPBA, and pABP libraries. PCR was performed using NEB Q5 polymerase and annealing temperatures were calculated using the NEB ‘Tm Calculator’ (http://tmcalculator.neb.com/). PCR-generated fragments were electrophoresed on agarose gels (Lonza; 1%, 1.5%, or 3% agarose in 90 mM Tris-Borate 2.5 mM EDTA), excised, and purified using ‘QIAquick’ gel extraction reagents (Qiagen). Gibson assembly reactions were performed using ‘Hi-Fi’ reagents (NEB) in 20 µl final reaction volume for 4 hours at 50 °C using backbone (0.25 pmol) and inserts (1.25 pmol total DNA). Prior to Gibson assembly, DNAs were co-precipitated with spermine,104 equilibrated with

Ghosh et al

Page 22

ACS Paragon Plus Environment

7/11/2016

Page 23 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Mg2+, washed, dried, and dissolved in 10 µl H2O. Assembled DNAs were phenol-chloroform extracted, ethanol-precipitated, and dissolved in 5 µl H2O prior to electroporation into ‘ElectroMAX’ DH10B electrocompetent cells (50 µl; Thermo-Fisher). After recovery for 1 hr in 1 ml SOC medium (2% tryptone, 0.5% yeast extract, 10 mM NaCl, 2.5 mM KCl, 10 mM MgCl2, 10 mM MgSO4, and 20 mM glucose), a sample (1 µl) was removed and used to assay transformation efficiency and to test whether individual colonies contained correctly assembled plasmids. The remaining cells were diluted into 1 L LB+100 µg Sp/ml (LB-Sp) and grown at 37 °C for 6 hours before recovery of the plasmid DNA libraries using ‘Midiprep’ reagents (Promega). We verified the plasmid libraries by restriction digestion and by sequencing the promoter and RBS regions, including junctions to adjacent DNA, of individual isolates. We also verified the same regions in the entire library populations by HTS. Growth enrichments RL3018 and RL3019 were transformed with saturating quantities of plasmid library DNA (5 µg DNA per 50 µl of concentrated cells) by electroporation.105,106 To verify that the number of transformants exceeded the theoretical diversity of the plasmid libraries, we plated 1 µl out of the 1ml recovered cells on LB-Sp plates, and then grew the remaining cells aerobically by diluting into 1 L LB-Sp for 12 hours at 37 °C. We then recovered the cells by centrifugation (5 min, 4000 × g, 4 °C), washed the cells twice with 50 ml MMG-1, resuspended in 50 ml MMG containing 16% v/v glycerol, and stored the cell suspensions at -80 °C in 1 ml aliquots. To enrich for superior strains, we first recovered cells from 1 ml of thawed suspension by centrifugation (5 min, 4000 × g, 4 °C), resuspended the cells in 1 ml MMG-2% glucose, and incubated anaerobically for 6 hours at 37 °C with no shaking. The cells were then diluted to an apparent OD600 of 0.05 in 50 ml MMG-1% glucose and grown at 37 °C anaerobically with gentle stirring to an apparent OD600 of ~0.5. Cells where then re-diluted to an apparent OD600 of 0.05 and the growth enrichment repeated two more times. At the end of each enrichment, we assayed culture media for glucose consumed and ethanol produced, and collected cells (25 ml) by centrifugation (4,000 × g, 4 °C, 5 min) for plasmid extraction and HTS.

Ghosh et al

Page 23

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 51

HT-Seq of libraries and bioinformatic analyses We amplified sections of plasmids possessing promoters and pdc, adhB and adhA RBSs using primers designed with 5′ adapter sequences (Table S2; Figure S3) and PCR with Q5 polymerase (NEB). Barcode indexes and stem sections that would permit fragments to bind to flow cells during HT-sequencing were added through a second round of PCR (Table S2; Figure S3). Each plasmid library was amplified with a unique index sequence to enable de-multiplexing after sequencing. To report rates of read contamination, we designed plasmids with unique sequences at pdc, adhB and adhA RBS loci (Table S1) and assigned a unique index sequence to amplify segments of control plasmids. Prior to sequencing, we pooled fragments equimolarly and then purified and concentrated them using ‘QIAquick’ PCR purification reagents (Qiagen). DNAs were sequenced at the University of Wisconsin Biotechnology Center DNA sequencing facility using the Illumina ‘MiSeq’ instrument to yield 8.1 × 106 usable ~250 bp reads across all samples (on average 1.6 × 105 reads per sample; 51 samples). We sorted reads from different samples by their indexes and then filtered the reads based on phred scores (>19) and homology of constant regions flanking variable regions (>90% homology with expected sequences). We then counted read occurrences of each of the 72 promoter variants, 12 pdc RBS variants, 16 adhB RBS variants, 12 adhA RBS variants within individual samples. To estimate contaminant read frequencies we counted control/unique sequences in sample libraries and RBS sequences tagged with the control index and found the frequencies to acceptably low (50% fractional representation. Asterisks (*) indicate expression elements present in isolates chosen for further characterization. Figure 5. Extent of enrichment of promoters and RBSs for PBA, ABP, PB, PA libraries. (A) Color-coded representation of promoter and RBS enrichments for strong or weak signals. Less saturated colors represent a lower degree of enrichment. (B-E) Heat maps of final enrichments for promoter variants (B), pdc RBS variants (C), adhB RBS variants (D), and adhA RBS variants arranged from highest (top) to lowest (bottom) predicted strengths. The calculated degree of enrichments (Ex) are shown below each column. Color coding and asterisks are as described in the legend to Figure 4. Figure 6. Performance of optimized ethanologenic expression cassettes. (A) Growth rates and ethanologenesis rates of expression cassettes compared with predictions from the iJR904 metabolic model for growth rate linked to ethanologenesis rate (dotted line). Error bars are standard deviations from triplicate measurements. The histogram below the plot represents the total Pdc, AdhB, and AdhA enzyme molecules per cell for each expression cassette. (B) Relationship between growth rates and ethanologenic enzyme levels. The predicted maximum for possible growth rate based on the cost of protein overproduction (red line) was derived from the model Scott et al.5 The unburdened ethanologenic growth rate of 0.38 h-1 was predicted by the modified iJR904 for RL3019 (Methods). (C) Identities of expression elements in PB1 and PB2.

Ghosh et al

Page 33

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 51

■ REFERENCES (1) Na, D., Kim, T. Y., and Lee, S. Y. (2010) Construction and optimization of synthetic pathways in metabolic engineering, Curr Opin Microbiol 13, 363-370. (2) Lutke-Eversloh, T., and Stephanopoulos, G. (2008) Combinatorial pathway analysis for improved L-tyrosine production in Escherichia coli: identification of enzymatic bottlenecks by systematic gene overexpression, Metab Eng 10, 69-77. (3) He, L., Xiao, Y., Gebreselassie, N., Zhang, F., Antoniewiez, M. R., Tang, Y. J., and Peng, L. (2014) Central metabolic responses to the overproduction of fatty acids in Escherichia coli based on 13C-metabolic flux analysis, Biotechnol Bioeng 111, 575-585. (4) Pitera, D. J., Paddon, C. J., Newman, J. D., and Keasling, J. D. (2007) Balancing a heterologous mevalonate pathway for improved isoprenoid production in Escherichia coli, Metab Eng 9, 193-207. (5) Scott, M., Gunderson, C. W., Mateescu, E. M., Zhang, Z., and Hwa, T. (2010) Interdependence of cell growth and gene expression: origins and consequences, Science 330, 1099-1102. (6) Kinoshita, S., Kakizono, T., Kadota, K., Das, K., and Taguchi, H. (1985) Purification of two alcohol dehydrogenases from Zymomonas mobilis and their properties, Appl Microbiol Biotechnol. 22, 249-254. (7) Kalnenieks, U., Galinina, N., Toma, M. M., Pickford, J. L., Rutkis, R., and Poole, R. K. (2006) Respiratory behaviour of a Zymomonas mobilis adhB::kan(r) mutant supports the hypothesis of two alcohol dehydrogenase isoenzymes catalysing opposite reactions, FEBS Lett 580, 5084-5088. (8) Weisse, A. Y., Oyarzun, D. A., Danos, V., and Swain, P. S. (2015) Mechanistic links between cellular trade-offs, gene expression, and growth, Proc Natl Acad Sci U S A 112, E1038-1047. (9) Gyorgy, A., Jimenez, J. I., Yazbek, J., Huang, H. H., Chung, H., Weiss, R., and Del Vecchio, D. (2015) Isocost Lines Describe the Cellular Economy of Genetic Circuits, Biophys J 109, 639-646.

Ghosh et al

Page 34

ACS Paragon Plus Environment

7/11/2016

Page 35 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

(10) Gorochowski, T. E., Avcilar-Kucukgoze, I., Bovenberg, R. A., Roubos, J. A., and Ignatova, Z. (2016) A Minimal Model of Ribosome Allocation Dynamics Captures Trade-offs in Expression between Endogenous and Synthetic Genes, ACS Synth Biol. (11) Ceroni, F., Algar, R., Stan, G. B., and Ellis, T. (2015) Quantifying cellular capacity identifies gene expression designs with reduced burden, Nat Methods 12, 415-418. (12) Juminaga, D., Baidoo, E. E., Redding-Johanson, A. M., Batth, T. S., Burd, H., Mukhopadhyay, A., Petzold, C. J., and Keasling, J. D. (2012) Modular engineering of Ltyrosine production in Escherichia coli, Appl Environ Microbiol 78, 89-98. (13) Temme, K., Zhao, D., and Voigt, C. A. (2012) Refactoring the nitrogen fixation gene cluster from Klebsiella oxytoca, Proc Natl Acad Sci U S A 109, 7085-7090. (14) Farasat, I., Kushwaha, M., Collens, J., Easterbrook, M., Guido, M., and Salis, H. M. (2014) Efficient search, mapping, and optimization of multi-protein genetic systems in diverse bacteria, Mol Syst Biol 10, 731. (15) Zelcbuch, L., Antonovsky, N., Bar-Even, A., Levin-Karp, A., Barenholz, U., Dayagi, M., Liebermeister, W., Flamholz, A., Noor, E., Amram, S., Brandis, A., Bareia, T., Yofe, I., Jubran, H., and Milo, R. (2013) Spanning high-dimensional expression space using ribosome-binding site combinatorics, Nucleic Acids Res 41, e98. (16) Smanski, M. J., Bhatia, S., Zhao, D., Park, Y., L, B. A. W., Giannoukos, G., Ciulla, D., Busby, M., Calderon, J., Nicol, R., Gordon, D. B., Densmore, D., and Voigt, C. A. (2014) Functional optimization of gene clusters by combinatorial design and assembly, Nat Biotechnol 32, 1241-1249. (17) Pfleger, B. F., Pitera, D. J., Smolke, C. D., and Keasling, J. D. (2006) Combinatorial engineering of intergenic regions in operons tunes expression of multiple genes, Nat Biotechnol 24, 1027-1032. (18) Ng, C. Y., Farasat, I., Maranas, C. D., and Salis, H. M. (2015) Rational design of a synthetic Entner-Doudoroff pathway for improved and controllable NADPH regeneration, Metab Eng 29, 86-96. (19) Oliver, J. W., Machado, I. M., Yoneda, H., and Atsumi, S. (2014) Combinatorial optimization of cyanobacterial 2,3-butanediol production, Metab Eng 22, 76-82.

Ghosh et al

Page 35

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 51

(20) Holtz, W. J., and Keasling, J. D. (2010) Engineering static and dynamic control of synthetic pathways, Cell 140, 19-23. (21) Zhang, F., Carothers, J. M., and Keasling, J. D. (2012) Design of a dynamic sensorregulator system for production of chemicals and fuels derived from fatty acids, Nat Biotechnol 30, 354-359. (22) Ajikumar, P. K., Xiao, W. H., Tyo, K. E., Wang, Y., Simeon, F., Leonard, E., Mucha, O., Phon, T. H., Pfeifer, B., and Stephanopoulos, G. (2010) Isoprenoid pathway optimization for Taxol precursor overproduction in Escherichia coli, Science 330, 70-74. (23) Mandenius, C. F., and Brundin, A. (2008) Bioprocess optimization using design-ofexperiments methodology, Biotechnol Prog 24, 1191-1203. (24) Romero, P. A., Tran, T. M., and Abate, A. R. (2015) Dissecting enzyme function with microfluidic-based deep mutational scanning, Proc Natl Acad Sci U S A 112, 7159-7164. (25) Wang, H. H., Isaacs, F. J., Carr, P. A., Sun, Z. Z., Xu, G., Forest, C. R., and Church, G. M. (2009) Programming cells by multiplex genome engineering and accelerated evolution, Nature 460, 894-898. (26) Carr, P. A., Wang, H. H., Sterling, B., Isaacs, F. J., Lajoie, M. J., Xu, G., Church, G. M., and Jacobson, J. M. (2012) Enhanced multiplex genome engineering through cooperative oligonucleotide co-selection, Nucleic Acids Res 40, e132. (27) Dietrich, J. A., Shis, D. L., Alikhani, A., and Keasling, J. D. (2013) Transcription factorbased screens and synthetic selections for microbial small-molecule biosynthesis, ACS Synth Biol 2, 47-58. (28) Raman, S., Rogers, J. K., Taylor, N. D., and Church, G. M. (2014) Evolution-guided optimization of biosynthetic pathways, Proc Natl Acad Sci U S A 111, 17803-17808. (29) Eckdahl, T. T., Campbell, A. M., Heyer, L. J., Poet, J. L., Blauch, D. N., Snyder, N. L., Atchley, D. T., Baker, E. J., Brown, M., Brunner, E. C., Callen, S. A., Campbell, J. S., Carr, C. J., Carr, D. R., Chadinha, S. A., Chester, G. I., Chester, J., Clarkson, B. R., Cochran, K. E., Doherty, S. E., Doyle, C., Dwyer, S., Edlin, L. M., Evans, R. A., Fluharty, T., Frederick, J., Galeota-Sprung, J., Gammon, B. L., Grieshaber, B., Gronniger, J., Gutteridge, K., Henningsen, J., Isom, B., Itell, H. L., Keffeler, E. C., Lantz, A. J., Lim, J. N., McGuire, E. P., Moore, A. K., Morton, J., Nakano, M., Pearson, S. A.,

Ghosh et al

Page 36

ACS Paragon Plus Environment

7/11/2016

Page 37 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Perkins, V., Parrish, P., Pierson, C. E., Polpityaarachchige, S., Quaney, M. J., Slattery, A., Smith, K. E., Spell, J., Spencer, M., Taye, T., Trueblood, K., Vrana, C. J., and Whitesides, E. T. (2015) Programmed evolution for optimization of orthogonal metabolic output in bacteria, PLoS ONE 10, e0118322. (30) Salis, H. M. (2011) The ribosome binding site calculator, Methods Enzymol 498, 19-42. (31) Brewster, R. C., Jones, D. L., and Phillips, R. (2012) Tuning promoter strength through RNA polymerase binding site design in Escherichia coli, PLoS Comput Biol 8, e1002811. (32) Levin-Karp, A., Barenholz, U., Bareia, T., Dayagi, M., Zelcbuch, L., Antonovsky, N., Noor, E., and Milo, R. (2013) Quantifying translational coupling in E. coli synthetic operons using RBS modulation and fluorescent reporters, ACS Synth Biol 2, 327-336. (33) Li, G. W., Burkhardt, D., Gross, C., and Weissman, J. S. (2014) Quantifying absolute protein synthesis rates reveals principles underlying allocation of cellular resources, Cell 157, 624-635. (34) Davis, J. H., Rubin, A. J., and Sauer, R. T. (2011) Design, construction and characterization of a set of insulated bacterial promoters, Nucleic Acids Res 39, 1131-1141. (35) Ingram, L. O., Conway, T., Clark, D. P., Sewell, G. W., and Preston, J. F. (1987) Genetic engineering of ethanol production in Escherichia coli, Appl Environ Microbiol 53, 24202425. (36) Yomano, L. P., York, S. W., Zhou, S., Shanmugam, K. T., and Ingram, L. O. (2008) Reengineering Escherichia coli for ethanol production, Biotechnol Lett 30, 2097-2103. (37) Neale, A. D., Scopes, R. K., Wettenhall, R. E., and Hoogenraad, N. J. (1987) Pyruvate decarboxylase of Zymomonas mobilis: isolation, properties, and genetic expression in Escherichia coli, J Bacteriol 169, 1024-1028. (38) Neale, A. D., Scopes, R. K., Kelly, J. M., and Wettenhall, R. E. (1986) The two alcohol dehydrogenases of Zymomonas mobilis. Purification by differential dye ligand chromatography, molecular characterisation and physiological roles, Eur J Biochem 154, 119-124. (39) Vazquez-Limon, C., Vega-Badillo, J., Martinez, A., Espinosa-Molina, G., Gosset, G., Soberon, X., Lopez-Munguia, A., and Osuna, J. (2007) Growth rate of a non-fermentative

Ghosh et al

Page 37

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 51

Escherichia coli strain is influenced by NAD+ regeneration, Biotechnol Lett 29, 18571863. (40) Martinez, A., York, S. W., Yomano, L. P., Pineda, V. L., Davis, F. C., Shelton, J. C., and Ingram, L. O. (1999) Biosynthetic burden and plasmid burden limit expression of chromosomally integrated heterologous genes (pdc, adhB) in Escherichia coli, Biotechnol Prog 15, 891-897. (41) Yang, Y. T., San, K. Y., and Bennett, G. N. (1999) Redistribution of metabolic fluxes in Escherichia coli with fermentative lactate dehydrogenase overexpression and deletion, Metab Eng 1, 141-152. (42) Stanley, G. A., and Pamment, N. B. (1993) Transport and intracellular accumulation of acetaldehyde in saccharomyces cerevisiae, Biotechnol Bioeng 42, 24-29. (43) Glick, B. R. (1995) Metabolic load and heterologous gene expression, Biotechnol Adv 13, 247-261. (44) O'Mullan, P. J., Buchholz, S. E., Chase, T., Jr., and Eveleigh, D. E. (1995) Roles of alcohol dehydrogenases of Zymomonas mobilis (ZADH): characterization of a ZADH-2-negative mutant, Appl Microbiol Biotechnol. 43, 675-678. (45) Reed, J. L., Vo, T. D., Schilling, C. H., and Palsson, B. O. (2003) An expanded genomescale model of Escherichia coli K-12 (iJR904 GSM/GPR), Genome Biol 4, R54. (46) Baba, T., Ara, T., Hasegawa, M., Takai, Y., Okumura, Y., Baba, M., Datsenko, K. A., Tomita, M., Wanner, B. L., and Mori, H. (2006) Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection, Mol Syst Biol 2, 2006.0008. (47) Ohta, K., Beall, D. S., Mejia, J. P., Shanmugam, K. T., and Ingram, L. O. (1991) Genetic improvement of Escherichia coli for ethanol production: chromosomal integration of Zymomonas mobilis genes encoding pyruvate decarboxylase and alcohol dehydrogenase II, Appl Environ Microbiol 57, 893-900. (48) Kovach, M. E., Elzer, P. H., Hill, D. S., Robertson, G. T., Farris, M. A., Roop, R. M., 2nd, and Peterson, K. M. (1995) Four new derivatives of the broad-host-range cloning vector pBBR1MCS, carrying different antibiotic-resistance cassettes, Gene 166, 175-176. (49) Grainger, D. C., Goldberg, M. D., Lee, D. J., and Busby, S. J. (2008) Selective repression by Fis and H-NS at the Escherichia coli dps promoter, Mol Microbiol 68, 1366-1377.

Ghosh et al

Page 38

ACS Paragon Plus Environment

7/11/2016

Page 39 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

(50) Shan, Y., Pan, Q., Liu, J., Huang, F., Sun, H., Nishino, K., and Yan, A. (2012) Covalently linking the Escherichia coli global anaerobic regulator FNR in tandem allows it to function as an oxygen stable dimer, Biochem Biophys Res Commun 419, 43-48. (51) Lazazzera, B. A., Bates, D. M., and Kiley, P. J. (1993) The activity of the Escherichia coli transcription factor FNR is regulated by a change in oligomeric state, Genes Dev 7, 19932005. (52) Harley, C. B., and Reynolds, R. P. (1987) Analysis of E. coli promoter sequences, Nucleic Acids Res 15, 2343-2361. (53) Oliphant, A. R., and Struhl, K. (1988) Defining the consensus sequences of E.coli promoter elements by random selection, Nucleic Acids Res 16, 7673-7683. (54) Hoover, D. M., and Lubkowski, J. (2002) DNAWorks: an automated method for designing oligonucleotides for PCR-based gene synthesis, Nucleic Acids Res 30, e43. (55) Fath, S., Bauer, A. P., Liss, M., Spriestersbach, A., Maertens, B., Hahn, P., Ludwig, C., Schafer, F., Graf, M., and Wagner, R. (2011) Multiparameter RNA and codon optimization: a standardized tool to assess and enhance autologous mammalian gene expression, PLoS ONE 6, e17596. (56) Gibson, D. G., Young, L., Chuang, R. Y., Venter, J. C., Hutchison, C. A., 3rd, and Smith, H. O. (2009) Enzymatic assembly of DNA molecules up to several hundred kilobases, Nat Methods 6, 343-345. (57) Zhang, X., and Reed, J. L. (2014) Adaptive evolution of synthetic cooperating communities improves growth performance, PLoS ONE 9, e108297. (58) Hoppner, T. C., and Doelle, H. W. (1983) Purification and kinetic characteristics of pyruvate decarboxylase and ethanol dehydrogenase from Zymomonas mobilis in relation to ethanol production, European J Appl Microbiol Biotechnol 17, 152-157. (59) Woodruff, L. B., May, B. L., Warner, J. R., and Gill, R. T. (2013) Towards a metabolic engineering strain "commons": an Escherichia coli platform strain for ethanol production, Biotechnol Bioeng 110, 1520-1526. (60) Haugen, S. P., Berkmen, M. B., Ross, W., Gaal, T., Ward, C., and Gourse, R. L. (2006) rRNA promoter regulation by nonoptimal binding of sigma region 1.2: an additional recognition element for RNA polymerase, Cell 125, 1069-1082.

Ghosh et al

Page 39

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 51

(61) Grigorova, I. L., Phleger, N. J., Mutalik, V. K., and Gross, C. A. (2006) Insights into transcriptional regulation and sigma competition from an equilibrium model of RNA polymerase binding to DNA, Proc Natl Acad Sci U S A 103, 5332-5337. (62) Wecker, M. S., and Zall, R. R. (1987) Production of Acetaldehyde by Zymomonas mobilis, Appl Environ Microbiol 53, 2815-2820. (63) Zhu, H., Gonzalez, R., and Bobik, T. A. (2011) Coproduction of acetaldehyde and hydrogen during glucose fermentation by Escherichia coli, Appl Environ Microbiol 77, 6441-6450. (64) Membrillo-Hernandez, J., Echave, P., Cabiscol, E., Tamarit, J., Ros, J., and Lin, E. C. (2000) Evolution of the adhE gene product of Escherichia coli from a functional reductase to a dehydrogenase. Genetic and biochemical studies of the mutant proteins, J Biol Chem 275, 33869-33875. (65) Leonardo, M. R., Cunningham, P. R., and Clark, D. P. (1993) Anaerobic regulation of the adhE gene, encoding the fermentative alcohol dehydrogenase of Escherichia coli, J Bacteriol 175, 870-878. (66) Hussein, R., Lee, T. Y., and Lim, H. N. (2015) Quantitative characterization of gene regulation by Rho dependent transcription termination, Biochim Biophys Acta 1849, 940954. (67) Lim, L. W., and Kennel, D. (1974) Evidence against transcription termination within the E. coli lac operon, Mol Gen Genet 133, 367-371. (68) Viikari, L., and Berry, D. R. (1988) Carbohydrate Metabolism in Zymomonas, Crit Rev Biotechnol 7, 237-261. (69) Flamholz, A., Noor, E., Bar-Even, A., Liebermeister, W., and Milo, R. (2013) Glycolytic strategy as a tradeoff between energy yield and protein cost, Proc Natl Acad Sci U S A 110, 10039-10044. (70) Kalnenieks, U. (2006) Physiology of Zymomonas mobilis: some unanswered questions, Adv Microb Physiol 51, 73-117. (71) Basso, L. C., Basso, T. O. & Rocha, L. N. (2011) Ethanol production in Brazil: the industrial process and its impact on yeast fermentation, Vol. 271, InTech Europe, Rijeka, Croatia.

Ghosh et al

Page 40

ACS Paragon Plus Environment

7/11/2016

Page 41 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

(72) Shen, C. R., Lan, E. I., Dekishima, Y., Baez, A., Cho, K. M., and Liao, J. C. (2011) Driving forces enable high-titer anaerobic 1-butanol synthesis in Escherichia coli, Appl Environ Microbiol 77, 2905-2915. (73) Trinh, C. T. (2012) Elucidating and reprogramming Escherichia coli metabolisms for obligate anaerobic n-butanol and isobutanol production, Applied Microbiology and Biotechnology 95, 1083-1094. (74) Gruchattka, E., Hadicke, O., Klamt, S., Schutz, V., and Kayser, O. (2013) In silico profiling of Escherichia coli and Saccharomyces cerevisiae as terpenoid factories, Microb Cell Fact 12, 84. (75) Burgard, A. P., Pharkya, P., and Maranas, C. D. (2003) Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization, Biotechnol Bioeng 84, 647-657. (76) Kim, J., and Reed, J. L. (2010) OptORF: Optimal metabolic and regulatory perturbations for metabolic engineering of microbial strains, BMC Syst Biol 4, 53. (77) Minty, J. J., Lesnefsky, A. A., Lin, F., Chen, Y., Zaroff, T. A., Veloso, A. B., Xie, B., McConnell, C. A., Ward, R. J., Schwartz, D. R., Rouillard, J. M., Gao, Y., Gulari, E., and Lin, X. N. (2011) Evolution combined with genomic study elucidates genetic bases of isobutanol tolerance in Escherichia coli, Microb Cell Fact 10, 18. (78) Feldmann, S. D., Sahm, H., and Sprenger, G. A. (1992) Pentose metabolism in Zymomonas mobilis wild-type and recombinant strains, App microbiol and biot 38, 354-361. (79) Rutter, C., and Chen, R. (2014) Improved cellobiose utilization in E. coli by including both hydrolysis and phosphorolysis mechanisms, Biotechnol Lett 36, 301-307. (80) Zeitoun, R. I., Garst, A. D., Degen, G. D., Pines, G., Mansell, T. J., Glebes, T. Y., Boyle, N. R., and Gill, R. T. (2015) Multiplexed tracking of combinatorial genomic mutations in engineered cell populations, Nat Biotechnol 33, 631-637. (81) Wong, A. S., Choi, G. C., Cui, C. H., Pregernig, G., Milani, P., Adam, M., Perli, S. D., Kazer, S. W., Gaillard, A., Hermann, M., Shalek, A. K., Fraenkel, E., and Lu, T. K. (2016) Multiplexed barcoded CRISPR-Cas9 screening enabled by CombiGEM, Proc Natl Acad Sci U S A 113, 2544-2549.

Ghosh et al

Page 41

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 51

(82) Eid, J., Fehr, A., Gray, J., Luong, K., Lyle, J., Otto, G., Peluso, P., Rank, D., Baybayan, P., Bettman, B., Bibillo, A., Bjornson, K., Chaudhuri, B., Christians, F., Cicero, R., Clark, S., Dalal, R., Dewinter, A., Dixon, J., Foquet, M., Gaertner, A., Hardenbol, P., Heiner, C., Hester, K., Holden, D., Kearns, G., Kong, X., Kuse, R., Lacroix, Y., Lin, S., Lundquist, P., Ma, C., Marks, P., Maxham, M., Murphy, D., Park, I., Pham, T., Phillips, M., Roy, J., Sebra, R., Shen, G., Sorenson, J., Tomaney, A., Travers, K., Trulson, M., Vieceli, J., Wegener, J., Wu, D., Yang, A., Zaccarin, D., Zhao, P., Zhong, F., Korlach, J., and Turner, S. (2009) Real-time DNA sequencing from single polymerase molecules, Science 323, 133-138. (83) Kasianowicz, J. J., Brandin, E., Branton, D., and Deamer, D. W. (1996) Characterization of individual polynucleotide molecules using a membrane channel, Proc Natl Acad Sci U S A 93, 13770-13773. (84) Blattner, F. R., Plunkett, G., 3rd, Bloch, C. A., Perna, N. T., Burland, V., Riley, M., Collado-Vides, J., Glasner, J. D., Rode, C. K., Mayhew, G. F., Gregor, J., Davis, N. W., Kirkpatrick, H. A., Goeden, M. A., Rose, D. J., Mau, B., and Shao, Y. (1997) The complete genome sequence of Escherichia coli K-12, Science 277, 1453-1462. (85) Freddolino, P. L., Amini, S., and Tavazoie, S. (2012) Newly identified genetic variations in common Escherichia coli MG1655 stock cultures, J Bacteriol 194, 303-306. (86) Jensen, K. F. (1993) The Escherichia coli K-12 "wild types" W3110 and MG1655 have an rph frameshift mutation that leads to pyrimidine starvation due to low pyrE expression levels, J Bacteriol 175, 3401-3407. (87) Neidhardt, F. C., Bloch, P. L., and Smith, D. F. (1974) Culture medium for enterobacteria, J Bacteriol 119, 736-747. (88) Lawther, R. P., Calhoun, D. H., Gray, J., Adams, C. W., Hauser, C. A., and Hatfield, G. W. (1982) DNA sequence fine-structure analysis of ilvG (IlvG+) mutations of Escherichia coli K-12, J Bacteriol 149, 294-298. (89) Thomason, L. C., Costantino, N., and Court, D. L. (2007) E. coli genome manipulation by P1 transduction, Curr Protoc Mol Biol Chapter 1, Unit 1.17.

Ghosh et al

Page 42

ACS Paragon Plus Environment

7/11/2016

Page 43 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

(90) Cherepanov, P. P., and Wackernagel, W. (1995) Gene disruption in Escherichia coli: TcR and KmR cassettes with the option of Flp-catalyzed excision of the antibiotic-resistance determinant, Gene 158, 9-14. (91) Gardner, J. G., and Keating, D. H. (2010) Requirement of the type II secretion system for utilization of cellulosic substrates by Cellvibrio japonicus, Appl Environ Microbiol 76, 5079-5087. (92) Drepper, T., Huber, R., Heck, A., Circolone, F., Hillmer, A. K., Buchs, J., and Jaeger, K. E. (2010) Flavin mononucleotide-based fluorescent reporter proteins outperform green fluorescent protein-like proteins as quantitative in vivo real-time reporters, Appl Environ Microbiol 76, 5990-5994. (93) Liebert, C. A., Hall, R. M., and Summers, A. O. (1999) Transposon Tn21, flagship of the floating genome, Microbiol Mol Biol Rev 63, 507-522. (94) De Hoon, M. J., Imoto, S., Kobayashi, K., Ogasawara, N., and Miyano, S. (2004) Predicting the operon structure of Bacillus subtilis using operon length, intergene distance, and gene expression information, Pac Symp Biocomput, 276-287. (95) Muller, J., Oehler, S., and Muller-Hill, B. (1996) Repression of lac promoter as a function of distance, phase and quality of an auxiliary lac operator, J Mol Biol 257, 21-29. (96) Brosius, J., Erfle, M., and Storella, J. (1985) Spacing of the -10 and -35 regions in the tac promoter. Effect on its in vivo activity, J Biol Chem 260, 3539-3541. (97) Pedelacq, J. D., Cabantous, S., Tran, T., Terwilliger, T. C., and Waldo, G. S. (2006) Engineering and characterization of a superfolder green fluorescent protein, Nat Biotechnol 24, 79-88. (98) McDowell, J. C., Roberts, J. W., Jin, D. J., and Gross, C. (1994) Determination of intrinsic transcription termination efficiency by RNA polymerase elongation rate, Science 266, 822-825. (99) Deuschle, U., Kammerer, W., Gentz, R., and Bujard, H. (1986) Promoters of Escherichia coli: a hierarchy of in vivo strength indicates alternate structures, Embo j 5, 2987-2994. (100) Campbell, R. E., Tour, O., Palmer, A. E., Steinbach, P. A., Baird, G. S., Zacharias, D. A., and Tsien, R. Y. (2002) A monomeric red fluorescent protein, Proc Natl Acad Sci U S A 99, 7877-7882.

Ghosh et al

Page 43

ACS Paragon Plus Environment

7/11/2016

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 51

(101) Abe, H., and Aiba, H. (1996) Differential contributions of two elements of rhoindependent terminator to transcription termination and mRNA stabilization, Biochimie 78, 1035-1042. (102) Chen, Y. J., Liu, P., Nielsen, A. A., Brophy, J. A., Clancy, K., Peterson, T., and Voigt, C. A. (2013) Characterization of 582 natural and synthetic terminators and quantification of their design constraints, Nat Methods 10, 659-664. (103) Hawley, D. K., and McClure, W. R. (1983) Compilation and analysis of Escherichia coli promoter DNA sequences, Nucleic Acids Res 11, 2237-2255. (104) Hoopes, B. C., and McClure, W. R. (1981) Studies on the selectivity of DNA precipitation by spermine, Nucleic Acids Res 9, 5493-5504. (105) Wu, N., Matand, K., Kebede, B., Acquaah, G., and Williams, S. (2010) Enhancing DNA electrotransformation efficiency in Escherichia coli DH10B electrocompetent cells, Electronic Journal of Biotechnology 13, 21-22. (106) Gonzales, M. F., Brooks, T., Pukatzki, S. U., and Provenzano, D. (2013) Rapid protocol for preparation of electrocompetent Escherichia coli and Vibrio cholerae, J Vis Exp 8. (107) Franden, M. A., Pilath, H. M., Mohagheghi, A., Pienkos, P. T., and Zhang, M. (2013) Inhibition of growth of Zymomonas mobilis by model compounds found in lignocellulosic hydrolysates, Biotechnol Biofuels 6, 99. (108) Schwalbach, M. S., Keating, D. H., Tremaine, M., Marner, W. D., Zhang, Y., Bothfeld, W., Higbee, A., Grass, J. A., Cotten, C., Reed, J. L., da Costa Sousa, L., Jin, M., Balan, V., Ellinger, J., Dale, B., Kiley, P. J., and Landick, R. (2012) Complex physiology and compound stress responses during fermentation of alkali-pretreated corn stover hydrolysate by an Escherichia coli ethanologen, Appl Environ Microbiol 78, 3442-3457.

Ghosh et al

Page 44

ACS Paragon Plus Environment

7/11/2016

Page 45 of 51

For Table of Contents Use Only

OptSSeq: High-throughput sequencing readout of growth enrichment defines optimal gene expression elements for homoethanologenesis Indro Neil Ghosh and Robert Landick

Expression Library

Growth Enrichment

OD t

Optimal Combinations

Optimal Expression Levels Growth Rate

...

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Enzyme copies /cell

TTTACA-CATAATGTG Promoters UAAGGAGAU RBSs UAAGGGGGA HighOptimal Throughput Expression Signals Sequencing

ACS Paragon Plus Environment

ACS Synthetic Biology

1

PCR assembly of promoters, RBSs, and genes with diverse expression

2

3

Growth enrichment of superior variants

...

Promoter library

Optimized expression library

RBS-ORF Fusion libraries

Combinatorial assembly

HTS identification of optimal expression elements

4

optimum 2

...

...

...

Expression library

...

...

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 46 of 51

optimum 1

Optimal combinations of promoters and RBSs

Figure 1. Optimization by Selection and Sequencing (OptSSeq). (1) Libraries of genetic elements (Promoters and RBSs) are designed to span a wide range of expression levels (transcription and translation initiation rates) and amplified into DNA fragments. (2) The library-containing DNA fragments are linked together by combinatorial Gibson assembly. (3) The expression library plasmids are transformed into a strain in which optimal expression of the genes will be linked to the rate of cell growth, and the fastest growing strain variants are growth-enriched. (4) The sections of the cassettes containing genetic expression elements (promoters and RBSs) are sequenced by high throughput sequencing (HTS); the analyzed results are used to determine the optimal gene expression elements for the expression cassette.

ACS Paragon Plus Environment

Page 47 of 51

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

A

Glycolysis ½ × Glucose

Competing Fermentative Pathways Succinate NAD+ NADH ΔfrdA fermentation

NAD+ NADH

Succinate

Lactate NAD+ fermentation Lactate

NADH

Acetate ATP excretion Acetate

ADP

ΔldhA

Homoethanol pathway

PEP

ADP

Z. mobilis pdc + adhB + adhA

ATP

Pdc

ΔackA

Formate

2×NAD+ 2×NADH Ethanol fermentation Ethanol

Acetyl CoA

Pyruvate formate lyase reaction

ΔadhE

B lac promoter

Z. mobilis pdc RBS

pPBwt pdc

NADH

CO2

Pyruvate

NAD+

Acetaldehyde Ethanol AdhB AdhA

Biogenesis Fatty acids Amino Acids Other biomolecules

Z. mobilis adhB RBS adhB

pBBR backbone aacC1(GnR)

dps100 ydfZ prom prom

mob

rep

pPBAsyn Z. mobilis pdc RBS

pdc*

Z. mobilis adhB RBS adhB*

Z. mobilis adhA RBS adhA*

pBBR backbone aphA(KnR)

mob

rep

Figure 2. Engineering a homoethanol pathway in E. coli. (A) Metabolic map of central carbon metabolism in fermentation-deficient RL3019 containing an ethanol cassette. Deletions of mixed acid fermentation pathways make the homoethanol pathway the only route for NADH recycling to enable generation of ATP by glucose fermen-tation. (B) Two plasmid configurations encoding homoethanologensis genes from Z. mobilis. pPBAsyn and pPBwt are used as starting points for optimization of enzyme levels. *, E. coli codon-optimized genes.

ACS Paragon Plus Environment

ACS Synthetic Biology

TSS -35 -10 dis Consensus ---TTGACA--17nt--TATAATGGG---AC C G C Library ---C ATAATGGG---AT TTATA--17nt--G T T

RBS

C A --UAAA G ucaccAUG-GGAG U U

spcR

adhB

adhA

mob

adhA

rep

pdc

PB

adhB

0.5 OD600

Dilute 2

pdc

0.5 OD600

Dilute 3

pdc

PA

0.09 hr-1

107

Strong Promoter Enriched

0.5

adhA RBSs Strong Medium Weak

106

0.10

0.05

104

0

12

24

36

48

Promoter ID#

-35

60

102

72

-10

dis

TSS

---TTGACA--17nt--GATAATGGG---A-

1

104 103

1

3

6

9

adhA RBS ID#

12

RBS

--UAAAGAUGUucaccAUG--

100

3 Passage #

2

pdc RBSs Strong Medium Weak

105

Strong adhB RBS enriched

104 103

102

103

0.0

0

adhB RBSs Strong Medium Weak Strong pdc RBS enriched

105

Weak adhA RBS enriched

105

Predicted translational initiation rate (AU)

Predicted translational initiation rate (AU)

1.0

Promoters Strong Medium Weak

adhA

1.0 × 104 combinations

0.00

C

adhB

1.4 × 104 combinations

1.7 × 105 combinations

B ABP library 1 0.03 hr-1

CA --UA UguuacAUG-C AGGAG UG

A G G --c gccuUAUGU GUGAuagcuAUG--

1.7 × 105 combinations ABP

RBS

+ pBBR backbone

pdc

Libraries:

pdc RBS Library: 12 variants

RBS

Combinatorial assembly PBA

adhB RBS Library: 16 variants

adhA RBS Library: 12 variants

Predicted translational initiation rate (AU)

Promoter Library: 72 variants

Growth Rate (h-1)

A

Consensus similarity score

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 48 of 51

1

4

8

12

adhB RBS ID#

16

RBS

--gccuUAAGGGGGAuagcuAUG--

102

1

3

6

9

pdc RBS ID#

12

RBS

--UCAGGAGAUguuacAUG--

Figure 3. Enrichment of optimal genetic expression elements during OptSSeq. (A) (Top) Degenerate sequences used to define promoters and RBS libraries. Red, degenerate positions. (Bottom) Structures of the four different operon configurations tested. (B) Scheme for growth enrichment of optimal ethanologenic strains. Representative average growth rates for libraries collected over the course of growth selection are shown on the right (replicate 1 of the ABP library). (C) (Top) Consensus similarity scores (CSS)52,53 for the 72 promoter variants and predicted translation initiation rates (TIRs)30 for RBS variants. Expression element variants are ordered into strong (green), medium (yellow) and weak (red) categories. A representative variant (pAPB1) found after 3 rounds of enrichment for the APB library is circled on the plot, with the precise sequence of this isolate shown under the plot.

ACS Paragon Plus Environment

Page 49 of 51

Passage # 0 1 2 3

*

C

>50% 50.0% 17.5%

Strong

Fractional Representation

Cons. 14*

B Promoters

RBS Pred. Str.

A Prom. Seq.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

adhA RBSs Passage # 0

1

2

>50% 50%

3

30%

6.35% 1.39%

NonCons.

Epromoter

0.99

0.00% 1.00

10*

*

Weak

EadhARBS

0.99

0.10

Zero 1 2 3 Passage #

0.01

Zero 1 2 3 Passage #

D adhB RBSs

1*

Passage # 0 1 2 3

*

15%

8.3%

6.3%

0.0% 1.00

0.0% 1.00

EadhBRBS

0.99

0

27%

18%

0.75

pdc RBSs Passage #

>50% 50%

0.75

1

4*

2

>50% 50%

3

30%

*

18% 8.3%

EpdcRBS

0.99

0.0% 1.00 0.75

0.50

0.50

0.50

0.25

0.25

0.25

0.00

Zero 1 2 3 Passage #

0.00

Zero 1 2 3 Passage #

0.00

Figure 4. Heat map of signal enrichment of promoters and RBSs during APB growth selection. Promoters (A) are ordered based on CSS.52,53 adhA RBSs (B), adhB RBSs (C), and pdc RBSs (D) are ordered based on predicted TIRs.30 The degrees of enrichment (Ex; Methods) for each step in the enrichment is shown in the plots below the heat maps. Increasingly dark shades of blue corresponding in increasingly large fractional representation of a particular sequence in the library from 0% (white) to 50% (dark blue); magenta, >50% fractional representation. Asterisks (*) indicate expression elements present in isolates chosen for further characterization.

ACS Paragon Plus Environment

ACS Synthetic Biology

B

Promoters PBA

PBA

pdc

Strong Weak E 1

0

Seq.

adhA

1* 6* 10* 14* 15* 25* 27*

Cons.

ABP Degree of enrichment

adhB

adhA

adhB

PB

pdc

PA

pdc

pdc adhB

ABP

PB

PA

Rep1 Rep2 Rep1 Rep2 Rep1 Rep2 Rep1 Rep2

*

* *

*

*

* *

Frac.

A

High

*

adhA

Low NonCons. Epromoter 0.27 0.07 0.99 0.76 0.06 0.10 0.61 0.07

C PBA

Strong

3* 4* 5*

ABP

PB

Frac.

pdc RBSs

Pred. Str.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 50 of 51

PA

Rep1 Rep2 Rep1 Rep2 Rep1 Rep2 Rep1 Rep2

*

*

*

*

*

*

*

High

adhB RBSs

D

PBA

Str

1*

ABP

E

PB

*

*

*

*

*

*

H

*

Weak Low EpdcRBS 0.41 0.66 0.99 0.96 0.45 0.30 0.76 0.30

Wk

adhA RBSs PBA

Rep1 Rep2 Rep1 Rep2 Rep1 Rep2

L

EadhBRBS 0.90 0.93 0.99 0.73 0.92 0.96

Str 2*1* 4*

ABP

PA

Rep1 Rep2 Rep1 Rep2 Rep1 Rep2

*

*

*

*

H

10* * * Wk L EadhARBS 0.23 0.08 0.99 0.72 0.62 0.63

Figure 5. Extent of enrichment of promoters and RBSs for PBA, ABP, PB, PA libraries. (A) Color-coded representation of promoter and RBS enrichments for strong or weak signals. Less saturated colors represent a low degree of enrichment. (B-E) Heat maps of final enrichments for promoter variants (B), pdc RBS variants (C), adhB RBS variants (D), and adhA RBS variants arranged from highest (top) to lowest (bottom) predicted strengths. The calcu-lated degree of enrichments are shown below each column. The calculated degree of enrichments (Ex) are shown below each column. Color coding and asterisks are as described in the legend to Figure 4.

ACS Paragon Plus Environment

Page 51 of 51

Growth rate (h-1) 0.1 0.2

0.0

iJR904 prediction

pP B2 pP A2

2

0.0

0.1 0.2 Growth rate (h-1)

Growth rate limit predicted by cost of enzyme overexpression

0.3 pPB1

0.2

pABP1 pABP2

0.1

pPB2

Z. mobilis ZM4

pPB1

pPA2 pABP1 pPB2

pABP2

4

0.4

0.25 0.3 0.4

Growth rate vs. enzyme level

pPA2

pPBA1

pPBA2 pPA1

pPBAsyn pPBwt

0.0 0.0

0.1 0.2 0.3 0.4 Pdc-Adh as a fraction of TCP

C

adhB RBS #1 AdhB/cell UAAGGGGGA ~29,000

UAAGGAGAU ~270,000 pdc

UAAGGGGGA ~69,000 adhB

pdc RBS #3

Promoter #1

pdc RBS #3

pPB2 TTGACA-TATAATGTG -10

dis

0.5

Pdc/cell UAAGGAGAU ~170,000

Promoter #25

pPB1 TTTACA-CATAATGTG -35

1

BA

pPBA2 pPBA1

Pdc AdhB AdhA

6

0

BP

pP

pP

pPBwt RL3019

pPBwt

Enzyme molecules per cell (×105)

pPA1

BA

pPBAsyn

5

1

2

10

pA BP 2

pP A1

pPB1

0

B

ZM4

pPBAsyn

Ethanol Production (pmol s-1μg-1 TCP)

100 50 15

0.25 0.3 0.4

pA

A

Growth rate (h-1)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

adhB RBS #1

Figure 6. Performance of optimized ethanologenic expression cassettes. (A) Growth rates and ethanologenesis rates of expression cassettes compared with predictions from the iJR904 metabolic model for growth rate linked to ethanologenesis rate (dotted line). Error bars are standard deviations from triplicate measurements. The histogram below the plot represents the total Pdc, AdhB, and AdhA enzyme molecules per cell for each expression cassette. (B) Relationship between growth rates and ethanologenic enzyme levels. The predicted maximum for possible growth rate based on the cost of protein overproduction (red line) was derived from the model Scott et al.5 and the unburdened ethanologenic growth rate of 0.38 h-1 predicted by the modified iJR904 for RL3019 (Methods). (C) Identities of expression elements in PB1 and PB2.

ACS Paragon Plus Environment