Engineering a New Class of Anti-LacI Transcription Factors with

The lactose repressor, LacI (I+YQR), is an archetypal transcription factor that has been a workhorse in many synthetic genetic networks. LacI represse...
0 downloads 0 Views 3MB Size
Subscriber access provided by University of Kansas Libraries

Article

Engineering a New Class of Anti-LacI Transcription Factors with Alternate DNA Recognition Ronald E. Rondon, and Corey J. Wilson ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.8b00324 • Publication Date (Web): 02 Jan 2019 Downloaded from http://pubs.acs.org on January 2, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Engineering a New Class of Anti-LacI Transcription Factors with Alternate DNA Recognition Ronald E. Rondon1 and Corey J. Wilson1† 1

Georgia Institute of Technology, School of Chemical & Biomolecular Engineering



To whom correspondence should be addressed: Corey J. Wilson, Georgia Institute of Technology, School of Chemical & Biomolecular Engineering, 311 Ferst Drive, Atlanta, GA 30332-0100. E-Mail: [email protected].

Keywords: Engineered Transcription Factors, Antilacs, Alternate DNA Recognition

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT The lactose repressor, LacI (I+YQR) is an archetypal transcription factor that has been a workhorse in many synthetic genetic networks. LacI represses gene expression (apo ligand) and is induced upon binding of the ligand isopropyl β-D-1-thiogalactopyranoside (IPTG). Recently, laboratory evolution was used to confer inverted function in the native LacI topology resulting in anti-LacI (antilac) function (IAYQR), where IPTG binding results in gene suppression. Here we engineered 46 antilacs with alternate DNA binding function (IAADR). Phenotypically, IAADR transcription factors are the inverse of wild-type I+YQR function and possess alternate DNA recognition (ADR). This collection of bespoke IAADR bind orthogonally to disparate non-natural operator DNA sequences and suppress gene expression in the presence of IPTG. This new class of IAADR gene regulators were designed modularly via the systematic pairing of nine alternate allosteric regulatory cores with six alternate DNA binding domains that interact with complementary synthetic operator DNA sequences. The 46 IAADR identified in this study are also orthogonal to the naturally occurring operator O1. Finally, a demonstration of full orthogonality was achieved via the construction of synthetic genetic toggle switches composed of two non-synonymous unit pair operations that control two distinct fluorescent outputs. This new class of IAADR transcription factors will facilitate the expansion of the computational capacity of engineered gene circuits, via the scalable increase in the control over the number of gene outputs by way of the expansion of the number of unique transcription factors (or systems of transcription factors) that can simultaneously regulate one or more promoter(s).

ACS Paragon Plus Environment

Page 2 of 24

Page 3 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

INTRODUCTION The Lactose Repressor (LacI) is a canonical molecular switch, serving as the central regulatory protein in the lac operon, in Escherichia coli (E. coli). The LacI transcription factor is a two-part system composed of a repressor protein and its corresponding operator DNA site (O1), which function together as a biological switch – controlling gene expression. Under normal cellular conditions, wild-type LacI will bind to the O1 operator DNA sequence and repress the transcription of downstream genes by physically blocking and compromising the activity of RNA polymerase.1 Upon binding of the chemical signal isopropyl-β-D-thiogalactoside (IPTG) (an analog of the natural inducer 1,6-allolactose) LacI undergoes a conformational shift that results in a ~20-fold reduction in the affinity of the transcription factor for its cognate DNA operator, thereby increasing the amount of mRNA transcribed.2 The functional unit of LacI is a dimer - representing the minimal requirement for repression, see repressor function (I+) Figure 1A - each monomer is composed of 360 amino acids. The first 60 residues constitute the N-terminal DNA Binding Domain (DBD). Dimerized DBD can recognize the O1 operator, assisted by the helix-turn-helix (HTH) motif. The allosteric core follows (residues 61-330), structurally partitioned into N- and C-subdomains, with three crossovers between the two regions.3 The cleft between these two subdomains forms the inducer binding pocket.1 Dimer assembly is achieved through the monomer-monomer interface located within the C-subdomain, while the N-subdomain is responsible for mediating and propagating the allosteric signal between the ligand binding site and the DNA binding domain, see Supplementary Information, Figure S1. Residues 331-360 make up the C-terminal tetramerization domain, which facilitates the dimerization of two functional units.4-7 In general, specific protein-DNA interactions require contact with approximately 12 base-pairs. LacI achieves binding specificity with its natural O1 DNA sequence (~20 base-pairs in length, see Figure 2A) via the coupling of two DNA binding domains through dimerization mediated by the C-subdomain. Several groups have conferred alternate operator DNA binding in the wild-type LacI scaffold.8-11 A summary of the 9 DBD and corresponding operators used in this study are given in Figure 2. Sartorius et al., were the first to report sets of alternate DBD and corresponding non-natural operator pairs, that were functional in the LacI scaffold.10 Notably, this study revealed two nonnatural transcription factor / operator systems that were orthogonal. Namely, transcription factor variant I+HQN interacts with non-natural operator Ottg (Figure 2C), and I+VAN interacts with Otta (Figure 2D – denoting Otta, I+TAN will be discussed below); however, I+HQN and I+VAN cannot interact with Otta and Ottg (respectively) – and both are orthogonal to the native operators O1 and OSYM (Figure 2A). Lewis et al. generated a large library of LacI variants with mutations in positions Y17, Q18, R22 (i.e., residues previously shown to be critical in DNA recognition).8 This library was screened for activity using a green fluorescent protein (GFP) reporter under the control of various operators in the presence and absence of IPTG. The operators investigated in this work include the native pseudo-palindromic O1 operator, the ideal symmetric OSYM operator 12 (Figure 2A), and the non-natural Ogta operator (Figure 2B) - a double mutant of OSYM which was no longer recognized by the wild-type DBD. Screening of the library against the Otta operator (Figure 2D) revealed three LacI variants capable of repression of the Otta operator (i.e., I+TAN (Y17T/Q18A/R22N) – Figure 2D, I+IAN (Y17I/Q18A/R22N), and I+VAN (Y17V/Q18A/R22N)), but were no longer capable of repressing the gene when under the control of the OSYM operator, demonstrating orthogonality. Lewis et al. conducted a subsequent study, in which over 8000 LacI variants were generated, encompassing a fully randomized library of positions Y17, Q18, and R22.9 This library was tested against 64 putative operator variants, with the sequence 5’-A ATT NNN GCT ZZZ AAT T-3’ where “N” is any nucleotide and “Z” is the complement necessary to achieve full symmetry. In total, 332 non-synonymous transcription factor (I+XXX) /operator (ONNN) combinations were identified, though not explicitly tested for orthogonality. From this study we selected three transcription factor / operator sets (Figure 2E, 2F, and 2G), that we believed would

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

function orthogonally. In a complementary study, Zhan et al 11 screened a library of palindromic operators containing single or double mutations at positions 7 and 9 within the OSYM DNA operator. This study identified 5 transcription factor / operator sets with orthogonal DNA binding functions, with three pairs that overlap with other studies (e.g., OSYM/IxYQR12, Ogta/IxNAR9, 10, and Ottg/IxHQN10). Although, the authors of this study were able to show that each LacI variant could interact with operator DNA, induction of these systems was not demonstrated. We selected two sets (Figure 2H and 2I), from Zhan et al., where Ogtg’ and Ogta’ are alternate operators in which base pair position 9 was modified. In addition to the ability to engineer the DNA binding function of LacI, recent studies have shown that laboratory evolution can be used to confer alternate allosteric phenotypes in the LacI scaffold. This is epitomized by the development of repressors bearing the anti-LacI (antilac) phenotype (IA).3, 13, 14 A given antilac is a system which allows gene expression only in the absence of inducer, and suppresses gene expression upon binding of the chemical signal IPTG, see Figure 1C. Poelwijk et al. engineered IA variants through two successive rounds of error-prone PCR (EPPCR) in a study aimed at the evolution of gene regulation.14 This effort focused on conferring adaptation to opposing selective pressures. In the first round, mutations resulted in the elimination of logical control over gene expression (i.e., constitutive gene expression); whereas the second round of EP-PCR resulted in the IA phenotype. The authors conclude that the antilacs observed in their study were the result of epistasis via a first round mutation S97P. In a set of related studies3, 13, Wilson et al. proposed an alternate means to achieve antilac function, without the use of selective pressures. Experimentation led Wilson et al. to develop the working hypothesis that epistasis with a single IS mutation (i.e., super repressors that are insensitive to the ligand IPTG, see Figure 1B), are essential for rerouting communication in the LacI scaffold via compensatory mutations. To test this hypothesis, five different IS mutations (i.e., D88A, K84A, V95A, V95F and D275F) were introduced into LacI, followed by a single round of EP-PCR to confer IA function. Between the two studies, 19 new antilac transcription factors IA(X)YQR (where, X equals engineered variant ID, see Supplementary Information, Figure S1 (inset) for the nine antilacs used in this study) were engineered from a given Is progenitor. Each IA(X)YQR bears the wild-type DBD (YQR), thus interact with the wild-type operator O1, see Figure 2. Notably, the resulting antilacs display a great degree of variation in repression strengths and ligand sensitivities. The present study seeks to engineer a new class of transcription factors via the development of antilac transcription factors with alternate DNA recognition (ADR) to corresponding non-natural operators (IAADR). The objective of the study is to engineer a large collection of IA(X)ADR transcription factors to be orthogonal in operator DNA binding to select IA(X)YQR variants we developed in Richards et al.13. Functionally this collection of IA(X)ADR transcription factors are expected to suppress transcription in the presence of the chemical signal IPTG (such that the systems reversibly bind to the ligand and operator DNA), with stringent pairwise specificity between the engineered transcription factor and cognate synthetic (symmetric) operator sequence. To achieve this, we modularly designed a large collection of non-natural IA(X)ADR transcription factors, bearing one of the following ADR DBD-motifs: NAR, HQN, TAN, GKR, HTK or KSL, see Figure 2. The engineering workflow involves the exchange of the wild-type DBD (YQR) in nine antilac variants IA(X)YQR (see Supplementary Information, Figure S1), with one of the six ADR DBD-motifs given above. All 54 combinations were tested, and this combinatorial set yielded 46 functional IAADR transcription factors, and corresponding controls (i.e., six I+ADR transcription factors).

ACS Paragon Plus Environment

Page 4 of 24

Page 5 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

RESULTS AND DISCUSSION Benchmarking Alternate DNA Binding Under Wild-type (I+) Allosteric Control. Given that many of the I+ADR transcription factors of interest in this study are from disparate sources, evaluating each of these unit pair systems using the same genetic architecture, and under the same conditions, is a prudent set of control experiments. Initially we selected eight ADR DBDmotifs (i.e., NAR, HQN, TAN, GKR, HTK, KSL, AWR, and RQR), and corresponding operators for this study, see Figure 2. Our analysis of existing data from Sartorius et al.10, Lewis et al.8, 9, and Zhan et al. 11 suggests that the selected ADR DBD-motifs (and wild-type YQR) are potentially orthogonal. Thus, we hypothesized that the eight ADR plus the wild-type pair (see Figure 2.) will interact with their cognate DNA operators in a way that is orthogonal (i.e., a given DBD will only interact with one operator DNA element, orthogonally without cross interactions with any of the remaining seven operators). We constructed eight I+ADR and complementary operators using procedures outlined in the Materials and Methods section. The reporter system used in this study is an engineered construct composed of a single DNA operator site (Ox º O1, Ogta, Ottg, Otta, Ogac, Octt, Oagg, Ogta’ or Ogtg’) located downstream of a promoter element, which is followed by the GFP reporter gene, see Supplementary Figure S2. The corresponding transcription factor (I+x º I+YQR, I+NAR, I+HQN, I+TAN, I+GKR, I+HTK, I+KSL, I+AWR, or I+RQR) is located on a separate plasmid, see Supplementary Figure S2. The eight unit pairs (I+ADR/O1) plus wild-type were induced with 10mM IPTG, to ensure saturation of binding sites. Although Zhan et al. have previously shown a putative ‘repression matrix’ for unit pairs I+NAR/Ogta, I+HQN/Ottg, I+AWR/Ogta’, and I+RQR/Ogtg’ (i.e., showing exclusive pairwise specificity) this set of I+ADR / operator variants were not assayed in the presence of IPTG, thus inducibility was never conclusively demonstrated. Accordingly, we conducted microplate assays in vivo (in E. coli - see Materials and Methods) to assess the phenotype and performance metrics of the eight ADR unit pairs, relative to the wild-type system (i.e., I+YQR/O1). Exhaustive evaluation of each of the eight I+ADR (and wild-type I+YQR) transcription factors against all nine cognate operators demonstrate that six of the I+ADR and corresponding palindromic operators are not only orthogonal to one another, but are orthogonal to the naturally occurring O1/I+YQR unit pair, see Figure 3A. Repressor variant I+RQR had significant cross interaction with the wild-type operator O1 and its cognate operator Ogtg, see Figure 3B. Likewise, I+NAR was promiscuous and interacted with both Ogta and Ogta’. The native repressor I+YQR interacted with non-cognate operator Ogta’. Interestingly, the alternate repressor I+AWR was unresponsive to 10mM IPTG, when paired with its putative unit complement Ogta’ operator, see Figure 3B. Accordingly, I+AWR/Ogta’ and I+RQR/Ogtg’ unit pairs were excluded from forthcoming modular design and were classified as unacceptable ADR for the purposes of this study. Previous investigations have shown that variation in the operator DNA sequence alone (i.e., in the absence of a given transcription factor) can affect basal transcription and translation.15 Mechanistically, this is the result of changes in the folding free-energy of hairpin formation, which could interfere with the translational machinery (i.e., resulting in differences in GFP output due to the sequence of the 5’ untranslated region (UTR)). Likewise, in this study we observe that different DNA operators yielded varying levels of GFP fluorescence in the absence of a given repressor protein. Accordingly, we employed an ‘insulator’ element, specifically the ribozyme RiBoJ, which has been shown to be an effective buffer against transfer-function variability.16 RiBoJ autocatalytically cleaves upstream sequences; therefore, placing RiBoJ between the promoter region and the GFP coding sequence would in principle allow for translation of the reporter gene and eliminate variability in the 5’ UTR region, see Figure 3C. RiBoJ was thus added downstream of the operator site but upstream of the ribosome binding site (RBS) for each of the operator sequences tested in this study. In turn, we assessed the performance metrics of the insulated reporter using microplate assays, with and without I+ADR (relative to wild-type I+YQR). In the unrepressed state (i.e., without repressor proteins) all operators produced GFP at similar levels,

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

which was expected. In addition, the GFP output across all conditions increased (i.e., with and without the repressor present), see Figure 3D. However, the basal expression (i.e. “leakiness” or expression in the repressed state) across all operator-repressor combinations increased more than the unrepressed GFP output, which led to a decrease in the apparent dynamic range for many of the repressors tested, see Figure 3E. Thus, these experiments confirmed our hypothesis that the differences in unrepressed output for each of the operator variants was due to interactions stemming from the 5-UTR, which could be alleviated using an insulator. However, the observed decrease in dynamic range across the variants tested reveals that the use of an insulator may not be beneficial for all applications, specifically for purposes in which maximum differences between repressed and induced states are required. For this reason, we chose to revert to our original design strategy (i.e., sans insulator). Engineering Non-natural Operator DNA Binding with Alternate Allosteric Phenotypes (IAADR). Alternate phenotypes for cooperative communication in the LacI scaffold have been previously engineered by Wilson et al.3, 13 by first introducing Is point mutations that block allosteric communication, followed by EP-PCR to generate mutations capable of conferring alternate repressive function. EP-PCR was applied to the allosteric core of LacI (residues 62-322), therefore leaving the DNA binding domain (DBD) of LacI unaffected, which led us to hypothesize that the DBD could be altered while leaving allosteric function unaffected. Toward this end, the six ADR DBD-motifs identified to function orthogonally (Figure 2 – i.e., within the box), in the wildtype regulatory core (I+x), were introduced into nine antilac regulatory cores (i.e., IA(1-9)X) stemming from three independent IS mutations (K84A, V95A, V95F), see Supplementary Figure S1. The nine IA(1-9)X were selected (opposed to all fourteen published IAX13) based on performance metrics in solution. Using the modular protein design strategy we proposed in Davey and Wilson17, 54 putative antilacs with alternate DNA recognition (IAADR) were constructed. In turn, we assessed the performance metrics for all 54 IAADR (plus nine wild-type controls IAYQR), using the single operator system we engineered for this study. The majority of the engineered systems display a high degree of modularity as most of the IAADR unit pairs maintain their pairwise specificity for their cognate operator, while maintaining allosteric communication, see Figure 4. Out of the 54 putative antilacs, 46 IAADR unit pairs exhibit the expected phenotype; whereas, the remaining 8 transcription factors were unresponsive to 10mM IPTG. Interestingly, three of the parental controls (IA(1)YQR, IA(2)YQR, and IA(4)YQR) were unresponsive to 10mM IPTG. It is important to note that the nine antilacs chosen for this study were originally reported to exhibit IA phenotypes.13 In the Richards et al. report, we utilized the pZS*22-sfGFP plasmid, which contains the PLlacO-1 promoter and operator region.18 The PLlacO-1 promoter was constructed by replacing the cI binding sites with sequences encoding the O1 operator and therefore contains one full O1 sequence directly downstream of the -10 hexamer, in addition to an 18bp fragment of O1 between the -35 and -10 hexamers (O1+, see Figure 2). This fragment encompasses more than the minimal length required for the specific recognition of the lactose operator19, thus O1+ provides a putative alternate binding site for the reported IAYQR antilacs. In contrast, our current construct for reporting utilizes a system containing only a single O1 operator site (instead of O1+). Accordingly, the observed phenotype for IA(1)YQR, IA(2)YQR, and IA(4)YQR can be reconciled, as the systems reported in our previous study were engineered toward O1+ (not O1); thus, modulated antilac (IA(x)YQR) binding may vary when assayed via a pristine O1 DNA sequence. Assessment of Orthogonal Gene Suppression Imposed by Engineering Antilacs with ADR. To explicitly test the resulting IAADR for orthogonal gene suppression, we exhaustively evaluated all 54 antilac constructs (and nine IA(x)YQR controls) as unit pairs with all six non-natural operators (plus the naturally occurring O1 operator). This combinatorial set resulted in nine suppression matrixes (i.e., 441 antilac/operator unit pairs) evaluated with and without 10mM IPTG, see Figure 5. While our results in the previous section identified 46 functional IAADR unit pairs, full assessment

ACS Paragon Plus Environment

Page 6 of 24

Page 7 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

of all 54 constructs was necessary to determine whether any unresponsive IAADR constructs could interact with other operators not designated as complementary by design. Within each of the nine antilac clusters, all 46 functional IAADR displayed orthogonal gene suppression upon ligand binding (i.e., where a single transcription factor suppresses gene expression via a single complementary synthetic DNA operator), see Figure 5. In other words (in the presence of ligand) suppressed gene expression was observed for all 46 functional variants, only in cases in which a given IAADR is paired with its complementary operator. In the absence of ligand, IA(5)HQN exhibits promiscuous cross-interaction with non-cognate DNA operator Otta – reducing (i.e., repressing) gene expression by 30%, relative to all other non-cognate operators (see Figure 5E). Suppression matrixes for IA(1)ADR, IA(4)ADR, IA(5)ADR, and IA(8)ADR sets exhibited orthogonal suppression for all six engineered antilacs and corresponding non-natural operators, see Figure 5A, D, E, and H (respectively). Matrixes for IA(3)ADR, IA(6)ADR, and IA(9)ADR sets were the second most successful clusters, with five (out of six) ADR-antilacs displaying exclusive ligand induced interaction with complementary operator DNA, see Figure 5C, F and I (respectively). The IA(2)ADR set (Figure 5B) was the least successful modular design cluster, with only three (of six) unit pairs exhibiting orthogonal (and specific) operator DNA interactions. Notably, antilac variant IA(2)YQR has the smallest reported dynamic range at 8.18%13, and in this study resulted in the poorest modular design outcomes. Out of the three parental systems (IA(1)YQR, IA(2)YQR, and IA(4)YQR) that were unresponsive to 10mM IPTG, each of the variants partially represses the expression of GFP. The order of repression (i.e., DNA binding sans ligand) for the putative antilacs is as follows, IA(2)YQR > > IA(4)YQR > IA(1)YQR (i.e., IA(2)YQR has the greatest repression strength). Practically speaking, the exceptional repression strength conferred by the IA(2)x regulatory core appears to limit the designability of IA(2)ADR (see Figure 5B). In contrast, engineered ADR-antilacs from parents IA(4)x and IA(1)x cores yield the maximum number (six) of IAADR sets. Accordingly, in certain cases initial (i.e., without ligand) repression strength appears to impose a threshold on antilac ADR modular design. In general, our hypothesis stands, independently engineered regulatory cores that exhibit IA phenotypes can be modularly paired with alternate operator DNA binding motifs. Assessment of Increased Signal Concentration on IAADR Suppression. As noted in the previous section the IA(2)ADR set (Figure 5B) was the least successful modular design cluster, with three (i.e., IA(2)TAN, IA(2)HTK, and IA(2)KSL) unit pairs exhibiting unresponsive phenotypes to 10mM IPTG. To examine whether IA(2)TAN, IA(2)HTK, and IA(2)KSL were entirely unresponsive to the inducer, we assessed gene suppression for these variants using excess concentrations of IPTG (i.e., 100 mM), see Figure 6A. Even at 100mM IPTG, the three variants remained incapable of increased suppression, such that any observed changes in gene output were statistically insignificant. To confirm that a concentration of 10mM represented saturating conditions throughout all variants, two sets of antilacs (i.e. IA(1)X , Figure 6B and IA(8)X ,Figure 6C) were assayed at 100mM, such that the two clusters represent antilacs originating from two different IS parents (V95A and K84A, respectively). GFP expression levels for the two clusters revealed no significant enhancements in gene suppression at 100mM IPTG (relative to 10 mM ligand concentration). Similar results were obtained for the six I+ADR (and I+YQR), in which IPTG functions as an inducer molecule (opposed to a co-repressor), see Figure 6D. Taken together, these results confirm that 10mM IPTG as saturating for IAADR variants. Effect of Operator Number and Symmetry on IAADR Performance. In addition to ligand concentration, properties of the operator can potentially influence the modular function of a given unit pair. In principle, the number of operators and operator symmetry (i.e., in the case of O1) are tunable. To elucidate the impact of the number of operator sites on the observed characteristics of the repressor variants, we devised a construct containing two tandem (repeating) operators, see Figure 7A (inset). We then proceeded to assay the activity of repressors bearing the corresponding DNA binding domain for a subset of the antilacs and compared these results with

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

those obtained with only the single promoter/operator region. As can be seen in Figure 7A, in all cases, the presence of a second operator region allowed for a decrease in basal expression while also decreasing the maximum gene output in the unrepressed state. More importantly, in some cases (i.e., IA(3)HTK, and IA(7)HTK ) the presence of the second operator region can result in a gain of IA function, see Figure 7A. This serves to elucidate the importance in the number and location of operator sites in the context of genetic circuits and provides yet another way in which the parameters of a system can be tuned to meet design needs. The six non-natural operators that resulted in orthogonal complementarity to discrete IAADR were all symmetric and were modified at base pair positions 5,6, and 7 (see Figure 2). However, alternate operators in which base pair position 9 was modified (i.e., Ogtg’ and Ogta’), resulted in crosstalk with more than one repressor, see Figure 3B. Certain parental unit pairs (i.e., IA(1)YQR, IA(2)YQR, and IA(4)YQR) were unresponsive to 10mM IPTG, when paired with O1. However, in the presence of symmetric non-natural operators IAADR variants derived from these parental regulatory cores were responsive to 10mM IPTG, see Figure 5A, B and D. To assess whether operator symmetry can influence the performance of antilacs the nine parental IAYQR chosen for this study were assayed in the presence of OSYM operator DNA (i.e., IAYQR/OSYM), see Figure 7B. The OSYM operator is a symmetric variant of O1, in which the right side has been mutated to match the left side of the operator with perfect symmetry, in a way that is synonymous to the six non-natural operators that were identified as orthogonal. The natural repressor I+YQR has a 10-fold greater affinity for OSYM, when compared to the native O1 operator.12 Interestingly, variant IA(4)YQR has restored antilac phenotype when paired with OSYM, see Figure 7B. However, the pairing of OSYM with unresponsive variants IA(1)YQR and IA(2)YQR does not confer antilac function. An increase in dynamic range was observed for unit pairs IA(7)YQR / OSYM and IA(5)YQR / OSYM. Thus, the presence of symmetric DNA can (in some cases) compensate for IA(X)YQR antilac binding properties that were engineered to pair with O1+. Moreover, this highlights the importance of iterative engineering between transcription factor and operator sets. Assessment of Full Orthogonality – Engineering Single-Signal Toggle Switches. To establish the ability to use two LacI variants with different DNA binding domains within the same cell without crosstalk (i.e., full orthogonality), two classes of synthetic single-signal toggle switches were constructed in which the fluorescent proteins GFP and mCherry (or red fluorescent protein (RFP)) were placed under the control of two distinct operators. The basic architecture of the gene circuit involves an I+ / Oxxx unit pair, regulating the expression of GFP; and an orthogonal IAADR and cognate operator, regulating the expression of mCherry. In the presence of IPTG the putative toggle switch will express GFP while suppressing mCherry production. However, in the absence of IPTG the simple gene circuit will facilitate the repression of GFP and the expression of mCherry. Here we constructed 5 toggle switches, utilizing five different antilac unit pairs (i.e., IA(8)HQN, IA(1)HQN, IA(6)HQN, IA(6)GKR, and IA(5)GKR) and two different complementary repressors (i.e., I+YQR and I+TAN). The first three single-signal toggle switches we constructed using the I+YQR / O1 unit pair coupled with an antilac unit pair IA(8)HQN / Ottg, IA(1)HQN / Ottg or IA(6)HQN / Ottg, see Figures 8A, B and C (respectively). As expected, in the absence of IPTG, mCherry fluorescence dominates as GFP expression is repressed by I+YQR; upon the addition of IPTG, the system toggles to the opposing state in which GFP fluorescence dominates and mCherry expression is suppressed by the IA variant bearing the HQN DBD-motif. All of the systems demonstrated orthogonal regulation between the two unit pairs (without apparent cross interaction), and the coupled operations displayed similar performance metrics to those observed for independently controlled systems. In the second iteration, to demonstrate the orthogonality between synthetic operators, GFP was placed under the control of I+TAN repressor, and RFP (rather than mCherry) was placed under the control of IA(5)GKR or IA(6)GKR, see Figures 8D and E (respectively). Once again, we observed the presence of a GFP dominated state (-IPTG) and the presence of an RFP dominated state

ACS Paragon Plus Environment

Page 8 of 24

Page 9 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

(+IPTG), both of which are stable. Canonical toggle switches (as described by Collins et al.20) utilize two repressors couple to the regulation of two separate promotors, such that repressor one inhibits transcription from promoter one and the corresponding transcription factor is induced by ligand one. Whereas, repressor two inhibits transcription from promoter two and is induced by ligand two. The single-signal toggle switches reported in this study represent a new architecture that allows for toggled expression of two separate outputs (as opposed to a single output, as presented in the canonical system). CONCLUSION The control of gene expression is an important tool for metabolic engineering, the design of synthetic gene networks, gene-function analysis, and protein manufacturing21-23. The most successful approaches to date are based on modulating mRNA synthesis via an inducible coupling to transcriptional effectors, which requires a biosensing function. A hallmark of biological sensing is the conversion of an exogenous signal, usually a small molecule or environmental cue such as a protein-ligand interaction into a useful output or response. Over the past 17 years, biomolecular engineers have designed a broad variety of genetic architectures (e.g., oscillators2427 , sensors28, and switches20, 29) that can be used in combination to confer new cellular functions30 using a relatively small collection of (DNA-binding) transcription factors (e.g., LacI, TetR, AraC, LuxR). The vast majority of the designed DNA-binding-based genetic circuits use fairly simple transcriptional controls (i.e., repression and induction via a single operator). However, the construction of synthetic, multi-input promoters is constrained by the number of unique transcription factors (or systems of transcription factors) that can simultaneously regulate a single promoter. This fundamental engineering constraint is an obstacle to synthetic biologists because it limits the computational capacity of engineered gene circuits. One solution to this problem is to engineer bespoke non-natural transcription factors that are orthogonal to native cellular environments that can work as systems to confer orthogonal control over gene production. One of the most utilized regulatory proteins is the lactose repressor (LacI)21-24, 26, controlling gene expression via canonical induction (I+), see Figure 1A. In a recent review article17, we explored the mechanochemical structure function relationship of LacI and investigated the designability (tunability) of LacI. This review of the literature revealed strategies for the modular design of novel regulatory proteins fashioned after the LacI topology and mechanochemical properties. Moreover, in a recent study we engineered LacI to confer alternate allosteric control producing, antilac functions (IA), see Figure 1C. Taken together, these advances represent a schema to systematically engineer new classes of transcription factors that can work together as systems without crosstalk, and will allow for the systematic improvement of circuit performance. To this end, we engineered 46 IAADR that exclusively interact with cognate non-natural operator DNA. The resulting IAADR are responsive to the chemical signal IPTG and interact with six mutually exclusive synthetic operator sequences, all orthogonal to the wild-type LacI unit pair. This study also allowed us to elucidate a set of IAADR engineering rules. First, ADR DBD-motifs that are responsive to IPTG in I+ allosteric cores, will likely be responsive in engineered IA regulatory cores. Second, ADR and cognate operator orthogonality shown in I+ADR unit pairs, will likely translate to synonymous IAADR unit pairs. Third, putative IAYQR that are unresponsive to IPTG, but recognize cognate non-natural operator DNA (i.e., partially suppress GFP expression), can be leveraged to confer fully functional IAADR (i.e., unit pairs that are responsive to IPTG). Mechanistically, conferred function from unresponsive parental variants (e.g., IA(1)YQR and IA(4)YQR) is the result of raising the level of gene expression upon pairing with ADR-DBD, such that the addition of IPTG confers a conformational change in a given IAADR that stabilizes the complex between the ADR-antilac and its cognate non-natural operator. Finally, strong initial suppression strength of unresponsive

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

partial variants (e.g., IA(2)YQR) can potentially limit antilac ADR modular design. Presumably, design limitations are due to marginal changes in the dynamic range upon DBD domain exchange, Supplementary Figure S3. In addition to unit pair orthogonality, the 46 IAADR that we engineered in this study exhibit diversity in dynamic ranges (i.e., Maximum GFP Output/Minimum GFP Output), or differences in gene expression upon suppression, Supplementary Figure S3. Accordingly, we can potentially leverage these engineered unit pairs for applications that require bespoke expression output controls. METHODS AND MATERIALS Construction of LacI mutants and Operator Variants. The genes for all LacI variants were based on pLacI (Novagen), which features a low copy number p15A origin, a chloramphenicol resistance marker, and the gene for the lactose repressor regulated with a constitutive LacI promoter. Mutations to the DNA binding domain were introduced via routine site directed mutagenesis using Phusion DNA Polymerase, summarized in Meyer et al.3. A reporter plasmid system was constructed starting with the pZS*22-sfGFP reported in Richards et al.13 This plasmid features a low copy number pSC101* origin of replication, and a Kanamycin resistance marker. The region of the plasmid excluding the promoter and operator was PCR amplified, visualized on an Agarose Gel, and Gel Extracted (Omega). A small fragment containing the constitutive trc promoter (hybrid of trp and lacUV5 promoters), a 5 bp spacer segment, and the operator sequence was constructed via oligos (Eurofins Genomics) and placed into the pZS*22-sfGFP vector through Circular Polymerase Extension Cloning (CPEC). The resulting plasmid was named ptrc and is shown in Supplementary Figure S2. For the toggle switch assay, it was necessary to introduce a third plasmid bearing a second LacI variant. This was accomplished via a plasmid with an alternate selection marker and compatible origin of replication. To this end, the AmpR coding region was PCR amplified from the pLS1 plasmid, visualized on an Agarose Gel, and Gel Extracted (Omega). This was then combined with the LacI coding region from pLacI via Splicing by Overlap Extension (SOE). Finally, the PBR322 origin of replication was PCR amplified from the pet28b vector (a gift from the Kane lab), visualized on an Agarose Gel, and Gel Extracted (Omega) and combined with the LacI and AmpR coding regions via Circular Polymerase Extension Cloning (CPEC). This plasmid could then be cotransformed along with the pLacI plasmid and the operator containing reporter plasmid for assaying. This plasmid can also be found in Supplementary Figure S2. The toggle switch was constructed as follows; the mCherry gene was first PCR amplified from the pet28b vector, visualized on an Agarose Gel, and Gel Extracted (Qiagen). A synthetic, constitutive promoter was then identified from the iGEM Standard Registry of parts (BBa_J23119) and synthesized via oligos (Eurofins Genomics) with a synthetic operator site directly downstream and the RBS site BBa_B0029 directly following. Similarly, a strong synthetic promotor was selected from work by Voigt et al. 31 and synthesized via oligos. The three fragments were combined via SOE and introduced into the linearized form of the ptrc plasmid via CPEC. Similarly, RFP was amplified from the pSb3t5 plasmid visualized on an Agarose Gel, and Gel Extracted (Qiagen) and the resulting toggle plasmid was built in a similar manner. Microplate Assay for Transcription Factor Operator Unit Pair Phenotyping. All experiments were performed in the cell strain 3.32, which is the E. Coli K12 strain that has LacI and the lac operon deleted. The plasmids pLacI and ptrc were co-transformed and plated on LB agar with kanamycin and chloramphenicol. Microplate assays were performed as outlined by Richards et al. 13. Briefly, colonies were inoculated in 1mL of LB and grown overnight at 37C and shaken at

ACS Paragon Plus Environment

Page 10 of 24

Page 11 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

185rpm at which point the cultures were diluted into 200uL wells in M9 minimal media supplemented with 0.2% casamino acids, 1mM Thiamine HCl and the appropriate antibiotics containing either 0, 10mM IPTG, or 100mM IPTG. Each sample was aliquoted in six samples in a clear, sterile, conical-bottom 96-well plate (Fischer Scientific) and grown in a 37C shaker at 185rpm covered with a Breathe-Easier sealing membrane (USA Scientific) to prevent evaporation. After 24 hours, all wells were transferred to a black 96-well plate (COSTAR) for assaying and GFP fluorescence (ex. 485nm, em. 510nm, gain 50) and optical density (OD600) were measured using a Synergy HT plate reader (BioTek). Corrections for pathlength were made using OD900 and OD975 and the fluorescence values were normalized to the optical density and averaged among all replicates. For the toggle switch construct, mCherry fluorescence was measured at ex.587nm, em. 610nm, gain 100 and RFP was measured at ex.585nm, em. 610nm. For each operator variant, the maximum GFP expression was determined using the LacStop control plasmid. LacStop is a defective LacI variant on the pLacI plasmid, which has a stop codon at positions 2 and 3, and therefore produces no repressor while still exerting the metabolic load of a second plasmid. Inducibility was determined by comparing the mean (n>4) GFP output in the presence and absence of inducer utilizing a student’s t-test with unequal variances and allowing for unequal sample sizes. The significance level was set to 0.01. FIGURE CAPTIONS: FIGURE 1: Lac repressor phenotypes. Lac repressor function is classified into four distinct phenotypes. (A) The wild type I+ repressor phenotype inhibits gene expression (green arrow) in the absence of inducer (red square) by binding the operator (blue oval). Gene expression is induced when the repressor undergoes a conformational shift upon binding the inducer, IPTG. (B)The IS or insensitive phenotype cannot be induced either due to an inability to bind inducer, shift conformation, or dissociate from the operator sequence. (C) The IA or antilac phenotype allows gene expression in the absence of inducer and binding of the co-repressor (red square) results in inhibition of gene expression (i.e., suppression). (D) The I- or nonfunctional phenotype is incapable of inhibiting gene expression either from an inability to fold and assemble or associate with the operator. FIGURE 2: The nucleotide sequences for each of the operator variants considered in this work along with OSYM, Ogtg’ and Ogta’, which were initially considered, but were not included in the final repression matrix. Nucleotide positions 5, 6, and 7 (purported to determine specificity) are shown in color in the left half site and bolded in the right half site. The central CG base pair is also shown in bold. Amino acid sequences around the DNA recognition helix for all variants are also shown with positions 17, 18 and 22 shown in color. A cartoon depiction of the orientation of the three residues within the helix is included. The N-terminal headpiece of LacI in complex with its operator (PDB: 1L1M) is also included with the three residues shown as spheres and DNA positions 14, 15, 16 are shown in black. The boxed unit pairs represent the orthogonal alternate DNA binding sets identified in this study. Note: (i) Operator DNA nomenclature is defined as Oxxx , where xxx corresponds to variable DNA positions 5,6,7. (ii) Ogtg’ and Ogta’ are alternate operators in which base pair position 9 was modified. FIGURE 3: (A) Repression matrix for the wild-type allosteric core. Scale bar on the left shows a reference for GFP output. The different operators are shown along the left, while the recognition helices are displayed along the top of the table. The bottom left triangle shows GFP output in the absence of IPTG, while the top right shows GFP output in the presence of 10mM IPTG. Red stars denote statistical significance at α = 0.01 level, upon induction. Maximum GFP expression by operator in is indicated via the lac stop control (a defective repressor that is incapable of

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

repression but exerts approximately the same metabolic load on the cell). (B) Repression matrix justifying the exclusion of I+AWR/Ogta’ and I+RQR/Ogtg’ unit pairs. (C) Cartoon description of the RiBoJ insulator function. (D) Output is shown for operators in the presence and absence of the RiBoJ insulator for comparison. (E) Dynamic range for each (operator-I+) pair in the presence and absence of RiBoJ. FIGURE 4: Combinatorial set of nine distinct antilacs (IA) from three IS parents paired with six alternate, symmetric DBDs and the wild type YQR. For a list of mutations corresponding to each antilac, see Figure S1. The IA phenotype is denoted in blue and those that are unresponsive to ligand are denoted in gray. The combinatorial set that compose the 54 putative antilacs with alternate DNA binding are outline with a box. 46 of the 54 IA(x)ADR are functional and are identified as blue filled circles. Unresponsive variants (gray filled circles) are either I- or Is phenotypes, see Figure 1. The top row represents previously tested IA variants with the wild-type DNA binding domain; however, in this study a simple O1 operator was used instead of O1+ - resulting in three unresponsive antilacs from the original set (i.e., IA(1)YQR, IA(2)YQR, and IA(4)YQR). FIGURE 5: Suppression matrixes for the nine antilac allosteric cores (A) IA(1), (B) IA(2), (C) IA(3), (D) IA(4), (E) IA(5), (F) IA(6), (G) IA(7), (H) IA(8), (I) IA(9) – antilac core descriptions given in Figure S1. The leftmost column shows the maximum (unrepressed) GFP output for each operator in the presence of LacStop. The different operators are shown along the left, while the Recognition Helices are displayed along the top of the table. The bottom left triangle shows GFP output in the absence of IPTG, while the top right shows GFP output in the presence of 10mM IPTG. Red stars in the upper right-hand corner signifies statistical significance at the α = 0.01 level, upon suppression. The black boxes denote the engineered IAADR (ADR = Alternate DNA Recognition). FIGURE 6: (A) OD (600) Normalized GFP Output at various conditions (no IPTG/10mM IPTG/100mM IPTG) for IA(2) mutants reclassified as unresponsive to demonstrate there is no significant change in expression, thus demonstrating true insensitivity to the inducer IPTG. (B) GFP Output at various conditions (-IPTG/10mM IPTG/100mM IPTG) for IA(1)ADRs. (C) GFP Output at various conditions (no IPTG/10mM IPTG/100mM IPTG) for all IA(8)ADRs. (D) GFP Output at various conditions (no IPTG/10mM IPTG/100mM IPTG) for all I+ADR. FIGURE 7: (A) Promoter architecture for the original promoter-operator constructs as well as the tandem operator used to investigate the impact of the number of operator sites on repression (using Octt and all nine IAHTK antilacs). (B) Evaluation of the nine ADR repressors as unit pairs with OSYM. FIGURE 8: OD Normalized fluorescence for the toggle switch including mCherry under the control of the WT O1 operator and GFP under the control of Ottg. The first three toggle switches we constructed using the I+YQR / O1 unit pair coupled with an antilac unit pair (A) IA(8)HQN / Ottg, (B) IA(1)HQN / Ottg or (C) IA(6)HQN / Ottg. mCherry fluorescence measured at 585ex/610em and GFP at 485ex/510em. Error Bars represent the 95% confidence interval of the mean from six trials. Fluorescence shown in the absence and presence of 10mM IPTG (inducer). OD Normalized fluorescence for the toggle switch containing mRFP under the control of the (E) Ogac operator and GFP under the control of the (D) Otta operator. mRFP fluorescence measured at 585ex/ 610em . Error Bars represent the 95% confidence interval of the mean from six trials. Fluorescence shown in the absence and presence of 10mM IPTG (inducer).

ACS Paragon Plus Environment

Page 12 of 24

Page 13 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

REFERENCES: [1] Wilson, C. J., Zhan, H., Swint-Kruse, L., and Matthews, K. S. (2007) The lactose repressor system: paradigms for regulation, allosteric behavior and protein folding, Cell Mol Life Sci 64, 3-16. [2] O'Gorman, R. B., Rosenberg, J. M., Kallai, O. B., Dickerson, R. E., Itakura, K., Riggs, A. D., and Matthews, K. S. (1980) Equilibrium binding of inducer to lac repressor.operator DNA complex, J Biol Chem 255, 10107-10114. [3] Meyer, S., Ramot, R., Kishore Inampudi, K., Luo, B., Lin, C., Amere, S., and Wilson, C. J. (2013) Engineering alternate cooperative-communications in the lactose repressor protein scaffold, Protein Eng Des Sel 26, 433-443. [4] Brenowitz, M., Pickar, A., and Jamison, E. (1991) Stability of a Lac repressor mediated "looped complex", Biochemistry 30, 5986-5998. [5] Mossing, M. C., and Record, M. T., Jr. (1986) Upstream operators enhance repression of the lac promoter, Science 233, 889-892. [6] Pfahl, M., Gulde, V., and Bourgeois, S. (1979) "Second" and "third operator" of the lac operon: an investigation of their role in the regulatory mechanism, J Mol Biol 127, 339344. [7] Reznikoff, W. S., Winter, R. B., and Hurley, C. K. (1974) The location of the repressor binding sites in the lac operon, Proc Natl Acad Sci U S A 71, 2314-2318. [8] Daber, R., and Lewis, M. (2009) A novel molecular switch, J Mol Biol 391, 661-670. [9] Milk, L., Daber, R., and Lewis, M. (2010) Functional rules for lac repressor-operator associations and implications for protein-DNA interactions, Protein Sci 19, 1162-1172. [10] Sartorius, J., Lehming, N., Kisters, B., von Wilcken-Bergmann, B., and Muller-Hill, B. (1989) lac repressor mutants with double or triple exchanges in the recognition helix bind specifically to lac operator variants with multiple exchanges, EMBO J 8, 1265-1270. [11] Zhan, J., Ding, B., Ma, R., Ma, X., Su, X., Zhao, Y., Liu, Z., Wu, J., and Liu, H. (2010) Develop reusable and combinable designs for transcriptional logic gates, Mol Syst Biol 6, 388. [12] Sadler, J. R., Sasmor, H., and Betz, J. L. (1983) A perfectly symmetric lac operator binds the lac repressor very tightly, Proc Natl Acad Sci U S A 80, 6785-6789. [13] Richards, D. H., Meyer, S., and Wilson, C. J. (2017) Fourteen Ways to Reroute Cooperative Communication in the Lactose Repressor: Engineering Regulatory Proteins with Alternate Repressive Functions, ACS Synth Biol 6, 6-12. [14] Poelwijk, F. J., de Vos, M. G., and Tans, S. J. (2011) Tradeoffs and optimality in the evolution of gene regulation, Cell 146, 462-470. [15] Daber, R., and Lewis, M. (2009) Towards evolving a better repressor, Protein Eng Des Sel 22, 673-683. [16] Bashor, C. J., and Collins, J. J. (2012) Insulating gene circuits from context by RNA processing, Nat Biotechnol 30, 1061-1062. [17] Davey, J. A., and Wilson, C. J. (2017) Deconstruction of complex protein signaling switches: a roadmap toward engineering higher-order gene regulators, Wiley Interdiscip Rev Nanomed Nanobiotechnol 9. [18] Lutz, R., and Bujard, H. (1997) Independent and tight regulation of transcriptional units in Escherichia coli via the LacR/O, the TetR/O and AraC/I1-I2 regulatory elements, Nucleic Acids Res 25, 1203-1210. [19] Bahl, C. P., Wu, R., Stawinsky, J., and Narang, S. A. (1977) Minimal length of the lactose operator sequence for the specific recognition by the lactose repressor, Proc Natl Acad Sci U S A 74, 966-970. [20] Gardner, T. S., Cantor, C. R., and Collins, J. J. (2000) Construction of a genetic toggle switch in Escherichia coli, Nature 403, 339-342.

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[21] Clancy, K., and Voigt, C. A. (2010) Programming cells: towards an automated 'Genetic Compiler', Curr Opin Biotechnol 21, 572-581. [22] Nielsen, A. A., Segall-Shapiro, T. H., and Voigt, C. A. (2013) Advances in genetic circuit design: novel biochemistries, deep part mining, and precision gene expression, Curr Opin Chem Biol 17, 878-892. [23] Voigt, C. A. (2006) Genetic parts to program bacteria, Curr Opin Biotechnol 17, 548-557. [24] Elowitz, M. B., and Leibler, S. (2000) A synthetic oscillatory network of transcriptional regulators, Nature 403, 335-338. [25] Fung, E., Wong, W. W., Suen, J. K., Bulter, T., Lee, S. G., and Liao, J. C. (2005) A synthetic gene-metabolic oscillator, Nature 435, 118-122. [26] Stricker, J., Cookson, S., Bennett, M. R., Mather, W. H., Tsimring, L. S., and Hasty, J. (2008) A fast, robust and tunable synthetic gene oscillator, Nature 456, 516-519. [27] Tigges, M., Marquez-Lago, T. T., Stelling, J., and Fussenegger, M. (2009) A tunable synthetic mammalian oscillator, Nature 457, 309-312. [28] Kobayashi, H., Kaern, M., Araki, M., Chung, K., Gardner, T. S., Cantor, C. R., and Collins, J. J. (2004) Programmable cells: interfacing natural and engineered gene networks, Proc Natl Acad Sci U S A 101, 8414-8419. [29] Kramer, B. P., Viretta, A. U., Daoud-El-Baba, M., Aubel, D., Weber, W., and Fussenegger, M. (2004) An engineered epigenetic transgene switch in mammalian cells, Nat Biotechnol 22, 867-870. [30] Lu, T. K., Khalil, A. S., and Collins, J. J. (2009) Next-generation synthetic gene networks, Nat Biotechnol 27, 1139-1150. [31] Chen, Y. J., Liu, P., Nielsen, A. A., Brophy, J. A., Clancy, K., Peterson, T., and Voigt, C. A. (2013) Characterization of 582 natural and synthetic terminators and quantification of their design constraints, Nat Methods 10, 659-664.

ACS Paragon Plus Environment

Page 14 of 24

Page 15 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

ASSOCIATED CONTENT Supporting Information: The Supporting Information (SI) includes nine IA primary structures, vector maps, suppression matrixes. SI is available free of charge on the ACS Publications website at DOI: xxxx Corresponding Author * [email protected]. Author Contributions The manuscript was written through contributions of all authors. / All authors have given approval to the final version of the manuscript. Funding Sources This work was supported by NSF Awards MCB 1747439, CBET 1804639, and CBET 1844289 to CJW. Notes The authors declare no competing financial interests. We would like to thank Andrew Short and Namratha Vedire for contributing to the cover art.

TOC Figure

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figures: Engineering a New Class of Anti-LacI Transcription Factors with Alternate DNA Recognition Ronald E. Rondon1 and Corey J. Wilson1† 1

Georgia Institute of Technology, School of Chemical & Biomolecular Engineering



To whom correspondence should be addressed: Corey J. Wilson, Georgia Institute of Technology, School of Chemical & Biomolecular Engineering, 311 Ferst Drive, Atlanta, GA 30332-0100. E-Mail: [email protected].

ACS Paragon Plus Environment

Page 16 of 24

Page 17 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Figure 1

Repressor (I+)

A

-IPTG

+IPTG

operator

operator

Insensi�ve (IS)

B

-IPTG

operator

+IPTG

operator

An�lac (IA)

C

-IPTG

+IPTG

operator

operator

Non-func�onal (I-)

D

-IPTG

operator

+IPTG

operator

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2

ACS Paragon Plus Environment

Page 18 of 24

Figure 3

A B

D E

C

promoter

operator

RiboJ

5’ UTR

RBS

RBS

Auto-Catalyzed Ribozyme Cleavage

ACS Paragon Plus Environment

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology Page 19 of 24

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 24

Figure 4

K84A K84A IIA(1) A(1)

IIA(2) A(2)

IIA(3) A(3)

V95A V95A IIA(4) A(4)

IIA(5) A(5)

IIA(6) A(6)

IIA(7) A(7)

V95F V95F IIA(8) A(8)

YQR YQR NAR NAR HQN HQN TAN TAN GKR GKR HTK HTK KSL KSL IIAA Phenotype Phenotype

Unresponsive Unresponsive

ACS Paragon Plus Environment

IIA(9) A(9)

Page 21 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Figure 5

A

B

C

D

E

F

G

H

I

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 6

A

C

B

D

ACS Paragon Plus Environment

Page 22 of 24

Page 23 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Figure 7

A

B

ACS Paragon Plus Environment

ACS Synthetic Biology

Figure 8

A

D B

E C

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 24

ACS Paragon Plus Environment