Genetic Design via Combinatorial Constraint Specification - ACS

Sep 5, 2017 - These are removed using a basic reachability test algorithm. For example, in Figure 1A, the red text translates into three ...... Oishi ...
3 downloads 11 Views 1MB Size
Subscriber access provided by University of Colorado Boulder

Article

Genetic design via combinatorial constraint specification Swapnil Bhatia, Michael Smanski, Christopher A. Voigt, and Douglas Densmore ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.7b00154 • Publication Date (Web): 05 Sep 2017 Downloaded from http://pubs.acs.org on September 6, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Synthetic Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Genetic design via combinatorial constraint specification

Swapnil P. Bhatia1, Michael J. Smanski2, Christopher A. Voigt3, and Douglas M. Densmore1

1

Biological Design Center, Department of Electrical and Computer Engineering, Boston University,

Boston, MA 02215, USA

2

Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, St Paul,

MN 55108, USA

3

Synthetic Biology Center, Department of Biological Engineering, Massachusetts Institute of Technology,

Cambridge, MA 02139, USA

Correspondence and requests for materials should be addressed to D.M.D ([email protected]).

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

We present a formal language for specifying via constraints a “design space” of DNA constructs composed of genetic parts, and an algorithm for automatically and correctly creating a novel representation of the space of satisfying designs. The language is simple, captures a large class of design spaces, and possesses algorithms for common operations on design spaces. The flexibility of this approach is demonstrated using a 16-gene nitrogen fixation pathway and genetic logic circuits.

Genetic engineers 1, 2 introduce new behaviors into organisms using recombinant DNA comprising a set of genetic parts.3 Current design practices focus on the manual selection of parts from a collection of sequences4, 5 and their ordered concatenation to form the final construct6. The increased access to and reduced cost of DNA synthesis and assembly7 has made it possible to simultaneously build and verify many variations of parts in the final constructs. For example, combinatorial libraries have been designed8 that vary the identity of promoters and ribosome binding sites (RBSs) at specific locations in a construct to explore the space of gene expression9. These libraries implicitly encode design rules10, 11 that might be advantageous to vary in the library, such as the order and orientation of genes and their organization into operons. Efficiently designing many constructs that conform to complex rules, however, remains a challenge even if their physical construction is not12. For example, imagine a system that comprises a dozen genes and one hundred genetic parts. If a designer seeks to build a thousand variants, it would be prohibitively difficult to design each construct using a DNA-level parts oriented software environment while complying with the necessary rules. Here, we present a specification language that enables the formalized definition of a “design space”. A design space encodes all possible designs adhering to the specified rules without having to explicitly enumerate each design. Combinatorial design algorithms are developed to convert the single design space specification into a user-defined number of final constructs, all of which conform to the rules by definition.

ACS Paragon Plus Environment

Page 2 of 16

Page 3 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

The field of synthetic biology has grown rapidly and genetic engineering projects are getting increasingly sophisticated. A variety of computer aided design algorithms and software environments have been developed to aid the engineering of DNA constructs. These can be broadly classified into those that aid specification, design, assembly, or verification. Specification tools describe the desired function of the biological construct13-16 or how the parts in the construct are organized17, 18. Design tools transform these specifications into collections of realizable DNA constructs and optimize them for performance19-23. Assembly tools plan how to physically create these constructs efficiently taking into account the needs of biological protocols, resource optimization, and automation24-28. Verification tools determine if specific functional properties are maintained by a final construct29 or if experimental data indicates the construct performs its function properly30, 31. Here, we propose a novel approach that allows for the specification of (possibly infinite) design spaces using a small, finite set of operators. This approach is purposely “structural” in that it is correctby-construction method that captures the composition of a design and not the function. This approach is compatible with current incremental, artisanal experimental design processes as well as providing a “data model” for future formalized, functional specification mechanisms. The approach also allows a designer to systematically sample designs or verify existing designs for adherence to genetic design rules. Our approach hinges on a formal “design language” to describe designs of interest and on algorithms to sample designs. The formal design language starts with a graph data structure representing a design space – the set of all designs plausible as per the designer’s specification – instead of a list of individual designs. It enables the reuse of design know-how and a concise, well-defined, and standardizable notation for describing designs. This approach leverages existing concepts from theoretical computer science and language theory32. Construction of a combinatorial set of objects from an elementary alphabet is a central paradigm in computing33, leading to languages and a study of their generative power, which has proven foundational in the creation of software toolchains for

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

programming34. We adapt these conceptual computing foundations to the specification and enumeration of genetic designs and propose algorithms and data structures for specifying design spaces via constraints.

Results and Discussion The proposed formal design language comprises five operators and is sufficiently expressive for capturing a large and clearly identifiable class of design spaces (Table 1). It requires a designer to provide two data items. First, the designer provides a parts library and classifies the parts into categories (or types), each category being given a distinct name. The designer then uses five types of operators shown in Table 1 to describe the space of designs they have in mind. Thus, a design specification is an expression that combines part categories with design operators. For example, a designer might define the categories “promoter”, “RBS”, “gene”, and “terminator”, and define the space of operons as: promoter then one-or-more { RBS then CDS } then terminator. Figure 1(a) shows a more detailed example. The specification text describes the design space of forward oriented single- and multi-operon designs where promoters are allowed to occur internal to operons. (For simplicity, translation is not included in this example.) The text enforces that the designs be transcriptionally valid: all genes (CDSes) have an upstream promoter and downstream terminator. The red text specifies that the design must contain a promoter flanked by spacers (e.g., insulators). The blue text specifies that either a spacer or one or more insulated promoters are located upstream of a gene. Whichever alternative occurs in a design, it must be followed by a coding sequence. The green text specifies that the portion defined above may be repeated zero or more times, to build an operon. This, when combined with the orange text, generalizes the specification to include design variations with a single gene, multi-gene operons, and multiple operons, each of which need to be followed by a terminator, which may have up or downstream spacers.

ACS Paragon Plus Environment

Page 4 of 16

Page 5 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

After writing the design specification, the first algorithmic step is to build the design space. This is done in a hierarchical manner, the steps for which are shown in Figure 1 (b) for the simple example above. Each line of text of the specification is translated piecewise into a searchable graph data structure composed of colored nodes with labeled arcs between them. Nodes are green, white, or red indicating whether a design can start, continue, or stop at that node, respectively. The arc labels are the part categories defined by the designer, or combinations of categories derived from the specification by the algorithm. The graph can be “read” by starting at the green node and progressing along the arcs until a red node is reached.

The parts encountered in the part categories along the path are

concatenated to build the construct. Alternative paths correspond to alternative constructs, all of which conform to the designer’s specification. Conversion of the specification to the graph is done in the following way. Each portion of the specification maps to a distinct but fixed mini-graph template, and the operators combine these minigraphs into a final graph representing the full specification32. Graph construction requires a suite of graph rewriting algorithms including end-start state joining, and graph tensor product35. Once a graph is built, it may contain “dead end” paths. These are removed using a basic reachability test algorithm. For example, in Figure 1 (a), the red text translates into three two-node graphs corresponding to the two spacers and the promoter, which are then joined into a linear graph, shown at the top of Figure 1 (b), to represent a linear part order constraint. The blue “one-or-more” constraint translates into two copies of the linear graph constructed above: one represents the single occurrence case and the other represents the multiple occurrence case. These graphs are then joined with the spacer and CDS constraints to obtain the graph shown in the blue rectangle in Figure 1 (b). This graph represents the red and blue text together. Similar hierarchical parsing and application of graph construction rules is used to obtain the final graph shown in Figure 1 (b). Other details regarding the implementation of the algorithms to build the design space are described in the Supplementary Information.

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

As a specification is parsed, its operators add new nodes and arcs to the specification graph, increasing its size, and thus affecting the rate at which designs may be generated. The increase in the graph size depends on the type and number of operator instances in the specification. Of the operators shown in Table 1, the “and” operator is the most expensive, but often enables the writing of concise specifications. To illustrate our approach and evaluate its performance, we applied it to the task of specifying the basic operon (Figure 1), the nitrogen fixation (nif) gene sub-cluster (Figure SI 1), and a NOR-gate based logic circuit (Figure SI 2) design spaces. The basic operon design space describes forward oriented single and multi-operon designs in which each CDS has an upstream promoter and a downstream terminator and operons are allowed to have internal promoters. The nif gene sub-cluster is part of a pathway for nitrogen fixation, with operons that follow specific design rules. The NOR-gate based logic circuit design space is the space of genetic circuits composed of repressors driven by tandem repressible promoters (See SI for details). The three resulting specifications contained 16, 63, and 70 operator instances, respectively (with 0 (0%), 11 (17%), and 9 (12%) of them being “and” operators, respectively). The NOR-based genetic logic circuit design space, for example, contains 1,426 nodes and 2,082 arcs, whereas the nif design space contains about 500 nodes and 1000 arcs. Generally, nodes and arcs in a design space increase as the specification is extended and may decrease as constraints are added, or as the graph is pruned via reachability analysis36. Linear part order constraints increase the number of nodes and arcs linearly, “or” constraints increase them additively, whereas “and” constraints increase them multiplicatively. Once the design space is created, it can be searched to generate designs that conform to the specification. The shortest designs can be generated by finding all the shortest green-to-red paths. Since every such path found produces a valid design, the approach produces valid designs orders of magnitude more efficiently than the naïve algorithm (Supplementary Information provides insight on algorithm performance), which generates all possible designs and checks each one against the

ACS Paragon Plus Environment

Page 6 of 16

Page 7 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

specification. The graph data structure constructed in our approach can be integrated with downstream software tools that use a specialized objective function but benefit from a pre-constrained search space – for example, a genetic circuit design tool16 where constraint-based combinatorial design must be coupled with characterization data-based circuit design. The graph is also useful in validating designs generated otherwise: a design’s part sequence may be checked against all valid paths in the graph. A key aspect of this approach is that the specification graph need only be constructed once for a specification and can be saved or transmitted electronically for further processing. The complete combinatorial structure of the design space is captured in a simple, electronically transmittable data structure. Its visualization offers intuitive insight into the specification and helps in proving its correctness and eliminating errors. Figure 2 (a) shows examples of colored paths through the specification graph, and corresponding examples of the satisfying designs are shown in Figure 2 (b). The category labels on the arrows of a satisfying path are concatenated to generate design templates, which can in turn be translated into a set of satisfying designs. Using a simple search algorithm, generating 10,000 design templates for example, requires the most search effort in the NOR graph (133K operations with 40.9 MB of memory taking 83s), the least search in the operon graph (57K operations with 0.77 MB of memory taking 1.1s), and an intermediate amount in the nif graph (40K operations with 65.9 MB taking 81s). Simply put, our approach will allow for the generation of thousands of fully-constrained designs for genetic circuits or gene clusters in seconds using desktop computing equipment. This is indicative of the approximate complexity of each of the specifications. For example, a NOR circuit might require all distinct repressors, matching repressor-promoter pairs, and acyclic connectivity among promoters being repressed and those driving the repressors, whereas an operon enforces the simpler promoter-geneterminator co-orientation constraint. The scalability of the proposed approach depends on the type and number of constraints in a specification: “then” and “or” constraints can scale to 1000s of genetic

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

components, whereas “and” constraints, their effect on the design space being multiplicative in the worst case, may not scale beyond a few 10s of such constraints, depending on their complexity. Nevertheless, the efficiency of the approach can be improved using more sophisticated search algorithms or constraint solvers. Moreover, for “and” constraints, one could selectively retain only a portion of the design space after the addition of each constraint, thus trading off design diversity for a smaller memory requirements. In this way, our approach allows for the generation of thousands of fullyconstrained design spaces in seconds for many classes of the practical design spaces. Here, we have proposed a simple, easily visualized, formal design language (Supplementary Information provides the specification syntax) that allows a single design file to be converted to a library of constructs, all of which conform to the design specification. The Supplementary Information contains several specification examples. These algorithms can be easily connected to higher-level design and visualization software. For example, the graph can be integrated with downstream software tools that use a specialized objective function but benefit from a pre-constrained search space16, 26, 37, 38. More generally, the graph data structure presented can serve as a link between high-level design automation for genetic circuits, statistical design of experiments, or metabolic engineering, and the low level partand sequence-based design rules needed to build a library of DNA constructs. The graph representation of a design space could also be integrated into synthetic biology data exchange standards17. The abstraction from physical DNA parts and the syntactic form of the design file also facilitates the distribution and archival of design spaces across the community. Design repositories could make these spaces available for others to apply to design and validate their systems. While specific designs themselves may not be portable across projects (e.g., because they are designed for different organisms), aspects of the design specification (e.g., constraints that enforce the transcriptiontranslation part organization) may be transferable. Thus, the creation of abstract design languages and

ACS Paragon Plus Environment

Page 8 of 16

Page 9 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

the algorithms to convert them into constructs will help facilitate the dissemination of design information in the genetic engineering community.

Acknowledgements S.P.B, M.J.S., C.A.V, and D.M.D were supported by the Defense Advanced Research Projects Agency (DARPA) Living Foundries grant HR0011-12-C-0067 during the creation of this research. This work was also supported by the Institute for Collaborative Biotechnologies through grant W911NF-09-0001 from the U.S. Army Research Office. S.P.B and D.M.D were also supported by the NSF’s Living Computing Project (Award #1522074). The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred. We thank Dr. Nicholas Roehner for helping elucidate the application of the proposed design language to genetic circuits. Author Contributions S.P.B, C.A.V, and D.M.D conceived of the study. S.P.B created and implemented the algorithms. M.J.S. and S.P.B. gathered the data and performed the analysis. S.P.B. wrote the manuscript and all the authors revised the final version.

Competing Financial Interests The authors declare no competing financial interests. Supporting Information Document detailing the formal definitions of a design space, our algorithm performance analysis, our specification syntax, and two case studies using this approach. Figures Captions Figure 1: Building a design space based on genetic parts and constraints. (A) Shown is an example of a simple design specification describing a transcriptionally-valid forward-oriented multi-gene multioperon. Transcriptionally valid means every coding sequence has an upstream promoter and a downstream terminator. The specification is read in the order red, blue, green, orange, and this corresponds to the growth of the node-arc diagram. Operators are shown in bold (one-or-more, zeroor-more, or, and then) and the remaining terms are user-defined names of part categories. (B) The stepwise translation of the textual specification into the node-arc diagram is illustrated and the colored rectangles correspond to part A. The green node is the start state and the red nodes are the potential end states. SBOLv symbols correspond to part classes: bent arrow/promoter, T/terminator, circle with X/spacer. (C) Node and arc counts for the nif design space are shown as constraints are added incrementally. Figure 2: Searching the design space to build genetic constructs. (A) The example design space constructed in Figure 1 is shown. The colors and numbers correspond to five valid paths that the search algorithm can take to build alternative constructs to satisfy the constraints. All paths start at the green node and must end at one of the red nodes. When a path is found, the parts (labeled on each arc) are concatenated into a construct. (B) The five constructs that are built from each path are shown. The colors and numbers correspond to part A. (C) Relationship between the number of nodes in a design space as new constraints are incrementally added, and the number of valid designs is shown for the

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 16

three design spaces (operon – squares, nif – circles, and NOR circuits – triangles). For the operon space, colored squares correspond to the colors in Figure 1 A.

ACS Paragon Plus Environment

Page 11 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Table 1:

Genetic Design Space Operators, Descriptions, and Examples Specification Design space Description Operator Type example example1 A

Any design in set A

Design examples2 a c

then

A then B

Concatenate every design in A with that in B

ab ac cb

or

A or B

Any design in A or B

a b

and

A and B

Any design in A and B

part set3

zero-or-more

zero-or-more A

Concatenate any designs in A zero or more times

c a aa

Concatenate any designs in A a one or more times aa 1. Arcs denote a part from the “part set”, circles represent part junctions, green circles represent a potential beginning of a design and red circles represent a potential end of a design. 2. Design examples are designs that can be created given the specification example. This is not an exhaustive list. denotes a null design containing no parts. 3. Each part category defined by a designer may be thought of as an operator that defines a design space containing a single design with a single part from that part category. one-or-more

one-or-more A

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 16

References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.

Arkin, A. 2008. Setting the standard in synthetic biology. Nat Biotechnol 26, 771-774. Purnick, P.E. & Weiss, R. 2009. The second wave of synthetic biology: from modules to systems. Nat Rev Mol Cell Biol 10, 410-422. Endy, D. 2005 Foundations for engineering biology. Nature 438, 449-453. Ham, T.S. et al. 2012. Design, implementation and practice of JBEI-ICE: an open source biological part registry platform and tools. Nucleic Acids Res 40, e141. Peccoud, J. et al. 2008. Targeted development of registries of biological parts. PLoS One 3, e2671. Andrianantoandro, E., Basu, S., Karig, D.K. & Weiss, R. 2006. Synthetic biology: new engineering rules for an emerging discipline. Mol Syst Biol 2, 2006 0028. Casini, A., Storch, M., Baldwin, G.S. & Ellis, T. 2015. Bricks and blueprints: methods and standards for DNA assembly. Nat Rev Mol Cell Biol 16, 568-576. Schaerli, Y. & Isalan, M. 2013. Building synthetic gene circuits from combinatorial libraries: screening and selection strategies. Molecular bioSystems 9, 1559-1567. Smanski, M.J. et al. 2014. Functional optimization of gene clusters by combinatorial design and assembly. Nat Biotechnol. Bilitchenko, L. et al. 2011. Eugene--a domain specific language for specifying and constraining synthetic biological parts, devices, and systems. PLoS One 6, e18882. Galdzicki, M., Rodriguez, C., Chandran, D., Sauro, H.M. & Gennari, J.H. 2011. Standard biological parts knowledgebase. PLoS One 6, e17005. Ellis, T., Adie, T. & Baldwin, G.S. 2011. DNA assembly for synthetic biology: from parts to pathways and beyond. Integr Biol (Camb) 3, 109-118. Beal, J., Lu, T. & Weiss, R. 2011. Automatic compilation from high-level biologicallyoriented programming language to genetic regulatory networks. PLoS One 6, e22490 . Jang, S.S., Oishi, K.T., Egbert, R.G. & Klavins, E. 2012. Specification and simulation of synthetic multicelled behaviors. ACS Synth Biol 1, 365-374. Huynh, L., Tsoukalas, A., Koppe, M. & Tagkopoulos, I. 2013. SBROME: a scalable optimization and module matching framework for automated biosystems design. ACS Synth Biol 2, 263-273. Nielsen, A.A. et al. 2016. Genetic circuit design automation. Science 352, aac7341 . Galdzicki, M. et al. 2014. The Synthetic Biology Open Language (SBOL) provides a community standard for communicating designs in synthetic biology. Nat Biotechnol 32, 545-550. Oberortner, E. & Densmore, D. 2015. Web-based software tool for constraint-based design specification of synthetic biological systems. ACS Synth Biol 4, 757-760. Chen, J., Densmore, D., Ham, T.S., Keasling, J.D. & Hillson, N.J. 2012. DeviceEditor visual biological CAD canvas. J Biol Eng 6, 1. Mirschel, S., Steinmetz, K., Rempel, M., Ginkel, M. & Gilles, E.D. 2009. PROMOT: modular modeling for systems biology. Bioinformatics 25, 687-689. Oishi, K. & Klavins, E. 2014. Framework for engineering finite state machines in gene regulatory networks. ACS Synth Biol 3, 652-665. Yaman, F., Bhatia, S., Adler, A., Densmore, D. & Beal, J. 2012. Automated selection of synthetic biology parts for genetic regulatory networks. ACS Synth Biol 1, 332-344. Beal, J. et al. 2012. An end-to-end workflow for engineering of biological networks from high-level specifications. ACS Synth Biol 1, 317-331. Hillson, N.J., Rosengarten, R.D. & Keasling, J.D. 2011. j5 DNA Assembly Design Automation Software. ACS Synthetic Biology 1, 14-21. Linshiz, G. et al. 2012. PaR-PaR laboratory automation platform. ACS Synth Biol 2, 216222.

ACS Paragon Plus Environment

Page 13 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

26. 27. 28. 29.

30. 31. 32. 33. 34. 35. 36. 37. 38.

Appleton, E., Tao, J., Haddock, T. & Densmore, D. 2014. Interactive assembly algorithms for molecular cloning. Nat Methods 11, 657-662. Densmore, D. et al. 2010. Algorithms for automated DNA assembly. Nucleic Acids Res 38, 2607-2616. Linshiz, G. et al. 2008. Recursive construction of perfect DNA molecules from imperfect oligonucleotides. Mol Syst Biol 4, 191. Cai, Y., Hartnett, B., Gustafsson, C. & Peccoud, J. 2007. A syntactic model to design and verify synthetic genetic constructs derived from standard biological parts. Bioinformatics 23, 2760-2767. Beal, J. et al. 2014. Model-driven Engineering of Gene Expression from RNA Replicons. ACS Synth Biol. Davidsohn, N. et al. 2015. Accurate predictions of genetic circuit behavior from part characterization and modular composition. ACS Synth Biol 4, 673-681. Hopcroft, J.E. & Ullman, J.D. 2006. Automata theory, languages, and computation. Prentice Hall. Floyd, R.W. & Beigel, R. 1994. The language of machines: an introduction to computability and formal languages. (Computer Science Press, Inc.). Aho, A.V., Sethi, R. & Ullman, J.D. 1986. Compilers: principles, techniques, and tools. (Addison-Wesley Longman Publishing Co., Inc.). Rabin, M.O. & Scott, D. 1959. Finite automata and their decision problems. IBM Journal of Research and Development 3, 114-125. Alur, R. 2011. in Embedded Software (EMSOFT), 2011 Proceedings of the International Conference on 273-278. Chandran, D., Bergmann, F.T. & Sauro, H.M. 2009. TinkerCell: modular CAD tool for synthetic biology. J Biol Eng 3, 19. Myers, C.J. et al. 2009. iBioSim: a tool for the analysis and design of genetic circuits. Bioinformatics 25, 2848-2849.

ACS Paragon Plus Environment

ACS Synthetic Biology Page 14 of 16

1 2 3 4 5 6

ACS Paragon Plus Environment

Page 15 of 16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39

ACS Synthetic Biology

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

ACS Paragon Plus Environment

Page 16 of 16