EcoSynther: A Customized Platform To Explore the ... - ACS Publications

Sep 27, 2017 - Moreover, we built a user-friendly Web server named EcoSynther. It is able to explore the precursors and heterologous reactions needed ...
0 downloads 8 Views 892KB Size
Subscriber access provided by PEPPERDINE UNIV

Article

EcoSynther: A customized platform to explore biosynthetic potential in E. coli. Shaozhen Ding, Xiaoping Liao, Weizhong Tu, Ling Wu, Yu Tian, Qiuping Sun, Junni Chen, and Qian-Nan Hu ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.7b00605 • Publication Date (Web): 27 Sep 2017 Downloaded from http://pubs.acs.org on September 28, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Chemical Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

EcoSynther: A customized platform to explore biosynthetic potential in E. coli Shaozhen Ding† ‡ , Xiaoping Liao‡, Weizhong Tu§, Ling Wu‡, Yu Tian‡ ∥, Qiuping Sun‡, Junni Chen§ and Qian-Nan Hu* † ‡ †

Shanghai Institutes for Biological Sciences, CAS, People’s Republic of China



Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, People’s Republic of China

§

Wuhan LifeSynther Science and Technology Co. Limited, Wuhan, People’s Republic of China.



University of Chinese Academy of Sciences, Beijing, People’s Republic of China.

*

To whom correspondence should be addressed.

Abstract Developing computational tools for chassis-centered biosynthetic pathway design is very important for a productive heterologous biosynthesis system by considering enormous foreign biosynthetic reactions. For many cases, a pathway to produce a target molecule consists of both native and heterologous reactions when utilizing a microbial organism as the host organism. Due to tens of thousands of biosynthetic reactions existed in nature, it’s not trivial to identify which could be served as heterologous ones to produce the target molecule in a specific organism. In the present work, we integrate more than ten thousand of E. coli non-native reactions and utilize a probability-based algorithm to search pathway. Moreover, we build a user friendly web server named EcoSynther. It is able to explore the precursors and heterologous reactions needed to produce a target molecule in Escherichia coli K12 MG1655, and then applies flux balance analysis to calculate theoretical yields of each candidate pathway. Compared with other chassis-centered biosynthetic pathway design tools, EcoSynther has two unique features: (1) allow for automatic search without knowing a precursor in E. coli and (2) evaluate the candidate pathways under constraints from E. coli physiological states and growth conditions. EcoSynther is available at http://www.rxnfinder.org/ecosynther/. Keywords: pathway design; genome-scale metabolic model; flux balance analysis; heterologous reactions; calculation of theoretical yields;

Introduction Microbial organisms have received much attention for recent decades due to the abilities to produce a large amount of value-added products, including beverages, enzymes, commodity chemicals, and specialty chemicals from low-cost substrates through metabolic engineering or synthetic biology methods 1. Biosynthetic potential of these low-cost substrates could be explored by BioSynther2. Although the biosynthetic routes for a large number of target molecules have been successfully constructed in various organisms 3, 4, production of many high-value metabolites is still limited by systematically planned experiments due to the considerable work involved in engineering a synthetic pathway and complexity of metabolic networks 5. In recent decades, more and more biological reactions existed in nature have been discovered and several databases are established, such as KEGG 6, Rhea 7, MetaCyc 8, RxnFinder 9. What’s more, genome-scale metabolic networks as well as statics-based models have been established to explore biosynthesis potentials of specific organisms 10, 11. In order to re(design) biosynthetic pathway efficiently and narrow the candidate pathways for further experiments, several pathway-finding algorithms 1, 5 have been developed on the basis of biological reaction databases mentioned above. Biosynthetic pathway design tools could be divided into two categories. The tools in the first category explore pathways without considering host organisms. For instance, given two user-defined ‘target’ and ‘source’ compounds, Metabolic tinker 12 designs thermodynamically feasible biosynthetic pathways based on metabolic compounds and reactions available from ChEBI 13 and Rhea7. Retrace 14, which was built on the observation that at least one atom was transformed from substrate to product in a biologically interesting pathway, is able to retrieve branching pathways in atom-level representation of the metabolic network. Instead of using 1

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

known reactions, BNICE15, utilizing the Enzyme Commission classification to formulate the enzyme reaction rule, can predict pathways with novel reactions. Based on the information of RDM patterns and structure alignment of substrate-product pairs in KEGG6, PathPred 16, a web-based server adopted in KEGG 6, can predict plausible enzyme-catalyzed reaction pathways of xenobiotics biodegradation in bacteria and biosynthesis of secondary metabolites in plants from a query compound. Although, these tools are able to retrieve biosynthetic pathways from substrates to products, these tools cannot assess the suitability of these pathways for a specific context in the system of a host organism. To identify heterologous reactions needed to produce desired metabolite in specific host organism, the second category of pathway design tools including FMM 17, PHT 18, MRE 19, was developed. FMM17 and PHT18 don’t rank suitable biosynthetic pathways, and just report which enzymes are not natively available in the specific host or which heterologous reactions needed to be introduced. In contrast, MRE can rank biosynthesis routes by taking endogenous metabolic system into consideration. While these chassis-centered tools could design biosynthesis pathway for desired metabolite production in specific host organism, they couldn’t report yields information online and also couldn’t retrieve results when only a target molecule is specified. In this study, based on much larger number of E. coli non-native biological reactions and genome-scaled metabolic network named iJO1366 20 , we developed a novel pathway design tool named EcoSynther. It is able to automatically retrieve heterologous pathways of a target molecule and its precursors in E. coli, and then evaluate the searching-results according to non-native steps and theoretical yields while considering E. coli physiological-state under specific growth conditions.

Results & Discussion Lycopene biosynthesis by EcoSynther Lycopene, one of the major carotenoids, has received much attention due to its beneficial biological and pharmaceutical activities, such as anti-cancer, anti-inflammatory, and anti-oxidative activities 21. In addition, lycopene could also be used as functional foods, feed supplements, as well as nutraceutical. Several methods, including nature extraction, chemical synthesis, and fermentation, have been applied to produce lycopene. Many of these approaches have limitations, however, fermentation is currently the most effective known method. In EcoSynther, users could retrieve the precursor named isopentenyl diphosphate and several heterologous pathways to synthesize lycopene (Figure. 1). And its theoretical yields and reaction fluxes of the first three heterologous pathways in searching results are shown in Figure. 2, the first pathway is consistent with experimental study 22. The reason that why EcoSynther doesn’t show detailed information about native reactions in candidate pathways is that when considering E. coli whole metabolism, it always involves hundreds of native reactions with non-zero flux in FBA results. But, the native reactions are shown in metabolic sub-network maps using Escher 23. The web server visualization including Escher map for the first pathway in Figure. 1 is shown in S1.pbf, and corresponding reactions with non-zero flux are shown in S2.xls.

Figure 1. Searching results in EcoSynther for lycopene production (The marked pathway is consistent with experimental result)

2

ACS Paragon Plus Environment

Page 2 of 12

Page 3 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Figure 2. Calculated theoretical yields and reaction fluxes for the first three heterologous pathways for lycopene production. This calculation was performed by: i.) setting glucose as main carbon source, ii.) setting oxygen as aerobic state, iii.) setting the percentage that minimum growth rate of mutant type E. coli accounts for maximum theoretical growth rate of WT type as 80%. (a): the first pathway(①) utilizes IPPP as precursor, while the second pathway(②) utilizes farnesy diphosphate as precursor.(b):The third pathway(③) also utilizes farnesy diphosphate as precursor, however, in this pathway, phytoene couldn’t be converted to lycopene directly, firstly, phytoene was converted to an intermediate metabolite named all-trans-zeta-Carotene, and then all-trans-zeta-Carotene was converted to lycopene. All theoretical yields for lycopene production in these three pathways are 0.118g/g. Abbreviation: IPPP: Isopentenyl diphosphate, GGDP: geranylgeranyl diphosphate.

Resveratrol biosynthesis by EcoSynther. Resveratrol has emerged as a promising molecule due to its anti-oxidative, anti-inflammatory, and chemopreventive activities 24. However, the process of extracting resveratrol from plant, such as peanuts and grapes is always tedious and inefficient. Utilizing environmentally friendly feedstocks and lower energy processes, heterologous biosynthesis pathway(Figure 3) for resveratrol production has been successfully constructed in E. coli 24. In EcoSynther, the precursor named tyrosine and several heterologous pathways (Figure 4) containing experimentally-proved one (the first pathway), could be automatically retrieved. The resveratrol theoretical yields and reaction fluxes for the first three heterologous pathways in the searching results are shown in Figure. 5.

Figure 3. Heterologous pathway to produce resveratrol in E. coli host. (TAL: tyrosine ammonia lyase;4CL: 4-coumarate: CoA ligase; STS: stilbene synthase).

3

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4. Searching results in EcoSynther for resveratrol production (The marked pathway is consistent with experimental result).

Figure 5. Calculated theoretical yields and reaction fluxes in the first three heterologous pathways for resveratrol production. This calculation was performed by: i.) setting glucose as main carbon source, ii.) setting oxygen as aerobic state, iii.) setting the percentage that minimum growth rate of mutant type E. coli accounts for maximum theoretical growth rate of WT type as 80%

Metabolites biosynthesized by EcoSynther Compared with 1,777 non-native metabolites which could be synthesized by E. coli according to previous study25, larger amount of reaction data contained in EcoSynther enables it to retrieve biosynthetic pathways for 4,489 non-native metabolites (S3.xls) within 10 heterologous steps. The minimum number of heterologous reactions (from 1 to 10) needed to produce the 4,489 non-native metabolites is shown in Figure 6. And the minimum number of heterologous steps required is segmented into several ranges, in which the top10 precursors frequently utilized in E. coli are summarized individually (Table 1).

4 Figure 6 Heterologous reactions needed for production of non-native metabolites

ACS Paragon Plus Environment

Table 1. The top 10 precursors frequently utilized in E. coli for non-native metabolites production in different ranges of minimum heterologous steps.

Page 4 of 12

Page 5 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Minimum number of heterologous reactions (steps) 1 – 3 steps 4 – 6 steps 7 -10 steps Farnesyl diphosphate 4-Hydroxybenzoate Malonyl-CoA Geranyl diphosphate 3-hydroxycinnamic acid Isopentenyl diphosphate

Farnesyl diphosphate 4-Hydroxybenzoate Malonyl-CoA Geranyl diphosphate 3-hydroxycinnamic acid Isopentenyl diphosphate

Farnesyl diphosphate 4-Hydroxybenzoate Malonyl-CoA Geranyl diphosphate 3-hydroxycinnamic acid Isopentenyl diphosphate

Higher fatty acid

CDP-diacylglycerol

CDP-diacylglycerol

S-Adenosylmethionine Pyruvate

Choline Undecaprenyl diphosphate L-Phenylalanine

L-Tyrosine Undecaprenyl diphosphate

1,2-Diacylglycerol

L-Phenylalanine

Stability testing of the pathway-searching algorithm. Because that the pathway-searching algorithm is based on probability, the results may be a bit different for each retrieval, we made a testing for its stability. In two searching-cases (lycopene, resveratrol), iterations and heterologous steps were set to 20,000 and 5 respectively. We made search for 11 times of each case, and calculated the node-combinations (3th column in Table 2) by only considering conjunctive nodes in each heterologous pathway. For example, in the first heterologous pathway for lycopene production (Figure 1), metabolites (IPPP, GGDP, phytoene, lycopene) were regarded as conjunctive nodes, and their order was regarded as one kind of node-combinations. The first searching-results in each searching-case was regarded as the object to be compared with other 10 results, detailed information in lycopene case is shown in Table2. Average percentage of stability testing gets 71.055% in lycopene production, 84.996% in resveratrol production, more details is shown in S4.xls. Table 2. Stability testing of the pathway-searching algorithm in lycopene case

Pathways retrieved 67 68 69 71 71 72 73 75 75 86

Combinations of conjunctive nodes 60 59 57 60 61 63 61 64 63 71

Consistency with the first results 46 46 46 46 45 48 46

Percentage

Average

0.6969 0.6969 0.6969 0.6969 0.6818 0.7272 0.6969

0.71055

50 47 49

0.7575 0.7121 0.7424

Discussion. In this paper, we introduced an efficient tool named EcoSynther to retrieve heterologous pathways for producing target molecule in E. coli, which could calculate maximum theoretical yields while considering E. coli physiological-state in specific growth conditions. About half of non-native metabolites (4,489 of 8,671) could be retrieved heterologous pathways in EcoSynther, the rest of metabolites which could not be produced indicated that our understanding about natural metabolic systems of these metabolites is still limited. Nearly 62% (2,801 of 4,489) of these metabolites could be biosynthesized in E. coli within three heterologous reactions, which encourage researchers to explore biosynthetic potential of E. coli more deeply. While previous studies26, 27 have shown pathway design tools help a lot in synthetic biology, the limitations of computer performance as well as our understanding of natural metabolism are the common issues which slow down the pace of rational design of biosynthetic pathway in silico. EcoSynther differs from previously built tools that predict biosynthetic pathways, which it does not require a precursor molecule to be specified and is organism-specific. This could be useful in exploring different precursor molecules to produce a specified metabolite in E. coli, and it also attempts to minimize use of heterologous metabolites in the generated pathways. 5

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

This work named EcoSynther is developed using Rhea7, KEGG6, and E. coli genome-scale metabolic network20 to automatically retrieve heterologous reactions needed to produce a target molecule without having to define precursor. The platform can also make a quantitative evaluation for each result based on heterologous pathway length and theoretical yields constrained by the denoted metabolic system. It could be served as a chassis-centered biosynthetic pathway design tool for E. coli metabolic engineering.

Methods

Data. There are 8,738 non-redundant and bidirectional reactions in Rhea7 and 9,998 enzymatic reactions in KEGG 6. KEGG compound ID utilized in KEGG reaction was mapped to ChEBI 28 ID utilized in Rhea 7 reaction when specific ChEBI ID was available. In order to establish an integrated database consisting of non-redundant biological reactions available from KEGG and Rhea, we checked whether the species of reactants and products are individually identical in two reactions without considering H2O, H, and reaction directionality. And then the integrated database was compared with E. coli model (iJO1366) 20 to remove its native reactions with taking reactions directionality of iJO1366 into consideration. It’s important to note that transportation reactions, elongation reactions, reactions containing polymer and genetic compound are excluded from consideration. In conclusion, the heterologous reaction database was established containing 12,011 reactions and 9,440 metabolites in which 8,671 are non-native molecule. Functions of EcoSynther. In order to retrieve E. coli heterologous reactions needed to biosynthesize the target molecule, users need to specify several parameters. For instance, substrate and condition allow users to specify which kind of environment including main carbon source and oxygen state that E. coli grows in; the maximum of heterologous reactions needs to be constrained for the reason that there is a practical limit to the number of heterologous genes that can be inserted into a host organism; the number of iterations for the pathway-searching algorithm and the percentage that minimum growth rate of mutant type E. coli accounts for maximum theoretical growth rate of WT type are constrained by iterations and biomass, respectively. What’s more, users can input one or more intermediates alternatively to filter out the biosynthetic pathways which don’t involve the intermediates. By setting parameters for biosynthetic requirement, EcoSynher starts to retrieve metabolic routes. After that, the user can specify the number of top-ranked pathways to display, it should be noted that the pathways are ranked by pathway length. Detailed information including heterologous reactions, theoretical yields and Escher map in different heterologous pathways is displayed separately. Workflow of EcoSynther Figure 7 shows the workflow of EcoSynther. Firstly, the pathway-searching algorithm identifies several heterologous reactions needed to biosynthesize the target molecule on the basis of user-input parameters and heterologous reaction database. Secondly, the heterologous reactions are introduced to E. coli iJO1366 model to construct an integrated metabolic network. And then, carbon flux for each reaction and theoretical yields of target molecule could be calculated by Flux Balance Analysis (FBA) 29 according to the integrated metabolic network and user-defined parameters, such as substrate and condition. At last, Django, CSS, JavaScript ,HTML as well as Escher 23 map were utilized to visualize the heterologous and native pathways for specified molecule biosynthesized in E. coli.

6

ACS Paragon Plus Environment

Page 6 of 12

Page 7 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Figure 7. Workflow of EcoSynther

Heterologous pathway-searching algorithm. Based on heterologous reaction database, we utilize a network-based search algorithm to identify the heterologous pathways of target molecules. Starting from the target metabolite, a single reaction from reactions that involve the target metabolite as a main product is selected. Previous study 30 has shown that uniform weighting selection scheme consistently outperformed the connectivity-based weighting schemes. Therefore, we also select reactions from candidates with equitable probabilities. Previous study doesn’t have any restrictions on the chosen reactions, so the reactions with more than two non-native reactants could be chosen and the pathway constructed could be a tree. In the present study, we only select the reactions of which at most one reactant is non-native metabolite. After that, the searching process proceeds repeatedly until a heterologous reaction with only native metabolites is reached. Calculate carbon flux/theoretical yields. For a specific target molecule, the pathway-searching algorithm will retrieve its heterologous pathways which could be integrated to model (iJO1366) separately. And then Flux Balance Analysis(FBA) in COBRApy 31 using Eq.1-4, was applied to calculate reaction carbon fluxes in each candidate pathway under specific condition (parameter setting). Here, secretion rate of target molecule is regarded as the objective function (Eq.1). Integrated metabolic network consisting of E. coli genome-scale metabolic model (iJO1366) and heterologous reactions, is converted into a mathematical model by forming a stoichiometric matrix named Sij, in which rows represent metabolites (set I) and columns represent reactions (set J). A steady-state mass-balance constrain is imposed (Eq.2), the flux through each reaction is given by S·v=0, which defines a system of linear equation. Reactions in set J have corresponding flux constrains (lower bound or upper bound) (Eq.3), including the main carbon source and oxygen uptake rate. Three kinds of main carbon source (glucose, xylose and glycerol) and two kinds of oxygen states (aerobic or anaerobic) are available in EcoSynther website according to previous study32. To approximate the experimentally measured result, the maximum uptake rate of each main carbon source and oxygen when specified were set to 20mmol gDW-1hr-1 33, 34. The heterologous reactions needed to produce the target molecule were regarded as reversible, in which lower bound was set to -1000mmol gDW-1hr-1 and upper bound was set to 1000.0mmol gDW-1hr-1. Growth was incorporated into the genome-scale model with biomass reaction. By setting the percentage that minimum growth rate of mutant type E. coli accounts for maximum theoretical growth rate of WT type (Eq.4), users could define the minimal growth rate that mutant type E. coli must achieve to sustain growth. The theoretical yields in each candidate pathway for target molecule production will be calculated through equation (Eq.5). 7

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Max   Eq. 1 

∈

(  ) = 0 ∀i ∈ I Eq. 2 "&& _!"#$

 _!"#$ ≤  ≤ 

∀' ∈ ( )*. 3

,6 !,-- ≥ / ∀/ ∈ 70,19 )*. 4 0123455

;? = @

@A4BCDA ∗FA4BCDA

5G05AB4AD ∗F5G05AB4AD

HI JK L JK

∗, JK

∗ HI JK L JK ∗ ∗,JK )*. 5

Draw Metabolic Map. In order to visualize which native reactions were activated when the target molecule was produced by E. coli in specific growth conditions. EcoSynther integrates Escher 23, a web application for visualizing data on biological pathway. After carbon flux was calculated, users could choose to visualize several sub-networks, such as e_coli_core. Core. metabolic, iJO1366.Central metabolism, iJO1366.Fatty acid bate-oxidation. It’s significant to note that the color and thickness of lines represent carbon fluxes of activated reactions, the green represents a small carbon flux, while red represents a large. Similarly, the thicker the line, the larger the carbon flux it represents.

Author Information Corresponding Author *E-mail: [email protected] ORCID: Qian-Nan Hu 0000-0001-5213-472X Notes: The authors declare no competing financial interest.

Acknowledgements This work was supported by the National Science Foundation of China [31270101; 31570092]; the national high technology research and development program [2012CB721000], and the Natural Science Foundation of Tianjin, China.

Associated Content Supporting Information: The web server visualization of the first pathway for lycopene production using EcoSynther is shown in S1.pbf, and the corresponding reactions with non-zero flux are shown in S2.xls. Metabolites that have been retrieved biosynthetic pathway using EcoSynther are shown in S3.xls. Detailed information about stability-testing of the two target molecules (lycopene, resveratrol) is shown in S4.xls.

References (1)

Long, M. R., Ong, W. K., and Reed, J. L. (2015) Computational methods in metabolic

engineering for strain design. Curr. Opin. Biotechnol. 34, 135-141. (2) Tu, W., Zhang, H., Liu, J., and Hu, Q.-N. (2016) BioSynther: a customized biosynthetic potential explorer. Bioinformatics 32, 472-473. (3)

Luo, Y., Li, B., Liu, D., Zhang, L., and Chen, Y. (2015) Engineered biosynthesis of natural

products in heterologous hosts. Chem. Soc. Rev. 44, 5265-5290. (4) Jakočiūnas, T., Jensen, M., and Keasling, J. (2016) CRISPR/Cas9 advances engineering of microbial cell factories. Metab. Eng. 34, 44-59. 8

ACS Paragon Plus Environment

Page 8 of 12

Page 9 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

(5) Medema, M., Raaphorst, v., Takano, E., and Breitling, R. (2012) Computational tools for the synthetic design of biochemical pathways. Nat. Rev. Microbiol. 10, 191-202. (6) Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y., and Morishima, K. (2017) KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45, D353-D361. (7) Morgat, A., Lombardot, T., Axelsen, K. B., Aimo, L., Niknejad, A., Hyka-Nouspikel, N., Coudert, E., Pozzato, M., Pagni, M., Moretti, S., Rosanoff, S., Onwubiko, J., Bougueleret, L., Xenarios, I., Redaschi, N., and Bridge, A. (2017) Updates in Rhea – an expert curated resource of biochemical reactions. Nucleic Acids Res. 4, D415–D418. (8) Caspi, R., Billington, R., Ferrer, L., Foerster, H., Fulcher, C. A., Keseler, I. M., Kothari, A., Krummenacker, M., Latendresse, M., Mueller, L. A., Ong, Q., Paley, S., Subhraveti, P., S, D., Weaver, and Karp, P. D. (2016) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 44, D471-D480. (9) Hu, Q.-N., Deng, Z., Hu, H., Cao, D. S., and Liang, Y. Z. (2011) RxnFinder: biochemical reaction search engines using molecular structures, molecular fragments and reaction similarity. Bioinformatics 27, 2465-2467. (10)

Colletti, P. F., Goyal, Y., Varman, A. M., Feng, X., Wu, B., and Tang, Y. J. (2011)

Evaluating Factors That Influence Microbial Synthesis Yields by Linear Regression with Numerical and Ordinal Variables. Biotechnol. Bioeng. 108, 893-902. (11) Varman, A. M., Xiao, Y., Leonard, E., and Tang, Y. J. (2011) Statistics-based model for prediction of chemical biosynthesis yield from Saccharomyces cerevisiae. Microb. Cell Fact. 10. (12) McClymont, K., S, O., and Soyer. (2013) Metabolic tinker: an online tool for guiding the design of synthetic metabolic pathways. Nucleic Acids Res. 41, 113-121. (13)

Degtyarenko, K., Matos, P. d., Ennis, M., Hastings, J., Zbinden, M., McNaught, A.,

Alcántara, R., Darsow, M., Guedj, M., and Ashburner, M. (2008) ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 36, D344–D350. (14)

Pitkänen, E., Jouhten, P., and Rousu, J. (2009) Inferring branching pathways in

genome-scale metabolic networks. . BMC Syst. Biol. 3, doi:10.1186/1752-0509-1183-1103. (15) Hatzimanikatis, V., Li, C., Ionita, J. A., Henry, C. S., Jankowski, M. D., and Broadbelt., L. J. (2005) Exploring the diversity of complex metabolic networks. Bioinformatics 21, 1603-1609. (16) Moriya, Y., Shigemizu, D., Hattori, M., Tokimatsu, T., Kotera, M., Goto, S., and Kanehisa, M. (2010) PathPred: an enzyme-catalyzed metabolic pathway prediction server. Nucleic Acids Res. 38, W138-W143. 9

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(17) Chou, C.-H., Chang, W.-C., Chiu, C.-M., Huang, C.-C., and Huang, H.-D. (2009) FMM: a web server for metabolic pathway reconstruction and comparative analysis. Nucleic Acids Res. 37, W129–W134. (18) Rahman, S. A., Advani, P., Schunk, R., Schrader, R., and Schomburg, D. (2005) Metabolic pathway analysis web service (Pathway Hunter Tool at CUBIC). Bioinformatics 21, 1189-1193. (19) Kuwahara, H., Alazmi, M., Cui, X., and Gao, X. (2016) MRE: a web tool to suggest foreign enzymes for the biosynthesis pathway design with competing endogenous reactions in mind. Nucleic Acids Res. 44, 217-225. (20) Orth, J., Conrad, T., Feist, A., and Palsson, B. (2011) A comprehensive genome-scale reconstruction of Escherichia coli metabolism—2011. Mol. Syst. Biol. 7, 535-544. (21) Kim, Y.-S., Lee, J.-H., Kim, N.-H., Yeom, S.-J., Kim, S.-W., and Oh, D.-K. (2011) Increase of lycopene production by supplementing auxiliary carbon sources in metabolically engineered Escherichia coli. Appl. Microbiol. Biotechnol. 9, 489–497. (22) Alper, H., Jin, Y.-S., Moxley, J. F., and Stephanopoulos, G. (2005) Identifying gene targets for the metabolic engineering of lycopene biosynthesis in Escherichia coli. Metab. Eng. 7, 155-164. (23) King, Z. A., Dräger, A., Ebrahim, A., Sonnenschein, N., Lewis, N. E., and Palsson, B. O. (2015) Escher: A Web Application for Building,Sharing, and Embedding Data-Rich Visualizations of Biological Pathways. PLoS Comput. Biol. DOI:10.1371/journal.pcbi.1004321. (24) Wua, J., Liub, P., Fana, Y., Baoa, H., Dua, G., Zhou, J., and Chen, J. (2013) Multivariate modular metabolic engineering of Escherichia coli to produce resveratrol from l-tyrosine. J. Biotechnol. 167, 404-411. (25) Zhang, X., Tervo, C. J., and Reed, J. L. (2016) Metabolic Assessment of E. coli as a Biofactory for Commercial Products. Metab. Eng. 35, 64-74. (26) Cho, A., Yun, H., Park, J. H., and Lee, S. Y. (2010) Prediction of novel synthetic pathways for the production of desired chemicals. BMC Syst. Biol. 4, 35. (27) Yim, H., Haselbeck, R., Niu, W., Pujol-Baxley, C., Burgard, A., Boldt, J., Khandurina, J., Trawick, J. D., Osterhout, R. E., Stephen, R., Estadilla, J., Teisan, S., Schreyer, H. B., Andrae, S., Yang, T. H., Lee, S. Y., Burk, M. J., and Dien, S. V. (2011) Metabolic engineering of Escherichia coli for direct production of 1,4-butanediol. Nat. Chem. Biol. 7, 445-452. (28) Hastings, J., Matos, P. d., Dekker, A., Ennis, M., Harsha, B., Kale, N., Muthukrishnan, V., Owen, G., Turner, S., Williams, M., and Steinbeck, C. (2013) The ChEBI reference database and 10

ACS Paragon Plus Environment

Page 10 of 12

Page 11 of 12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res. 41, D456-D463. (29) Orth, J. D., Thiele, I., and Palsson, B. Ø. (2010) What is flux balance analysis? Nat. Biotechnol. 38, 245-246. (30)

Yousofshahi, M., Lee, K., and Hassoun, S. (2011) Probabilistic pathway construction.

Metab. Eng. 13, 435-444. (31)

Ebrahim, A., Lerman, J. A., Palsson, B. O., and Hyduke, D. R. (2013) COBRApy:

COnstraints-Based Reconstruction and Analysis for Python. BMC Syst. Biol. 7, :74. (32) Feist, A. M., Zielinski, D. C., and Orth, J. D. (2010) Model-driven evaluation of the production potential for growth- coupled products of Escherichia coli. Metab. Eng. 12, 173-186. (33) VARMA, A., BOESCH, B. W., and PALSSON, B. (1993) Stoichiometric Interpretation of Escherichia coli Glucose Catabolism under Various Oxygenation Rates. Appl. Environ. Microbiol. 59, 2465-2473. (34)

VARMA, A., and PALSSON, B. O. (1994) Stoichiometric flux balance models

quantitatively predict growth and metabolic by-product secretion in wild-type Escherichia coli W3110. . Appl. Environ. Microbiol. 60, 3724–3731.

11

ACS Paragon Plus Environment

ACS Chemical Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47

ACS Paragon Plus Environment

Page 12 of 12