Engineering of a Synthetic Metabolic Pathway for the Assimilation of (d

Jul 17, 2015 - The C2 molecule ethylene glycol (EG) is not accessible via the natural metabolic network of E. coli when (d)-xylose is used as the subs...
0 downloads 11 Views 1MB Size
Page 1 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Engineering of a synthetic metabolic pathway for the assimilation of

2

(D)-xylose into value-added chemicals

3

Yvan Cam (1-4)*, Ceren Alkim(1-4)*, Debora Trichez(1-4), Vincent Trebosc(1-4), Amélie Vax(1-4), François

4

Bartolo(1,5), Philippe Besse(1,5), Jean Marie François(1-4), Thomas Walther(1-4)

5 6

(*) These authors contributed equally to the study.

7 8

Short title: A synthetic pathway for xylose assimilation

9 10

Affiliations

11

1

Université de Toulouse; INSA, UPS, INP; LISBP, 135 Avenue de Rangueil, 31077 Toulouse, France;

12

2

INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés (LISBP), Toulouse, France;

13

3

CNRS, UMR5504, Toulouse, France;

14

4

TWB, 3 rue des Satellites, Canal Biotech Building 2, 31400 Toulouse, France;

15

5

Département Génie Mathématiques et Modélisation (GMM), 135 Avenue de Rangueil, 31077

16

Toulouse, France

17 18

Key words:

synthetic pathway, xylose utilization, glycolic acid production, Escherichia coli

Subject area:

synthetic biology, chemical biology, metabolic engineering

19 20 21 22 23 24 25 26

Corresponding authors:

Yvan Cam

([email protected])

Thomas Walther

([email protected])

1 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Page 2 of 39

Abstract

2 3

A synthetic pathway for (D)-xylose assimilation was stoichiometrically evaluated and

4

implemented in Escherichia coli strains. The pathway proceeds via isomerization of (D)-xylose to (D)-

5

xylulose, phosphorylation of (D)-xylulose to obtain (D)-xylulose-1-phosphate (X1P), and aldolytic

6

cleavage of the latter to yield glycolaldehyde and DHAP. Stoichiometric analyses showed that this

7

pathway provides access to ethylene glycol with a theoretical molar yield of 1. Alternatively, both

8

glycolaldehyde and DHAP can be converted to glycolic acid with a theoretical yield that is 20 % higher

9

than for the exclusive production of this acid via the glyoxylate shunt. Simultaneous expression of

10

xylulose-1 kinase and X1P aldolase activities, provided by human ketohexokinase-C and human

11

aldolase-B, respectively, restored growth of a (D)-xylulose-5-kinase mutant on xylose. This strain

12

produced ethylene glycol as the major metabolic endproduct. Metabolic engineering provided strains

13

that assimilated the entire C2 fraction into the central metabolism, or that produced 4.3 g/l glycolic

14

acid at a molar yield of 0.9 in shake flasks.

15 16 17 18 19 20 21 22 23 24

2 ACS Paragon Plus Environment

Page 3 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Introduction

2 3

The high volatility of oil market prices and growing environmental concerns have led to a

4

search for alternative and renewable feedstocks for the production of chemicals.1,2 In particular,

5

lignocellulosic biomass is considered to have a high potential to serve as a substitute for fossil

6

resources because it is not used in human nutrition and can be readily converted into sugar

7

monomers that can be used in industrial fermentation processes.3 Lignocellulosic biomass is rich in

8

(D)-xylose. Depending on the bio-refinery process employed, the xylose fraction in the resulting

9

hydrolysates can vary between 6-25 % for whole-plant hydrolysates4 and reaches 70-80 % in the

10

hemicellulosic fraction.5,6 Thus, the efficient use of lignocellulosic biomass requires the development

11

of optimal metabolic pathways for the conversion of (D)-xylose into the desired value-added product.

12

Metabolic engineering strategies commonly aim at increasing product yields by deleting

13

reactions that consume the desired compound, by increasing the activities of rate-limiting reactions,

14

and by removing feed-back inhibition exerted by final or intermediate products on their own

15

biosynthesis. While this approach has proven successful in a vast number of cases, it ultimately relies

16

on the rather punctual modification of a preexisting metabolic network. It is therefore limited to a

17

predefined stoichiometry which was optimized by evolution for biomass and energy production,7–10

18

and not for the synthesis of value-added chemicals. This fact may render microbial product syntheses

19

sub-optimal from a purely stoichiometric point of view. The identification and evaluation of

20

alternative chemically feasible pathways that provide a higher yield for the synthesis of value-added

21

metabolites,11–14 and the implementation of these synthetic pathways by enzyme engineering and/or

22

rational strain design2,15–18 is therefore highly important.

23

In this context we investigate the consequences of replacing the natural pathway for (D)-

24

xylose assimilation in Escherichia coli by a synthetic route, which changes the stoichiometry of xylose

25

metabolism and therefore product formation, and which by-passes the regulatory constraints of the

26

natural metabolic pathway.

27

In wild-type E. coli cells, (D)-xylose is first converted into (D)-xylulose-5P via subsequent

28

isomerization and phosphorylation reactions (Figure 1). (D)-Xylulose-5P is then further processed

29

through the pentose phosphate pathway and enters the Embden-Meyerhof-Parnass (EMP) pathway

30

in the form of fructose-6P and glyceraldehyde-3P. We herein investigate the metabolic consequences

31

of assimilating (D)-xylose via isomerization to (D)-xylulose, phosphorylation of the latter in position 1

32

to yield (D)-xylulose-1P, and subsequent aldolytic cleavage of xylulose-1P into the C3 molecule 3 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 39

1

dihydroxyacetonephosphate (DHAP) and the C2 molecule glycolaldehyde (Figure 1, in the following

2

referred to as the xylulose-1P pathway). DHAP is then metabolized via the EMP pathway to produce

3

biomass precursors and energy. The metabolic fate of glycolaldehyde in E. coli is less clear since this

4

organism naturally possesses a panel of enzymatic activities that may either reduce this C2 molecule

5

to ethylene glycol, or oxidize it to glycolic acid and further to glyoxylic acid where it may eventually

6

re-enter the central metabolic pathways. Alternatively, glycolaldehyde may be metabolized by a yet

7

unknown metabolic route.

8

The degradation of xylitol by the described pathway was earlier observed in human liver

9

cells19. We herein analyzed the stoichiometric consequences of replacing the natural (D)-xylose

10

assimilation pathway by a synthetic reaction sequence on the production of selected value-added

11

chemicals. We then implemented the synthetic pathway in E. coli by characterizing candidate

12

enzymes that possess the required enzymatic activities, by identifying the minimal set of synthetic

13

pathway reactions that confer growth on (D)-xylose, and by exploring the metabolic fate of pathway

14

intermediates. We characterized the transcriptional response of cells in the presence of the synthetic

15

pathway and found that it was partially controlled by the presence the pathway intermediate

16

glycolaldehyde. Finally, we have engineered the E. coli host strain to optimize the production of

17

glycolic acid via the new pathway obtaining 90 % of the theoretical yield on (D)-xylose.

18 19 20 21 22 23 24 25 26 27

4 ACS Paragon Plus Environment

Page 5 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Results

2

Stoichiometric evaluation of the synthetic pathway for conversion of (D)-xylose into

3

selected value-added chemicals

4 5

The synthetic xylulose-1P pathway changes the overall stoichiometry of xylose metabolism.

6

To evaluate the potential of the new pathway regarding the production of value-added chemicals we

7

calculated the theoretical maximum yield for molecules that can be produced via the annotated

8

activities in the natural metabolic network of E. coli 20,21 and which are in the ‘vicinity’ of the synthetic

9

pathway. We extended this analysis to cover the natural xylose isomerase (XI) pathway of E. coli, and

10

alternative natural pathways for xylose assimilation which were found in other microorganisms. In

11

particular we analyzed the stoichiometry of the xylose dehydrogenase-xylitol dehydrogenase (XR-

12

XDH) pathway, which is the preferred pathway for xylose assimilation in yeast and fungi;22 the Dahms

13

pathway,23 which proceeds via the oxidation of xylose to xylonate, the dehydration of the latter to 2-

14

dehydro-3-deoxy-D-pentonate, and aldolytic cleavage to yield glycolaldehyde and pyruvate; and the

15

Weimberg pathway,24 which starts as the Dahms pathway but dehydrates 2-dehydro-3-deoxy-D-

16

pentonate to yield 2-oxoglutarate.

17

We extended a previously published stoichiometric model of the central metabolic pathways 25

18

in E. coli

by the reactions comprising the natural and the synthetic xylose pathways (xylose

19

isomerase, xylulose-5-kinase, xylulose-1-kinase, xylulose-1P aldolase, xylose reductase, xylitol

20

dehydrogenase, xylose dehydrogenase, xylonate dehydratase, 2-dehydro-3-deoxy-D-pentonate

21

aldolase, 2-dehydro-3-deoxy-D-pentonate dehydratase), the reactions catalyzed by glycolaldehyde

22

reductase, glycolaldehyde dehydrogenase, glycolate oxidase, glyoxylate reductase, and the glycerol-

23

3P dependent pathway towards 1,3-propanediol (Figure S1). Since the membrane-associated enzyme

24

glycolate oxidase requires the presence of oxygen as terminal electron acceptor, all simulations

25

implicating this enzyme were limited to aerobic conditions.

26

Elementary flux mode analysis,26 which is implemented in the software package

27

CellNetAnalyzer,27 was used to analyze the stoichiometric model. The results of the simulations are

28

summarized in Table 1. The C2 molecule ethylene glycol (EG) is not accessible via the natural

29

metabolic network of E. coli when (D)-xylose is used as the substrate.20,21 In contrast, the synthetic

30

pathway has the potential to produce 1 mol EG per mole of xylose, thus providing increased

31

metabolic flexibility. When calculating the maximum theoretical yield for the production of glycolic

5 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 39

1

acid (GA) from (D)-xylose, we found a value of 2 mol glycolic acid per mol xylose for the synthetic

2

pathway. One mole of GA is directly produced from the C2 molecule glycolaldehyde which is released

3

in the aldolytic cleavage of (D)-xylulose-1P (X1P). Another mole of GA is derived from the C3 product

4

of the aldolase reaction (DHAP) which is metabolized via the EMP pathway and the glyoxylate shunt

5

(Figure S1). The theoretical GA yield of the natural metabolic network is only 1.66 mol/mol (Figure

6

S1). Thus, the stoichiometry of the synthetic pathway confers a 20 % advantage over natural xylose

7

metabolism. The reason for the superiority of the synthetic pathway is that it produces the C2

8

precursor glycolaldehyde through the carbon-conserving aldolytic cleavage of X1P. The C2 precursor

9

in the natural metabolic network is glyoxylic acid. This metabolite is derived from isocitrate (via

10

isocitrate lyase), whose production requires entry of carbon into the Krebs cycle and therefore

11

causes carbon loss due to the decarboxylation of pyruvate.

12

It is of note that the stoichiometric advantage of the synthetic xylulose-1P pathway for the

13

production of C2 molecules is also maintained when comparing it to the xylose reductase (XR)–xylitol

14

dehydrogenase (XDH) pathway, which is the preferred route for (D)-xylose assimilation in yeast and

15

fungi.22 The combined action of the (commonly) NADPH-dependent XR and NAD-dependent XDH

16

activities results in the production of (D)-xylulose which is then phosphorylated to yield (D)-xylulose-

17

5P before entering the pentose phosphate pathway. Since the stoichiometric model of the metabolic

18

network in E. coli assumes the NADH and NADPH cofactors to be interconvertible by

19

transhydrogenase reactions,25 the theoretical yield of the XR/XDH pathway is identical with the (D)-

20

xylose isomerase pathway. The Dahms pathway has a very similar stoichiometry as the xylulose-1P

21

pathway, and provides access to EG and GA at identical theoretical yields. However, since this

22

pathway produces pyruvate instead of DHAP, it requires gluconeogenic activity to confer growth and

23

is energetically less efficient. The Weimberg pathway provides no access to EG and has the smallest

24

theoretical GA yield (1 mol/mol) of all analyzed metabolic routes (Table 1).

25

While the synthetic pathway offers a significant advantage over the natural xylose

26

metabolism when regarding the production of the C2 compounds GA and EG, the pathway is

27

outperformed by the natural E. coli reaction network when analyzing maximum yields of malate,

28

succinate, ethanol, and 1,3-propanediol (Table 1). The reason for the lower performance of the

29

synthetic pathway is that the C2 product of the aldolase reaction can only be re-integrated into

30

central metabolism via glyoxylate through the action of malate synthase (Figure S1). This reaction

31

requires acetyl-CoA as a substrate which is produced by pyruvate dehydrogenase in E. coli, and which

32

therefore causes the carbon loss of one mole CO2 per mole of glyoxylate that is utilized (the

33

metabolic route comprised of pyruvate oxidase and acetyl-CoA synthetase has equal stoichiometry

34

regarding the carbon balance). 6 ACS Paragon Plus Environment

Page 7 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

In conclusion, our analyses have identified a significant stoichiometric advantage of the

2

synthetic pathway over natural xylose metabolism when regarding the synthesis of the C2 molecules

3

GA and EG, whereas the production of C3 and C4 molecules was predicted to be less efficient. We

4

therefore focused our work on the biosynthesis of the listed C2 molecules.

5 6 7

Identification and characterization of candidate enzymes possessing (D)-xylulose-1-kinase

8

and (D)-xylulose-1P aldolase activity

9 10

The construction of the synthetic pathway required the identification of enzymes that had

11

xylulose-1-kinase and xylulose-1P aldolase activities. Human hexokinase C was earlier reported to

12

accept (D)-xylulose as a substrate,28,29 and human aldolase B was found to be able to cleave X1P into

13

glycolaldehyde and DHAP.29 However, since the heterologous expression of human genes in E. coli

14

may yield insufficient enzymatic activities, we also tested alternative candidate enzymes of bacterial

15

origin. In particular, we analyzed the kinetic parameters of the E. coli enzymes (L)-fuculokinase,

16

encoded by fucK,30 and (L)-rhamnulose kinase, encoded by rhaB,31 on (D)-xylulose. These enzymes

17

were chosen because FucK was previously reported to have (D)-xylulose-1 kinase activity,32 and

18

because RhaB has 1-kinase activity on a hexose sugar that is sterically cognate with (D)-xylulose. The

19

genes encoding the mentioned enzymes were cloned into the pET-28a expression vector thereby

20

adding an N-terminal His-tag. The enzymes were expressed in E. coli BL21(DE3) cells, and purified

21

using affinity chromatography. The purity of the purified protein fraction was verified by SDS-PAGE

22

(not shown) prior to enzymatic tests.

23

We found that all three candidate kinases were active on (D)-xylulose, with RhaB having the

24

highest activity on the pentose sugar and FucK the lowest (Table 2). Both, fuculokinase and

25

rhamnulose kinase could not be saturated at (D)-xylulose concentrations of up to 50 mM and were

26

found to have a much lower affinity for (D)-xylulose than Khk-C (Table 2). For the latter enzyme we

27

found Km values of 0.5 mM and 0.31 mM on (D)-xylulose and (D)-fructose, respectively, which is

28

comparable to the reported literature values (0.44 mM and 0.8 mM28), and which indicates that Khk-

29

C is active on both sugars with almost equal specificity. Since we considered the kinetic parameters

30

of Khk-C adequate for the in vivo function of our pathway, we tested the capacity of E. coli to

31

heterologously express this enzyme. We found a xylulose-1 kinase activity of ~0.59 µmol/(mgprot min)

32

in crude cell extract upon expression of Khk-C from the medium copy pACT3-khkC plasmid. This value 7 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 39

1

is in the range of enzymatic activities that are part of the glycolytic pathway,33 and we therefore

2

decided to continue our work using this enzyme.

3

In a similar way we searched for candidate X1P aldolase enzymes. In addition to Aldo-B which

4

had a reported activity on X1P, we tested the enzymes fructose-1,6-bisphosphate (F16bP) aldolase

5

(FbaB) from E. coli, and tagatose-1,6-bisphosphate (T16bP) aldolase (LacD) from Lactococcus lactis.

6

The functional expression of the aldolases was verified on F16bP (Table 3). We confirmed the X1P

7

aldolase activity of Aldo-B and found that FbaB was also active on this substrate. The Ll-T16bP

8

aldolase was inactive on X1P (Table 3). Upon expression of Aldo-B and FbaB from high copy plasmids

9

in wild-type E. coli cells we obtained a X1P aldolase activity of 0.09 µmol/(mgprot min) in crude cell

10

extract for Aldo-B, whereas the X1P aldolase activity measured upon the expression of FbaB was

11

nearly indistinguishable from the background. We therefore decided to continue our work using

12

Aldo-B as the X1P aldolase.

13 14

Simultaneous expression of (D)-xylose-1-kinase and (D)-xylulose-1P aldolase confers

15

growth of a (D)-xylulose-5-kinase deleted mutant on (D)-xylose

16 17

We next identified the core set of synthetic pathway reactions whose presence is necessary

18

to confer growth on (D)-xylose. As expected, a xylB mutant strain which is defective in the (D)-

19

xylulose-5 kinase could not grow on (D)-xylose (Figure 2A). The fact that this strain was able to

20

proliferate on medium containing 20 mmol/l dihydroxyacetone as the only carbon source (Figure 2A)

21

indicated that growth of the ΔxylB mutant should be restored if sufficient DHAP could be supplied via

22

the synthetic pathway. The ΔxylB mutant strain was indeed able to proliferate on (D)-xylose upon

23

simultaneous expression of khk-C and aldo-B, but not during expression of either of these genes

24

alone (Figure 2B). These results indicate that the synthetic pathway is functional in vivo, and that E.

25

coli does not naturally express any of the two required activities when cultivated on (D)-xylose.

26 27

The metabolic fate and regulatory role of glycolaldehyde

28 29

The capacity of the synthetic pathway to enable growth on (D)-xylose led us to investigate

30

the physiological consequences of expressing this pathway in E. coli. Growth and product formation

31

kinetics of the wild-type strain and the ΔxylB mutant strain expressing the synthetic pathway 8 ACS Paragon Plus Environment

Page 9 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

enzymes Khk-C and Aldo-B (strain Pen205) during cultivation on (D)-xylose are depicted in Figure 3.

2

We found that the specific growth rate of strain Pen205 was reduced by approximately 3.5-fold

3

compared to wild-type cells (0.11 h-1 vs. 0.41 h-1). The wild-type strain transiently accumulated

4

pyruvate and lactate, and produced 75 mmol/l acetate and 18 mmol/l formate from 70 mmol/l (D)-

5

xylose (Figure 3A). In contrast, we could not detect lactate, pyruvate, and acetate in the cultivation

6

medium of strain Pen205, but found that EG and formate accumulated to concentrations of 27

7

mmol/l and 16 mol/l, respectively (Figure 3B). Furthermore, the wild-type strain consumed 6.9 ±1.4

8

mmol/l glycolaldehyde and produced 3.3 ±0.4 mmol/l EG within 11 h of incubation on (D)-xylose

9

medium which was supplemented with 10 mmol/l glycolaldehyde (not shown). Thus, E. coli has the

10

natural capability to reduce glycolaldehyde to EG, which is presumably brought about by one or

11

several of the enzymes which were previously shown to have glycolaldehyde reductase activity.34 On

12

the other hand, the amount of EG produced during the growth of strain Pen205 on (D)-xylose and

13

during exposure of wild-type cells to glycolaldehyde could only partially account for the (D)-xylose or

14

glycolaldehyde, respectively, which had been consumed. Thus, glycolaldehyde was utilized by one or

15

several additional pathways.

16

To better understand the metabolic re-arrangements triggered by the presence of the

17

synthetic pathway and its intermediate glycolaldehyde, we compared the genome-wide

18

transcriptional activity of the wild-type strain growing on (D)-xylose (condition 1), strain Pen205

19

growing on (D)-xylose (condition 2), and the wild-type strain growing on (D)-xylose in the presence of

20

10 mmol/l glycolaldehyde (condition 3). We found that 288 genes were significantly up or down-

21

regulated by at least 3-fold (p≤0.05) in strain Pen205 and in response to glycolaldehyde compared to

22

growth of wild-type cells on pure (D)-xylose (Figure 4). Upon hierarchical clustering of these 288

23

genes we could identify 6 characteristic clusters. Representative gene ontology (GO) terms and KEGG

24

pathway categories that were over-represented in each of the gene clusters were depicted below the

25

graphs showing the individual clusters (Figure 4, see Supplementary File 1 for complete gene lists and

26

statistical analyses). Clusters 1 and 2 were very similar and contained genes which were induced in

27

strain Pen205 and in the presence of glycolaldehyde. They differed only in that cluster 1 contained

28

genes whose expression was even further increased in cells exposed to glycolaldehyde. We found

29

that clusters 1 and 2 were enriched for genes implicated in the metabolism of glycolate (glcABDF),

30

glyoxylate and allantoine (hyi, gcl, glxR, allB, aceA), and pyruvate (gloA, poxB). The glc operon is

31

induced by glycolate,35 and glyoxylate induces the transcription of the operon which contains the

32

genes (among others) hyi, gcl, glxR, and allB.36,37 The finding that the transcription of these genes was

33

up-regulated indicated that (i) both compounds were present in strain Pen205 and in wild-type cells

34

exposed to glycolaldehyde, and (ii) suggests that a significant fraction of glycolaldehyde was 9 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 39

1

consumed via glycolate and glyoxylate (Figure S1, see below). Cluster 1 also contained the entire

2

frmABR operon which encodes genes implicated in the detoxification of formaldehyde to

3

formate.38,39 This may explain why formate accumulated in the supernatant of strain Pen205 under

4

aerobic conditions (as witnessed by the absence of other fermentative end-products, such as lactate

5

and ethanol). Gene cluster 3 contained genes upregulated in strain Pen205 and was enriched for

6

genes implicated in the SOS response (recA, yebG, umuCD, sulA, dinI), and the citrate (sdhACD) and

7

glyoxylate cycle (aceBK). These results indicated that the presence of the synthetic pathway

8

derepressed the oxidative metabolism and triggered a pronounced stress response independently

9

from its intermediate glycolaldehyde. The enrichment of genes implicated in transport processes

10

(livJ, fhuA, gatAB, ompF, exbBD) in cluster 4 suggested that the cells exposed to extracellular

11

glycolaldehyde may have tried to escape the effect of this compound by down-regulating transport

12

proteins. Cluster 5 was enriched for genes which function under oxygen-limiting conditions (GO

13

terms: citrate cycle and glycerophospholipid metabolism: frdABCD, glpABQT, gldA). The high

14

expression of these genes in the wild-type cultures lets us speculate that these cells were strongly

15

oxygen-limited (presumably due to the much faster growth of these cells, see above). This notion is

16

corroborated by the accumulation of the fermentative end-products lactate and formate (Figure 3A).

17

In addition, cluster 5 was enriched for genes implicated in histidine, alanine, aspartate, and

18

glutamate metabolism (hisBCDG, aspA, asnB, pyrBI). In cluster 6, genes were overrepresented which

19

have functions in flagellar assembly and chemotaxis indicating that the presence of the synthetic

20

pathway but not its intermediate glycolaldehyde alone caused a strong perturbation of these

21

systems.

22

In summary, our transcriptome data show that part of the transcriptional response to the

23

synthetic pathway is caused by the presence of its intermediate glycolaldehyde, and its oxidized

24

derivatives glycolate and glyoxylate. In particular, the transcriptional upregulation of glyoxylate,

25

pyruvate, and formaldehyde metabolism (Clusters 1, 2), and the down-regulation of genes implicated

26

in fermentative metabolism and amino acid synthesis (Cluster 5) can be also observed in wild-type

27

cells exposed to glycolaldehyde. Cells appear however to be more stressed when growing with the

28

synthetic pathway as indicated by the upregulation of the SOS response (Cluster 3) and the down-

29

regulation of flagellar assembly and chemotaxis (Cluster 6). The observations that (i) a nearly

30

complete pathway between glycolaldehyde and malate could be traced based on the increased

31

transcriptional activity of the genes coding for glycolate oxidase, and malate synthase, and (ii) that

32

glycolic acid accumulated only very transiently (Figure 3B), suggested that a significant part of the

33

glycolaldehyde was assimilated into the central metabolism via the intermediates glycolic acid,

34

glyoxylic acid and malate. 10 ACS Paragon Plus Environment

Page 11 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1 2

Strain engineering for complete assimilation of glycolaldehyde and increasing glycolic acid

3

yield

4

Contrary to glycolate oxidase and malate synthase genes, the expression of the

5

glycolaldehyde dehydrogenase, encoded by aldA,40 was not upregulated in response to the synthetic

6

pathway or glycolaldehyde. To explore the possibility to assimilate the entire C2 fraction produced by

7

the synthetic pathway back into the central metabolism, we overexpressed aldA from a medium

8

copy plasmid in parallel with the synthetic pathway enzymes Khk-C and Aldo-B in a ΔxylB mutant

9

strain (Pen221). We found that the molar yields of EG and GA were decreased by 44 % and increased

10

by two-fold, respectively (Figure 5 and Table S1). However, the total fraction of the C2 molecules GA

11

and EG (YC2 = ([EG] + [GA])/[xylose]) which accumulated in the supernatant of Pen221 decreased to

12

30 % of the amount of glycolaldehyde that was produced from xylose. A further increased

13

transcription of AldA was achieved by using a high copy plasmid (pEXT20-khkC-aldoB-aldA, Pen462),

14

which abolished EG production without changing the accumulation of GA (Figure 5). When strain

15

Pen462 was cultivated in baffled flasks to increase aeration of the culture, only traces of EG and GA

16

could be detected in the supernatant indicating the complete assimilation of the C2 fraction into the

17

central metabolism (Figures 5 and S2A, Table S1). Furthermore, the inactivation of glycolate oxidase,

18

which was brought about by the deletion of the glcD subunit, resulted in the accumulation of GA to

19

concentrations of 3.1 ±0.15 g/l (0.75 mol/mol) and 4.3 ±0.006 g/l (0.9 mol/mol) when aldA was

20

expressed from a medium or a high copy plasmid, respectively (Figure 5). Together, these data

21

indicated that a major fraction of glycolaldehyde was indeed assimilated via the intermediates

22

glyoxylate and malate by strain Pen462, and that GA could be produced via the synthetic pathway at

23

90 % of the theoretical yield (which is 1 mol GA per mol xylose without engineering of the glyoxylate

24

shunt, see Figure S1). However, the observation that the total C2 yield in strain Pen224 (ΔxylB ΔglcD

25

double mutant over-expressing the synthetic pathway enzymes Khk-C and Aldo-B but not AldA) did

26

not reach 100 % (Figure 5) indicates that yet another unidentified glycolaldehyde-consuming

27

pathway exists in E. coli.

28 29 30 31

11 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 39

1

12 ACS Paragon Plus Environment

Page 13 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

ACS Synthetic Biology

Discussion

2 3

We have explored the theoretical and practical potential of a synthetic pathway for the

4

assimilation of (D)-xylose in E coli. This pathway alters the stoichiometry of xylose assimilation,

5

thereby providing access to the value-added chemical EG which cannot be obtained via the natural

6

metabolic network of E. coli, and conferring a 20 % higher maximum theoretical GA yield compared

7

to the natural metabolism of this microorganism. In contrast, it decreases the theoretical yield for

8

the biosynthesis of other metabolites of industrial interest. On a more conceptual level these results

9

indicate that it is indeed possible to heavily alter the stoichiometry of sugar metabolism to be better

10

suited for the production of a target molecule, and that this approach has a high potential to

11

complement metabolic engineering strategies that focus on the rather local modification of a

12

metabolic network.

13

Our approach essentially consists in the carbon-conserving production of C2 molecules via

14

the asymmetric aldolytic cleavage of a xylose-derived C5 compound; and in this regard it is similar to

15

the work of Liu41 and Stephanopoulos42.

16

production of EG from (D)-xylose via the Dahms pathway23. Their reaction sequence proceeds via the

17

oxidation of xylose to xylonate, the dehydration of the latter to 2-dehydro-3-deoxy-D-pentonate,

18

followed by an aldolytic cleavage to yield glycolaldehyde and pyruvate.41 This pathway also provides

19

access to EG and a stoichiometric advantage of 20 % over the natural metabolism for the production

20

of GA. However, our pathway has a higher energy output (one additional ATP per molecule of

21

xylose), and does not depend on an initial oxidation step, which provides more metabolic flexibility

22

notably under oxygen-limiting conditions. In a recent patent, Stephanopoulos et al.42 demonstrated

23

the assimilation of (D)-xylose via isomerization to (D)-xylulose, epimerization of (D)-xylulose to (D)-

24

ribulose, phosphorylation of the latter to yield (D)-ribulose-1P, which was followed by an aldolytic

25

cleavage to obtain glycolaldehyde and DHAP. While this pathway has the same stoichiometry as ours,

26

it additionally employs the synthetic (D)-xylulose epimerase reaction, which may complicate

27

implementation of the pathway since another enzymatic activity has to be identified and optimized

28

by protein engineering.

Liu and colleagues41 previously demonstrated the

29

On the industrial scale EG and GA are exclusively produced from fossil resources.43,44 EG is

30

mainly used as an anti-freezing agent and as a precursor for plastics.43,45 Its annual production

31

volume is expected to reach ~28 Mt in 2015.46 GA is used in the textile industry and in skin care

32

products.44,47 Its annual production currently amounts to ~0.4 Mt.48 Thus, both molecules have 13 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 39

1

considerable industrial value and the replacement of current petrol-based syntheses by biochemical

2

processes may prove highly beneficial. Liu41 and Stephanopoulos42 focused on the optimization of EG

3

production obtaining a molar product yield on xylose of 0.67 and 0.84, respectively. We chose to

4

optimize the production of GA because it can be derived from both products of the xylulose-1P

5

aldolase reaction, i.e. glycolaldehyde and DHAP, contrary to EG which can only be produced from

6

glycolaldehyde. Thus, the maximum theoretical carbon yield of GA (0.8 Cmol/Cmol) is much higher

7

than for EG (0.4 Cmol/Cmol) which renders a potential industrial application of this technology more

8

attractive and more sustainable. We found that very few genetic modifications, namely, the

9

inactivation of glycolate oxidase by deleting the subunit glcD and the overexpression of the aldehyde

10

dehydrogenase AldA, have increased the GA yield from zero to 0.9 mol/mol in the strain that

11

assimilated (D)-xylose via the synthetic pathway. This value corresponds to 90 % of the theoretical

12

maximum of this pathway, and is nearly identical with the maximum GA yield that was obtained with

13

an optimized E. coli strain that accumulated ~50 g/L GA during fed-batch cultivation on glucose.49

14

This strain produced glycolic acid via a heavily engineered Krebs cycle and glyoxylate shunt and

15

contained genetic modifications (among others) that resulted in the overexpression of glyoxylate

16

reductase (ghrA) and isocitrate lyase (aceA), the deletion of both malate synthases (aceB, glcB) and

17

glycolate oxidase (glcDEF), the transcriptional derepression of the glyoxylate shunt and Krebs cycle

18

(by deletion of iclR and arcA), the deletion of the Entner-Doudoroff pathway (edd-eda), and a nearly

19

complete inactivation of isocitrate dehydrogenase, icd.49 Whether the GA yield from (D)-xylose can

20

be further improved by the simultaneous action of our synthetic pathway and the engineered Krebs

21

and glyoxylate shunt49 is currently under investigation.

22

While we have demonstrated the function of the synthetic xylose pathway in E. coli, its use is

23

not restricted to this bacterium, but can be expected to be portable to other organisms. It was

24

suggested that other microorganisms such as yeast or Corynebacterium glutamicum may be better

25

suited for the production of GA since they have higher resistance to low pH and/or to the product

26

itself.50,51 However, work on Saccharomyces cerevisiae and Kluyveromyces lactis strains,50 whose

27

Krebs and glyoxylate pathways were modified in a similar way as previously described for E. coli,49

28

has shown that engineering of yeast to produce GA from sugars was not trivial. Both engineered

29

microorganisms showed significant production of GA only when cultivated in the presence of

30

ethanol. When incubated on pure glucose or xylose, the GA yield remained rather low (0.02 and 0.08

31

mol/mol, respectively, for the engineered S. cerevisiae strain).50 In line with these results, the

32

highest reported product titer of 15 g/L GA observed with an engineered K. lactis strain was obtained

33

on a medium that contained ethanol and (D)-xylose at a mass ratio of ~35:1.50 Similarly, a C.

34

glutamicum strain which carried modifications in Krebs cycle and glyoxylate shunt produced GA only 14 ACS Paragon Plus Environment

Page 15 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

from acetate (at a final concentration of 5.3 g/l and a yield of 0.43 mol/mol), whereas glucose served

2

only as a carbon source to support growth.51 Given the simplicity of our pathway, it may therefore

3

represent an effective alternative or complement for the engineering of these organisms to increase

4

GA production from sustainable ‘second-generation’ feed-stocks.

5

15 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Page 16 of 39

Materials and Methods

2 3

Media and cultivation conditions

4

Cells were cultivated on Luria-Bertani (LB) medium52 during genetic manipulations and during

5

the first preculture of cells that had been taken from the glycerol stock (30 % (v/v), kept at -80 °C). All

6

other experiments were performed using M9 mineral medium that contained glucose or xylose at

7

concentrations of 20 g/l or 10 g/l, respectively, 18 g/l Na2HPO4 * 12 H2O, 3 g/l KH2PO4, 0.5 g/l NaCl,

8

2/l g NH4Cl, 0.5 g/l MgSO4 * 7 H2O, 0.015 g/l CaCl2 * 2 H2O, 0.010 g/l FeCl3, 0.006 g/l Thiamine HCl, 0.4

9

mg/l NaEDTA * 2 H2O, 1.8 mg/l CoCl2 * 6 H2O, 1.8 mg/l ZnCl2SO4 * 7 H2O, 0.4 mg/l Na2MoO4 * 2 H2O,

10

0.1 mg/l H3BO3, 1.2 mg/l MnSO4 * H2O, 1.2 mg/l CuCl2 * 2 H2O. This medium was buffered by addition

11

of 20 g/l 3-(N-morpholino)propanesulfonic acid (MOPS), adjusted to pH 7, and filter sterilized. If

12

required, media were supplemented with the appropriate antibiotics (ampicillin 100 µg/ml,

13

kanamycin 50 µg/ml, chloramphenicol 25 µg/ml). All chemicals were purchased from Sigma.

14

Precultures (10 mL of LB medium in 50 ml test tubes (BD Falcon)) were inoculated from frozen

15

glycerol stocks and cultivated overnight. Cells from these cultures were used to inoculate 50 mL of

16

glucose mineral medium in 250 mL shake flasks at OD ~0.25, and isopropyl β-D-1-

17

thiogalactopyranoside (IPTG) was added at a concentration of 1 mM when the OD reached ~0.6.

18

After an overnight incubation, cells were spun down by centrifugation (4000 x g, Allegra 21-R,

19

Beckman-Coulter), washed twice with sterile water, and suspended in fresh M9 mineral medium

20

containing 1 mM IPTG and (D)-xylose as the only carbon source to adjust an OD ~0.5. All cultivations

21

were carried out at 37 °C on a rotary shaker (Infors HT) running at 200 rpm. Cell growth was followed

22

by monitoring the OD600 using a spectrophotometer (Biochrom Libra S11), and culture supernatants

23

were withdrawn regularly for HPLC analysis.

24 25

Quantification of extracellular metabolites by HPLC analyses

26

Clear supernatant was obtained by centrifugation of the culture samples at 13000 rpm for 5

27

min in a bench-top centrifuge (Eppendorf 5415D) and stored at -20 °C until further analysis. Samples

28

were prepared for HPLC analyses by filtration through a syringe filter (0.2 µm pore size, Sartorius

29

Minisart RC4). HPLC analyses were performed on an Ultimate 3000 system (Dionex, Sunnyvale, USA)

30

equipped with an autosampler (WPS-3000RS, Dionex) holding the samples at 4 °C, an RI detector (RID

31

10A, Shimadzu) and an UV/VIS detector (SPD-20A, Shimadzu). Analytes were separated on an Aminex

32

HPX-87H (300 x 7.8 mm, 9 µm, Biorad) column protected by a Micro-Guard Cation H (30 x 4.6 mm, 16 ACS Paragon Plus Environment

Page 17 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Biorad) guard-column. Column temperature was held at 32 °C, flow rate was fixed at 0.5 ml/min, and

2

analytes were eluted with a solution of 1.25 mM H2SO4. Injected sample volume was 20 µl.

3 4

Plasmid and strain constructions

5

(Plasmid construction) For the enzymatic characterization of candidate kinases (Khk-C, RhaB

6

and FucK) or aldolases (AldoB, FbaB and LacD), the genes encoding those enzymes were cloned into

7

the pET-28a(+) plasmid (Novagen). The genes were PCR amplified using Phusion polymerase

8

(Biolabs), the primers listed in Table 4, and genomic DNA from E. coli MG1655 (for rhaB and fbaB),

9

the genomic DNA from E. coli BL21 (for fucK), or the plasmids pET-khk-c (for khk-C, 28) and pEX-K-

10

aldoB (for aldo-B, Eurofins), or the synthetic gene lacD (Eurofins, sequence from Lactococcus lactis

11

MG1820) as the matrix. The pEX-K-aldoB plasmid contained a codon-optimized version of the human

12

aldo-B gene. The PCR fragments were sub-cloned into pGEM®-T Easy Cloning Vector (Promega) and

13

ligation products were transformed into

14

α (Biolabs) cells. Plasmids were extracted using the GeneJET plasmid Miniprep kit (Thermo Scientific)

15

and checked for correct insertion of the genes by sequencing (Cogenics). Genes were then ligated

16

into the pET-28a(+) vector using T4 ligase and NdeI and BamHI restriction sites (all enzymes were

17

from Biolabs) obtaining the corresponding expression vectors (Table 5). The expression vector for

18

khk-C (pET28-khkC) was obtained by ligating the DNA fragment obtained by NdeI/BamHI-digestion of

19

pET-khk-C28 into the corresponding pET-28a sites. All pET-28a-derived plasmids were transformed

20

into commercial BL21 (DE3) (Invitrogen) cells following the provider’s instructions. To express

21

synthetic pathway genes in the E. coli MG1655-derived strains they were cloned into pEXT20 or

22

pACT3 expression vectors.53 The genes khk-C and aldoB were amplified from pET-khk-C and pEX-K

23

aldoB, respectively, with primers listed in Table 4. The DNA fragments were purified using a PCR

24

purification kit (Thermo Scientific) and recombined in one step into the SalI/HindIII-digested pEXT20

25

vector using the In-Fusion® HD Cloning Kit (Clontech). The obtained plasmid was named as pEXT20-

26

khkC-aldoB. The aldehyde dehydrogenase A encoding aldA gene was amplified from genomic DNA of

27

strain E. coli MG1655 using the primer listed in Table 4 and sub-cloned into the pGEM®-T Easy vector.

28

The resulting vetor was KpnI/HindIII or EcoRI/SmaI digested and the resulting fragment was cloned

29

into the corresponding sites of the pACT3 or pEXT20-khkC-aldoB, respectively.

commercial chimio-competent E. coli NEB5-

30

(Strain construction) All strains were derived from the Escherichia coli K-12 MG1655 wild-

31

type strain. Gene deletions were introduced successively using the phage transduction method

32

adapted from Miller.54 Strains carrying the desired single deletions were recovered from the Keio

33

collection.55 Chimio-competent cells were prepared using the one-step protocol from Chung et al.56 17 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 39

1

Transformation was performed using standard protocols.57 Integration and excision of the FRT-

2

kanamycin cassette from the target locus was analyzed by colony PCR using Taq polymerase (Biolabs)

3

and the flanking locus-specific primers (Table 4). Plasmids were transformed into the strains using

4

the protocol of Chung et al.56 The constructed strains are listed in Table 6.

5 6

Expression and purification of enzymes

7

The E. coli BL21(DE3) strain bearing the pET-28a(+) (Novagen) plasmid with the gene of

8

interest was cultivated overnight in a 50 mL test tube containing 10 ml of LB medium at 37°C and 200

9

rpm. This pre-culture was used to inoculate 200 ml LB medium in 1 l baffled flask at an OD600 of 0.05.

10

The culture was incubated at 37 °C and 200 rpm. The expression induced by addition of 1 mM IPTG at

11

the OD6OO ~0.6. After 3 h of incubation at 37 °C, cells were centrifuged and pellets were stored at -20

12

°C. Protein purification was performed by suspending the pellet in 1.5 ml of lysis buffer (50 mmol/l

13

Hepes, 0.3 mol/l NaCl, pH 7.5). Cells were then sonicated at 30 % power output (Bioblock Scientic;

14

VibraCell™ 72434) and cellular debris was removed by spinning the samples at 15000 x g at 4 °C

15

(Eppendorf centrifuge 5415D) for 15 min. 0.3 ml of TALON® His Tag Purification Resin (Clontech) was

16

washed twice with 3 mL of lysis buffer being incubated with the protein extract on a rotating wheel

17

for 20 min at room temperature. The samples were centrifuged for 5 min at 700 x g at 4 °C before

18

removing the protein extract from the pellet. The pellet was washed twice with ten bead volumes of

19

lysis buffer that contained 0 (first wash) and 15 mM (second wash) imidazole. Proteins were eluted

20

using 500 µl of water that was precooled to 4 °C and contained 200 mM imidazole.

21 22

Preparation of protein extracts and enzymatic assays

23

Cells harvested from exponentially growing cultures and separated from the cultivation

24

medium by centrifugation at 4000 x g (Allegra 21-R, Beckman-Coulter) at 4 °C for 10 min. Cells were

25

then suspended in 15 ml Hepes buffer (100 mM Hepes, 85 mM KCl and 7.5 mM MgCl2, pH 7) and

26

sonicated during four intermitted cycles of 30 s at 30 % power output (Bioblock Scientic, VibraCell

27

72434). Between sonication cycles samples were placed on ice for 1 min. Protein extracts were

28

centrifuged for 15 min at 15000 x g and 4 °C to remove debris. The clear extracts were used for the

29

enzymatic analysis. Total protein concentration was determined with the method of Bradford.58

30

Kinase activities were assayed by coupling the ADP release in the reaction to the oxidation of

31

NADH via pyruvate kinase and lactate dehydrogenase. The assay mix contained 90 mM Hepes (pH 7),

32

77 mM KCl, 12 mM MgCl2, 4 mM ATP, 0.2 mM NADH, 2 mM phosphoenolpyruvate, 4 Units/ml of a 18 ACS Paragon Plus Environment

Page 19 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

pyruvate kinase lactate dehydrogenase mixture (Sigma). The reaction was started by adding

2

appropriate concentrations of (D)-fructose (Sigma) or (D)-xylulose (Carbosynth).

3

Aldolase activities were assayed by coupling the release of glyceraldehyde or glycolaldehyde

4

to the oxidation of NADH catalyzed by glycerol-3P-dehydrogenase (Gdh). When the enzymes were

5

tested on (D)-xylulose-1P, the substrate was synthesized in situ by the action of ketohexokinase. The

6

reaction mix contained 90 mM Hepes, 77 mM KCl, 6.8 mM MgCl2, 0.2 mM NADH and 0.6 U/ml Gdh

7

(Sigma) for assays on the substrate fructose-16bP (Sigma). For assays on xylulose-1P it additionally

8

contained 4 mM ATP, and 0.02 U/ml ketohexokinase (Khk-A, prospec bio). The reactions were started

9

by adding appropriate amounts of (D)-xylulose, or fructose-1,6-bisphosphate (Sigma). All assays were

10

performed in microtiter plates in a reaction volume of 250 µl. The oxidation of NADH was followed at

11

340 nm using a microplate reader (BioRad 680XR).

12 13

RNA extraction and microarray analysis

14

M9 mineral medium containing (D)-xylose as the only carbon source (100 ml in 500 ml shake

15

flasks) was inoculated from exponentially growing wild-type cells (cultivated on xylose M9 medium)

16

to adjust an OD of ~0.1. Cultures were incubated as described above until OD reached ~1. Then they

17

were split into two 50 ml aliquots and further cultivated in 250 ml shake flasks in the presence or

18

absence of 10 mM glycolaldehyde. After 30 min of incubation, 1 ml of the cell suspension was

19

withdrawn and centrifuged at 1500 x g (Eppendorf 5415D) for 5 min. The supernatant was removed

20

and the cell pellets were directly subject to RNA extraction. Cultures of strain Pen205 were treated

21

analogously. The RNeasy Mini Kit (QIAGEN) was used to extract RNA. Quantity and quality of the

22

samples were determined by NanoDrop (Thermo) and Bioanalyzer (Agilent Technologies),

23

respectively. Samples with a RIN (RNA Integrity Number) higher or equal to 8.00 were used for

24

further microarray analysis. The RNA samples were converted to cDNA and labeled using the Low

25

Input Quick Amp Labeling kit (Agilent) and hybridized on E. coli Gene Expression Microarrays (8x15K,

26

Agilent) following the Agilent One-Color Microarray-Based Gene Expression Analysis Protocol. The

27

slides were scanned on a Tecan scanner MS200 and analyzed by Feature Extraction V.11.5.1.1. RNA

28

was extracted and analyzed from three independent experiments for each condition.

29

Data treatment and statistical analyses

30

Raw data were background-corrected,59 normalized,60,61 and log2 transformed. Genes that

31

varied by more than ±1.58-fold (p < 0.001, log2 scale) from the wild-type reference condition were

32

filtered, and hierarchically clustered using the Pearson correlation.62 The heatmap shows the

19 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 39

1

expression levels of genes which were normalized to an average of zero and a standard deviation of

2

one. Characteristic gene clusters were analyzed for enrichment of Gene Ontology and KEGG pathway

3

categories63 by applying hyper-geometric tests according to Boyle et al.64

20 ACS Paragon Plus Environment

Page 21 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

ACS Synthetic Biology

Acknowledgements

2 3

The study was financed by the Toulouse White Biotechnology (TWB) consortium (Project:

4

PENTOSYS). DT was supported by a post-doctoral grant (Science without borders program) provided

5

by the CAPES foundation (Ministry of Education, Brazil). YC was supported by a post-doctoral grant

6

provided by the Institut National de la Recherche Agronomique (INRA, France) and by the Région

7

Midi Pyrenées . We thank Dr. Asipu and Dr. Bonthron for kindly providing us the pET-khk-c plasmid,

8

and the Biochip platform at the LISBP for carrying out the gene chip analyses.

9

21 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

Page 22 of 39

References (1) Curran, K. A., and Alper, H. S. (2012) Expanding the chemical palate of cells by combining systems biology and metabolic engineering. Metab. Eng. 14, 289–297. (2) Zhang, J., Babtie, A., and Stephanopoulos, G. (2012) Metabolic engineering: enabling technology of a bio-based economy. Curr. Opin. Chem. Eng. 1, 355–362. (3) Vanholme, B., Desmet, T., Ronsse, F., Rabaey, K., Van Breusegem, F., De Mey, M., Soetaert, W., and Boerjan, W. (2013) Towards a carbon-negative sustainable bio-based economy. Front. Plant Sci. 4, 174. (4) Villegas, J. D., and Gnansounou, E. (2008) Techno-economic and environmental evaluation of lignocellulosic biochemical refineries: need for a modular platform for integrated assessment (MPIA). J. Sci. Ind. Res. 67, 927–940. (5) Lee, J.-W., and Jeffries, T. W. (2011) Efficiencies of acid catalysts in the hydrolysis of lignocellulosic biomass over a range of combined severity factors. Bioresour. Technol. 102, 5884–5890. (6) Perego, P., Converti, A., Palazzi, E., Del Borghi, M., and Ferraiolo, G. (1990) Fermentation of hardwood hemicellulose hydrolysate by Pachysolen tannophilus, Candida shehatae and Pichia stipitis. J. Ind. Microbiol. 6, 157–164. (7) Bar-Even, A., Flamholz, A., Noor, E., and Milo, R. (2012) Rethinking glycolysis: on the biochemical logic of metabolic pathways. Nat. Chem. Biol. 8, 509–517. (8) Heinrich, R., Montero, F., Klipp, E., Waddell, T. G., and Melendez-Hevia, E. (1997) Theoretical approaches to the evolutionary optimization of glycolysis: thermodynamic and kinetic constraints. Eur. J. Biochem. 243, 191–201. (9) Melendez-Hevia, E., Waddell, T. G., Heinrich, R., and Montero, F. (1997) Theoretical approaches to the evolutionary optimization of glycolysis--chemical analysis. Eur. J. Biochem. 244, 527–43. (10) Noor, E., Eden, E., Milo, R., and Alon, U. (2010) Central carbon metabolism as a minimal biochemical walk between precursors for biomass and energy. Mol. Cell 39, 809–820. (11) Bar-Even, A., Noor, E., Flamholz, A., and Milo, R. (2013) Design and analysis of metabolic pathways supporting formatotrophic growth for electricity-dependent cultivation of microbes. Biochim. Biophys. Acta 1827, 1039–1047. (12) Bar-Even, A., Flamholz, A., Noor, E., and Milo, R. (2012) Thermodynamic constraints shape the structure of carbon fixation pathways. Biochim. Biophys. Acta 1817, 1646–1659. (13) Bar-Even, A., Noor, E., Lewis, N. E., and Milo, R. (2010) Design and analysis of synthetic carbon fixation pathways. Proc. Natl. Acad. Sci. U. S. A. 107, 8889–8894. (14) Medema, M. H., van Raaphorst, R., Takano, E., and Breitling, R. (2012) Computational tools for the synthetic design of biochemical pathways. Nat. Rev. Microbiol. 10, 191–202. (15) Adkins, J., Pugh, S., McKenna, R., and Nielsen, D. R. (2012) Engineering microbial chemical factories to produce renewable “biomonomers.” Front. Microbiol. 3, 313. (16) Lee, J. W., Na, D., Park, J. M., Lee, J., Choi, S., and Lee, S. Y. (2012) Systems metabolic engineering of microorganisms for natural and non-natural chemicals. Nat. Chem. Biol. 8, 536–546. (17) Pirie, C. M., De Mey, M., Jones Prather, K. L., and Ajikumar, P. K. (2013) Integrating the protein and metabolic engineering toolkits for next-generation chemical biosynthesis. ACS Chem. Biol. 8, 662–672. (18) Way, J. C., Collins, J. J., Keasling, J. D., and Silver, P. A. (2014) Integrating biological redesign: where synthetic biology came from and where it needs to go. Cell 157, 151–161. (19) James, H. M., Bais, R., Edwards, J. B., Rofe, A. M., and Conyers, A. J. (1982) Models for the metabolic production of oxalate from xylitol in humans: a role for fructokinase and aldolase. Aust. J. Exp. Biol. Med. Sci. 60, 117–122. (20) Kanehisa, M., Goto, S., Sato, Y., Kawashima, M., Furumichi, M., and Tanabe, M. (2014) Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 42, D199–205. (21) Kanehisa, M., and Goto, S. (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30. 22 ACS Paragon Plus Environment

Page 23 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51

ACS Synthetic Biology

(22) Karhumaa, K., Garcia Sanchez, R., Hahn-Hägerdal, B., and Gorwa-Grauslund, M.-F. (2007) Comparison of the xylose reductase-xylitol dehydrogenase and the xylose isomerase pathways for xylose fermentation by recombinant Saccharomyces cerevisiae. Microb. Cell Factories 6, 5. (23) Dahms, A. S. (1974) 3-Deoxy-D-pentulosonic acid aldolase and its role in a new pathway of Dxylose degradation. Biochem. Biophys. Res. Commun. 60, 1433–1439. (24) Weimberg, R. (1961) Pentose oxidation by Pseudomonas fragi. J. Biol. Chem. 236, 629–635. (25) Stelling, J., Klamt, S., Bettenbrock, K., Schuster, S., and Gilles, E. D. (2002) Metabolic network structure determines key aspects of functionality and regulation. Nature 420, 190–193. (26) Schuster, S., Dandekar, T., and Fell, D. A. (1999) Detection of elementary flux modes in biochemical networks: a promising tool for pathway analysis and metabolic engineering. Trends Biotechnol. 17, 53–60. (27) Klamt, S., Saez-Rodriguez, J., and Gilles, E. D. (2007) Structural and functional analysis of cellular networks with CellNetAnalyzer. BMC Syst. Biol. 1, 2. (28) Asipu, A., Hayward, B. E., O’Reilly, J., and Bonthron, D. T. (2003) Properties of normal and mutant recombinant human ketohexokinases and implications for the pathogenesis of essential fructosuria. Diabetes 52, 2426–2432. (29) Bais, R., James, H. M., Rofe, A. M., and Conyers, R. A. (1985) The purification and properties of human liver ketohexokinase. A role for ketohexokinase and fructose-bisphosphate aldolase in the metabolic production of oxalate from xylitol. Biochem. J. 230, 53–60. (30) Elsinghorst, E. A., and Mortlock, R. P. (1988) D-arabinose metabolism in Escherichia coli B: induction and cotransductional mapping of the L-fucose-D-arabinose pathway enzymes. J. Bacteriol. 170, 5423–5432. (31) Badía, J., Baldomà, L., Aguilar, J., and Boronat, A. (1989) Identification of the rhaA, rhaB and rhaD gene products from Escherichia coli K-12. FEMS Microbiol. Lett. 53, 253–257. (32) Heath, E. C., and Ghalambor, M. A. (1962) The metabolism of L-fucose. I. The purification and properties of L-fuculose kinase. J. Biol. Chem. 237, 2423–2426. (33) Siddiquee, K. A. Z., Arauzo-Bravo, M. J., and Shimizu, K. (2004) Effect of a pyruvate kinase (pykFgene) knockout mutation on the control of gene expression and metabolic fluxes in Escherichia coli. FEMS Microbiol. Lett. 235, 25–33. (34) Lee, C., Kim, I., and Park, C. (2013) Glyoxal detoxification in Escherichia coli K-12 by NADPH dependent aldo-keto reductases. J. Microbiol. Seoul Korea 51, 527–530. (35) Pellicer, M. T., Badía, J., Aguilar, J., and Baldomà, L. (1996) glc locus of Escherichia coli: characterization of genes encoding the subunits of glycolate oxidase and the glc regulator protein. J. Bacteriol. 178, 2051–2059. (36) Cusa, E., Obradors, N., Baldomà, L., Badía, J., and Aguilar, J. (1999) Genetic analysis of a chromosomal region containing genes required for assimilation of allantoin nitrogen and linked glyoxylate metabolism in Escherichia coli. J. Bacteriol. 181, 7479–7484. (37) Walker, J. R., Altamentova, S., Ezersky, A., Lorca, G., Skarina, T., Kudritska, M., Ball, L. J., Bochkarev, A., and Savchenko, A. (2006) Structural and biochemical study of effector molecule recognition by the E. coli glyoxylate and allantoin utilization regulatory protein AllR. J. Mol. Biol. 358, 810–828. (38) Gonzalez, C. F., Proudfoot, M., Brown, G., Korniyenko, Y., Mori, H., Savchenko, A. V., and Yakunin, A. F. (2006) Molecular basis of formaldehyde detoxification. Characterization of two Sformylglutathione hydrolases from Escherichia coli, FrmB and YeiG. J. Biol. Chem. 281, 14514–14522. (39) Gutheil, W. G., Holmquist, B., and Vallee, B. L. (1992) Purification, characterization, and partial sequence of the glutathione-dependent formaldehyde dehydrogenase from Escherichia coli: a class III alcohol dehydrogenase. Biochemistry (Mosc.) 31, 475–481. (40) Baldomà, L., and Aguilar, J. (1987) Involvement of lactaldehyde dehydrogenase in several metabolic pathways of Escherichia coli K12. J. Biol. Chem. 262, 13991–13996. (41) Liu, H., Ramos, K. R. M., Valdehuesa, K. N. G., Nisola, G. M., Lee, W.-K., and Chung, W.-J. (2013) Biosynthesis of ethylene glycol in Escherichia coli. Appl. Microbiol. Biotechnol. 97, 3409–3417. 23 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51

Page 24 of 39

(42) Stephanopoulos, G., Pereira, B., DeMey, M., Dugar, D., and Avalos, J. L. (2013) Engineering microbes and metabolic pathways for the production of ethylene glycol. WO 2013/126721 (43) Rebsdat, S., and Mayer, D. (2000) Ethylene Glycol, in Ullmann’s Encyclopedia of Industrial Chemistry. Wiley-VCH. (44) Miltenberger, K. (2000) Hydroxycarboxylic Acids, Aliphatic, in Ullmann’s Encyclopedia of Industrial Chemistry. Wiley-VCH. (45) Aggarwal, S. L., and Sweeting, O. J. (1957) Polyethylene: Preparation, Structure, And Properties. Chem. Rev. 57, 665–742. (46) Shell Global. Mono-ethylene glycol. http://www.shell.com/global/products-services/solutionsfor-businesses/chemicals/media-centre/factsheets/meg.html. (47) Sharad, J. (2013) Glycolic acid peel therapy - a current review. Clin. Cosmet. Investig. Dermatol. 6, 281–288. (48) Transparency Market Research. Glycolic Acid Market Segment Forecast up to 2018, Research Report. http://www.transparencymarketresearch.com/glycolic-acid-market.html. (49) Dischert, W., and Soucaille, P. (2012) Method for producing high amount of glycolic acid by fermentation. US 2012/0315682 (50) Koivistoinen, O. M., Kuivanen, J., Barth, D., Turkia, H., Pitkänen, J.-P., Penttilä, M., and Richard, P. (2013) Glycolic acid production in the engineered yeasts Saccharomyces cerevisiae and Kluyveromyces lactis. Microb. Cell Factories 12, 82. (51) Zahoor, A., Otten, A., and Wendisch, V. F. (2014) Metabolic engineering of Corynebacterium glutamicum for glycolate production. J. Biotechnol. 192 Pt B, 366–375. (52) Bertani, G. (1951) Studies on lysogenesis. I. The mode of phage liberation by lysogenic Escherichia coli. J. Bacteriol. 62, 293–300. (53) Dykxhoorn, D. M., St Pierre, R., and Linn, T. (1996) A set of compatible tac promoter expression vectors. Gene 177, 133–136. (54) Miller, J. H. (1992) A Short Course in Bacterial Genetics: A Laboratory Manual and Handbook for Escherichica coli and Related Bacteria. Cold Spring Harbor Laboratory Press, Plainview (NY). (55) Baba, T., Ara, T., Hasegawa, M., Takai, Y., Okumura, Y., Baba, M., Datsenko, K. A., Tomita, M., Wanner, B. L., and Mori, H. (2006) Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2, 2006.0008. (56) Chung, C. T., Niemela, S. L., and Miller, R. H. (1989) One-step preparation of competent Escherichia coli: transformation and storage of bacterial cells in the same solution. Proc. Natl. Acad. Sci. U. S. A. 86, 2172–2175. (57) Sambrook, J., Fritsch, E. F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor. (58) Bradford, M. M. (1976) A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 72, 248–254. (59) Ritchie, M. E., Silver, J., Oshlack, A., Holmes, M., Diyagama, D., Holloway, A., and Smyth, G. K. (2007) A comparison of background correction methods for two-colour microarrays. Bioinforma. Oxf. Engl. 23, 2700–2707. (60) Bolstad, B. M., Irizarry, R. A., Astrand, M., and Speed, T. P. (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinforma. Oxf. Engl. 19, 185–193. (61) Yang, Y. H., and Thorne, N. P. (2003) Normalization for two-colour cDNA microarray data., in Science and Statistics: A Festschrift for Terry Speed., pp 403–418. Goldstein, D. R., Bethesda. (62) Pearson, K. (1895) Notes on regression and inheritance in the case of two parents. Proc. R. Soc. Lond. 58, 240–242. (63) Falcon, S., and Gentleman, R. (2007) Using GOstats to test gene lists for GO term association. Bioinforma. Oxf. Engl. 23, 257–258. (64) Boyle, E. I., Weng, S., Gollub, J., Jin, H., Botstein, D., Cherry, J. M., and Sherlock, G. (2004) GO::TermFinder--open source software for accessing Gene Ontology information and finding 24 ACS Paragon Plus Environment

Page 25 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1 2 3 4 5 6

ACS Synthetic Biology

significantly enriched Gene Ontology terms associated with a list of genes. Bioinforma. Oxf. Engl. 20, 3710–3715. (65) Cherepanov, P. P., and Wackernagel, W. (1995) Gene disruption in Escherichia coli: TcR and KmR cassettes with the option of Flp-catalyzed excision of the antibiotic-resistance determinant. Gene 158, 9–14.

25 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Page 26 of 39

Figure and Table Legends

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Figure 1: Natural and synthetic pathways for the assimilation of (D)-xylose and (D)-glucose. Synthetic

19

pathway reactions of the xylulose-1P pathway are shown in blue. Annotated enzymatic activities in E.

20

coli are shown in black. Potential products are indicated in red. [1] (D)-xylose isomerase, [2] (D)-

21

xylulose-1-kinase [3] (D)-Xylulose-1P aldolase, [4] glycolaldehyde dehydrogenase, [5] glycolaldehyde

22

reductase, [6] glycolate oxidase.

23 24

26 ACS Paragon Plus Environment

Page 27 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1 (D)-Xylose

2

(D)-Glucose

100 0

3

PEP

ADP

Pyr

ATP

4

100 0

GA

8

NAD(P)H

NADH

0 0

NAD

NADP

0 0

GA

200 166

100 166

0 0

NADP

14

100 166 100 166

ATP

CO2

0 0

NADH NAD

Lactate

NAD NADH+ CO2

NADPH+ CO2

Malate

0 0

100 166

Oac NADH NAD

Mal

Formate Ac-CoA

AcP CoASH

Glyox

17 0 0

Succinate

100 166

166

ATP + CoASH

0 0

AcAld

Cit

NADH NAD+ CoASH Ici

CoASH

Suc

QH2 + CO2

Acetate ADP

ATP

CoASH

0 0

100 Fum 166 100 QH2

0 0

0 0

CoASH

100 166

0 0

0 0

16

19 20 21 22 23 24 25 26 27 28 29

2KD6PG

2PG

PEP

QH2

15

18

100 166

Pyr

NADP

13

0 0

3PG

ADP

NADPH

12

100 166

ATP

0 0

ADP + CO2

PPP

0 100

13bPG ADP

PDO

ATP

11

NAD

0 0

NAD

0 0

Xyl5P

NADH

3HPA NADH

Ribu5P

0 33

GA3P

0 0

EG

0 100

ATP ADP

F16bP

Gly3P

9 10

0 66

DHAP NADH

100 0 NAD

NADP NADPH+ CO2

F6P

100 0

GlyAld

(D)-Xylulose

NADP NADPH

0 ATP 66 ADP 0 66

0 100

0 0

0 0 6PG

0 0

5 6

0

G6P

Xylu1P

7

0

(D)-Xylulose

ADP

(D)-Xylose

ATP

Suc-CoA

0 ADP 0

NADP NADPH + CO2

Ethanol NADH NAD

0 0

0 0

2-OG NAD + CoASH NADH + CO2

0 0

Figure S1: Predicted optimum carbon flux distribution during glycolic acid production in the natural metabolic network of E. coli and during function of the synthetic pathway. Fluxes are indicated in blue as mole percent per consumed xylose (upper values: during function of synthetic pathway, lower values: absence of the synthetic pathway). Unbalanced extracellular metabolites are boxed. (Abbreviations: EG – ethylene glycol, GA – glycolic acid, GlyAld – glycolaldehyde, glyox – glyoxylic acid, PPP – pentose phosphate pathway, G6P – glucose-6P, F6P – fructose-6P, F16bP – fructose-1,6bisP, DHAP – dihydroxyacetonephosphate, GA3P – glyceraldehyde-3P, 13bPG – 1,3-bisP-glycerate, 3PG – 3P-glycerate, PEP – phosphoenolpyruvate, Ac-CoA – acetyl-CoA, Cit – citrate, Ici – isocitrate, 2OG – 2-oxoglutarate, Suc-CoA – succinyl-CoA, Fumin – intracellular fumarate, Malin – intracellular malate, QH2 – quinol, 6PG – 6P-gluconate, Ribu5P – ribulose-5P, xyl5P – xylulose-5P, xylu1P – xylulose-1P, Gly3P – glycerol-3P, 3HPA – 3-hydroxypropanal, 2KD6PG - 2-keto-3-deoxygluconate-6P). 27 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Figure 2: (A) Growth of the wild-type and the ΔxylB E. coli strains on 100 mmol/l (D)-glucose, 70

18

mmol/l (D)-xylose, or 20 mmol/l DHA as the only carbon sources. (B) Restoration of growth of the

19

ΔxylB mutant strain on 70 mmol/l (D)-xylose as the only carbon source depending on the expression

20

of synthetic pathway enzymes. Data is presented as means and standard deviations of at least two

21

independent experiments.

22 23 24 25 26 27 28 29 30 28 ACS Paragon Plus Environment

Page 29 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1 2 3 4 5 6 7

8 9

Figure 3: Growth and product formation kinetics of the strains (A) E. coli MG1655 (wild-type) and (B)

10

Pen205 (ΔxylB expressing pEXT20-khkC-aldoB) on mineral medium containing (D)-xylose at 70

11

mmol/l initial concentration. Cell growth is indicated as OD600nm. Data is presented as means and

12

standard deviations of at least two independent experiments.

13 14 15 16 17 18 19 20 21 22 29 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Figure 4: Analysis of the genome-wide transcriptional activity in E. coli strains growing on mineral medium containing 70 mmol/l (D)-xylose as the only carbon source. (C1) wild-type cells, (C2) strain Pen205, (C3) wild-type cells in xylose medium supplemented with 10 mmol/l glycolaldehyde (3 replicates). Red and blue represent high and low expression levels, respectively. Six characteristic gene clusters were identified by hierarchical clustering. KEGG ontology terms enriched in these clusters (with p≤0.01) are listed below the graphs representing each cluster.

31 32 30 ACS Paragon Plus Environment

Page 31 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1 2 3

4 5

Figure 5: Molar yields of ethylene glycol (EG), glycolic acid (GA), and the total C2 compound yield

6

(=([EG] + [GA])/[xylose]) obtained during cultivation of mutants derived from strain E. coli MG1655

7

ΔxylB. All cultures were started with an initial (D)-xylose concentration of 70 mmol/l and incubated in

8

50 mL medium shaken in 250 ml flasks, except for condition (*) which was started at 30 mmol/l (D)-

9

xylose and incubated in 50 ml medium in 500 ml baffled flasks. Yields were calculated from the

10

concentrations of EG, GA and xylose which were measured during stationary phase. Data is

11

presented as means and standard deviations of at least two independent experiments.

12 13

31 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 39

1 2 3 4 5 6 7

8 9 10 11

Figure S2: Growth and product formation kinetics of the E. coli strains (A) Pen462 (ΔxylB pEXT20-

12

khkC-aldoB-aldA) and (B) Pen492 (ΔxylB ΔglcD pEXT20-khkC-aldoB-aldA) on xylose mineral medium.

13

Strain Pen462 was incubated at an initial (D)-xylose concentration of 30 mmol/l in 50 ml medium

14

shaken in 500 ml baffled flasks, strain Pen492 was incubated at an initial (D)-xylose concentration of

15

70 mmol/l in 50 ml medium shaken in 250 ml unbaffled flasks. Data is presented as means and

16

standard deviations of at least two independent experiments.

17 18 19 20 21 22 32 ACS Paragon Plus Environment

Page 33 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10

Table 1: Theoretical maximum yields of selected compounds on (D)-xylose depending on the use of the synthetic xylulose-1P pathway and different natural pathways

Theoretical yield [mol/mol](a) Compound Glycolic acid Ethylene glycol Malic acid Succinic acid Propanediol Ethanol 11 12 13

Synthetic X1P pathway(b) 2 1 1 0.5 1 1

XI pathway(c) 1.66 0 1.1 (1.66) (e) 0.8 (1.4) (e) 1.16 1.66

XR-XDH pathway(d) 1.66 0 1.1 (1.66) (e) 0.8 (1.4) (e) 1.16 1.66

Dahms pathway23 2 1 1 0.5 1 1

Weimberg pathway24 1 0 1 1 1 1

(a) The theoretical yield corresponds to the maximum product formation in the absence of growth. (b) Xylulose-1P (X1P) pathway, (c) The xylose isomerase (XI) pathway is the natural pathway in E. coli. (d) xylose reductase (XR) -xylitol dehydrogenase (XDH) pathway. (e) Numbers in parentheses correspond to the maximum yield under anaerobic conditions.

14 15 16 17 18 19 20 21 22 23 33 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 39

1 2 3 4 5 6 7 8 9 10 11 12

Table 2: Kinetic parameters of purified candidate (D)-xylulose-1-kinases on different substrates

Enzyme

Ketohexokinase Khk-C, H. sapiens (L)-rhamnulose kinase RhaB, E. coli (L)-fuculokinase FucK, E. coli 13 14

Natural substrate* Vmax Km [U/mg] [mM] 6.59 ±1.4 0.31 ±0.1

(D)-xylulose Vmax Km [U/mg] [mM] 4.40 ±0,7 0.50 ±0.06

nd

nd

9.71 ±1,8

ns

20.12 ±3.5

0.06 ±0.01

0.17 ±0.05

ns

(*) (D)-fructose for Khk-c, (L)-fuculose for FucK, ns - not saturated, nd - not detected. Data is presented as means and standard deviations of three replicates.

15 16 17 18 19 20 21 22 23 34 ACS Paragon Plus Environment

Page 35 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10

Table 3: Kinetic parameters of purified candidate (D)-xylulose-1P aldolases on different substrates

Enzyme

11

(D)-Fructose-1,6bP

(D)-Xylulose-1P

Vmax

Km

Vmax

Km

[U/mg]

[mM]

[U/mg]

[mM]

Fructose-16bP aldolase Aldo-B, H. sapiens

0.46 ±0.05

0.03 ±0.01

0.81 ±0.2

nd

Fructose-16bP aldolase FbaB, E. coli

0.52 ±0.1

0.33 ±0.07

0.18 ±0.04

nd

Tagatose-16bP aldolase LacD, L. lactis

2.87 ±0.5

nd

0

0

nd- not detected. Data is presented as means and standard deviations of three replicates.

12 13 14 15 16 17 18 19

35 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 39

1 2 3 4

Table 4: Primers used in this study Primer

Sequence

Construction of pET28a expression plasmids aldoB_NdeI_f CATATGATGGCACATCGCTTTCCGGCTCTGA aldoB_BamHI_r fbaB_NdeI-f

GGATCCTTAATACGTGTAACAGGCC CATATGACAGATATTGCGCAGTTGCTTGG

fbaB_BamHI-r fucK_NdeI_f

GGATCCTCAGGCGATAGTAATTTTGCTAT CCATGGATGCACCATCACCATCACCATATGTTATCCGGCTATATTGCAGGAG

fucK_BamHI-SalI_r lacD_NdeI_f

GGATCCGTCGACATTAACGGCGAAATTGTTTCAGCATT CATATGGTACTTACAGAACAGAAACG

lacD_BamHI_r rhaB_NdeI_f

GGATCCCTATACTTTATCAGTCCATGGAC CATATGACCTTTCGCAATTGTGT

rhaB_BamHI_r

GGATCCTCATGCGCAAAGCTCCTTTG

Construction of khkC-aldoB operon fw-aldofu

GATGGCATCGTGTGAAGGAGGAACCGTATGGCACATCGCTTTCCGGCTCTG

rev-aldofu

ATGCCTGCAGGTCGATTAATACGTGTAACAGGCCGTAAA

fwkhkhfu

CGGTACCCGGGGATCAGGAGGCACACGATGGAAGAGAAGCAGATC

revkhkhfu

TCACACGATGCCATCAAAGCCCTGC

Cloning of aldA aldA_rbs_f

CCTCTAGAGTCGACCTGCAGAGGAGGATTCATATGTCAGTACCCGTTCAACATCC

aldA_rbs_r

GCCAAAACAGAAGCTTTTAAGACTGTAAATAAACCACC

aldA_EcoRI_f

TTGAATTCAGGAGGATTCATATGTCAGTACCCGTTCAACA

aldA_SmaI_r

TTCCCGGGTTAAGACTGTAAATAAACCA

Verification primers for gene knock-outs

5

xylB_loc_f

GTTATCGGTAGCGATACCGGGCATTTT

xylB_loc_r

GGATCCTGAATTATCCCCCACCCGGTCAGGCA

glcD_loc_f

TCCCGGACCTCGTGCACAGGTA

glcD_loc_r TCCGTTGTTCACCATCTCTTCATAG * Restriction sites are italicized and the start/stop codons are shown in bold.

6 7 8 36 ACS Paragon Plus Environment

Page 37 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1 2 3 4

Table 5: Plasmids used in this study Name pGEM-T

Relevant characteristics R

Amp , used for PCR fragment subcloning R

pACT3

Cm

pEXT20

Amp

pET-28a(+)

KanR

Reference Promega 53

R

53

Novagen R

pCP20

Amp , plasmid used for removing Kan cassette

65

pET-khk-c

pET-11a derivative carrying khk-c gene from Homo sapiens

28

pEX-K aldoB

Amp , pEX-K plasmid carrying a synthetic aldoB gene from

R

Eurofins

Homo sapiens codon-optimized for E. coli pET-28a khk-c

pET-28-a(+) derivative carrying H. sapiens khk-c gene from

This work

Asipu et al., 2003 pET-28a fucK

pET-28-a(+) derivative carrying E. coli fucK gene

This work

pET-28a rhaB

pET-28-a(+) derivative carrying E. coli rhaB gene

This work

pET-28a aldoB

pET-28-a(+) derivative carrying a synthetic aldoB gene from

This work

H. sapiens codon optimized for E. coli pET-28a fbaB

pET-28-a(+) derivative carrying E. coli fbaB gene

This work

pET-28a lacD

pET-28-a(+) derivative carrying Lactococcus lactis lacD gene

This work

pEXT20-aldoB

pEXT20 derivative carrying codon optimized aldoB gene

This work

pEXT20-khkC

pEXT20 derivative carrying H. sapiens khk-c gene from

This work

Asipu et al., 2003 pEXT20-khkC-aldoB

pEXT20 derivative carrying both H. sapiens khk-c gene from

This work

Asipu et al., 2003 and aldoB gene pEXT20-khkC-aldoB-aldA

pEXT20 derivative carrying H. sapiens khk-c gene from

This work

Asipu et al., 2003, codon optimized aldoB gene and aldA gene from E. coli pACT3-aldA

pACT3 derivative carrying aldA gene from E. coli

This work

5 6 7

37 ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 39

1 2 3 4

Table 6: Escherichia coli strains used in this study Strain reference

Genotype -

Reference

-

MG1655

F λ ilvG- rfb-50 rph-1

ATCC 47076

NEB5-α

fhuA2 Δ(argF-lacZ)U169 phoA glnV44 Φ80Δ (lacZ)M15 gyrA96 recA1

NEB

relA1 endA1 thi-1 hsdR17 BL-21 (DE3)

F- dcm ompT hsdS(rB- mB-) gal [malB+]K-12(λS)

Invitrogen

JW3536-2

F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ-, ΔxylB747::kan, rph-1,

Baba et al., 2006

Δ(rhaD-rhaB)568, hsdR514 JW2946-1

-

F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ , ΔglcD753::kan, rph-1,

Baba et al., 2006

Δ(rhaD-rhaB)568, hsdR514 Pen155

MG1655 ΔxylB::FRT

This work

Pen220

Pen155 ΔglcD::FRT

This work

Pen832

MG1655 containing pEXT20

This work

Pen833

Pen155 containing pEXT20

This work

Pen834

Pen155 containing pEXT20-aldoB

This work

Pen835

Pen155 containing pEXT20-khkC

This work

Pen205

Pen155 containing pEXT20-khkC-aldoB

This work

Pen221

Pen155 containing pEXT20-khkC-aldoB and pACT3- aldA

This work

Pen462

Pen155 containing pEXT20-khkC-aldoB-aldA

This work

Pen224

Pen220 containing pEXT20-khkC-aldoB

This work

Pen299

Pen220 containing pEXT20-khkC-aldoB and pACT3-aldA

This work

Pen492

Pen220 containing pEXT20-khkC-aldoB-aldA

This work

5 6 7 8 38 ACS Paragon Plus Environment

Page 39 of 39

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1 2 3 4 5 6 7 8

Table S1. Biomass and product yields obtained during cultivation of wild-type and mutant E. coli

9

strains on (D)-xylose Biomass [g/g]

Glycolic acid [mol/ mol]

wt

0.179 ±0,002

0

0

0

∆xylB pEXT20-khkC-aldoB

0.158 ±0.006

0.05 ±0.02

0.45 ±0.02

0.50 ±0.01

∆xylB pEXT20-khkC-aldoB pACT3-aldA

0.165 ±0.008

0.10 ±0.02

0.20 ±0.09

0.30 ±0.12

∆xylB pEXT20-khkC-aldoB-aldA

0.138 ±0.008

0.25 ±0.02

0.01 ±0.01

0.26 ±0.01

∆xylB pEXT20-khkC-aldoB-aldA**

0.332 ±0.004

0

0.04 ±0.01

0.04 ±0.01

∆xylB ∆glcD pEXT20-khkC-aldoB

0.148 ±0.001

0.36 ±0.02

0.47 ±0.01

0.83 ±0.02

∆xylB ∆glcD pEXT20-khkC-aldoB pACT3-aldA

0.161 ±0.001

0.76 ±0.04

0.33 ±0.08

1.09 ±0.09

Strain genotype

10 11 12 13 14 15

Ethylene glycol Total C2 yield* [mol/ mol] [mol/mol]

0.166 ±0.008 0.92 ±0.02 0.01 ±0.01 0.93 ±0.01 ∆xylB ∆glcD pEXT20-khkC-aldoB-aldA Data is presented as means and standard deviations of at least two independent experiments. (*) Calculated using equation: YC2 = ([EG] + [GA])/[xylose]. All cultures were started with an initial (D)xylose concentration of 70 mmol/l and incubated in 50 mL medium shaken in 250 ml flasks, except for condition (**) which was started at 30 mmol/l (D)-xylose and incubated in 50 ml medium in 500 ml baffled flasks. Yields were calculated from the concentrations of cells, EG, GA and xylose which were measured during stationary phase.

16 17

39 ACS Paragon Plus Environment