Yarrowia lipolytica - ACS Publications - American Chemical Society

Jun 6, 2017 - Coraline Rigouin, Marc Gueroult, Christian Croux, Gwendoline Dubois, Vinciane Borsenberger,. Sophie Barbe, Alain Marty, Fayza Daboussi, ...
1 downloads 0 Views 2MB Size
Subscriber access provided by Binghamton University | Libraries

Article

Production of Medium Chain Fatty Acids by Yarrowia lipolytica: Combining molecular design and TALEN to engineer the Fatty Acid Synthase coraline rigouin, Marc Guéroult, Christian Croux, Gwendoline Dubois, Vinciane Borsenberger, Sophie Barbe, Alain Marty, Fayza Daboussi, Isabelle André, and Florence Bordes ACS Synth. Biol., Just Accepted Manuscript • Publication Date (Web): 06 Jun 2017 Downloaded from http://pubs.acs.org on June 7, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Synthetic Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Production of Medium Chain Fatty Acids by Yarrowia lipolytica: Combining molecular

2

design and TALEN to engineer the Fatty Acid Synthase.

3 4

Coraline Rigouin1, Marc Gueroult1, Christian Croux1, Gwendoline Dubois1, Vinciane Borsenberger1,

5

Sophie Barbe1, Alain Marty1, Fayza Daboussi1, Isabelle André1 and Florence Bordes1*

6 7

1

LISBP, Université de Toulouse, CNRS, INRA, INSA, Toulouse, France

8 9

ABSTRACT

10

Yarrowia lipolytica is a promising organism for the production of lipids of biotechnological interest

11

and particularly for biofuel. In this study, we engineered the key enzyme involved in lipid

12

biosynthesis, the giant multifunctional Fatty Acid Synthase (FAS), to shorten chain length of the

13

synthesized fatty acids. Taking as starting point that the Ketoacyl Synthase (KS) domain of Yarrowia

14

lipolytica FAS is directly involved in chain length specificity, we used molecular modelling to

15

investigate molecular recognition of palmitic acid (C16 fatty acid) by the KS. This enabled to point

16

out the key role of an isoleucine residue, I1220, from the fatty acid binding site, which could be

17

targeted by mutagenesis. To address this challenge, TALEN (Transcription Activator-Like Effector

18

Nucleases)-based genome editing technology was applied for the first time to Yarrowia lipolytica and

19

proved to be very efficient for inducing targeted genome modifications. Among the generated FAS

20

mutants, those having a bulky aromatic amino acid residue in place of the native isoleucine at position

21

1220 led to a significant increase of myristic acid (C14) production compared to parental wild-type

22

KS. Particularly, the best performing mutant, I1220W, accumulates C14 at a level of 11.6 % total fatty

23

acids. Overall, this work illustrates how a combination of molecular modelling and genome-editing

24

technology can offer novel opportunities to rationally engineer complex systems for synthetic biology.

25

26

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

27

KEYWORDS: Biofuel, Fatty Acid Synthase, TALEN, Computer-aided engineering, Medium Chain

28

Fatty Acid, Ketoacyl Synthase specificity, Yarrowia lipolytica

29

ACS Paragon Plus Environment

Page 2 of 26

Page 3 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

30

INTRODUCTION

31

Lipids produced by microorganisms constitute a very promising route toward the development of fuels

32

and chemicals from low cost carbon feedstock. In this context, the use of oleaginous microorganisms

33

is of particular interest. Among the lipid-producer microorganisms, Yarrowia lipolytica, a GRAS

34

organism (Generally Recognized As Safe), with considerable potential in industrial biotechnology1,2,

35

is capable of producing and accumulating important amount of lipids (more than 50% of its dry

36

weight) in large-scale fermentation.3,4 In regards to such potential, this yeast is likely to become a

37

microorganism of choice for its use as a platform for lipid and biofuel production.3–6

38

In the current context of lowering environmental impact of fossil fuel, aviation industry is paying a

39

great interest to the development of sustainable fuels. Aviation-fuels have a number of requirements:

40

they must remain liquid at low temperature and have high energy content by volume. For these

41

reasons, biokerosene is essentially composed of saturated Fatty Acids (FA) with Medium Chain length

42

Fatty Acids (MCFAs).7

43

Fatty acids are produced in the cytosol of Yarrowia lipolytica via the fungal type I Fatty Acid

44

Synthase (FASI), a giant multifunctional protein that forms a dodecameric complex (α6β6) of 2.6

45

MDa. It is encoded by two genes (α and β) and that integrate all steps of fatty acid synthesis (Figure 1

46

and Figure S1 of Supporting information). The underlying chemistry resembles that of type II fatty

47

acid synthesis in bacteria and plant plastids, but, in this last case, it is achieved by sets of dissociated

48

monofunctional proteins.8 FAS enzymes are essential to the organism as they drive the only pathway

49

to de novo fatty acid synthesis. Depending on the FAS system harbored by the organism, substrate

50

chain length specificity appears to be driven by different catalytic steps. In bacteria, plants or animals,

51

whose elongation cycle terminates by the hydrolysis of the acyl esterified on the Acyl Carrier Protein

52

(ACP), the thioesterase specificity is a key determinant of FA chain length.9,10 In yeast FASI system,

53

there is no thioesterase activity and the termination of the elongation cycle is achieved with the

54

transfer of the acyl-ACP to a coenzyme A by the Malonyl Palmitoyl Transferase (MPT). Recently, this

55

termination step has been suspected to play a role in FA chain length. Noteworthy, reports describe

56

expression of FASI systems coupled or fused with a thioesterase: in Yarrowia lipolytica such approach

57

leads to either the production of MCFA11 or the accumulation oleic acid12 whereas in Saccharomyces

58

cerevisiae it triggers short chain FA production13. The Ketoacyl Synthase (KS) domain catalyzes the

59

condensation between acyl-ACP and malonyl-ACP and leads to the production of fatty acids

60

composed of 16 to 18 carbons (C16 and C18 fatty acids14). KS domains have been described to be

61

involved in fatty acid chain length determination in FASI system15,16 as well as in simple FASII

62

system.17,18 Postulating that the KS domain of Yarrowia lipolytica FAS is the molecular ruler of FA

63

chain length, engineering this domain could lead to variants with modified FA chain length specificity.

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

64 65 66 67 68 69 70

Figure 1. Linear domain organization of YlFAS genes at approximate sequence scale showing catalytic (colored) and scaffolding (grey) domains. ACP: Acyl Carrier Protein: KR: Ketoacyl Reductase; KS: Ketoacyl Synthase; PPT: PhosphoPantetheine Transferase; AT: Acetyl Transferase; ER: Enoyl Reductase; DH: Dehydratase; MPT: Malonyl Palmitoyl Transferase.

71

Computer-aided engineering strategies have shown their tremendous potential to alter enzyme

72

specificity by focusing mutations on key molecular determinants of the active site involved in

73

substrate recognition.19–21 Rational insight enables to increase the odds to identify a hit having the

74

desired specificity among the explored sequence space and thus, to considerably narrow down the

75

number of mutants to construct and screen. Taking as starting point that the KS domain of the FASI

76

system is directly involved in fatty acid chain length specificity in Yarrowia lipolytica, we used

77

molecular modelling to investigate in details the molecular determinants involved in fatty acid chain

78

recognition, and identify key amino-acid residues to mutate in order to alter KS specificity toward

79

shortened fatty acids.

80

The capacity of constructing mutants in an essential and complex system such as FAS, strongly

81

depends on the molecular tools available to edit the genome of the organism. Gene disruption,

82

replacement and more generally, genome modification in Yarrowia lipolytica have relied until very

83

recently exclusively on Homologous Recombination (HR) based on insertion of selectable marker.22

84

These methods suffer from long and tedious processes in part due to low rate of HR (varying in

85

efficiency between 2% and 45% depending on the targeted locus) in this organism.22–24 The emergence

86

of powerful technologies based on double-strand break (DSB) repair mechanisms has revolutionised

87

the engineering of organisms which were previously difficult to manipulate (mammalian cells, plants,

88

microalgae, insects…).25,26 Nuclease-induced DSB can be repaired by two different pathways: Non-

89

Homologous End-Joining (NHEJ), leading to efficient introduction of insertion/deletion mutations

90

which can disrupt translational reading frame and homology-directed repair that occurs when

91

exogenous DNA is supplied. Consequently, point mutations or large insertions/deletions are easily

92

introduced through recombination at the target locus with exogenously supplied DNA donor

93

templates.27

94

CRISPR (Clustered Regulatory Interspaced Short Palindromic Repeats)/Cas9-based genome editing

95

was successfully applied to Yarrowia lipolytica to raise HR frequency.28,29 This powerful technology

96

enables the deletion, replacing, insertion or introduction of specific mutations in a coding sequence.

97

This latest action has not yet been implemented in Yarrowia lipolytica. In this study, we report the use

98

of TALEN-based technology, which has been extensively studied in terms of activity and

ACS Paragon Plus Environment

Page 4 of 26

Page 5 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

99

specificity30,31 and successfully used to manipulate the genome of many organisms, to edit the

100

sequence of the FAS enzyme. This TALEN-based technology enabled to perform site-directed

101

mutagenesis directly in the genome of Yarrowia lipolytica at a position predicted to affect enzyme

102

activity. Using a combination of computer-aided design and genome editing of this giant multi-domain

103

protein, we successfully obtained lipid-producer mutants displaying a modified fatty acids profile with

104

high potential for biokerosene application.

105 106 107

RESULTS AND DISCUSSION

108 109

3D modelling of the Ketoacyl Synthase domain from Yarrowia lipolytica FASI

110

Fungal FAS is a dodecamer made of 6 α subunits and 6 β subunits. The KS domain forms a dimer and

111

is located on the α subunit (Figure 1). It catalyzes the decarboxylative condensation of the ACP-

112

attached malonyl moiety with either the acetyl-ACP starter group or with the elongated acyl-ACP

113

group (Figure S1 of Supporting Information), accepting acyl-ACP substrates of up to 16 or 18 carbon

114

length. In order to elucidate molecular determinants likely involved in the recognition of fatty acids, a

115

three-dimensional (3D) model of the KS domain of Yarrowia lipolytica Fatty Acid Synthase (YlFAS)

116

was constructed by comparative modelling. To investigate further binding mode of fatty acids in the

117

active site of YlKS, we used the X-ray structure of β-Ketoacyl synthase I from Escherichia coli in

118

complex with capric acid (C10) (PDB ID: 1F91)32 as template to model acyl chains of different lengths

119

(C10 capric acid and C16 palmitic acid) into the active site of the YlKS structural model (Figure 2A).

120

These complexes were then used to map amino-acid residues interacting with C16 and involved in the

121

acyl-binding pocket formation. As a result, M1217, I1220 and M1226 were identified as relevant

122

amino-acid residues to target by mutagenesis in order to limit chain length of fatty acids that can bind

123

in enzyme active site. Next, we performed a sequence analysis using 109 sequences of KS3 family

124

extracted from ThymeDB33 in order to verify the amino-acid conservation at these three positions.

125

Overall, amino-acid residues at positions 1217 and 1226, appeared relatively conserved within the

126

family whereas an important amino-acid variability was observed at position 1220, suggesting a less

127

critical functional role for this residue. Mutagenesis focused on I1220 residue appeared thus

128

particularly appealing as this highly variable residue was also found to interact with the acyl chain, in

129

close contact with the carbons 10-12 of the palmitic acid (C16) (Figure 2A). Our idea was thus to

130

obstruct the active site by substituting I1220 residue with bulky amino-acids (such as aromatic F, Y or

131

W residues) aiming at inducing steric hindrance to prevent binding of long acyl chains (Figure 2B).

132

Nonetheless, in order to probe more extensively the effect of mutations at this position, we decided to

133

mutate systematically position 1220 by all other 19 possible amino acids.

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

134 135 136 137 138 139 140 141

Figure 2. Molecular docking of fatty acids in the active site of KS from Yarrowia lipolytica. (A) View of a C16 docked in active site of parental wild-type KS. Location of amino-acid residue I1220 is shown in stick. (B) View of a C14 docked in active site of mutant I1220W, shown to illustrate the effect of a bulky amino acid mutation introduced at position 1220 on the binding of shortened fatty acids such as C14. Carbon chain numbering is given for reference.

142

FAS enzyme is essential for the yeast as it is the only de novo pathway for FA synthesis. As our first

143

trials, consisting in deleting the gene and then transforming the resulting strain with a mutated KS,

144

were cumbersome, we decided to change strategy and opted for a TALEN-based technology to

145

introduce the mutation in vivo directly in the yeast genome.

146

The general TALEN architecture consists in the fusion of a custom TALE DNA binding domain

147

linked to the N-terminal end of the non-specific FokI nuclease domain. The Fok1 cleavage domain

148

functions enzymatically as a dimer, requiring two DNA binding domains in order to create a

149

TALEN.34 Each pair unit binds to adjacent binding sites that are separated by a spacer sequence. The

150

cleavage site is localized in the middle of the spacer region. Here, we designed a TALEN whose

151

cleavage site is centered on the I1220 residue to be mutated (Figure 3).

152 153 154 155 156 157

Figure 3. (A) TALEN target sites on the αFAS gene (KS domain in red). Schematic representation of the Nterminal and C-terminal domains of the TALEN (displayed in purple), TALEN binding sites (in blue) and the FokI domain (in green). The I1220 ATC codon is identified by the arrow and highlighted in yellow.

158

negative (FAS-) phenotype via error-prone NHEJ repair. Selection was performed on YNB medium

159

complemented with oleic acid (called YNB OA) to allow the growth of the FAS- mutants. In order to

160

prevent degradation of this exogenous fatty acid or any new type of fatty acids produced by the

Design of TALEN targeting position 1220

To first validate the TALEN functionality, we monitored the ability of the system to generate a FAS

ACS Paragon Plus Environment

Page 6 of 26

Page 7 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

161

engineered strain, the strain JMY1233 for which β−oxidation pathway had been removed was used.

162

When JMY1233 strain was transformed with the control plasmids pL68BT and pU18BT and plated on

163

YNB OA, only one type of transformant was obtained: large, white and thick colonies, comparable to

164

those formed by the wild-type strain, therefore unambiguously attributed to a wild-type FAS (FAS+)

165

phenotype. When JMY1233 was transformed with pL68_TALKSl and pU18_TALKSr, two types of

166

transformants were obtained. Out of the 930 obtained transformants, 3% were large, white and thick

167

colonies (L colonies), corresponding to a FAS+ phenotype whereas 97% of the transformants were

168

small and translucent colonies (T colonies) that were assumed to represent the FAS- phenotype (Table

169

1).

170

Nineteen T colonies were cultured in rich medium supplemented with oleic acid and sequenced at the

171

targeted locus. We found that 100% of these colonies chosen for their FAS- phenotype displayed an

172

insertion or a deletion (InDel mutations) at the targeted locus (Table S1 of the Supporting

173

Information). We concluded that the T colonies phenotype was indeed associated with a NHEJ repair

174

leading to InDel mutations at the targeted locus. We showed that transformation with TALEN leads to

175

97% of NHEJ repair at the targeted locus. The 3% FAS+ phenotype L colonies may be attributed to a

176

lack of cleavage of the TALEN or a faithful repair. In addition, one could attribute this phenotype to a

177

small InDel not disrupting the αFAS gene frame and leading to a functional FAS. Seven of these FAS+

178

L colonies were sequenced and showed no modification in their sequence at the targeted site compared

179

to the wild-type sequence, confirming the FAS+ genotype.

180

In conclusion, we demonstrate here that TALEN are functional and highly efficient in Yarrowia

181

lipolytica. Moreover, by targeting the essential αFAS gene, we could easily develop a screen for

182

identifying colonies that underwent NHEJ repair. Altogether, these results offer news insights to

183

undertake HR experiments in order to saturate position 1220.

184 185

Site-directed mutagenesis by homologous recombination

186

To perform site-directed mutagenesis at position 1220, we designed a matrix to serve as exogenous

187

DNA for HR at the TALEN targeted site. The matrix is a 2000 pb length double strand DNA centered

188

on I1220 codon (ATC). Altogether, 20 DNA matrixes were constructed displaying the nucleotide

189

sequence for I1220 (wild-type) codon and the 19 other possible codons replacing that of I1220. In

190

addition, to prevent TALEN from binding to the matrix and to the chromosome upon HR, 4 silent

191

mutations were introduced in the sequences corresponding to each of the TALEN recognition sites

192

(Figure S2 of the supporting Information).

193

In order to verify that TALEN can induce repair by HR when exogenous DNA is provided, we first

194

carried out the experiment with the matrix carrying the mutation I1220F (mI1220F), identified as

195

being relevant by molecular modelling to obstruct the active site and potentially prevent elongation of

196

FA longer than C16. When JMY1233 was transformed with the control plasmids and the matrix, we

197

found that all transformants were large white colonies homogenous in size on YNB OA medium.

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 26

198

However, when the transformation was performed with the matrix and TALEN expressing plasmids,

199

colonies varying in size and appearance were obtained. More precisely, three types of colonies were

200

obtained on YNB OA plates after 5 days upon transformation. Among the 1330 obtained

201

transformants, the previously observed L and T phenotypes were encountered once again: 30

202

transformants (2 %) formed large white colonies harboring the FAS+ phenotype (L colonies), and 800

203

(59 %) were small translucent colonies (T colonies) harboring the FAS- phenotype (Table 1 and Figure

204

S3 of the supporting information). Interestingly, a new phenotype was also obtained for 500 clones

205

(37%) that formed smaller white colonies (S colonies). Ten L colonies and 15 S colonies were

206

analyzed by PCR using the locus specific primer KS_P1 and the matrix specific primer HR_screening

207

R (Figure S2 of the supporting information). Based on PCR amplification, we found that 20% of the L

208

colonies and 95 % of the S colonies underwent HR. The genotype was confirmed by sequencing the

209

PCR amplicons (Table S1). The T colonies were unable to grow without oleic acid complementation;

210

their sequencing confirmed that they underwent NHEJ repair (Table S1 of Supporting information).

211

We showed in this particular example that HR-induced by TALEN is efficient in Yarrowia lipolytica

212

with a frequency up to 40 %. This number is highly significant compared to HR efficiency obtained by

213

traditional methods (using selection markers) at this particular locus, estimated to be less than 1%

214

(data not shown).

215 216 217 218 219 220 221 222

Table 1: Results of the strain JMY1233 transformations with TALEN expressing plasmids (TALEN plasmids (+) pU18_TALKSr and pL68_TALKSl or the control plasmids (-) pU18BT and pL68BT) plated on YNB OA, without (-) and with (+) addition of the matrix mI1220F. For each transformation, the total number of colonies and the number of each type of colonies (L, S or T colonies) obtained is given. Below (grey background) are presented the sequencing results at the targeted locus (Number of colonies displaying a given genotype / number of sequenced colonies (WT: Wild-Type; HR: repair via Homologous Recombination; NHEJ: repair via Non Homologous End Joining)). matrix Total number Type of colonies TALEN of transformants L T S plasmids

+ +

+ +

1000

1000

0

0

930

30

900

0

7/7 WT genotype (FAS+)

19/19 NHEJ genotype (FAS-)

/

2000

0

0

10/10 WT genotype (FAS+)

/

/

30

800

500

8/10 WT genotype 0/10 NHEJ genotype 2/10 HR genotype

0/9 WT genotype 8/9 NHEJ genotype 1/9 HR genotype

1/15 WT genotype 0/15 NHEJ genotype 14/15 HR genotype

2000

1330

223 224

In order to verify that the S phenotype was not due to a growth delay related to the silent mutations

225

introduced into the matrix, we compared the growth and lipid profiles of the wild-type strain

226

JMY1233 and of the strain JMY1233 giving a S phenotype upon transformation with the TALEN

ACS Paragon Plus Environment

Page 9 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

227

plasmids and a wild-type matrix (displaying the 8 silent mutations but the WT ATC codon of

228

Isoleucine at position 1220). Results show that both strains grew at the same rate and gave similar

229

lipid profiles (Figure S5 of Supporting information). We conclude that the S phenotype likely reflect

230

cells that underwent TALEN cleavage and required more time to grow upon HR repair. One cannot

231

exclude that the S phenotype may also reflect a decrease in FAS activity impairing the cell growth.

232 233

Based on these results, we carried out the 19 other transformations using the corresponding matrix and

234

focused the screening on the S colonies as they gave the best HR frequency and were present for most

235

transformations (with the exception of transformations with mI1220D and mI1220R matrixes).

236

Noteworthy, depending on the matrix added, we observed that the number of transformants, the

237

frequency of homologous recombination and NHEJ repair varied. For 18 of the 20 transformations, we

238

identified at least 2 “HR positive” clones by PCR screening of less than 10 S colonies. The data

239

corresponding to the screening of these 19 recombination experiments are presented in table S4.

240

Introduction of the expected mutation was subsequently confirmed for those two clones for each

241

position by sequencing the targeted locus. Regarding the transformation with the two remaining

242

matrixes mI1200R and mI1220D, we noticed that on YNB OA plates, the percentage of T colonies

243

was over 99% while no S colonies were detected. We hence hypothesized that these particular amino

244

acids at position 1220 could be deleterious for the FAS activity, leading to FAS- phenotype upon HR.

245

We screened T colonies by growing them in rich medium supplemented with oleic acid and found that

246

2 colonies out of 5 that were screened underwent HR (Table S4).

247 248

Fatty acid specificity of I1220 mutants

249

After the mutants had lost the TALEN plasmids and subsequently recovered prototrophy (see

250

experimental procedures), they were grown in minimum medium. Lipid content was extracted and

251

fatty acid composition was analyzed. Results obtained were the following: when Ile amino-acid at

252

position 1220 was replaced by Arg or Asp, mutants required oleic acid complementation to grow. This

253

suggests that either the FAS activity is completely lost or so impaired that FA synthesis by the mutated

254

FAS is insufficient for supporting growth. Analysis of the FA profile for the cultures grown in the

255

presence of oleic acid to complement the lack of suitable lipids showed that no neo-synthesis of FA

256

occurred, and consequently it can be concluded that these mutated FAS are inactive. For all other

257

amino-acid substitutions at position 1220, cells were able to grow without addition of exogenous fatty

258

acids in the medium, suggesting that the mutated strains are thus able to make their membrane lipids.

259

When Ile1220 was replaced by Glu or Lys, no change of FA specificity was observed. However, we

260

noticed a limited growth and a lower accumulation of lipids (Figure 4A). This suggests that charged

261

amino-acid residues likely disturb the loading of acyl-ACP into the active site, leading to an impaired

262

FAS.

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

263

When Ile1220 was replaced by Ala, Gly, Val, Leu, Met, Ser, Thr, Cys, Pro, Gln or Asn, the

264

corresponding strains displayed a FA profile overall similar to that of the wild-type after 4 days (data

265

not shown) and 8 days (Figure 4A) of culture with minor changes in chain length specificity.

266

Noteworthy, we found an enhanced FA accumulation for strains JMY1233_I1220L and I1220M. In

267

addition, percentage of C14 of DCW was 0.07% for JMY1233 and reached 0.24% for

268

JMY1233_I1220M what represents a 3.4 fold increase.

269

The most interesting behaviors were observed when Ile1220 was replaced by an aromatic amino-acid,

270

Trp, Tyr, Phe or His, as the FA profile was found to be modified and significant amounts of C14 were

271

produced (Figure 4B). The C14 represented 2.3 % of total FA (0.26 % of DCW) for the strain

272

JMY1233_I1220Y, 2 % of total FA (0.47 % of DCW) for JMY1233_I1220H, 5.8 % of total FA (1.3

273

% of DCW) for JMY1233_I1220F and 11.6 % of total FA (2 % of DCW) for JMY1233_I1220W,

274

corresponding respectively to a 4, 7, 18 and 29 fold increase in C14 accumulation (% of DCW)

275

compared to the wild-type FAS. Of note, traces of lauric acid (C12) were also detected for

276

JMY1233_I1220W (Figure 4B). To ensure significance of the results, cultures and lipid extractions

277

were repeated in an independent experiment for those clones. Results are given in Figure S6 of

278

supporting information.

279

Noteworthy, mutants capable of accumulating MCFA could not reach the same DCW as the wild-type

280

strain. This suggests that these FAS mutants, although capable of producing FA supporting cell

281

growth, seem to have a decrease of their catalytic efficiency. Besides, one cannot exclude that MCFA

282

incorporation into cell membrane can cause alteration compromising cell viability.

283

ACS Paragon Plus Environment

Page 10 of 26

Page 11 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

284 285 286 287 288 289 290 291

Figure 4: (A) Fatty Acid (FA) profiles of the JMY1233m (wt(m)) strain and the mutants JMY1233_I1220X (X= Ala(A), Glu(G), Val(V), Leu(L), Met(M), Ser(S), Thr(T), Cys(C), Pro(P), Gln(Q), Asn(N), Asp(E) or Lys(K)). (B) FA profiles of the JMY1233m (wt(m)) strain and the mutants JMY1233_I1220X (X= Tyr(Y), Try(W), Phe(F) or His(H). For each clone, total FA accumulation and C14 accumulation are given in % w/w of DCW. Average and standard errors are given for two clones cultivated separately. DCW and FA content were measured after 8 days of culture in minimum medium.

292

Altogether these results support molecular modelling studies which suggest that bulky aromatic

293

amino-acid at position 1220 could impede the accommodation of long chain FA in YlKS binding site

294

hence promoting medium chain FA production. Although all aromatic mutations (Trp, Tyr, Phe or

295

His) led to a significant increase in C14 production, some differences were observed depending on the

296

physico-chemical properties of the amino-acid side chain. Whilst the highest C14 production was

297

achieved for hydrophobic and bulky amino-acid residues such as Trp and Phe, lower amounts of C14

298

were observed for Tyr and His amino-acid mutations (Figure 4B).

299 300

DISCUSSION

301

Computer-aided design was successfully applied to the engineering of a giant multifunctional FAS

302

enzyme to modulate its substrate specificity. In this study, we showed that a single amino-acid

303

mutation introduced in one of the eight domains of the FAS complex, could lead to major changes in

304

lipid production profile. These results support the role of the Ketoacyl Synthase domain on the fatty

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

305

acid chain length determination. By introducing a bulky aromatic amino-acid residue at position 1220,

306

the accommodation of long chain FA in the active site is impaired as we showed that mutants

307

JMY1233_I1220F or JMY1233_I1220W were able to produce Medium Chain Fatty acids up to 6%

308

and 12% of the total Fatty acids, respectively. Simultaneously to our work on the FAS from Yarrowia

309

lipolytica, Gajewski et al.16 engineered the FASI from Saccharomyces cerevisiae. Consistently with

310

our results, the authors showed that increasing the steric hindrance of two amino acids from the

311

binding pocket of the KS domain lead to a significant increase in short chain FA. In our 3D model,

312

equivalent amino acid positions in the YlFAS face the C6-C10 carbons of the palmitic acid and their

313

mutation in ScFAS led to the production of short chain FA (C6 and C8 FA). Our mutation is closer to

314

the C10-C14 carbons of the palmitic acid and leads to the production of MCFA. This shows that the

315

KS binding pocket topology is crucial in FA chain length determination and its size can be

316

advantageously modulated to produce a variety of molecules of interest: MCFA valuable for

317

biokerosene production or short chain FA for gasoline replacement. In the recent publication of Xu et

318

al., YlFAS was engineered by replacing the MPT domain by a small thioesterase.11 This strategy led to

319

the production of 29% of free C14 in the mutant EcTesA. With our strategy, we obtained 12% of total

320

C14 fatty acid for the mutant JMY1233_I1220W. Our objective was to demonstrate that rational

321

design enables to guide engineering of enzymes involved in complex systems such as FAS and help to

322

modify its substrate specificity toward shorter fatty acid chains. This work shows that molecular

323

design is highly valuable to understand and control molecular determinants involved in the substrate

324

specificity of the KS domain. The strategy followed herein could be extended to other FAS domains in

325

order to extend the diversity of synthetized fatty acids. Interestingly, while Xu et al. focused on the

326

termination step, allowing the transfer of shorter fatty acids out of the FAS, we rather focused on the

327

elongation process, trying to prevent synthesis of longer fatty acids. We believe that combining both

328

strategies may lead to a synergistic effect and greatly enhance the production of shorter chain fatty

329

acids as it has recently been demonstrated for Saccharomyces cerevisiae.13,16

330

In our experiments, increased production of MCFA seems to slightly reduce lipid accumulation.

331

Whether this is due to mutated FAS catalytic efficiency or diacyl glycerol acyl transferase specificity

332

needs to be investigated. Furthermore, the lipid profile may also not completely reflect the FAS

333

products as the FA may be modulated by other enzymes before their incorporation into triglycerides.

334

This opens the route to further investigation in order to improve the MCFA yield.

335

In this study, we also revealed for the first time the potential of targeted genome engineering using

336

TALEN technology in Yarrowia lipolytica. We showed in our experiments that TALEN was efficient

337

at 97% for Non Homologous End Joining Repair and up to ca. 40% for Homologous Recombination

338

repair. These values are close to the efficiencies reported for Mfe1 disruption (90 ± 7%)29 and for

339

TRP1 disruption (85.6 %)28 by NHEJ using CRISPR-Cas9 technology in Yarrowia lipolytica.

340

Regarding HR efficiency using CRISPR-Cas9, HR was tested in both papers for gene disruption and

341

was estimated to reach 64 ± 11% using a selectable marker for Mfe129 and 11.1 ± 11 3.6 for TRP128.

ACS Paragon Plus Environment

Page 12 of 26

Page 13 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

342

These papers confirm that CRISPR-Cas9 system is a powerful technology to perform gene disruption.

343

The innovative aspect of our study however stands in the use of TALEN technology to perform point

344

mutation at a specific locus. Since the HR event could not be selected, its detection was challenging

345

and yet we reached a 40% HR efficiency. This highlights the incredible power of the technology and

346

makes our research pioneer in the field. Developing new technologies for genome editing is crucial,

347

especially for those organisms that have suffered from a lack of available genetic tools. Thanks to the

348

development of CRISPR-Cas9 and today TALEN technologies in Yarrowia lipolytica, one can choose

349

the appropriate tool depending on the targeted application. This opens incredible opportunities for

350

gene editing in Yarrowia lipolytica and promises to widen the use of this organism for

351

biotechnological applications. Thanks to an elegant genome editing technology combined to molecular

352

modelling, this work demonstrates the possibility to reduce the chain length of the produced fatty acid

353

in Yarrowia lipolytica, showing once again that this strain can be used as a platform for lipid

354

production.

355 356

THEORETICAL AND EXPERIMENTAL PROCEDURES

357

Computational methods.

358

3D protein modelling. In order to build a three-dimensional model of the Ketoacyl Synthase domain

359

of Yarrowia lipolytica (YlKS), the sequence of Yarrowia lipolytica FAS (YlFAS) subunit α

360

(YALI0B19382p) was first blasted against the Protein structure DataBank (PDB) to identify

361

homologous sequences with crystallographic structures available. Three structures showing a sequence

362

identity between 68% and 70% with the target YlFAS α subunit sequence were identified. They all

363

correspond to FAS complexes from yeast (two from S. cerevisiae (PDB ID: 2PFF and 2VKZ) and one

364

from T. lanuginosa (PDB ID: 4V58)). These three structures were used as structural templates for the

365

modelling of YlFAS subunit α. Modelling was based upon sequence alignment using MUSCLE35 and

366

MODELLER36 to generate 100 potential models of the YlKS domain. Orientation of side-chains in the

367

selected model was determined using SCWRL4.37 The quality of the model was then assessed with

368

ProQ2.38 The domain corresponding to YlKS was determined from the one described in S. cerevisiae39

369

and corresponds to the residues 980 to 1629. Starting from this model, a dimeric structure was built

370

based on the crystal structure symmetry observed in FAS complex from T. lanuginosa (PDB ID:

371

4V58). The model was then minimized to remove steric clashes, followed by a short Molecular

372

Dynamics (MD) simulation (8 ns) in explicit water solvent. Using the final model of YlKS issued from

373

MD simulation, we have then placed an acyl chain corresponding either to C10 or C16 into the

374

enzyme active site by manual molecular docking. This docking was performed by superimposing the

375

YlKS model with the structure of β-Ketoacyl synthase I from Escherichia coli in complex with capric

376

acid (C10) (PDB ID: 1F91).32 The model of the complex with palmitic acid (C16) was also constructed

377

by extending the docked C10 molecule. These complexes were minimized with position restraints on

378

Cα atoms. MD simulations were performed on YlKS in presence of a bound acyl chain using Gromacs

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 26

379

software40 with the amber99sb-ildn force field41 and TIP3P water42Acyl chain parameters were

380

extracted from Lipids14 force field43 The systems were embedded in a TIP3P water box in minimum

381

salt conditions. The full hydrated systems were minimized and 100 ps of constant volume and

382

temperature MD simulations were performed at 300 K using Berendsen thermostat44 with all protein

383

heavy atoms fixed. The whole systems were simulated for 1 ns under constant pressure, and constant

384

temperature 300 K while all protein heavy atoms were fixed. The resulting systems were used as

385

starting points for production phases. Simulations were performed at constant temperature (300 K)

386

using V-rescale thermostat45 and pressure (1 bar) using a Parrinello-Rahman coupling algorithm46 The

387

integration time–step was 2 fs and all bonds were constrained using P-LINCS47 Water molecules were

388

kept rigid using the SETTLE algorithm48 Lennard-Jones interactions were cutoff at 1.0 nm. Long-

389

range electrostatic interactions were treated using the particle mesh Ewald approach49 with a 1.0 nm

390

direct space cut-off. The neighbor list was updated every 10 ps and the center of mass motion removed

391

at every step.

392

Sequence analysis. Sequence analysis was performed using 109 sequences belonging to KS3 family

393

extracted from ThymeDB.33 Multiple sequence alignment was performed using MUSCLE.35 Position-

394

dependent amino acid residue variation in multiple sequence alignment data was analyzed using the

395

Shannon information entropy measure (Hx), calculated using in house SEQUESTER software50,51

396 397

Experimental methods

398

Media. Rich medium YPD (Yeast extract 10g/L, bactopeptone 10 g/L, glucose 10 g/L) was used for

399

growing cells prior genomic DNA extraction and to start cultures. Minimal medium YNB (glucose 10

400

g/L, YNB w/o AA 1.7 g/L, NH4Cl 5g/L, Phosphate buffer pH 6.8 50 mM, agar 15g/L) was used to

401

select colonies after transformation. When necessary, Oleic Acid (80% purity-Sigma) prepared in

402

Tween® 40 was added at 0.1% (P/V) final concentration.

403

To grow cells for lipid content analysis, the specific minimal medium for lipid accumulation was used

404

(glucose (80 g/L), ammonium sulfate (1.5 g/L), Phosphate buffer pH 6.8 (100 mM), oligo elements:

405

CoCl2 0.5 mg/L, CuSO4 0.9 mg/L, Na2MoO4 0.06 mg/L, CaCl2 23 mg/L, H3BO3 3 mg/L, MnSO4 3.8

406

mg/L, MgSO4 10 mg/L, ZnSO4 40 mg/L, FeSO4 40 mg/L, vitamins (D-biotin 0,05 mg/L, Panthotenate

407

1 mg/L, nicotinic acid 1 mg/L, Myo-inositol 25 mg/L, Thiamine hydrochloride 1 mg/L, Pyridoxol

408

hydrochloride 1 mg/L, p-aminobenzoic acid 0.2 mg/L). For mutants that required oleic acid to grow,

409

0.2 % (P/V) was added.

410 411

TALEN design and plasmids. A TALE-Nuclease (described and used to stimulate targeted gene

412

modifications)34,52 has been designed to generate a double-strand break centered on the I1220 Codon

413

position. The TALE-Nuclease_KS encoded by the TAL_KSr (SEQ ID NO: 1) and the TAL_KSl (SEQ

414

ID

415

TGTTCCGGTTCCGGTATgggtggtatcaccgcCCTGCGAGGCATGTTCA-3’. The sequences of the

NO:2)

plasmids

were

designed

to

cleave

the

ACS Paragon Plus Environment

DNA

sequence

(5’-

Page 15 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

416

corresponding TAL_KSl and TAL_KSr (Table S2 of Supporting information) were synthesized

417

following the Golden Gate TALEN kit (Addgene) and cloned into a shuttle plasmid designed and

418

constructed for usage in Yarrowia lipolytica. The empty plasmids pL68 and pU18 harbor a yeast

419

origin of replication (ARS68 and ARS18 respectively)53, a selection marker (LEU2 or URA3

420

respectively) in addition to an origin of replication in Escherichia coli and the Kanamycin resistance

421

encoding gene (Figure S5 of Supporting information). Shuttle plasmids pL68BT and pU18BT were

422

built from pL18 and pU18 plasmids by insertion of the N-terminal and C-terminal sequences for

423

optimal TALEN scaffolding54 in between the constitutive promoter pTEF and the Lip2 terminator.

424

Subsequently, TAL_KSr and TAL_KSl were cloned into pU18BT and pL68BT plasmids giving

425

respectively pU18TAL_KSr and pL68TAL_KSl. Plasmids were amplified in Escherichia coli,

426

extracted and sequences checked before further utilization in Yarrowia lipolytica.

427 428

Matrix design. In order to allow homologous recombination at the target locus, 19 different matrixes

429

were synthesized by PCR fusion of two overlapping PCR products synthesized using primers carrying

430

the desired mutations substituting the I1220 codon by the 19 other amino acids codons. In addition

431

four silent mutations were introduced into each TALEN target site to prevent the TALEN to bind to

432

the matrix and to the chromosome upon HR. In addition, these silent mutations allowed the design of a

433

matrix specific primer, used consequently for the screening of desired clones, i.e. where the

434

homologous recombination between the matrix and the chromosome occurred (Figure 3B and Figure

435

S2 of Supporting information).

436 437

Strain and transformation. The strain Yarrowia lipolytica JMY1233 (MATa, leu2-270, ura3-302,

438

xpr2-322, ∆pox1-6) was used in this study. It has been deleted of the β−oxidation pathway55 and is

439

auxotroph for Uracil and Leucine. This strain is used as a platform for the engineering of strain

440

producing new or original lipids. β−oxidation was removed to prevent degradation of any new types of

441

fatty acid produced by the engineered strain. JMY1233 cells were made competent with the Frozen-EZ

442

Yeast Transformation II Kit TM (Zymoresearch). Transformations were performed as described by the

443

manufacturer using 50 µL of competent cells and 500 ng of each of the control plasmids pU18BT and

444

pL68BT or TALEN plasmids pU18TAL_KSr and pL68TAL_KSl. For homologous recombination

445

experiments, 500 ng of matrix was used in addition of the pU18BT and pL68BT plasmids or the

446

pU18TAL_KSr and pL68TAL_KSl plasmids. Transformants were selected on YNB AO plates for the

447

selection of FAS+ and FAS- colonies. Genomic DNA was extracted from the cultures of selected

448

clones and screened for mutation event by PCR. After selection, positive clones were cultured in rich

449

medium colonies and streaked on plates to allow for the loss of the replicative TALEN plasmids. Once

450

the loss of both plasmids was verified, the resulting auxotrophic strains were transformed with the

451

plasmids pL68 and pU18 to regain prototrophy.

452

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

453

PCR screening on genomic DNA. Primers used for the screening and the sequencing are listed in

454

Table S3 of Supporting information. Genomic DNA extractions were performed on 2 mL overnight

455

cultures in YPD medium using QIAprep spin miniprepTM (Qiagen). PCR were carried out with 1 µL of

456

gDNA. For the screening of NHEJ clones, KS-T7_F and KS-T7_R primers were used to amplify a

457

fragment of 500 bp centered on I1220 codon position. KS-T7_F primer was then used to sequence the

458

PCR product. For the screening of the HR experiment clones, the primers KS_P1 and

459

HR_screening_R or WT_screening_R were used. KS_P1 and KS_T2 primers were then used to

460

amplify a fragment of 4000 bp and KS-T7_F primer used to sequence this PCR product in order to

461

confirm the introduction of the desired mutations at position I1220.

462 463

Cultures, lipids extraction and analysis. Mutants were grown in 50 mL of minimum medium

464

required for lipid accumulation at 28°C for 8 days. 2 mL samples (cells + medium) were collected

465

after 4 and 8 days for a measurement of the growth and were lyophilized for further analysis. After 8

466

days, Dried Cell Weights (DCW) were measured on 10 mL cultures. For mutant requiring oleic acid

467

complementation to grow, oleic acid was added in the medium at 0.2% (P/V). FA profiles were

468

analyzed after their extraction and transmethylation in hot acid methanol.56 Briefly, 2 mL of a solution

469

of Methanol (with Heptadecanoic acid -C17 standard prepared at 0.2 mg/mL) with 2.5 % sulfuric

470

acid is added to the dried sample in addition to 1 mL of Toluene. Samples are heated at 80°C for 3 h.

471

Once the samples are cooled down, biphasic liquid extraction takes place using 1.5 mL of 0.5 M NaCl

472

and 1.5 mL hexane (containing internal standard methyl Eicosanoate -C20 at 0.1 mg/mL). Analyses

473

are performed on organic phase with a gas chromatography coupled with Mass Spectrometry

474

TRACETM 1310 equipped with the TRACETM TR-5 column (Thermo-scientific). For mutant

475

complemented with OA, total FA extracted were compared with the amount of OA added in the

476

medium at initial time.

477 478 479 480

ACS Paragon Plus Environment

Page 16 of 26

Page 17 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

481

ASSOCIATED CONTENT

482

Supporting Information. Supplementary tables and figures showing details of FAS cycle, library

483

screening, culture and analysis of JMY1233 and JMY1233m, plasmid constructions.

484 485 486

AUTHOR INFORMATION

487

Corresponding Author

488

*[email protected]

489 490

Funding Sources

491

Financial support for this study was provided by Agence Nationale de la Recherche (ANR), and

492

Commissariat aux Investissements d’Avenir via the project ProBio3 “Biocatalytic production of

493

lipidic bioproducts from renewable resources and industrial by-products: BioJet Fuel Application”

494

(ANR-11-BTBT-0003).

495 496

ACKNOWLEDGMENTS

497

Authors thank Jean Marc Nicaud for providing plasmids containing Yarrowia lipolytica

498

Autonomously replicating sequences. Authors are also grateful to C. Topham for providing his help

499

with sequence co-evolution analysis using Sequester. We also thank the ICEO high-throughput facility

500

of the Laboratoire d’Ingénierie des Systèmes Biologiques et des Procédés (Toulouse, France), for

501

providing access to GC-MS equipment. This work was granted access to the HPC resources of the

502

Computing Center of Region Midi-Pyrénées (CALMIP, Toulouse, France).

503 504

ABBREVIATIONS

505

KS, ketoacyl synthase; FAS, fatty acid synthase; TALEN, transcription activator-like effector

506

nuclease; ACP, acyl carrier protein; C10, capric acid ; C14, myristic acid; C16, Palmitic acid; C18 or

507

OA, Oleic Acid; YNB, Yeast Nitrogen Base; NHEJ, non-homologous end joining; DSB, Double

508

Strand break; HR, homologous recombination; FA, Fatty Acid, MCFA, medium chain fatty acid;

509

LCFA, long chain fatty acid; CRISPR, Clustered Regulatory Interspaced Short Palindromic Repeats;

510

DCW, Dried Cell Weights.

511

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

512

REFERENCES

513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561

(1) (2) (3) (4)

(5)

(6)

(7)

(8) (9)

(10)

(11)

(12)

(13)

(14) (15)

(16) (17)

(18)

Madzak, C. Yarrowia Lipolytica: Recent Achievements in Heterologous Protein Expression and Pathway Engineering. Appl. Microbiol. Biotechnol. 2015, 99 (11), 4559–4577. Zhu, Q.; Jackson, E. N. Metabolic Engineering of Yarrowia Lipolytica for Industrial Applications. Curr. Opin. Biotechnol. 2015, 36, 65–72. Ledesma-Amaro, R.; Nicaud, J.-M. Yarrowia Lipolytica as a Biotechnological Chassis to Produce Usual and Unusual Fatty Acids. Prog. Lipid Res. 2016, 61, 40–50. Beopoulos, A.; Verbeke, J.; Bordes, F.; Guicherd, M.; Bressy, M.; Marty, A.; Nicaud, J.-M. Metabolic Engineering for Ricinoleic Acid Production in the Oleaginous Yeast Yarrowia Lipolytica. Appl. Microbiol. Biotechnol. 2014, 98 (1), 251–262. Blazeck, J.; Hill, A.; Liu, L.; Knight, R.; Miller, J.; Pan, A.; Otoupal, P.; Alper, H. S. Harnessing Yarrowia Lipolytica Lipogenesis to Create a Platform for Lipid and Biofuel Production. Nat. Commun. 2014, 5, 3131. Qiao, K.; Wasylenko, T. M.; Zhou, K.; Xu, P.; Stephanopoulos, G. Lipid Production in Yarrowia Lipolytica Is Maximized by Engineering Cytosolic Redox Metabolism. Nat. Biotechnol. 2017, 35 (2), 173–177. Allouche, Y.; Cameleyre, X.; Guillouet, S.; Hulin, S.; Thevenieau, F.; Akomia, L.; Molina-Jouve, C. ProBio3 Project: How to Achieve Scientific and Technological Challenges to Boost the Sustainable Microbial Production of Lipids as Biojet Fuel and Chemical Compounds. OCL 2013, 20 (6), D605. Leibundgut, M.; Maier, T.; Jenni, S.; Ban, N. The Multienzyme Architecture of Eukaryotic Fatty Acid Synthases. Curr. Opin. Struct. Biol. 2008, 18 (6), 714–725. Jing, F.; Cantu, D. C.; Tvaruzkova, J.; Chipman, J. P.; Nikolau, B. J.; Yandeau-Nelson, M. D.; Reilly, P. J. Phylogenetic and Experimental Characterization of an Acyl-ACP Thioesterase Family Reveals Significant Diversity in Enzymatic Specificity and Activity. BMC Biochem. 2011, 12, 44. Schütt, B. S.; Brummel, M.; Schuch, R.; Spener, F. The Role of Acyl Carrier Protein Isoforms from Cuphea Lanceolata Seeds in the De-Novo Biosynthesis of Medium-Chain Fatty Acids. Planta 1998, 205 (2), 263–268. Xu, P.; Qiao, K.; Ahn, W. S.; Stephanopoulos, G. Engineering Yarrowia Lipolytica as a Platform for Synthesis of Drop-in Transportation Fuels and Oleochemicals. Proc. Natl. Acad. Sci. 2016, 113 (39), 10848–10853. Stefan, A.; Hochkoeppler, A.; Ugolini, L.; Lazzeri, L.; Conte, E. The Expression of the Cuphea Palustris Thioesterase CpFatB2 in Yarrowia Lipolytica Triggers Oleic Acid Accumulation. Biotechnol. Prog. 2016, 32 (1), 26–35. Zhu, Z.; Zhou, Y. J.; Krivoruchko, A.; Grininger, M.; Zhao, Z. K.; Nielsen, J. Expanding the Product Portfolio of Fungal Type I Fatty Acid Synthases. Nat. Chem. Biol. 2017, 13 (4), 360– 362. Tehlivets, O.; Scheuringer, K.; Kohlwein, S. D. Fatty Acid Synthesis and Elongation in Yeast. Biochim. Biophys. Acta 2007, 1771 (3), 255–270. Sangwallek, J.; Kaneko, Y.; Sugiyama, M.; Ono, H.; Bamba, T.; Fukusaki, E.; Harashima, S. Ketoacyl Synthase Domain Is a Major Determinant for Fatty Acyl Chain Length in Saccharomyces Cerevisiae. Arch. Microbiol. 2013, 195 (12), 843–852. Gajewski, J.; Pavlovic, R.; Fischer, M.; Boles, E.; Grininger, M. Engineering Fungal de Novo Fatty Acid Synthesis for Short Chain Fatty Acid Production. Nat. Commun. 2017, 8, 14650. Christensen, C. E.; Kragelund, B. B.; von Wettstein-Knowles, P.; Henriksen, A. Structure of the Human Beta-Ketoacyl [ACP] Synthase from the Mitochondrial Type II Fatty Acid Synthase. Protein Sci. Publ. Protein Soc. 2007, 16 (2), 261–272. Val, D.; Banu, G.; Seshadri, K.; Lindqvist, Y.; Dehesh, K. Re-Engineering Ketoacyl Synthase Specificity. Struct. Lond. Engl. 1993 2000, 8 (6), 565–566.

ACS Paragon Plus Environment

Page 18 of 26

Page 19 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611

ACS Synthetic Biology

(19)

(20)

(21)

(22)

(23)

(24)

(25) (26) (27) (28)

(29)

(30)

(31) (32)

(33)

(34)

(35)

Verges, A.; Cambon, E.; Barbe, S.; Salamone, S.; Le Guen, Y.; Moulis, C.; Mulard, L. A.; RemaudSiméon, M.; André, I. Computer-Aided Engineering of a Transglycosylase for the Glucosylation of an Unnatural Disaccharide of Relevance for Bacterial Antigen Synthesis. ACS Catal. 2015, 5 (2), 1186–1198. Champion, E.; Guérin, F.; Moulis, C.; Barbe, S.; Tran, T. H.; Morel, S.; Descroix, K.; Monsan, P.; Mourey, L.; Mulard, L. A.; Tranier, S.; Remaud-Siméon, M.; André, I. Applying Pairwise Combinations of Amino Acid Mutations for Sorting out Highly Efficient Glucosylation Tools for Chemo-Enzymatic Synthesis of Bacterial Oligosaccharides. J. Am. Chem. Soc. 2012, 134 (45), 18677–18688. Champion, E.; André, I.; Moulis, C.; Boutet, J.; Descroix, K.; Morel, S.; Monsan, P.; Mulard, L. A.; Remaud-Siméon, M. Design of Alpha-Transglucosidases of Controlled Specificity for Programmed Chemoenzymatic Synthesis of Antigenic Oligosaccharides. J. Am. Chem. Soc. 2009, 131 (21), 7379–7389. Fickers, P.; Le Dall, M. T.; Gaillardin, C.; Thonart, P.; Nicaud, J. M. New Disruption Cassettes for Rapid Gene Disruption and Marker Rescue in the Yeast Yarrowia Lipolytica. J. Microbiol. Methods 2003, 55 (3), 727–737. Verbeke, J.; Beopoulos, A.; Nicaud, J.-M. Efficient Homologous Recombination with Short Length Flanking Fragments in Ku70 Deficient Yarrowia Lipolytica Strains. Biotechnol. Lett. 2013, 35 (4), 571–576. Kretzschmar, A.; Otto, C.; Holz, M.; Werner, S.; Hübner, L.; Barth, G. Increased Homologous Integration Frequency in Yarrowia Lipolytica Strains Defective in Non-Homologous EndJoining. Curr. Genet. 2013, 59 (1–2), 63–72. Gaj, T.; Gersbach, C. A.; Barbas, C. F. ZFN, TALEN and CRISPR/Cas-Based Methods for Genome Engineering. Trends Biotechnol. 2013, 31 (7), 397–405. Kim, H.; Kim, J.-S. A Guide to Genome Engineering with Programmable Nucleases. Nat. Rev. Genet. 2014, 15 (5), 321–334. Sander, J. D.; Joung, J. K. CRISPR-Cas Systems for Editing, Regulating and Targeting Genomes. Nat. Biotechnol. 2014, 32 (4), 347–355. Gao, S.; Tong, Y.; Wen, Z.; Zhu, L.; Ge, M.; Chen, D.; Jiang, Y.; Yang, S. Multiplex Gene Editing of the Yarrowia Lipolytica Genome Using the CRISPR-Cas9 System. J. Ind. Microbiol. Biotechnol. 2016, 43 (8), 1085–1093. Schwartz, C. M.; Hussain, M. S.; Blenner, M.; Wheeldon, I. Synthetic RNA Polymerase III Promoters Facilitate High-Efficiency CRISPR-Cas9-Mediated Genome Editing in Yarrowia Lipolytica. ACS Synth. Biol. 2016, 5 (4), 356–359. Boch, J.; Scholze, H.; Schornack, S.; Landgraf, A.; Hahn, S.; Kay, S.; Lahaye, T.; Nickstadt, A.; Bonas, U. Breaking the Code of DNA Binding Specificity of TAL-Type III Effectors. Science 2009, 326 (5959), 1509–1512. Moscou, M. J.; Bogdanove, A. J. A Simple Cipher Governs DNA Recognition by TAL Effectors. Science 2009, 326 (5959), 1501. Olsen, J. G.; Kadziola, A.; von Wettstein-Knowles, P.; Siggaard-Andersen, M.; Larsen, S. Structures of Beta-Ketoacyl-Acyl Carrier Protein Synthase I Complexed with Fatty Acids Elucidate Its Catalytic Machinery. Struct. Lond. Engl. 1993 2001, 9 (3), 233–243. Chen, Y.; Kelly, E. E.; Masluk, R. P.; Nelson, C. L.; Cantu, D. C.; Reilly, P. J. Structural Classification and Properties of Ketoacyl Synthases. Protein Sci. Publ. Protein Soc. 2011, 20 (10), 1659–1667. Christian, M.; Cermak, T.; Doyle, E. L.; Schmidt, C.; Zhang, F.; Hummel, A.; Bogdanove, A. J.; Voytas, D. F. Targeting DNA Double-Strand Breaks with TAL Effector Nucleases. Genetics 2010, 186 (2), 757–761. Edgar, R. C. MUSCLE: A Multiple Sequence Alignment Method with Reduced Time and Space Complexity. BMC Bioinformatics 2004, 5, 113.

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662

(36)

(37) (38) (39) (40)

(41)

(42) (43) (44) (45) (46) (47) (48) (49)

(50)

(51)

(52)

(53)

(54)

Eswar, N.; Webb, B.; Marti-Renom, M. A.; Madhusudhan, M. S.; Eramian, D.; Shen, M.-Y.; Pieper, U.; Sali, A. Comparative Protein Structure Modeling Using Modeller. Curr. Protoc. Bioinforma. Ed. Board Andreas Baxevanis Al 2006, Chapter 5, Unit 5.6. Krivov, G. G.; Shapovalov, M. V.; Dunbrack, R. L. Improved Prediction of Protein Side-Chain Conformations with SCWRL4. Proteins 2009, 77 (4), 778–795. Ray, A.; Lindahl, E.; Wallner, B. Improved Model Quality Assessment Using ProQ2. BMC Bioinformatics 2012, 13, 224. Lomakin, I. B.; Xiong, Y.; Steitz, T. A. The Crystal Structure of Yeast Fatty Acid Synthase, a Cellular Machine with Eight Active Sites Working Together. Cell 2007, 129 (2), 319–332. Pronk, S.; Páll, S.; Schulz, R.; Larsson, P.; Bjelkmar, P.; Apostolov, R.; Shirts, M. R.; Smith, J. C.; Kasson, P. M.; van der Spoel, D.; Hess, B.; Lindahl, E. GROMACS 4.5: A High-Throughput and Highly Parallel Open Source Molecular Simulation Toolkit. Bioinforma. Oxf. Engl. 2013, 29 (7), 845–854. Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis, J. L.; Dror, R. O.; Shaw, D. E. Improved Side-Chain Torsion Potentials for the Amber Ff99SB Protein Force Field. Proteins 2010, 78 (8), 1950–1958. Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of Simple Potential Functions for Simulating Liquid Water. J. Chem. Phys. 1983, 79 (2), 926–935. Dickson, C. J.; Madej, B. D.; Skjevik, Å. A.; Betz, R. M.; Teigen, K.; Gould, I. R.; Walker, R. C. Lipid14: The Amber Lipid Force Field. J. Chem. Theory Comput. 2014, 10 (2), 865–879. Berendsen, H. J. C.; Postma, J. P. M.; van Gunsteren, W. F.; DiNola, A.; Haak, J. R. Molecular Dynamics with Coupling to an External Bath. J. Chem. Phys. 1984, 81 (8), 3684–3690. Bussi, G.; Donadio, D.; Parrinello, M. Canonical Sampling through Velocity Rescaling. J. Chem. Phys. 2007, 126 (1), 014101. Parrinello, M.; Rahman, A. Polymorphic Transitions in Single Crystals: A New Molecular Dynamics Method. J. Appl. Phys. 1981, 52 (12), 7182–7190. Hess, B. P-LINCS:  A Parallel Linear Constraint Solver for Molecular Simulation. J. Chem. Theory Comput. 2008, 4 (1), 116–122. Miyamoto, S.; Kollman, P. A. Settle: An Analytical Version of the SHAKE and RATTLE Algorithm for Rigid Water Models. J. Comput. Chem. 1992, 13 (8), 952–962. Cheatham, T. E. I.; Miller, J. L.; Fox, T.; Darden, T. A.; Kollman, P. A. Molecular Dynamics Simulations on Solvated Biomolecular Systems: The Particle Mesh Ewald Method Leads to Stable Trajectories of DNA, RNA, and Proteins. J. Am. Chem. Soc. 1995, 117 (14), 4193–4194. Ladevèze, S.; Tarquis, L.; Cecchini, D. A.; Bercovici, J.; André, I.; Topham, C. M.; Morel, S.; Laville, E.; Monsan, P.; Lombard, V.; Henrissat, B.; Potocki-Véronèse, G. Role of Glycoside Phosphorylases in Mannose Foraging by Human Gut Bacteria. J. Biol. Chem. 2013, 288 (45), 32370–32383. Daudé, D.; Topham, C. M.; Remaud-Siméon, M.; André, I. Probing Impact of Active Site Residue Mutations on Stability and Activity of Neisseria Polysaccharea Amylosucrase. Protein Sci. Publ. Protein Soc. 2013, 22 (12), 1754–1765. Cermak, T.; Doyle, E. L.; Christian, M.; Wang, L.; Zhang, Y.; Schmidt, C.; Baller, J. A.; Somia, N. V.; Bogdanove, A. J.; Voytas, D. F. Efficient Design and Assembly of Custom TALEN and Other TAL Effector-Based Constructs for DNA Targeting. Nucleic Acids Res. 2011, 39 (12), e82. Fournier, P.; Abbas, A.; Chasles, M.; Kudla, B.; Ogrydziak, D. M.; Yaver, D.; Xuan, J. W.; Peito, A.; Ribet, A. M.; Feynerol, C. Colocalization of Centromeric and Replicative Functions on Autonomously Replicating Sequences Isolated from the Yeast Yarrowia Lipolytica. Proc. Natl. Acad. Sci. U. S. A. 1993, 90 (11), 4912–4916. Daboussi, F.; Leduc, S.; Maréchal, A.; Dubois, G.; Guyot, V.; Perez-Michaut, C.; Amato, A.; Falciatore, A.; Juillerat, A.; Beurdeley, M.; Voytas, D. F.; Cavarec, L.; Duchateau, P. Genome Engineering Empowers the Diatom Phaeodactylum Tricornutum for Biotechnology. Nat. Commun. 2014, 5, 3831.

ACS Paragon Plus Environment

Page 20 of 26

Page 21 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

663 664 665 666 667

ACS Synthetic Biology

(55) (56)

Beopoulos, A.; Cescut, J.; Haddouche, R.; Uribelarrea, J.-L.; Molina-Jouve, C.; Nicaud, J.-M. Yarrowia Lipolytica as a Model for Bio-Oil Production. Prog. Lipid Res. 2009, 48 (6), 375–387. Browse, J.; Kunst, L.; Anderson, S.; Hugly, S.; Somerville, C. A Mutant of Arabidopsis Deficient in the Chloroplast 16:1/18:1 Desaturase. Plant Physiol. 1989, 90 (2), 522–529.

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

36x20mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 22 of 26

Page 23 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Linear domain organization of YlFAS genes at approximate sequence scale showing catalytic (colored) and scaffolding (grey) domains. ACP: Acyl Carrier Protein: KR: Ketoacyl Reductase; KS: Ketoacyl Synthase; PPT: PhosphoPantetheine Transferase; AT: Acetyl Transferase; ER: Enoyl Reductase; DH: Dehydratase; MPT: Malonyl Palmitoyl Transferase 254x190mm (96 x 96 DPI)

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Molecular docking of fatty acids in the active site of KS from Y. lipolytica. (A) View of a C16 docked in active site of parental wild-type KS. Location of amino-acid residue I1220 is shown in stick. (B) View of a C14 docked in active site of mutant I1220W, shown to illustrate the effect of a bulky amino acid mutation introduced at position 1220 on the binding of shortened fatty acids such as C14. Carbon chain numbering is given for reference 338x190mm (96 x 96 DPI)

ACS Paragon Plus Environment

Page 24 of 26

Page 25 of 26

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

(A) TALEN target sites on the αFAS gene (KS domain in red). Schematic representation of the N-terminal and C-terminal domains of the TALEN (displayed in purple), TALEN binding sites (in blue) and the FokI domain (in green). The I1220 ATC codon is identified by the arrow and highlighted in yellow 254x190mm (96 x 96 DPI)

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

(A) Fatty Acid (FA) profiles of the JMY1233m (wt(m)) strain and the mutants JMY1233_I1220X (X= Ala(A), Glu(G), Val(V), Leu(L), Met(M), Ser(S), Thr(T), Cys(C), Pro(P), Gln(Q), Asn(N), Asp(E) or Lys(K)). (B) FA profiles of the JMY1233m (wt(m)) strain and the mutants JMY1233_I1220X (X= Tyr(Y), Try(W), Phe(F) or His(H). For each clone, total FA accumulation and C14 accumulation are given in % w/w of DCW. Average and standard errors are given for two clones cultivated separately. DCW and FA content were measured after 8 days of culture in minimum medium. 209x252mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 26 of 26