Design and Assembly of DNA Sequence Libraries ... - ACS Publications

Jun 15, 2016 - Design and Assembly of DNA Sequence Libraries for Chromosomal. Insertion in Bacteria Based on a Set of Modified MoClo Vectors...
2 downloads 0 Views 1MB Size
Subscriber access provided by Nanyang Technological Univ

Letter

Design and assembly of DNA sequence libraries for chromosomal insertion in bacteria based on a set of modified MoClo vectors Daniel Schindler, Sarah Milbredt, Theodor Sperlea, and Torsten Waldminghaus ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.6b00089 • Publication Date (Web): 15 Jun 2016 Downloaded from http://pubs.acs.org on June 21, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Synthetic Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Manuscript for ACS Synthetic Biology

2 3

Design and assembly of DNA sequence libraries for chromosomal insertion in

4

bacteria based on a set of modified MoClo vectors

5

Daniel Schindler‡, Sarah Milbredt‡, Theodor Sperlea‡, Torsten Waldminghaus*

6

‡equal contribution

7 8

Chromosome Biology Group, LOEWE Center for Synthetic Microbiology, SYNMIKRO, Philipps-

9

Universität Marburg, Hans-Meerwein-Str. 6, D-35043 Marburg, Germany

10 11

Abstract

12

Efficient assembly of large DNA constructs is a key technology in synthetic biology. One of the most

13

popular assembly systems is the MoClo standard in which restriction and ligation of multiple

14

fragments occurs in a one-pot reaction. The system is based on a smart vector design and type IIs

15

restriction enzymes which cut outside their recognition site. While the initial MoClo vectors had been

16

developed for the assembly of multiple transcription units of plants, some derivatives of the vectors

17

have been developed over the last years. Here we present a new set of MoClo vectors kit for the

18

assembly of fragment libraries and insertion of constructs into bacterial chromosomes. The vectors

19

are accompanied by a computer program that generates a degenerate synthetic DNA sequence that

20

excludes ‘forbidden’ DNA motifs. We demonstrate the usability of the new approach by construction

21

of a stable fluorescence repressor operator system (FROS).

22 23

Keywords

24

genome engineering; chromosome; software; Escherichia coli; sequence design; synthetic biology

25 26

Introduction

27

Biotechnology as well as basic research in biology often includes changing the organism of interest. In

28

some cases, one might want to teach microorganisms to produce some valuable chemical, in other

29

cases one wants to see the effect of additional factors or how cells compete without a certain

30

component. Thus, the ability to introduce changes in an efficient way is a key for future life science

31

developments. Alterations of organisms will, in most cases, be made on the DNA level from which

32

the phenotypic characteristics are derived. The development of genetic modification started in the

33

1970 with the first recombinant DNA being used to transform cells and has since been extended

34

enormously. Especially the research field of synthetic biology came along with a multitude of new

ACS Paragon Plus Environment

1

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 15

35

techniques for DNA manipulation and assembly (1-5). These new DNA assembly approaches were

36

developed to overcome certain limitations of traditional cloning strategies. One major issue is that

37

cloning based on DNA ligase and regular restriction endonucleases often leaves the respective cut

38

sites as scar in the assembled product. However, there are at least four DNA assembly approaches

39

for scar-free assembly of DNA fragments (1-3). First, the Gibson assembly is based on homologous

40

ends of DNA fragments, which are fused in an in vitro reaction including an exonuclease, a DNA

41

polymerase and a DNA ligase (5). This is similar to the second approach where the homologous ends

42

are fused in vivo by the highly efficient recombination system of the yeast Saccharomyces cerevisiae

43

(6). In a third approach, the ligase cycling reaction (LCR), the homology is not mediated by the DNA

44

fragment ends but by a bridging oligonucleotide (7). A fourth approach makes use of type IIs

45

restriction enzymes (8). These enzymes are distinct from other restriction enzymes in that they cut

46

outside their recognition site. They are directional and the positioning of the recognition site allows

47

determination where the DNA is cut. Notably, the actual cut site can be freely chosen allowing the

48

design of scar-less assemblies.

49

An important benefit of the four described methods as compared to traditional cloning is their

50

suitability for fast, single reaction multi-fragment assembly. The first three approaches are

51

dependent on homologous regions of about 20-40 bps which will determine the position of

52

fragments in a multi-part assembly. With type IIs restriction sites the required homology is limited to

53

only 4 bps. This fact was used to develop hierarchical assembly systems based on vectors with

54

defined 4 bp sequences to fit one another (8-10). Such a system allows the efficient assembly of

55

many fragments into a destination vector independent of the actual sub-fragment sequence or size.

56

Probably the most popular type IIs-based assembly framework is the MoClo system developed by

57

Sylvestre Marillonnet and colleagues (9, 11). It consists of sets of seven vectors with the 4 bp

58

overhang ends of each vector matching the overhangs of the preceding and following vector,

59

respectively. Assembling fragments from one vector set (one level) into the next is possible because

60

the resistance markers as well as the type IIS restriction enzymes and sites are alternating. A set of

61

endlinkers is used to generate matching ends for assembly of different numbers of fragments into

62

one acceptor vector (9, 11).

63

One important benefit of the MoClo approach is that it is based on mixing complete plasmids

64

eliminating the need for PCR or fragment isolation. Recently, the MoClo system was adapted to or

65

optimized for special purposes as transcription unit assembly in plants, mammals, fungi or bacteria

66

(12-15). Here we present modifications of the MoClo system for efficient cloning of sequence

67

libraries and for the construction of fragments to be inserted into the E. coli chromosome. We

68

introduce a computer tool for sequence design and show the feasibility of our approach by designing

69

and assembling a FROS array (fluorescence repressor operator system).

ACS Paragon Plus Environment

2

Page 3 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

70

FROS is a widely used tool for spatial and temporal visualization of genetic loci in vivo and has been

71

applied to various different organisms (16-18). Fluorescently labelled DNA binding proteins are used

72

to highlight specific binding sites, which are integrated at a gene locus of interest by homologous

73

recombination. It was initially applied with tandem repeats of the lac operator and a gfp fused Lac

74

repressor in yeast and CHO-cells (16). Also a tet operator-based FROS system was generated and

75

used in yeast (19). As the transfer of FROS to bacteria was not very successful due to instability

76

caused by large homologous regions; arrays were optimized by insertion of random spacers in

77

between the operator repeats to decrease homology (20, 21). As further improvement the number

78

of binding sites can be reduced from 250 to 64 to limit interference with the replication machinery

79

(22). FROS was subsequently applied successfully in various bacteria to gain new insights into the

80

localization, replication and segregation of chromosomes (17, 23, 24). Nevertheless, the design and

81

generation of DNA sequences with many repetitive elements remains challenging. In this paper we

82

present a new set of MoClo vectors that allowed generation of a FROS array with 64 binding sites of

83

two different operators in just 4 cloning steps based on a single pair of degenerate oligonucleotides

84

and its subsequent integration into the chromosome of E. coli.

85 86

Results and discussion

87 88

A computer tool to generate sequences with restricted diversity

89

Efficient assembly of DNA fragments is critical for modern molecular biology approaches. It was

90

predicted that software tools will have an increasing importance for DNA assembly approaches (2).

91

Often, sequences are needed that have specific DNA motifs at defined sites but not at others. Other

92

DNA motifs, as for example restriction sites, need to be excluded throughout the whole construct. It

93

might be straight forward to design a single exact sequence with these characteristics based on

94

extension of two DNA oligonucleotides with an overlap region at one end (Fig. 1A). However,

95

efficient cloning strategies should allow working with libraries generated from mixtures of DNA

96

oligonucleotides to lower the overall costs. Here we present the computer program MARSeG (Motif

97

Avoiding Randomized Sequence Generator) that generates degenerated sequences with a high

98

degree of diversity while excluding a list of DNA motifs provided by the user (Fig. 1A). An example for

99

its application could be the design of 20 spacer sequences with a length of 200 bps each, that are

100

used to separate transcription units within a large scale gene circuit assembly. Notably, these spacer

101

sequences should not harbor recognition sites for a list of restriction enzymes. Instead of designing

102

and buying 20 individual sequences one could just order a fully randomized sequence with 200 Ns

103

and receive an oligonucleotide mix to be cloned into a vector backbone. However, a certain amount

104

of these sequences will have at least one of the ‘forbidden’ DNA motifs. We tested this assumption

ACS Paragon Plus Environment

3

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 15

105

by comparing random sequences with sequences generated with MARSeG (Fig. 1B). Almost 60 % of

106

completely random sequences with a length of 200 bps contain at least one motif from a list of 19

107

restriction enzyme recognition sites (motif list III in table S1, fig. 1B). The computer tool MARSeG

108

reduces the diversity of sequences in such a way that ‘forbidden’ motifs are excluded while

109

maintaining high sequence diversity. This leads to sequence collections without any appearance of

110

the ‘forbidden’ motifs (Fig. 1B). The degree of MARSeG library diversity will depend on the number

111

and type of DNA motifs to be excluded as shown by analysis of the overall sequence homology for

112

100 example sequences for three different lists with two, ten or nineteen ‘forbidden’ DNA motifs,

113

respectively (Table S1, fig. 1C). An alternative approach would be to generate many sequences of the

114

desired length and exclude all sequences that do contain one or more of the ‘forbidden’ motifs or

115

other undesired characteristics (25). However, this approach will only generate individual sequences

116

and no sequence libraries as MARSeG does. MARSeG is open source and available, including a

117

detailed user manual, through the web site (http://www.synmikro.com/marseg).

118 119

A new set of MoClo vectors optimized for library cloning and insertion into the E. coli chromosome

120

The MoClo vectors are widely used and some specialized derivatives or part libraries have been

121

developed (12, 13, 26). We changed the existing vectors to facilitate library cloning, multi-fragment

122

assembly and insertion of constructs into bacterial chromosomes via homologous recombination

123

techniques. An overview of the modifications is depicted in figure 2A and a list of new vector sets is

124

given in table S2. The starting point for our modifications was a set of MoClo vectors kindly provided

125

by Sylvestre Marillonnet. The respective Level 1 vectors have been described previously and the level

126

M and P vectors differ from previous vectors by the fact that they do not contain T-DNA borders for

127

agrobacterium delivery (9). Working with libraries instead of individual sequences poses special

128

requirements on the DNA assembly system. Most importantly, the percentage of positive clones

129

should be near 100% because clones are not selected individually. To suppress vectors still containing

130

the lacZ cassette instead of the desired fragment, we added ccdB gene in such a way that it is lost

131

with the lacZ gene upon successful cloning (ccdB+ vectors). The ccdB gene product is a small

132

cytotoxin that kills E. coli cells that are not engineered to express the antitoxin CcdA or possess a

133

mutated gyrase (27). As expected, cloning with the ccdB+ vector led to elimination of the blue

134

colonies still harboring the lacZ-ccdB MoClo cassette (Fig. 2B).

135

A second change to previous MoClo vectors is a size reduction of the sequence remaining between

136

level 1 fragments in higher level assemblies. Respective sequences where placed in the original level

137

1 vectors between the BpiI and BsaI sites and contain restriction sites to facilitate the analysis of

138

assembled transcription sites. For this purpose they were certainly helpful but could be deleterious in

139

other cases for example as potential recombination sites if occurring to frequently. To keep this short

ACS Paragon Plus Environment

4

Page 5 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

140

sequences remaining between the assembled fragments as small as possible we deleted the 12 bp

141

between the BsaI and the BpiI cut sites for the whole level 1 vector set.

142

Very often it is desired to introduce constructed gene circuits or pathways into the host chromosome

143

as the genomic stability is higher compared to plasmid based expression (28). In addition, the cell to

144

cell variability of plasmid copy numbers makes it difficult to derive quantitative data for exact

145

measurements of expression phenomena (29). Chromosomal insertions into the E. coli chromosome

146

are straight forward with the phage lambda based recombination system (30). However, a frequent

147

problem are false-positives originating from transferred plasmids even if those just served as PCR

148

template or were supposed to be cut by restriction enzymes. To eliminate this problem we

149

exchanged the original pMB1 replication origin with oriR6K. This conditional replication origin does

150

only replicate in E. coli strains expressing the lambda pir gene and thus, replicons based on oriR6K are

151

not able to replicate in wildtype E. coli. As a proof of principle we cloned building blocks for a

152

chromosomal insertion into level 1 vectors, including homologous regions targeting the lac locus, a

153

chloramphenicol resistance marker flanked by FRT sites to remove the cassette after successful

154

integration via ‘flipping’ and a fluorescence gene with a constitutive promoter. After a one-step

155

assembly of all four parts into one of our new vectors the assembled construct could readily be

156

inserted into the E. coli chromosome by recombineering to generate red fluorescent cells (data not

157

shown). All vectors as well as the parts described here and below (72 plasmids in total) are available

158

through a request form on our homepage (http://www.synmikro.com/plasmidrequest). An overview

159

of the new MoClo vectors and their position within the MoClo hierarchy is shown in supplementary

160

figure S1.

161 162

Application of new vectors to construct a repressor-operator array

163

To test the usability of the MARSeG program and the new MoClo vectors we applied these tools to a

164

more challenging assembly, namely the construction of a FROS system. Such systems consist of an

165

array of operator sites which are bound by a fluorescence marker fused to the respective repressor

166

protein to visualize a specific genomic region by microscopy. These arrays are difficult to assemble

167

because the operator sequences are homologous to one another. Such repetitive sequences have

168

been shown to be especially difficult to assemble with methods relying on larger homology parts as

169

Gibson assembly (8, 31). The array we designed contains tet as well as lac operators to allow more

170

flexibility in the choice of binding proteins. Construction of a FROS array of 128 operators (64 TetO

171

plus 64 LacO) was based on building blocks with 8 alternating operator sequences separated by

172

variable linker sequences to reduce homology between building blocks (Fig. 3A). The basic building

173

blocks were generated by elongation of two overlapping DNA oligonucleotides designed with

174

MARSeG (Fig. 3A, see Method section for details). Fragment libraries were applied to a MoClo

ACS Paragon Plus Environment

5

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 15

175

reaction with seven level 1 vectors (Fig. 3A). After transformation of the cloning reaction the

176

generated plasmid libraries were directly purified from the liquid E. coli culture and used for a four

177

part assembly into level M vectors (Fig. 3A). Each of these parts should contain 32 operators with

178

each of the operators being separated by a different spacer sequence of different length which was

179

designed by MARSeG to be diverse on one hand but to not contain recognition sites of the type IIs

180

endonucleases used (Fig. 3A). To test this diversity, we sequenced the operator array regions after

181

the second assembly step (pMA281-284, see supl. table S4) and aligned all spacer sequences with

182

one another. The respective homology matrix is shown in figure 3B. Notably, none of the 155 spacer

183

sequences appeared more than once in the array. Homologies ranged between 0 and 90 %, clearly

184

showing that the design and cloning approach presented here is able to produce a suitable amount

185

of sequence diversity.

186

To further test the functionality of the constructed array it was integrated into the E. coli

187

chromosome via the new vector system as described above. Cells carrying this integration were

188

transformed with a plasmid allowing inducible expression of the fluorescence protein mVenus fused

189

to the TetR repressor. Fluorescence microscopy showed clear formation of foci in cells with the

190

constructed FROS array insertion as expected (Fig. 4A). In contrast, only diffuse fluorescence was

191

seen in cells lacking the FROS array (Fig. 4A). These results demonstrate the functionality of the

192

constructed FROS array. A common problem with FROS arrays in which the spacer sequences

193

between the operators have the same sequence and are not diverse as in our case is their genetic

194

instability caused by homologous recombination events. This can lead to undesired size reduction of

195

the respective FROS array. To test if the FROS array presented here is resistant to such recombination

196

events we cultured the E. coli strain carrying the array for an extended period of 120 h (see Material

197

section for details). After 24h periods we measured the array size by Southern Blotting (Fig. 4B). No

198

fragments smaller than the expected 11247 bps were detected over the entire test period supporting

199

genetic stability of the constructed FROS array (Fig. 4B). To further test if the genetic stability of the

200

FROS array with MARSeG-based design outperforms that of an array with the same operator setup

201

but similar instead of diverse spacer sequences we constructed such a “bad-design-array” with 128

202

operator sites as above. We cultivated the respective plasmid pMA704 in E. coli MG1655

203

continuously for several days in parallel to cells carrying a similar plasmid with the MARSeG-designed

204

spacers (pMA290). The plasmid DNA was isolated after 24 hour intervals and cut with BpiI to release

205

the 4817 bp FROS array. A respective band can be seen at all analyzed time points for the FROS array

206

with MARSeG design (Fig. 4C, left). In contrast the FROS array band becomes weaker starting at 48h

207

of cultivation in case of the similar spacer sequences (Fig. 4C, right). In addition, smaller bands occur

208

on the agarose gel at later time points, clearly indicating plasmid size reduction through homologous

209

recombination. We conclude that the FROS array designed using MARSeG has a higher genetic

ACS Paragon Plus Environment

6

Page 7 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

210

stability as an array with all spacer sequences being similar. It is important to note that stable FROS

211

arrays with variable linkers have been constructed before (17). However, the assembly approach

212

presented here presents a threefold improvement to the earlier work. First, the MARSeG design

213

excludes unwanted restriction sites to omit unwanted cloning of erroneously cut sub-fragments

214

instead of full fragments. Second, the previous assembly approach required laborious purification of

215

DNA fragments for cloning instead of the plasmid-based MoClo approach used here. Third, the

216

previous approach included a doubling of operator sites in each assembly step while the MoClo

217

hierarchy used here generates a four-fold increase of sites in each step. This reduces the number of

218

cloning steps which will be more important the bigger the assembly of interest is.

219

Chromosomal insertions into the lac operon are a common approach but are limited to E. coli strains

220

that actually carry this gene region. To allow more flexibility and potentially target multiple

221

chromosomal sites we have designed and constructed flanking regions for five additional

222

chromosomal loci in the new MoClo vectors (Suppl. table S4, suppl. fig. S2). We have used respective

223

vectors to assemble a cassette targeting tnaA and could successfully use it for insertion of the FROS

224

array (32)(Suppl. table S3 and S4). As for the integration into lacZ we observed fluorescence foci

225

showing functional chromosomal integration (data not shown).

226 227

Conclusions

228

The ability to efficiently assemble DNA constructs and integrate them into a host genome is still a

229

main bottleneck in basic and applied molecular biology research. New methods have been developed

230

over the last years allowing multi-fragment assembly based on different principles. The next step

231

must be the adaptation and optimization of these new approaches to specific systems. Here we

232

present tools for the design and efficient multi-fragment assembly of genetic constructs for

233

chromosomal insertion. Our new MoClo vectors are fully compatible with previously published

234

MoClo kits of the Marillonnet group. Although we focus on manipulation of the E. coli chromosome

235

our approach should be applicable in many bacteria that allow genetic modification via homologous

236

recombination. We expect the approach presented here to be especially valuable for the design and

237

construction of synthetic chromosomes which is now technically possible (1, 4, 5, 33).

238 239

Material and Methods

240 241

A detailed description of the materials and methods is provided in the supplementary material.

242

Author Contribution

243

DS, SM and TS contributed equally to this work.

244 245

Acknowledgments ACS Paragon Plus Environment

7

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 15

246

We gratefully thank Sylvestre Marillonnet (Halle, Germany) for providing MoClo vectors and helpful

247

discussions. Johan Elf (Uppsala, Sweden), Federico Katzen and Xiquan Liang (Thermo Fisher Scientific,

248

Carlsbad, USA), Alexander Böhm (Marburg, Germany, †), William Margolin (Houston, USA) and

249

Michael L. Kahn (Washington, USA) are acknowledged for providing strains and/or plasmids. We

250

thank Julian Sohl, Joel Eichmann and Patrick Sobetzko from the Waldminghaus lab for helping with

251

experiments and data analysis as well as Nadine Schallopp for excellent technical assistance and the

252

whole working group for fruitful discussions. We are grateful to Manuel Seip for help with setting up

253

the web pages. This work was supported within the LOEWE program of the State of Hesse.

254 255 256 257 258 259 260 261 262 263

Supporting Information - Fig. S1: New vectors and their hierarchy within the MoClo system. - Fig. S2: Insertion sites on the E. coli chromosome to be targeted with the new MoClo vectors. - Supplementary Methods. - Tables S1: DNA motifs to be excluded in sequences designed by MARSeG. - Table S2: Overview of optimized MoClo plasmids and their respective changes. - Table S3: Strains used in this study. - Table S4: Plasmids used in this study. - Table S5: Oligonucleotides used in this study.

264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280

ACS Paragon Plus Environment

8

Page 9 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

281 282 283

Figure 1: Generation of diverse DNA sequences that exclude of a list of motifs by MARSeG. (A)

284

Double-stranded DNA fragment libraries are generated by annealing and extension of two single-

285

stranded oligonucleotides (black lines) with partial overlap (gray boxes). Three possible designs are

286

shown below. A fully defined sequence (first line) could exclude a list of motifs but does not confer

287

diversity; a fully randomized sequence (second line) confers diversity but might lead to sequences

288

including unwanted motifs. Sequences generated with MARSeG (third line) confer diversity while

289

excluding unwanted motifs. (B) Sequences of 200 bps were generated completely random (red) or

290

with MARSeG (blue) and the number of motifs from a list of 19 ‘forbidden’ restriction enzyme

291

recognition sites (list III in table S1) was counted for 500 derived sequences. (C) Trade-off between

292

the amount of excluded motifs and diversity in MARSeG generated sequences. After generating

293

degenerate sequences using MARSeG with three motif lists as indicated (Tab. S1), 100 sequences

294

were defined from each respective template. Pairwise sequence homology values were calculated

295

using a Smith-Waterman algorithm. The degree of homology is color coded as indicated.

296

ACS Paragon Plus Environment

9

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 15

297 298 299

Figure 2: Optimization of MoClo vectors for library cloning and chromosomal insertions. (A)

300

Schematic drawing of changes to existing MoClo vectors: arrows = insertion, double arrows =

301

exchange, red cross = deletion. (B) Cloning into standard MoClo vectors produces some background

302

consisting of original vectors, indicated by blue colonies (top panel). New vectors including ccdB lead

303

to white colonies only (bottom panel). Percentage of colonies is given in the respective color.

304 305

ACS Paragon Plus Environment

10

Page 11 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

306 307

Figure 3: Assembly of the LacO/TetO operator array. (A) DNA oligonucleotides were annealed (gray

308

boxes), elongated and enriched via PCR. Linker lengths (in bps) are shown as white numbers. The

309

resulting library was cloned into seven level 1 vectors. Sets of four level 1 vector libraries were

310

assembled into level M acceptor vectors and four resulting individual vectors were combined into

311

level P to gain the final array. For integration into E. coli lacZ, flanking regions and a chloramphenicol

312

cassette (flanked by FRT sites) were assembled together with the final array into level M. (B) Spacer

313

sequence homology matrix. The sequenced FROS array assembly parts (pMA281-284, see supl. table

314

S4) were disassembled and the pairwise homologies of spacer sequences were calculated using a

315

Smith-Waterman algorithm. The respective homology is color coded as indicated.

316

ACS Paragon Plus Environment

11

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 15

317 318

Figure 4: In vivo functionality of the constructed FROS array. (A) Fluorescence microscopy of E. coli

319

cells harboring a plasmid encoding a TetR-mVenus fusion and either no FROS array (top panel; strain

320

SM100) or a chromosomal integration of the new FROS array (bottom panel; strain SM112). The scale

321

bar is 2 µm. (B) Southern Blot analysis to test stability of the LacO/TetO array during extended

322

cultivation. Chromosomal DNA was isolated from strain SM93 after different time points of

323

cultivation as indicated and cut with NdeI. DNA was plotted on a membrane after separation on an

324

agarose gel and the array detected with a probe directed against lacI. Black asterisk highlights the

325

size of the array (11247 bps). As control we used DNA from wildtype E. coli MG1655 without FROS

326

integration resulting in a fragment of 7520 bps. (C) Genetic stability of a FROS array with MARSeG-

327

designed variable spacer sequences (left) compared to an equivalent array with each spacer

328

sequence being similar to one another. E. coli strains DS366 und DS367 carrying plasmids pMA290

329

and pMA704 respectively were cultivated for the indicated time periods. Plasmid DNA was isolated

330

and cut with BpiI to release the array (4817 bp) and the vector backbone (3968 bp) as indicated. ACS Paragon Plus Environment

12

Page 13 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

331

ACS Synthetic Biology

References

332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382

1. 2. 3.

4.

5.

6. 7.

8. 9. 10.

11.

12. 13.

14.

15.

Schindler, D., and Waldminghaus, T. (2015) Synthetic chromosomes, FEMS microbiology reviews 39, 871-891. Casini, A., Storch, M., Baldwin, G. S., and Ellis, T. (2015) Bricks and blueprints: methods and standards for DNA assembly, Nat Rev Mol Cell Biol 16, 568-576. Karas, B. J., Suzuki, Y., and Weyman, P. D. (2015) Strategies for cloning and manipulating natural and synthetic chromosomes, Chromosome research : an international journal on the molecular, supramolecular and evolutionary aspects of chromosome biology 23, 57-68. Annaluru, N., Muller, H., Mitchell, L. A., Ramalingam, S., Stracquadanio, G., Richardson, S. M., Dymond, J. S., Kuang, Z., Scheifele, L. Z., Cooper, E. M., Cai, Y., Zeller, K., Agmon, N., Han, J. S., Hadjithomas, M., Tullman, J., Caravelli, K., Cirelli, K., Guo, Z., London, V., Yeluru, A., Murugan, S., Kandavelou, K., Agier, N., Fischer, G., Yang, K., Martin, J. A., Bilgel, M., Bohutski, P., Boulier, K. M., Capaldo, B. J., Chang, J., Charoen, K., Choi, W. J., Deng, P., DiCarlo, J. E., Doong, J., Dunn, J., Feinberg, J. I., Fernandez, C., Floria, C. E., Gladowski, D., Hadidi, P., Ishizuka, I., Jabbari, J., Lau, C. Y., Lee, P. A., Li, S., Lin, D., Linder, M. E., Ling, J., Liu, J., London, M., Ma, H., Mao, J., McDade, J. E., McMillan, A., Moore, A. M., Oh, W. C., Ouyang, Y., Patel, R., Paul, M., Paulsen, L. C., Qiu, J., Rhee, A., Rubashkin, M. G., Soh, I. Y., Sotuyo, N. E., Srinivas, V., Suarez, A., Wong, A., Wong, R., Xie, W. R., Xu, Y., Yu, A. T., Koszul, R., Bader, J. S., Boeke, J. D., and Chandrasegaran, S. (2014) Total synthesis of a functional designer eukaryotic chromosome, Science 344, 55-58. Gibson, D. G., Glass, J. I., Lartigue, C., Noskov, V. N., Chuang, R. Y., Algire, M. A., Benders, G. A., Montague, M. G., Ma, L., Moodie, M. M., Merryman, C., Vashee, S., Krishnakumar, R., Assad-Garcia, N., Andrews-Pfannkoch, C., Denisova, E. A., Young, L., Qi, Z. Q., Segall-Shapiro, T. H., Calvey, C. H., Parmar, P. P., Hutchison, C. A., 3rd, Smith, H. O., and Venter, J. C. (2010) Creation of a bacterial cell controlled by a chemically synthesized genome, Science 329, 5256. Ma, H., Kunes, S., Schatz, P. J., and Botstein, D. (1987) Plasmid construction by homologous recombination in yeast, Gene 58, 201-216. de Kok, S., Stanton, L. H., Slaby, T., Durot, M., Holmes, V. F., Patel, K. G., Platt, D., Shapland, E. B., Serber, Z., Dean, J., Newman, J. D., and Chandran, S. S. (2014) Rapid and reliable DNA assembly via ligase cycling reaction, ACS synthetic biology 3, 97-106. Engler, C., Kandzia, R., and Marillonnet, S. (2008) A one pot, one step, precision cloning method with high throughput capability, PLoS One 3, e3647. Weber, E., Engler, C., Gruetzner, R., Werner, S., and Marillonnet, S. (2011) A modular cloning system for standardized assembly of multigene constructs, PLoS One 6, e16765. Storch, M., Casini, A., Mackrow, B., Fleming, T., Trewhitt, H., Ellis, T., and Baldwin, G. S. (2015) BASIC: A New Biopart Assembly Standard for Idempotent Cloning Provides Accurate, Single-Tier DNA Assembly for Synthetic Biology, ACS synthetic biology 4, 781-787. Werner, S., Engler, C., Weber, E., Gruetzner, R., and Marillonnet, S. (2012) Fast track assembly of multigene constructs using Golden Gate cloning and the MoClo system, Bioengineered bugs 3, 38-43. Lee, M. E., DeLoache, W. C., Cervantes, B., and Dueber, J. E. (2015) A Highly Characterized Yeast Toolkit for Modular, Multipart Assembly, ACS synthetic biology 4, 975-986. Engler, C., Youles, M., Gruetzner, R., Ehnert, T. M., Werner, S., Jones, J. D., Patron, N. J., and Marillonnet, S. (2014) A golden gate modular cloning toolbox for plants, ACS synthetic biology 3, 839-843. Duportet, X., Wroblewska, L., Guye, P., Li, Y., Eyquem, J., Rieders, J., Rimchala, T., Batt, G., and Weiss, R. (2014) A platform for rapid prototyping of synthetic gene networks in mammalian cells, Nucleic acids research 42, 13440-13451. Weber, E., Gruetzner, R., Werner, S., Engler, C., and Marillonnet, S. (2011) Assembly of designer TAL effectors by Golden Gate cloning, PLoS One 6, e19722. ACS Paragon Plus Environment

13

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431

16.

17.

18.

19. 20.

21. 22. 23. 24.

25.

26.

27. 28. 29.

30. 31.

32.

33.

Page 14 of 15

Robinett, C. C., Straight, A., Li, G., Willhelm, C., Sudlow, G., Murray, A., and Belmont, A. S. (1996) In vivo localization of DNA sequences and visualization of large-scale chromatin organization using lac operator/repressor recognition, J Cell Biol 135, 1685-1700. Lau, I. F., Filipe, S. R., Soballe, B., Okstad, O. A., Barre, F. X., and Sherratt, D. J. (2003) Spatial and temporal organization of replicating Escherichia coli chromosomes, Mol Microbiol 49, 731-743. Matzke, A. J., Huettel, B., van der Winden, J., and Matzke, M. (2005) Use of two-color fluorescence-tagged transgenes to study interphase chromosomes in living plants, Plant Physiol 139, 1586-1596. Michaelis, C., Ciosk, R., and Nasmyth, K. (1997) Cohesins: chromosomal proteins that prevent premature separation of sister chromatids, Cell 91, 35-45. Gordon, G. S., Sitnikov, D., Webb, C. D., Teleman, A., Straight, A., Losick, R., Murray, A. W., and Wright, A. (1997) Chromosome and low copy plasmid segregation in E. coli: visual evidence for distinct mechanisms, Cell 90, 1113-1121. Dworkin, J., and Losick, R. (2002) Does RNA polymerase help drive chromosome segregation in bacteria?, Proc Natl Acad Sci U S A 99, 14089-14094. Mettrick, K. A., and Grainge, I. (2016) Stability of blocked replication forks in vivo, Nucleic acids research 44, 657-668. Thanbichler, M., and Shapiro, L. (2006) MipZ, a spatial regulator coordinating chromosome segregation with cell division in Caulobacter, Cell 126, 147-162. Wang, X., Montero Llopis, P., and Rudner, D. Z. (2014) Bacillus subtilis chromosome organization oscillates between two distinct patterns, Proc Natl Acad Sci U S A 111, 1287712882. Casini, A., Christodoulou, G., Freemont, P. S., Baldwin, G. S., Ellis, T., and MacDonald, J. T. (2014) R2oDNA designer: computational design of biologically neutral synthetic DNA sequences, ACS synthetic biology 3, 525-528. Iverson, S. V., Haddock, T. L., Beal, J., and Densmore, D. M. (2016) CIDAR MoClo: Improved MoClo Assembly Standard and New E. coli Part Library Enable Rapid Combinatorial Design for Synthetic and Traditional Biology, ACS synthetic biology 5, 99-103. Bernard, P. (1996) Positive selection of recombinant DNA by CcdB, Biotechniques 21, 320323. Santos, C. N., Regitsky, D. D., and Yoshikuni, Y. (2013) Implementation of stable and complex biological systems through recombinase-assisted genome engineering, Nat Commun 4, 2503. Bentley, W. E., and Quiroga, O. E. (1993) Investigation of subpopulation heterogeneity and plasmid stability in recombinant Escherichia coli via a simple segregated model, Biotechnol Bioeng 42, 222-234. Datsenko, K. A., and Wanner, B. L. (2000) One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products, Proc Natl Acad Sci U S A 97, 6640-6645. Cermak, T., Doyle, E. L., Christian, M., Wang, L., Zhang, Y., Schmidt, C., Baller, J. A., Somia, N. V., Bogdanove, A. J., and Voytas, D. F. (2011) Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting, Nucleic acids research 39, e82. Waldminghaus, T., Weigel, C., and Skarstad, K. (2012) Replication fork movement and methylation govern SeqA binding to the Escherichia coli chromosome, Nucleic acids research 40, 5465-5476. Messerschmidt, S. J., Kemter, F. S., Schindler, D., and Waldminghaus, T. (2015) Synthetic secondary chromosomes in Escherichia coli based on the replication origin of chromosome II in Vibrio cholerae, Biotechnol J 10, 302-314.

ACS Paragon Plus Environment

14

Page 15 of 15

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

For Table of Contents Use Only

Design and assembly of DNA sequence libraries for chromosomal insertion in bacteria based on a set of modified MoClo vectors Daniel Schindler*, Sarah Milbredt*, Theodor Sperlea* and myself (* equal contribution).

Graphical Abstract

ACS Paragon Plus Environment