ACS Publications - American Chemical Society

Dec 21, 2015 - Cancer Informatics Core, H. Lee Moffitt Cancer Center and Research Institute, Tampa, Florida 33612, United States. §. Department of En...
0 downloads 0 Views 1MB Size
Subscriber access provided by ORTA DOGU TEKNIK UNIVERSITESI KUTUPHANESI

Article

Identifying regulatory changes to facilitate nitrogen fixation in the non-diazotroph Synechocystis sp. PCC 6803 Thomas J. Mueller, Eric A. Welsh, Himadri B. Pakrasi, and Costas D. Maranas ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.5b00202 • Publication Date (Web): 21 Dec 2015 Downloaded from http://pubs.acs.org on December 23, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

ACS Synthetic Biology is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

1

Research article

2

Identifying regulatory changes to facilitate nitrogen fixation in the non-

3

diazotroph Synechocystis sp. PCC 6803

4 5

Thomas J. Mueller1

6

Eric A. Welsh2

7

Himadri B. Pakrasi3,4

8

Costas D. Maranas1*

9 10

1Department

11

Park, Pennsylvania, USA

12

2Cancer

13

Tampa, Florida, USA

14

3Department

15

University, St. Louis, Missouri, USA

16

4Department

of Chemical Engineering, Pennsylvania State University, University

Informatics Core, H Lee Moffitt Cancer Center and Research Institute,

of Energy, Environmental, and Chemical Engineering, Washington

of Biology, Washington University, St. Louis, Missouri, USA

17 18

Correspondence: Costas D. Maranas, Department of Chemical Engineering,

19

Pennsylvania State University, 112 Fenske Laboratory, University Park, PA, 16802

20

Email: [email protected]

21

Keywords:

22

Network, Modeling, Synechocystis, Cyanothece

23

Abbreviations: TF, transcription factor

Cyanobacteria,

Computational

Biotechnology,

ACS Paragon Plus Environment

Gene

Regulatory

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 33

24

Abstract

25

The incorporation of biological nitrogen fixation into a non-diazotrophic

26

photosynthetic organism provides a promising solution to the increasing fixed

27

nitrogen demand, but is accompanied by a number of challenges for accommodating

28

two incompatible processes within the same organism. Here we present regulatory

29

influence networks for two cyanobacteria, Synechocystis PCC 6803 and Cyanothece

30

ATCC 51142, and evaluate them to co-opt native transcription factors that may be

31

used to control the nif gene cluster once it is transferred to Synechocystis. These

32

networks were further examined to identify candidate transcription factors for

33

other metabolic processes necessary for temporal separation of photosynthesis and

34

nitrogen

35

transcription factors native to Synechocystis, LexA and Rcp1, were identified as

36

promising candidates for the control of the nif gene cluster and other pertinent

37

metabolic processes, respectively. Lessons learned in the incorporation of nitrogen

38

fixation into a non-diazotrophic prokaryote may be leveraged to further progress

39

the incorporation of nitrogen fixation in plants.

fixation,

glycogen

catabolism

and

cyanophycin

synthesis.

Two

40 41

1 Introduction

42

Global use of nitrogen fertilizer has seen a sevenfold increase from 1960 to 1995,

43

and further increases are predicted barring substantial increases in fertilizer

44

efficiency1. Increasing levels of biological nitrogen fixation can replace a portion of

45

the required fertilizer, decreasing nitrogen demands2, 3. Some plants, such as

46

legumes, have formed symbiotic relationships with rhizobium in order to acquire

ACS Paragon Plus Environment

Page 3 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

47

fixed nitrogen4, whereas other plants must acquire fixed nitrogen from the

48

environment5. The inclusion of nitrogen fixation in a photosynthetic organism

49

requires strict regulation of the oxygen levels within the organism, as nitrogenase is

50

irreversibly inhibited by oxygen6. Our goal in this paper is to identify how to provide

51

the regulatory structures to control the inherently incompatible processes of

52

photosynthesis and nitrogen fixation.

53 54

Cyanobacteria are photosynthetic prokaryotes that serve as a promising test system

55

in transferring nitrogen-fixing capabilities, as both diazotrophic and non-

56

diazotrophic species exist. Some cyanobacteria, such as Synechocystis PCC 6803

57

(hereafter referred to as Synechocystis), require fixed nitrogen from the

58

environment7 whereas others, such as Cyanothece ATCC 51142 (hereafter referred

59

to as Cyanothece) and Anabaena PCC 7120, are diazotrophic8, 9. In order to

60

accommodate

61

cyanobacteria separate processes either spatially or temporally6. Organisms such as

62

Anabeana use specialized cells called heterocysts to physically separate nitrogen

63

fixation from cells performing photosynthesis9. Others, such as Cyanothece,

64

temporally separate nitrogen fixation and photosynthesis8. Cyanothece generates

65

glycogen during the day and then consumes it through a burst of respiration in the

66

early dark hours, generating both the energy necessary for nitrogen fixation as well

67

as the requisite anaerobic environment10. Therefore, on a standard 12h/12h light-

68

dark cycle, the nitrogen fixation genes are highly overexpressed during the dark

69

period under anoxic conditions8.

both

nitrogen

fixation

and

photosynthesis,

ACS Paragon Plus Environment

diazotrophic

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

70 71

Synechocystis and Cyanothece have been extensively studied compared to other

72

cyanobacteria. When both organisms are grown under constant light at 30°C and

73

non-nitrogen fixing conditions, the doubling times vary from approximately 24

74

hours for Synechocystis11 to 45 hours for Cyanothece12. Synechocystis was the first

75

cyanobacterium to have its genome sequenced13 and has served as a genetically

76

tractable model organism for cyanobacteria14. A number of its regulatory genes have

77

been further investigated15-17 and the structure of segments of the regulatory

78

network has been proposed18. While Synechocystis is a more common target of

79

research, recent work has compiled both transcriptomic and proteomic data for

80

Cyanothece8, 19. In addition, there have been significant efforts spent elucidating the

81

metabolism of both Cyanothece and several closely related strains11, 12, 20, 21 and the

82

regulatory network of Cyanothece22. The regulatory network presented here was

83

generated in order to facilitate comparison between the two organisms. Given the

84

close relationship in Cyanothece between nitrogen fixation and its diurnal

85

oscillations, any comparisons between different organisms focusing on nitrogen

86

fixation must trace the light-dark cycle.

87 88

Synechocystis will be used as a proof of concept for the incorporation of temporal

89

separation of photosynthesis and nitrogen fixation in a non-diazotrophic organism.

90

Cyanothece is a logical source for the nitrogen fixation genes given its close

91

phylogenetic relationship to Synechocystis and the fact that it contains the largest

92

contiguous nif cluster in cyanobacteria10.

ACS Paragon Plus Environment

Page 4 of 33

Page 5 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

93 94

In order to identify candidate regulators for the processes necessary to incorporate

95

nitrogen fixation into Synechocystis, we generated, examined, and compared

96

regulatory networks for both Synechocystis and Cyanothece. We identified

97

transcription factors from both Synechocystis and Cyanothece that could be co-opted

98

to control these processes in Synechocystis. Two of these identified transcription

99

factors, LexA and Rcp1, are native to Synechocystis and have both expression

100

profiles and environmental responses that make them compelling candidates for

101

controlling the nif cluster and other pertinent metabolic processes, respectively.

102

Cyanobacteria have been used as a platform system for investigating CO2

103

concentrating mechanisms in plants23, 24. Therefore lessons learned in manipulating

104

regulatory structures in cyanobacteria could ultimately inform the incorporation of

105

nitrogen fixation in plants.

106 107

2 Results and Discussion

108

2.1 Comparison of peak expression timing

109

The regulatory networks for both Synechocystis PCC 6803 (Synechocystis) and

110

Cyanothece ATCC 51142 (Cyanothece) were generated by minimizing the extent of

111

experimental expression variance not explained for a fixed number of regulatory

112

influences. The number of influences for each gene cluster was chosen as the

113

minimum number of influences that would allow for over 50% of the expression

114

change to be explained. The Synechocystis network contains 45 gene clusters, 17

115

transcription factor (TF) clusters, and an average degree (i.e., TF clusters affecting a

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

116

gene cluster) of 5. The Cyanothece network contains 221 gene clusters, 16 TF

117

clusters, and has an average degree of 2.23 (Table 1). Parts of the networks are

118

shown in Figure 1. The regulatory influences of each TF cluster on each gene cluster

119

can be found in Supplemental File S1. This difference in average degree could be

120

caused in part by the differing methods used to process the original transcriptomic

121

data.

122 123

In order for a non-diazotrophic organism such as Synechocystis to temporally

124

separate nitrogen fixation and photosynthesis, a number of genes must have

125

expression profiles similar to those of their Cyanothece counterparts. The genes in

126

these two organisms that have differing expression profiles extends past those

127

involved in nitrogen fixation and its associated processes. Performing a bidirectional

128

BLAST search on the set of cycling Cyanothece genes identified 306 homolog pairs

129

between the two organisms where both genes were in the set of cycling genes. Only

130

114 of these pairs have elevated expression for both genes at the same segment of

131

the light-dark cycle. Another 77 peaked at different times during the same period

132

(e.g. a gene peaks in the early light and its homolog peaks in the late light). Figure 2

133

shows the distribution of the 115 homolog pairs that peak in different periods (i.e.

134

one homolog peaks in the dark period and the other in the light). Of the 14 homolog

135

pairs with differing peaks categorized in photosynthesis and respiration, six were

136

associated with respiratory electron flow and two with phycobilisomes25. These un-

137

matched peaks can be explained by the different temporal pattern in glycogen

138

consumption and use of phycobilisomes in the two organisms26. Given the number

ACS Paragon Plus Environment

Page 6 of 33

Page 7 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

139

of differences in expression across the genome, we focused only on those genes

140

directly associated with nitrogen fixation or the metabolic processes recruited

141

before or after nitrogen fixation.

142 143

2.2 Candidates for controlling nif cluster expression

144

Several different avenues are possible for imposing the necessary regulatory action

145

on genes involved in these processes. By using native transcription factors, the

146

number of genes that would need to be transferred is minimized and the

147

uncertainty in the expression profile of the transcription factor is reduced, as many

148

of its regulatory partners are already present. Candidate native TFs were identified

149

by determining the gene clusters in Synechocystis with the appropriate expression

150

profile for this process, and by investigating the TF clusters that were identified as

151

regulatory influences for a number of these gene clusters. Some of these identified

152

TFs have expression profiles similar to that of the final desired expression profile.

153

Controlling gene expression using a transcription factor with the same expression

154

profile is a concept underlying the development in gene circuits27. Gene circuits

155

developed using TFs have been designed to respond to environmental cues (i.e.,

156

oxygen level)28. The strategy outlined here, instead, aims to identify TFs native to

157

the organism that already respond to environmental conditions (e.g., low oxygen,

158

absence of light, etc.) that subsequently can be coopted to control the genes of

159

interest. Examining the Cyanothece regulatory network and the regulatory

160

influences predicted for the genes associated with the process identified additional

161

candidate regulators. Using these regulators would require the transfer of the

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

162

regulatory genes to Synechocystis but reduce the need to modify promoter regions

163

for the genes transferred from Cyanothece.

164 165

The expression profile necessary for the transferred nif genes is characterized by an

166

increase in expression during the dark and a rapid decrease in expression at the

167

beginning of illumination. The desired profile can be seen in gene clusters 4, 10, 16,

168

17, 18, 19, and 35 of Synechocystis. There are four TF clusters (10, 15, 16, and 17)

169

that are regulatory influences for at least three of these gene clusters. Each TF

170

cluster contains a single gene that has been shown to respond to environmental

171

conditions pertinent to nitrogen fixation. TF cluster 10 contains gene sll1689, which

172

codes for the group 2 sigma factor sigE. Both the transcript levels of sigE and the

173

protein levels of the associated SigE protein increase under nitrogen depletion29.

174

Studies have shown that sigE expression increases in an NtcA dependent manner29.

175

NtcA, a global nitrogen regulator in cyanobacteria30, responds to intracellular 2-

176

oxoglutarate levels and regulates a number of genes involved in nitrogen

177

assimilation31. In Cyanothece ATCC 51142 (Cyanothece), it has been shown that ntcA

178

expression is out of phase with that of the nif genes32, and has been suggested that

179

ntcA is involved primarily in nitrogen assimilation than in nitrogen fixation33.

180

Nevertheless, both ntcA and sigE do not account for the presence of oxygen, a

181

nitrogenase inhibitor, through detection of either light or intracellular oxygen levels.

182

Therefore, additional synthetic regulation is needed to account for such factors.

183

ACS Paragon Plus Environment

Page 8 of 33

Page 9 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

184

The remaining three TF clusters contain TF genes that respond to these

185

environmental conditions. TF cluster 17 contains the hik16 (slr1805) gene. While

186

hik16 is most commonly associated with the salt stress response34, it also responds

187

to oxidative stress induced by hydrogen peroxide35. The decreased expression of

188

hik16 during the day may be explained by the oxidative stress that can be caused by

189

reactive oxygen species generated by photosynthesis36. Given that it responds to

190

additional stimuli, namely salt stress, hik16 would not be an ideal native TF for use

191

in nif cluster regulation. Both TF clusters 15 and 16 contain genes (lexA and rre34

192

respectively) that respond to oxygen levels within the environment. In

193

Synechocystis, LexA binds to two different sites on the hydrogenase operon and is

194

also known to regulate the sbtA gene, which codes for a sodium-bicarbonate

195

symporter37. Both of these regulatory influences are predicted in the Synechocystis

196

network (Figure 1A). LexA is thought to function as a transcriptional activator16 and

197

is upregulated after the transition to a low O2 environment38. Similarly, rre34

198

(sll0789) was shown to be upregulated under low O2 conditions38. Given the

199

irreversible inhibition of nitrogenase by oxygen, a transcriptional activator that is

200

upregulated under low O2 conditions is a promising candidate for use in regulating

201

the nif cluster in Synechocystis. When comparing the expression profiles of the two

202

candidates in Synechocystis with that of the nif cluster in Cyanothece (Figure 3) we

203

see that lexA peaks at the same time as the nif cluster as compared to the later peak

204

in expression for rre34. This places lexA as the top candidate for regulating the nif

205

cluster in a diazotrophic Synechocystis.

206

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

207

The Cyanothece regulatory network identifies a number of transcription factors

208

involved in the regulation of genes in the nif cluster. 33 genes in the nif cluster were

209

identified as cycling and were contained within 13 different gene clusters

210

(Supplementary File S2). Many of these gene clusters have different sets of

211

regulatory influences although several TF clusters are prominent (Supplementary

212

File S3). 27 of the 33 genes have either TF cluster 4 or TF cluster 15 as a regulatory

213

influence. TF cluster 4 contains three genes, one of which is cce_1898, annotated as

214

patB. In the previous regulatory network of Cyanothece, PatB was predicted to

215

influence the expression of the nitrogenase complex22. PatB was also shown to be

216

required for diazotrophic growth, and is expressed in the heterocysts of Anabaena

217

PCC 712039. TF cluster 15 is composed of only one gene, sigE. SigE has been shown

218

to be upregulated in both Synechocystis and Anabeana PCC 7120 under nitrogen

219

depletion9,

220

influences 8 genes. The deletion of sigD in Anabeana impairs the ability of the

221

organism to establish diazotrophic growth40 (Figure 1B). Given the prediction that

222

PatB controls nif cluster expression, the introduction of patB into Synechocystis and

223

the use of LexA to control its expression is another possible approach to controlling

224

the nif cluster expression in Synechocystis (Figure 4A). By using the native nif

225

cluster regulator, this strategy circumvents the need to modify all promoter regions

226

in the nif cluster individually and also retains the native control of the nif cluster by

227

PatB.

29.

TF cluster 10, another prominent regulator, contains sigD and

228 229

2.3 Candidates for regulating other pertinent metabolic processes

ACS Paragon Plus Environment

Page 10 of 33

Page 11 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

230

In addition to nitrogen fixation two other processes, glycogen catabolism and

231

cyanophycin synthesis, must be active at specific times for the diazotrophic

232

Synechocystis to create the requisite anaerobic environment and to assimilate and

233

store newly fixed nitrogen. In Cyanothece the anaerobic environment and energy

234

necessary for nitrogen fixation are generated through the consumption of glycogen

235

in the early dark period8. Glycogen catabolism in Synechocystis is a metabolic

236

process that must be placed under different regulatory control. There is a pair of

237

homologs (cce_1629 and slr1367) within the two organisms that are annotated as

238

glycogen phosphorylases and identified as cycling. The current expression profile in

239

Synechocystis includes a peak in expression one hour before the onset of light, and a

240

negative log2 ratio one hour into the dark period, when its homolog peaks in

241

Cyanothece (Figure 5A). This is in contrast to the glycogen synthase genes, which

242

peak in the beginning of the light period for both organisms, and does not require

243

any modified regulation. Using the steps discussed to identify native transcription

244

factors for the nif cluster, two gene clusters, 16 and 34, were found to have the

245

desired profile; low expression during the light and peak expression in the early

246

dark. These two clusters share four regulatory influences, the most significant of

247

these is TF cluster 12, which is composed of the rcp1 (slr0474) gene. The Rcp1

248

protein is part of a two-component light sensing system with the phytochrome

249

Cph141. Garcia-Dominguez et al. showed that rcp1 is maximally expressed 15

250

minutes into the dark period, and its expression levels are almost undetectable five

251

minutes after illumination42. Given the role that glycogen catabolism plays in

252

consuming oxygen within the cell, its associated genes should be expressed prior to

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

253

those for nitrogen fixation, making rcp1 an ideal candidate for controlling glycogen

254

phosphorylase. The Cyanothece network identifies the regulatory influences for

255

glycogen phosphorylase as a hypothetical protein (cce_4141) and sigD, which is

256

involved in establishing diazotrophic growth in Anabeana40.

257 258

Cyanophycin is a dynamic nitrogen reserve composed of equimolar quantities of

259

arginine and aspartic acid26. Cyanophycin accumulation in Cyanothece follows the

260

fixation of nitrogen, being mainly generated between two and six hours after the

261

beginning of the dark period26. In contrast, cyanophycin has a minor role in

262

Synechocystis, which, instead, degrades phycobilisomes, leading to cell bleaching26.

263

Therefore, arginine, aspartic acid, and cyanophycin synthesis in Synechocystis

264

should be expressed in a manner similar to that of their counterparts in Cyanothece.

265

The genes involved in arginine and aspartic acid synthesis were identified through

266

the gene-protein-reaction relationships of the reactions that carry flux at maximum

267

biomass production for the genome-scale models iSyn731 and iCyt773 (Figure

268

5C)11. For each reaction, the elevated expression time for each associated gene was

269

compared. Several of the gene pairs shared elevated expression times, although four

270

genes in Synechocystis (sll0573, slr0585, slr1133, and sll0902) either were not

271

cycling, or had elevated expression during the beginning of the light period.

272

Conversely, their Cyanothece counterparts exhibited elevated expression during the

273

beginning of the dark period (Figure 5B). Cyanophycin synthetase catalyzes the

274

elongation reaction of cyanophycin through the incorporation of arginine and

275

aspartic acid43. The gene that codes for this enzyme in Synechocystis, slr2002, is not

ACS Paragon Plus Environment

Page 12 of 33

Page 13 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

276

cycling and therefore will need to be controlled by a different TF. The cyanophycin

277

synthetase gene in Cyanothece, cce_2237, peaks in expression at the beginning of the

278

dark period, with above average expression continuing through the middle of the

279

dark period. Given the similar timing of necessary expression, the native TF

280

identified as candidates for nif cluster and glycogen catabolism regulation, namely,

281

LexA and Rcp1, serve equally well as candidates for control of these genes. Two TF

282

clusters, 3 and 13, regulate the six clusters containing these genes in Cyanothece. TF

283

cluster 13 is composed of a FUR family transcription factor, and TF cluster 3

284

contains cce_1982, which is the homolog of rcp1 in Synechocystis, strengthening its

285

candidacy for regulation of cyanophycin synthesis (Figure 4B).

286 287

This work focuses on the transcriptional regulation of the nif genes and the genes

288

associated with several other key processes. Future work that considers allosteric

289

regulation will be able to provide a more comprehensive set of possible

290

modifications that allow for the incorporation of processes such as nitrogen fixation.

291

The inclusion of additional forms of regulation would capture processes such as the

292

allosteric regulation of NtcA activity through binding with 2-oxoglutarate44.

293 294

In this paper we elucidated regulatory influence networks for two cyanobacteria,

295

Synechocystis

296

transcriptomic data taken during 12h/12h light-dark cycles. These networks were

297

inspected to identify what regulation would need to be introduced or modified to

298

incorporate nitrogen fixation into the non-diazotrophic Synechocystis. One native

PCC

6803

and

Cyanothece

ATCC

ACS Paragon Plus Environment

51142,

generated

using

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 33

299

transcription factor, LexA, arose as a candidate for regulating nif cluster expression,

300

while PatB was identified as a promising candidate to be a regulator of nif cluster

301

genes in Cyanothece that could be transferred to Synechocystis. Using these

302

predictions together, we postulate that transferring patB to Synechocystis and

303

controlling its expression with LexA would lead to the desired nif cluster expression

304

within Synechocystis. Other processes with required expression changes dictated by

305

the temporal separation of photosynthesis and nitrogen fixation, specifically

306

glycogen catabolism and cyanophycin anabolism, were analyzed. The native Rcp1

307

transcription factor arose as an additional candidate for the regulation of these

308

processes within a newly diazotrophic Synechocystis. Given the expression profiles

309

and response to environmental conditions, LexA and Rcp1 are transcription factors

310

native to Synechocystis that should be considered for use as regulatory control of the

311

nif cluster and the associated metabolic processes, respectively.

312

3 Materials and methods

313

3.1 Cycling and homologous gene identification

314

Regulatory influence networks for both organisms were generated using time series

315

transcriptomic data over a standard 12h light-dark cycle

316

51142 (Cyanothece) data from Stockel et al.8 was processed using the IRON software

317

tool46. The log2 ratio of expression levels at each time point to the pooled samples

318

was then calculated for each gene and time point. The criteria from Kucho et al.47

319

were used to identify cycling genes in the Synechocystis PCC 6803 (Synechocystis)

320

data set45 and applied to the Cyanothece data set as well. The Synechocystis data

321

used was not normalized in the range of -1 to +1 in order to be consistent with the

ACS Paragon Plus Environment

8, 45.

The Cyanothece ATCC

Page 15 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

322

Cyanothece data from Stockel et al.8. The nif genes qualified as cycling using two of

323

the three criteria from Kucho et al. (i.e., appropriate period length and statistically

324

significant differences between peak and trough expression each day) but the values

325

of the error function described in Kucho et al. were above the specified cutoff. Given

326

the importance of including these genes within the network for identifying

327

regulation on the nif cluster, only the period length and statistically significant

328

differences in peak and trough expression were used to identify cycling genes. This

329

resulted in a set of 1,552 genes identified as cycling in Cyanothece as compared to

330

the 1,423 cycling genes in Synechocystis (Supplementary File S4).

331 332

To identify homolog pairs a bidirectional BLAST search between the two organisms

333

was performed using an e value cutoff of 10-2. A result was determined to be a hit if

334

the alignment length was at least 75% of the length of both the query and target

335

genes, and if the raw score was at least 25% of the self-hit score of the query gene. A

336

homolog pair was identified if both genes had only each other as a hit. A gene was

337

defined as having elevated expression if the log2 ratio at a time point was at least

338

90% of the maximum log2 ratio for that gene. To compare when peak expression

339

times for homolog pairs, elevated expression was then categorized as occurring in

340

the beginning, middle, or final third of the light or dark period.

341 342

3.2 Cluster generation

343

In order to simplify the system before inferring regulatory influences, hierarchical

344

clustering using Euclidean distances was implemented for both organisms. First the

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

345

optimal agglomeration method available in R was identified for each organism using

346

the cophenetic correlation48. For both organisms the unweighted pair group method

347

with arithmetic mean (UPGMA)49 method generated the dendrogram whose

348

cophenetic distances most closely correlated with the previously calculated

349

Euclidean distances. The SD validity index50 was then used to identify the optimal

350

height cutoff for both dendrograms, generating 45 gene clusters for Synechocystis

351

and 221 gene clusters for Cyanothece.

352 353

The transcription factor (TF) clusters were generated in a similar manner. Genes

354

with a regulatory function were first identified using literature sources. Of the 146

355

regulatory genes identified by Singh et al in Synechocystis51, 76 are present in the set

356

of cycling genes of Saha et al.45. The cycling Cyanothece gene set contained 72 genes

357

that were either annotated as having a regulatory function by Stockel et al.8 or were

358

identified as regulators by McDermott et al.22. Dendrograms for these gene sets

359

were then generated using the same UPGMA agglomeration method. The SD validity

360

index was used to identify the optimal height cutoff for Cyanothece, and these 72

361

genes were clustered into 16 different TF clusters. The identified optimal height

362

cutoff for the Synechocystis transcription factors generated only six TF clusters. The

363

subsequently generated network did not adequately capture the expression of the

364

gene clusters, as the percentage of experimental expression change explained by the

365

network (see Network Formulation) ranged from 8.1-56.4% at maximum regulatory

366

influences. A lower sampled height cutoff yielded 17 TF clusters, and was chosen to

367

give a comparable number of TF clusters for the two networks. The members of

ACS Paragon Plus Environment

Page 16 of 33

Page 17 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

368

each cluster for both organisms are listed in Supplemental File S5. All clustering and

369

cluster analysis was performed using the R language along with the cluster and

370

clusterCrit packages52-54

371 372

3.3 Network elucidation using optimization formulation

373

The network of regulatory influences for each organism was generated by

374

iteratively determining for each gene cluster the maximum amount of experimental

375

expression change that can be accounted for a given number of regulatory

376

influences. J, K, and T denote the total number of gene clusters, TF clusters, and time

377

points, respectively. The experimental expression level of gene cluster j at time

378

point t is represented as Xjt, and the experimental expression level of TF cluster k at

379

time point t is denoted as TFkt. Ckj is the interaction coefficient that describes the

380

influence of TF cluster k on gene cluster j.

381 382

Gene expression was assumed to be a linear additive contribution of the set of

383

influencing TFs. This can be postulated for each gene cluster j and time point t as

384

shown in equation 1.

385

 =     ∀  ∈  & ∀  ∈



(1)



386

Backward finite difference was used to approximate the rate of expression term (i.e.,

387

left-hand side) of equation 1 in equation 3 in the mixed integer linear program

388

formulation shown below. Solving this formulation identifies the interaction

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

389

coefficients (Ckj) that minimize the amount of change in gene cluster expression not

390

explained by the regulatory influences. The formulation was solved iteratively for

391

every gene cluster and a range of maximum regulatory influences.  

 E       +  (2)    D    D D  s. t.  D (3)D ∀  ∈   −  − '     =  −  ∀  ∈ {1, … , − 1}   D ≤  ≤ ./ ∀1 ∈ 2 (4) −./     D  D   /  ≤ max regulatory influences (5)D  C 392

The slack variables  and  represent the positive and negative deviations from

393

experimental measurements. Equations 4 and 5 use the binary variables, ykj, to

394

restrict the number of regulatory influences per gene cluster. If a transcription

395

factor in TF cluster k is in gene cluster j then a regulatory interaction will be

396

imposed for that TF cluster gene cluster pair by fixing ykj to one for that pair.

397 398

In order to determine the number of regulatory influences per cluster, the

399

percentage of experimental expression change accounted for by the network was

400

calculated for the range from one to eleven regulatory influences for both

401

organisms. For a given number of regulatory influences this percentage of explained

402

expression variance is defined as the difference between the error (i.e. the sum of

403

the slack variables) with zero regulatory influences and the given number of

404

regulatory influences, divided by the error with zero influences. The number of

405

regulatory influences was chosen for each cluster as the minimum number of

ACS Paragon Plus Environment

Page 18 of 33

Page 19 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

406

influences that brought the percentage of explained expression change above 50%.

407

The percentage of explained expression change for each cluster at every number of

408

regulatory influences can be found in Supplemental File S6.

409

Author Contributions

410

TJM created and analyzed regulatory networks. EAW processed Cyanothece data.

411

TJM and CDM wrote the paper. All authors read and approved the final manuscript.

412

Acknowledgements

413

We thank Fuzhong Zhang and Tae Seok Moon for discussions during the writing of

414

the manuscript. Supported by funding from the National Science Foundation, grant

415

MCB-1331194.

416

Conflict of Interest

417

The authors declare that they have no conflict of interest.

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

418

Tables

419

Table 1: Statistics for both regulatory networks

Page 20 of 33

Synechocystis

Cyanothece

Genes

1,423

1,552

Gene clusters

45

221

TFs

76

72

TF clusters

17

16

Average degree

5

2.23

Average % explained

54.25%

60.70%

expression change 420 421

ACS Paragon Plus Environment

Page 21 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

422

ACS Synthetic Biology

Figure Captions

423 424

Figure 1: Parts of the regulatory networks for A) Synechocystis and B) Cyanothece.

425

Edges are colored based on whether the predicted interaction is activating (green)

426

or inhibiting (red). A subset of the gene clusters containing nif genes is shown

427

above, all regulatory influences on the nif genes are listed in Supplemental File S2.

428

Dashed lines indicate edges with associated literature support discussed in the

429

Results section.

430 431

Figure 2: Distribution of homolog pairs in which one peaks in the light period and

432

one peaks in the dark between the different subsystems. Subsystem annotations

433

from Saha et al. were used45. *“DNA replication and other processes” refers to the

434

subsystem “DNA replication, restriction, modification, recombination, and repair”.

435 436

Figure 3: Comparison of candidate native transcription factor expression profiles in

437

Synechocystis (lexA: black line, rre34: cyan line) with those of the nif cluster in

438

Cyanothece (dashed lines). The dark periods of the light-dark cycle (hours 0-12 and

439

24-36) are represented with gray shading.

440 441

Figure 4: All proposed regulatory changes. Genes native to Synechocystis are in

442

green; those from Cyanothece are in orange. Proposed regulatory changes are

443

represented as dashed lines. A) Proposed strategy for control of the nif cluster in

444

Synechocystis using lexA and patB. B) Strategies for control of glycogen catabolism

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

445

(horizontal stripes) and cyanophycin synthesis (vertical stripes) genes in

446

Synechocystis using rcp1.

447 448

Figure 5: Metabolic processes requiring modified regulation in a diazotrophic

449

Synechocystis. A) Expression profiles of rcp1 (solid black line) and glycogen

450

phosphorylase (solid brown line) in Synechocystis, and glycogen phosphorylase

451

(dashed brown line) in Cyanothece. B) Expression profiles of rcp1 (solid black line)

452

and cycling cyanophycin synthesis genes (solid lines, colors corresponding to C) in

453

Synechocystis, and the expression profiles of the cycling cyanophycin synthesis

454

genes in Cyanothece (dashed lines). Sll0573 and slr2002 are not shown, as they were

455

not identified as cycling. The dark periods of the light-dark cycle (hours 0-12 and

456

24-36) are represented with gray shading. C) Reactions and genes identified by the

457

iSyn731 and iCyt773 genome-scale models as being involved in cyanophycin

458

synthesis. Reactions with associated genes whose elevated expression times differ

459

between the two organisms are color-coded.

460

ACS Paragon Plus Environment

Page 22 of 33

Page 23 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

461

Supporting Information

462

S1 (.xlsx): Regulatory influences for each gene cluster for both networks

463

S2 (.docx): Annotation, gene cluster, and regulatory influences for the cycling genes

464

within the nif cluster.

465

S3 (.docx): Venn Diagram depicting those transcription factors that are influences

466

for nif genes and have cycling homologs in Synechocystis.

467

S4 (.xlsx): Transcriptomic log2 ratios of cycling Cyanothece genes and cycling

468

Synechocystis genes

469

S5 (.xlsx): Members of each gene and TF cluster for Synechocystis and Cyanothece

470

S6 (.xlsx): Percentage of explained expression change for each gene cluster for 1-10

471

TF influences

472 473

References

474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493

[1] Tilman, D., Cassman, K. G., Matson, P. A., Naylor, R., and Polasky, S. (2002) Agricultural sustainability and intensive production practices, Nature 418, 671-677. [2] Peoples, M. B., Herridge, D. F., and Ladha, J. K. (1995) Biological NitrogenFixation - an Efficient Source of Nitrogen for Sustainable Agricultural Production, Plant Soil 174, 3-28. [3] Dobereiner, J. (1997) Biological nitrogen fixation in the tropics: Social and economic contributions, Soil Biol Biochem 29, 771-774. [4] Gage, D. J. (2004) Infection and invasion of roots by symbiotic, nitrogen-fixing rhizobia during nodulation of temperate legumes, Microbiol Mol Biol Rev 68, 280-300. [5] Kraiser, T., Gras, D. E., Gutierrez, A. G., Gonzalez, B., and Gutierrez, R. A. (2011) A holistic view of nitrogen acquisition in plants, J Exp Bot 62, 1455-1466. [6] Berman-Frank, I., Lundgren, P., and Falkowski, P. (2003) Nitrogen fixation and photosynthetic oxygen evolution in cyanobacteria, Res Microbiol 154, 157164. [7] Herrero, A., Muro-Pastor, A. M., and Flores, E. (2001) Nitrogen control in cyanobacteria, J Bacteriol 183, 411-425. [8] Stockel, J., Welsh, E. A., Liberton, M., Kunnvakkam, R., Aurora, R., and Pakrasi, H. B. (2008) Global transcriptomic analysis of Cyanothece 51142 reveals robust

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538

diurnal oscillation of central metabolic processes, P Natl Acad Sci USA 105, 6156-6161. [9] Kumar, K., Mella-Herrera, R. A., and Golden, J. W. (2010) Cyanobacterial Heterocysts, Csh Perspect Biol 2. [10] Welsh, E. A., Liberton, M., Stoeckel, J., Loh, T., Elvitigala, T., Wang, C., Wollam, A., Fulton, R. S., Clifton, S. W., Jacobs, J. M., Aurora, R., Ghosh, B. K., Sherman, L. A., Smith, R. D., Wilson, R. K., and Pakrasi, H. B. (2008) The genome of Cyanothece 51142, a unicellular diazotrophic cyanobacterium important in the marine nitrogen cycle, P Natl Acad Sci USA 105, 15094-15099. [11] Saha, R., Verseput, A. T., Berla, B. M., Mueller, T. J., Pakrasi, H. B., and Maranas, C. D. (2012) Reconstruction and Comparison of the Metabolic Potential of Cyanobacteria Cyanothece sp ATCC 51142 and Synechocystis sp PCC 6803, Plos One 7. [12] Feng, X. Y., Bandyopadhyay, A., Berla, B., Page, L., Wu, B., Pakrasi, H. B., and Tang, Y. J. (2010) Mixotrophic and photoheterotrophic metabolism in Cyanothece sp ATCC 51142 under continuous light, Microbiol-Sgm 156, 25662574. [13] Kaneko, T., Sato, S., Kotani, H., Tanaka, A., Asamizu, E., Nakamura, Y., Miyajima, N., Hirosawa, M., Sugiura, M., Sasamoto, S., Kimura, T., Hosouchi, T., Matsuno, A., Muraki, A., Nakazaki, N., Naruo, K., Okumura, S., Shimpo, S., Takeuchi, C., Wada, T., Watanabe, A., Yamada, M., Yasuda, M., and Tabata, S. (1996) Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions, DNA Res 3, 109136. [14] Mitschke, J., Georg, J., Scholz, I., Sharma, C. M., Dienst, D., Bantscheff, J., Voss, B., Steglich, C., Wilde, A., Vogel, J., and Hess, W. R. (2011) An experimentally anchored map of transcriptional start sites in the model cyanobacterium Synechocystis sp. PCC6803, Proc Natl Acad Sci U S A 108, 2124-2129. [15] Garcia-Dominguez, M., Reyes, J. C., and Florencio, F. J. (2000) NtcA represses transcription of gifA and gifB, genes that encode inhibitors of glutamine synthetase type I from Synechocystis sp PCC 6803, Mol Microbiol 35, 11921201. [16] Gutekunst, K., Phunpruch, S., Schwarz, C., Schuchardt, S., Schulz-Friedrich, R., and Appel, J. (2005) LexA regulates the bidirectional hydrogenase in the cyanobacterium Synechocystis sp PCC 6803 as a transcription activator, Mol Microbiol 58, 810-823. [17] Li, H., and Sherman, L. A. (2000) A redox-responsive regulator of photosynthesis gene expression in the cyanobacterium Synechocystis sp strain PCC 6803, J Bacteriol 182, 4268-4277. [18] Lemeille, S., Latifi, A., and Geiselmann, J. (2005) Inferring the connectivity of a regulatory network from mRNA quantification in Synechocystis PCC6803, Nucleic Acids Res 33, 3381-3389. [19] Aryal, U. K., Stockel, J., Krovvidi, R. K., Gritsenko, M. A., Monroe, M. E., Moore, R. J., Koppenaal, D. W., Smith, R. D., Pakrasi, H. B., and Jacobs, J. M. (2011)

ACS Paragon Plus Environment

Page 24 of 33

Page 25 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584

ACS Synthetic Biology

Dynamic proteomic profiling of a unicellular cyanobacterium Cyanothece ATCC51142 across light-dark diurnal cycles, Bmc Syst Biol 5. [20] Mueller, T. J., Berla, B. M., Pakrasi, H. B., and Maranas, C. D. (2013) Rapid construction of metabolic models for a family of Cyanobacteria using a multiple source annotation workflow, Bmc Syst Biol 7. [21] Bandyopadhyay, A., Elvitigala, T., Welsh, E., Stockel, J., Liberton, M., Min, H. T., Sherman, L. A., and Pakrasi, H. B. (2011) Novel Metabolic Attributes of the Genus Cyanothece, Comprising a Group of Unicellular Nitrogen-Fixing Cyanobacteria, Mbio 2. [22] McDermott, J. E., Oehmen, C. S., McCue, L. A., Hill, E., Choi, D. M., Stockel, J., Liberton, M., Pakrasi, H. B., and Sherman, L. A. (2011) A model of cyclic transcriptomic behavior in the cyanobacterium Cyanothece sp ATCC 51142, Mol Biosyst 7, 2407-2418. [23] Price, G. D., Badger, M. R., Woodger, F. J., and Long, B. M. (2008) Advances in understanding the cyanobacterial CO2-concentrating-mechanism (CCM): functional components, Ci transporters, diversity, genetic regulation and prospects for engineering into plants, J Exp Bot 59, 1441-1461. [24] Lieman-Hurwitz, J., Rachmilevitch, S., Mittler, R., Marcus, Y., and Kaplan, A. (2003) Enhanced photosynthesis and growth of transgenic plants that express ictB, a gene involved in HCO3- accumulation in cyanobacteria, Plant Biotechnol J 1, 43-50. [25] Vermaas, W. F. J. (2001) Photosynthesis and Respiration in Cyanobacteria, In eLS, John Wiley & Sons, Ltd. [26] Li, H., Sherman, D. M., Bao, S. L., and Sherman, L. A. (2001) Pattern of cyanophycin accumulation in nitrogen-fixing and non-nitrogen-fixing cyanobacteria, Arch Microbiol 176, 9-18. [27] Brophy, J. A. N., and Voigt, C. A. (2014) Principles of genetic circuit design, Nat Methods 11, 508-520. [28] Immethun, C. M., Ng, K. M., DeLorenzo, D. M., Waldron-Feinstein, B., Lee, Y. C., and Moon, T. S. (2015) Oxygen-responsive Genetic Circuits Constructed in Synechocystis sp. PCC 6803, Biotechnol Bioeng. [29] Osanai, T., Ikeuchi, M., and Tanaka, K. (2008) Group 2 sigma factors in cyanobacteria, Physiol Plant 133, 490-506. [30] Vega-Palas, M. A., Flores, E., and Herrero, A. (1992) NtcA, a global nitrogen regulator from the cyanobacterium Synechococcus that belongs to the Crp family of bacterial regulators, Mol Microbiol 6, 1853-1859. [31] Liu, D., and Yang, C. (2014) The nitrogen-regulated response regulator NrrA controls cyanophycin synthesis and glycogen catabolism in the cyanobacterium Synechocystis sp. PCC 6803, J Biol Chem 289, 2055-2071. [32] Gaudana, S. B., Krishnakumar, S., Alagesan, S., Digmurti, M. G., Viswanathan, G. A., Chetty, M., and Wangikar, P. P. (2013) Rhythmic and sustained oscillations in metabolism and gene expression of Cyanothece sp ATCC 51142 under constant light, Front Microbiol 4. [33] Bradley, R. L., and Reddy, K. J. (1997) Cloning, sequencing, and regulation of the global nitrogen regulator gene ntcA in the unicellular diazotrophic cyanobacterium Cyanothece sp. strain BH68K, J Bacteriol 179, 4407-4410.

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629

[34] Marin, K., Suzuki, L., Yamaguchi, K., Ribbeck, K., Yamamoto, H., Kanesaki, Y., Hagemann, M., and Murata, N. (2003) Identification of histidine kinases that act as sensors in the perception of salt stress in Synechocystis sp PCC 6803, P Natl Acad Sci USA 100, 9061-9066. [35] Kanesaki, Y., Yamamoto, H., Paithoonrangsarid, K., Shoumskaya, M., Suzuki, I., Hayashi, H., and Murata, N. (2007) Histidine kinases play important roles in the perception and signal transduction of hydrogen peroxide in the cyanobacterium, Synechocystis sp PCC 6803, Plant J 49, 313-324. [36] Niyogi, K. K. (1999) Photoprotection revisited: Genetic and molecular approaches, Annu Rev Plant Phys 50, 333-359. [37] Oliveira, P., and Lindblad, P. (2009) Transcriptional regulation of the cyanobacterial bidirectional Hox-hydrogenase, Dalton T, 9990-9996. [38] Summerfield, T. C., Nagarajan, S., and Sherman, L. A. (2011) Gene expression under low-oxygen conditions in the cyanobacterium Synechocystis sp. PCC 6803 demonstrates Hik31-dependent and -independent responses, Microbiol-Sgm 157, 301-312. [39] Jones, K. M., Buikema, W. J., and Haselkorn, R. (2003) Heterocyst-specific expression of patB, a gene required for nitrogen fixation in Anabaena sp strain PCC 7120, J Bacteriol 185, 2306-2314. [40] Khudyakov, I. Y., and Golden, J. W. (2001) Identification and inactivation of three group 2 sigma factor genes in Anabaena sp strain PCC 7120, J Bacteriol 183, 6667-6675. [41] Yeh, K. C., Wu, S. H., Murphy, J. T., and Lagarias, J. C. (1997) A cyanobacterial phytochrome two-component light sensory system, Science 277, 1505-1508. [42] Garcia-Dominguez, M., Muro-Pastor, M. I., Reyes, J. C., and Florencio, F. J. (2000) Light-dependent regulation of cyanobacterial phytochrome expression, J Bacteriol 182, 38-44. [43] Berg, H., Ziegler, K., Piotukh, K., Baier, K., Lockau, W., and Volkmer-Engert, R. (2000) Biosynthesis of the cyanobacterial reserve polymer multi-L-arginylpoly-L-aspartic acid (cyanophycin) - Mechanism of the cyanophycin synthetase reaction studied with synthetic primers, Eur J Biochem 267, 55615570. [44] Flores, E., and Herrero, A. (2005) Nitrogen assimilation and nitrogen control in cyanobacteria, Biochem Soc T 33, 164-167. [45] Saha, R., Liu, D., Hoynes-O'Connor, A., Liberton, M., Yu, J., Bhattacharyya, M., Balassy, A., Zhang, F., Moon, T., Maranas, C. D., and Pakrasi, H. B. (2015) Diurnal Processes in the Cyanobacterium Synechocystis sp. PCC 6803: Insights from Transcriptomic, Fluxomic and Physiological Analyses. [46] Welsh, E. A., Eschrich, S. A., Berglund, A. E., and Fenstermacher, D. A. (2013) Iterative rank-order normalization of gene expression microarray data, BMC Bioinformatics 14, 153. [47] Kucho, K., Okamoto, K., Tsuchiya, Y., Nomura, S., Nango, M., Kanehisa, M., and Ishiura, M. (2005) Global analysis of circadian expression in the cyanobacterium Synechocystis sp strain PCC 6803, J Bacteriol 187, 21902199.

ACS Paragon Plus Environment

Page 26 of 33

Page 27 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647

ACS Synthetic Biology

[48] Halkidi, M., Batistakis, Y., and Vazirgiannis, M. (2001) On clustering validation techniques, J Intell Inf Syst 17, 107-145. [49] Sneath, P. H. A., and Sokal, R. R. (1973) Numerical taxonomy; the principles and practice of numerical classification, W. H. Freeman, San Francisco,. [50] Halkidi, M., Vazirgiannis, M., and Batistakis, Y. (2000) Quality Scheme Assessment in the Clustering Process, In Principles of Data Mining and Knowledge Discovery (Zighed, D., Komorowski, J., and Żytkow, J., Eds.), pp 265-276, Springer Berlin Heidelberg. [51] Singh, A. K., Elvitigala, T., Cameron, J. C., Ghosh, B. K., Bhattacharyya-Pakrasi, M., and Pakrasi, H. B. (2010) Integrative analysis of large scale expression profiles reveals core transcriptional response and coordination between multiple cellular processes in a cyanobacterium, Bmc Syst Biol 4. [52] R Core Team. (2014) R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna Austria. [53] Martin Maechler, Peter Rousseeuw, Anja Struyf, Mia Hubert, and Hornik, K. (2015) cluster: Cluster Analysis Basics and Extensions. [54] Desgraupes, B. (2014) clusterCrit: Clustering Indices.

648

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

649

For Table of Contents Use Only

650

Title: “Identifying regulatory changes to facilitate nitrogen fixation in the non-

651

diazotroph Synechocystis sp. PCC 6803”

652

Authors: Thomas J. Mueller, Eric A. Welsh, Himadri B. Pakrasi, and Costas D.

653

Maranas

654 655

Table of Contents Figure:

656

ACS Paragon Plus Environment

Page 28 of 33

Page 29 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Figure 1: Parts of the regulatory networks for A) Synechocystis and B) Cyanothece. Edges are colored based on whether the predicted interaction is activating (green) or inhibiting (red). A subset of the gene clusters containing nif genes is shown above, all regulatory influences on the nif genes are listed in Supplemental File S2. Dashed lines indicate edges with associated literature support discussed in the Results section. 157x139mm (300 x 300 DPI)

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2: Distribution of homolog pairs in which one peaks in the light period and one peaks in the dark between the different subsystems. Subsystem annotations from Saha et al. were used45. *“DNA replication and other processes” refers to the subsystem “DNA replication, restriction, modification, recombination, and repair”. 84x155mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 30 of 33

Page 31 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Figure 3: Comparison of candidate native transcription factor expression profiles in Synechocystis (lexA: black line, rre34: cyan line)) with those of the nif cluster in Cyanothece (dashed lines). The dark periods of the light-dark cycle (hours 0-12 and 24-36) are represented with gray shading. 84x63mm (300 x 300 DPI)

ACS Paragon Plus Environment

ACS Synthetic Biology

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 4: All proposed regulatory changes. Genes native to Synechocystis are in green; those from Cyanothece are in orange. Proposed regulatory changes are represented as dashed lines. A) Proposed strategy for control of the nif cluster in Synechocystis using lexA and patB. B) Strategies for control of glycogen catabolism (horizontal stripes) and cyanophycin synthesis (vertical stripes) genes in Synechocystis using rcp1. 177x109mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 32 of 33

Page 33 of 33

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Figure 5: Metabolic processes requiring modified regulation in a diazotrophic Synechocystis. A) Expression profiles of rcp1 (solid black line) and glycogen phosphorylase (solid brown line) in Synechocystis, and glycogen phosphorylase (dashed brown line) in Cyanothece. B) Expression profiles of rcp1 (solid black line) and cycling cyanophycin synthesis genes (solid lines, colors corresponding to C) in Synechocystis, and the expression profiles of the cycling cyanophycin synthesis genes in Cyanothece (dashed lines). Sll0573 and slr2002 are not shown, as they were not identified as cycling. The dark periods of the light-dark cycle (hours 0-12 and 24-36) are represented with gray shading. C) Reactions and genes identified by the iSyn731 and iCyt773 genome-scale models as being involved in cyanophycin synthesis. Reactions with associated genes whose elevated expression times differ between the two organisms are color-coded. 177x110mm (300 x 300 DPI)

ACS Paragon Plus Environment