Synthetic Transcription Factors Switch from Local to Long-Range

Jan 9, 2019 - Here, we studied how synthetic TALE transcriptional activators and repressors affect the expression of genes in a gene array during cell...
0 downloads 0 Views 854KB Size
Subscriber access provided by Iowa State University | Library

Letter

Synthetic transcription factors switch from local to long-range control during cell differentiation Takeo Wada, Sandrine Wallerich, and Attila Becskei ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.8b00369 • Publication Date (Web): 09 Jan 2019 Downloaded from http://pubs.acs.org on January 10, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Abstract Graphic 80x40mm (200 x 200 DPI)

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1

Synthetic transcription factors switch from local to long-range

2

control during cell differentiation

3

Takeo Wada, Sandrine Wallerich & Attila Becskei*

4

Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056 Basel, Switzerland

5

*Corresponding author: [email protected]

6

ABSTRACT

7

Genes, including promoters and enhancers, are regulated by short- and long-range

8

interactions in higher eukaryotes. It is unclear how mammalian gene expression subject to

9

such a combinatorial regulation can be controlled by synthetic transcription factors (TF).

10

Here, we studied how synthetic TALE transcriptional activators and repressors affect the

11

expression of genes in a gene array during cellular differentiation. The protocadherin gene

12

array is silent in mouse embryonic stem (ES) and neuronal progenitor cells. The TALE

13

transcriptional activator recruited to a promoter activates specifically the target gene in ES

14

cells. Upon differentiation into neuronal progenitors, the transcriptional regulatory logic

15

changes: the same activator behaves like an enhancer, activating distant genes in a correlated,

16

stochastic fashion. The long-range effect is reflected by the alterations in CpG methylation.

17

Our findings reveal the limits of precision and the opportunities in the control of gene

18

expression for TF-based therapies in cells of various differentiation stages.

19

KEYWORDS: gradient, CTCF, epigenetic, synthetic biology, hydroxymethylation, neuron.

20

The control of gene expression is of major biotechnological relevance; synthetic

21

transcription factors have been widely used to examine transcriptional and posttranscriptional

22

regulation and to correct pathological gene expression1-6. Repressors have been used to suppress

23

the expression of aberrant genes, and activators can rescue promoter aberrations by enhancing the

24

expression of genes or their substitutes7, 8. Designer transcription factors are increasingly popular

25

in such and similar applications because they can be targeted to arbitrary DNA sequences9-11. The

26

Transcription Activator-Like Effectors (TAL effector) and by the RNA-guided clustered regularly

27

interspaced short palindromic repeat (CRISPR) associated protein (Cas9) can target arbitrary

28

sequences and control gene expression when fused to activator and repressor domains. While a 1 ACS Paragon Plus Environment

Page 2 of 24

Page 3 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

29

gene can be efficiently repressed by both TALEs and CRISPR/dCas9, only TALEs have been

30

shown to be able to efficiently activate gene expression12.

31

Transcription in higher eukaryotes is controlled jointly by transcription factors bound to the

32

promoters directly upstream of the coding region of a gene and also by distant enhancers, which

33

can be located megabases away from the promoter13, 14. Most TALE-based transcriptional factors

34

have been targeted to promoters but their recruitment to enhancers can also lead to the repression

35

or amplification of enhancer-specific transcriptional programs15.

36

The long-range interactions between promoters, enhancers and other regulatory sequences can

37

be viewed as a regulatory landscape with multiple configurations and outcomes, the choice among

38

which is influenced by the cellular differentiation state or by external stimuli. In such regulatory

39

landscapes, it is challenging to predict the effect of the designer transcription factors. Here we

40

aimed to study how the site-specific recruitment of TALE-activators and repressors affects the

41

expression of a gene chromosomal segment that is embedded in a chromosomal regulatory

42

landscape. In order to study this, we have selected the protocadherin Pcdh gene cluster, a

43

prototypical tandem array, which consists of genes with similar sequences. This homogeneity

44

enabled us to study the long-range effects of transcription factors in a long chromosomal segment.

45

RESULTS

46

Transcriptional activators switch from specific short-range to long-range activation during

47

differentiation

48

To study how transcription factors control gene expression in a chromosomal segment, we

49

targeted activators and repressors linked to TALE DNA-binding domains (TALE-A and TALE-R,

50

respectively) to specific Pcdh isoforms. These TALE-based transcriptional factors were expressed

51

under the control of the constitutive CAG promoter in a chromosomally integrated construct. First,

52

we examined activators, which are fusion proteins of TALEs and the VP160 transcriptional

53

activation domains (see Methods). By composing the TALE from single-nucleotide specific

54

domains10, we targeted them to the promoters of chosen isoforms in the Pcdh-α cluster. The α

55

cluster in mouse is a gene array that comprises 14 gene isoforms, Pcdh-α1 to α12 and -αC1 to αC

56

216, 17. We designed TALE activators that bind to the Pcdh-α6 and α11 promoters (TALE-A-α6 and

57

TALE-A-α11) (Figure 1A). The simultaneous targeting of multiple TALEs to a single gene can 2 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

58

markedly increase the expression of the target gene18. However, we wanted to examine the effect

59

of a single, well-defined binding event, and thus have screened integrands with a single TALE-A

60

(see SI Methods). We have chosen TALE-As that most efficiently induce the expression of the

61

target genes. We designed three TALE-As to recognize different parts of the Pcdh-α6 promoter

62

sequence but only a single construct resulted in high expression of the target gene (Figure S1).

63

The α–cluster in the protocadherin array is inactive in embryonic stem (ES) cells or neuronal

64

progenitors (NP), and it is expressed only in neurons19. Therefore, we studied how TALE activators

65

affect the expression in ES and NP cells, in which the Pcdh-α–cluster is inactive. The NP cells were

66

differentiated in vitro from ES cells exposed to retinoic acid (Figure 1B). We measured the

67

expression of the Oct4, Pax6 and Synaptophysin, as marker genes of the ES cells, NPs and neurons,

68

respectively (Figure S2A). The proportion of cells that express Synaptophysin reaches a maximum

69

4 to 7 days after the dissociation of the embryoid bodies (Figure S2B). We analyzed the expression

70

of the Pcdh-α cluster at the single-cell level because expression is stochastic with a marked

71

tendency to binary response, implying that cells in a cell population either do not express a given

72

gene at all or express it fully. Thus, we measured the frequency (percentage) of cells that express

73

a specific gene isoform.

74

The TALE-A-α6 and TALE-A-α11activated the expression of the target genes in ES cells

75

specifically and efficiently: 45% and 41% of the cells were Pcdh-α6+ and α11+, respectively

76

(Figure 1C, D). Off-target isoforms were not or only minimally expressed.

77

We observed a marked shift in the expression pattern when ES cells were differentiated into

78

NP cells: while the expression of the target gene did not change notably, a large number of off-

79

target isoforms was also expressed, (Figures 1E, F and S2C). The frequency of the expression of

80

the off-target genes was particularly high in TALE-A-α11cells.

81

We wanted to assess whether the off-target genes may be activated directly by the TALE-A-

82

α11 activator and performed a genome wide search for similar sequences. Within the protocadherin

83

array, the sequence most similar to the TALE-A-α11target sequence is found in the β-cluster

84

(Figure S3A). A 16 bp long sequence in the Pcdh-β19 promoter is identical to a continuous stretch

85

of the 20 bp long target sequence of the TALE-A-α11. In pairwise alignments of the target sequence

86

and each promoter region of the Pcdh-α isoforms, the best matches were retrieved from the Pcdh-

87

α1 and α3 promoter sequences. In these two sequences, there is a15 bp identity with the TALE-A3 ACS Paragon Plus Environment

Page 4 of 24

Page 5 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

88

α11 target, and the matches in the alignment are interrupted with multiple mismatches. Thus, the

89

likelihood of the activation of these genes is less than that of the Pcdh-β19 gene.

90

So far, we quantified the frequency of ON cells in a population of sorted single cells (Figure

91

1). To assess the expression of the potential off-targets identified based on the sequence alignments,

92

we measured gene expression in bulk population, comprising approximately 50’000 cells. In this

93

way, even low expression can be conveniently quantified. Furthermore, we constructed a cascade,

94

in which the expression of TALE-A-α11 was controlled by the TET (rtTA) system so that

95

expression of TALE-A-α11 can be induced by doxycycline. By titrating its dose, a dose-response

96

relation can be assessed. Neither the Pcdh-β19 nor the upstream Pcdh-β18 were activated by

97

TALE-A-α11 in ES and NP cells (Figures S3 and S4). Furthermore, no expression of off-target

98

candidates (Pcdh-α1 and -α3) was detected even at the highest expression level of TALE-A-α11 in

99

ES cells (Figure 2A).

100

The frequency of the ON cells expressing the target genes of the TALE activators does not

101

increase during differentiation (Figure 1). These findings suggest that it is not the binding of TALEs

102

to the off-target genes but some form of activity gradient around the target gene is the cause that

103

triggers the expression of the off-target genes in NP cells.

104

Formation of an activity gradient around transcriptional activators

105

The expression of off-target genes appears to decline with the distance from the target gene

106

(Figures 1E, F and S2C), and we hypothesized the existence of a bilateral activity gradient

107

flanking the TALE recognition site in NP cells. To quantify how the Pcdh-α expression varies with

108

the distance from the TALE target gene, we fitted two models, which incorporate either a simple

109

gradient (equation (1)) or an asymmetric gradient (equation (2)), flanking the TALE target (Figure

110

2B). The reason to construct the asymmetric gradient is based on the observation that the target

111

gene was expressed particularly strongly while the isoform directly downstream of it rather weakly.

112

Specifically, the Pcdh-α7 was considerably more expressed in TALE-A-α11 cells than in the

113

TALE-A-α6 cells, despite the larger distance to the target gene (Figure S2C). Similarly, the

114

expression of α12 was higher in TALE-A-α6 cells than in TALE-A-α11 cells. Therefore, the

115

expression of the target gene and the gene downstream of it was fitted independently of the other

116

genes in the asymmetric gradient model. 4 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

117

To predict the expression of isoforms in the NP cells, we fitted the activity gradient and

118

multiplied it by the intrinsic expression propensity of each isoform, which we define below. The

119

activity gradient was formulated as an exponentially declining function with an origin at the TALE

120

target gene (Figure 2B). We considered the expression propensity of the isoforms to be

121

proportional to the expression observed in wild-type (control) mature neurons19.

122

First, we fitted the gradient models to the expression in the TALE-A-α6 NP cells (Figure 2B,

123

top panel). The simple model was in good agreement with the experimental data, but the

124

asymmetrical gradient model fitted more closely the expression of the target gene (α6) and the gene

125

downstream of it (α7) (Figure 2B, bottom panel): Pcdh-α7 is expressed 2.5 times less than expected

126

from the simple gradient model. The decline of the gradient is relatively mild; the half-range of the

127

gradient, i.e. the number of genes at which the gradient declines to the half of its initial value was

128

4.3. The half-range was 12 for TALE-A-α11 cells (Figure S5), which indicates a broader gradient

129

in comparison to TALE-A-α6 cells.

130

The switch from the target-specific activation in ES cells to the broad gradient in NP cells either

131

imply a qualitative shift in regulation or may reflect a quantitative shift implying that a gradient

132

with a limited local effect is already present in ES cells. Indeed, the results with the TALE-A-α11

133

titrations system indicate that the expression declines precipitously around the target gene: the

134

expression of the α9 and α11 isoforms was around 100 times less than that of the target gene (α11)

135

(Figure 2A). Expression was not detected in the upstream cluster (α1-6), which are positioned at a

136

larger distance from the target gene. The intensity of the gradient increased gradually in response

137

to increasing TALE-activator expression, when the doxycycline concentration was varied. Thus,

138

these observations strengthen the alternative hypothesis, according to which the gradient is already

139

present in ES cells but it is restricted to the vicinity of the target gene, and it strengthens and

140

broadens only upon the ES cells start to differentiate into NP cells.

141

The strong activation of expression in the entire cluster in NP cells (Figure 1F) may be a result

142

of self-amplifying processes, such as positive feedback loops in epigenetic processes20. Positive

143

feedback loops can result in a memory6. To test whether there is memory in gene expression. We

144

pre-expressed TALE-A-α11 in ES cells and examined if this pre-induction affected gene

145

expression 10 days later in NP cells. There was no difference in the expression (Figure S4D, E).

5 ACS Paragon Plus Environment

Page 6 of 24

Page 7 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

146

This suggests there is no long-term transcriptional memory that spans successive differentiation

147

stages. Similarly, no memory was observed in ES cells (Figure S4A-C)

148

Both activators and repressors have asymmetric upstream and downstream effects

149

To determine whether the asymmetric effect is a feature restricted to activators, or is a general

150

feature of transcription factors, including repressors, we built TALE-repressor fusions by

151

incorporating the SID4X repressor domain21. We targeted these TALE-Rs to the Pcdh-α3 and -α4

152

promoters, which are highly expressed in neurons and thus are well suited to study repression at

153

the neuronal stage (Figure 2C). The TALE-R-α4 repressed the gene expression in neurons; the

154

distribution of the number of expressed isoforms was very similar to that observed with the TALE-

155

A-α6 activator in NP cells (Figures 3A and S6A). The gene cluster was more efficiently repressed

156

by TALE-R-α4 than by TALE-R-α3.

157

To quantify this asymmetry in the gradient activity in a convenient way, we calculated the

158

relative activation or repression for the two genes upstream of the target gene and for the two

159

downstream genes (see equations (4) and (5)). The upstream genes were more efficiently activated

160

than the downstream genes by both TALE-A-α6 and TALE-A-α11 but this difference is much

161

larger in the case of TA-α11 (Figure 3B).

162

By calculating the relative repression, we found that the upstream genes were more efficiently

163

repressed than the downstream genes (Figure 3B). Thus, the strong upstream effect is a feature

164

shared by both activators and repressors. This can be also visualized by comparing the expression

165

frequencies in cells expressing an activator and a repressor that yield similar mean numbers of

166

isoforms, which is the case for the TALE-A-α6 and the TALE-R-α4 (Figure 2C). The isoforms in

167

the downstream part of the cluster are activated and repressed less by the respective TALEs than

168

in the upstream cluster. This difference increases with the distance from the target gene.

169

Next, we analyzed how the activity gradient flanking the TALE-target genes affects single-cell

170

expression. We grouped those TALE-A-α6 cells that expressed the Pcdh-α1 isoform: only one out

171

of eight cells expressed all isoforms in the α1- α6 segment (Figure 3C). There was no single cell

172

in the examined population (NHprt = 244) that had a continuous expression of all genes between the

173

α6 and α12 isoforms. Thus, the expression of long segments of contiguous genes is the exception

174

rather than the rule. Interestingly, most cells that express either the α1 or the α12 isoform did not 6 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

175

express at all the target gene (α6). Thus, the activity gradient must be viewed in probabilistic terms

176

at the single-cell level, which can induce the expression of only a few genes along this gradient.

177

To characterize this activity gradient at the single-cell level, we calculated the correlation matrix

178

from the expression of the isoforms in single cells (Figure 3D). In WT neurons, there were only a

179

few pairwise correlations coefficients were above 0.2 (rS > 0.2). For the NP cells containing the

180

TALE-A-α6, we observed correlation coefficients higher than 0.2 for all neighboring isoform pairs

181

upstream of α6, which indicates the presence of a stochastic activity gradient upstream of the target

182

gene. High correlation was also observed between the upstream (α1- α5) and downstream (α10- α

183

C1) genes.

184

Both activators and repressors introduce marked correlations between the expression of specific

185

isoforms in single cells, which can be also seen by the large value of co-occurrence of gene

186

expression in the Pcdh-α cluster in the presence of activators and repressors (Figure S6B).

187

The transcriptional activity gradient is mirrored by decreased CpG methylation

188

To characterize the molecular mechanisms underlying the activity gradient and to give an answer

189

as to why the stochastic correlations are more prominent upstream of the target gene, we measured

190

the CpG methylation of the Pcdh-α promoters. The CpG methylation is thought to block the binding

191

of the CCCTC-binding factor, CTCF, to the promoter, preventing gene activation17. The CTCF is

192

a main mediator of DNA looping22. The CTCF-mediated interaction of the promoter of a specific

193

isoform with the HS5-1 enhancer, which is positioned downstream of the α-cluster, is thought to

194

play an important role in the stochastic promoter choice23.

195

In NP cells and neurons, the promoters of most isoforms displayed moderate methylation (20-

196

30%) (Figures 4A, S7A). Interestingly, the TALE-A-α6 reduced the CpG methylation of most but

197

not all Pcdh-α promoters in NP cells. The degree by which methylation was reduced due to TALE-

198

A-α6 is in a very good agreement with the increase in the expression of the respective isoforms

199

(Figure 4B, the agreement is indicated by the overlap of red and yellow colors, seen as an orange

200

coloration). There were two exceptions to this agreement: the methylation did not change

201

significantly in the target promoter (α6) and it increased considerably in the promoter of the

202

downstream gene (α7) in response to the activator. These two exceptions correspond to the

203

unconstrained genes in the asymmetric gradient model: the target gene (α6), which has a higher

204

than expected expression and the downstream gene (α7), which has a lower than expected 7 ACS Paragon Plus Environment

Page 8 of 24

Page 9 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

205

expression. Thus, the CpG methylation pattern confirms the assumptions of the asymmetric

206

gradient model, and suggests that the target and the downstream genes are regulated independently

207

of the methylation, while that the long-range upstream effect of the transcriptional activator is

208

mediated by CpG methylation.

209

CpG methylation affects the binding of CTCF to the promoters, which in turn controls gene

210

expression. We measured the CTCF binding with chromatin immunoprecipitation (ChIP) (Figure

211

S8). Interestingly, TALE-A-α6 enhances the binding of CTCF to the promoters of the α2-α5 genes,

212

upstream of the target gene, which mirrors the decreased methylation of the respective promoters

213

(Figures 4C). This confirms that the reduced methylation leads to increased CTCF binding to the

214

promoters in the α2-α5 segment, which explains the higher expression and the stochastic

215

correlation of the genes between the α1 and α6 isoforms.

216

We also wanted to see how methylation is influenced by the TALE-A-α6 activator in ES cells,

217

where it exerts target specific control without long-range effect. Most genes, including the

218

upstream genes did not show altered methylations, and the methylation frequency was close to zero

219

both in WT and TALE-A-α6 cells (Figure 4A, bottom panel). Similarly, the CTCF binding did not

220

display a specific pattern: the TALE-A-α6 enhanced the CTCF binding to most isoforms

221

throughout the cluster without a recognizable pattern (Figure S8B). The absence of gradients in

222

methylation and CTCF binding is likely to explain the absence of the long-range effect in ES cells.

223

Unexpectedly, TALE-A-α6 induced the methylation of the target promoter, Pcdh-α6, in ES cells.

224

To explain the unusual increased methylation due to the activation by TALE-A-α6, we measured

225

hydroxylation of methyl-cytosine since it has been implicated in gene activation24-26. Nearly all of

226

the methylated CpG sites in the Pcdh-α6 promoter were hydroxylated (Figure S7B, C).

227

DISCUSSION

228

The term enhancer was coined to distinguish a viral regulatory sequence that activates gene

229

expression at large distances from classic promoter sequences that activate expression at short

230

distances, positioned directly upstream of a gene27, 28. Our results shift this paradigm revealing that

231

an activator can cross the boundaries of this classification during cellular differentiation. The

232

activator that acts locally on the target gene in ES cells, changes its range of action when the cells

233

differentiate into neuronal progenitor state (Figure 5). In NP cells, it activates nearly the entire

234

Pcdh-α cluster, showing the features of an enhancer. This change in the regulatory logic coincides 8 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

235

with the appearance of CpG methylation at the neuronal progenitor stage (Figure 4A). The

236

activator reduces the methylation, which enhances CTCF binding and increases the probability of

237

the Pcdh genes to be expressed (Figure 4B). These epigenetic changes can be described as an

238

activity gradient, in which the TF evokes a strong response in the upstream genes. Different

239

mechanisms may explain this asymmetry. The asymmetry may arise due to the CTCF. All CTCF

240

binding sites in the promoters point to an enhancer in the downstream part of the Pcdh-α cluster,

241

and it has been observed that the directionality of the CTCF binding sites strongly influences which

242

combination of genes is activated in the gene array29.

243

It is important to note that the asymmetry reflects a bulk effect. At the single cell level, positive

244

correlations dominate in the TALE-induced bilateral gradient, as evidenced by the positive

245

correlations between the downstream and upstream genes (Figures 3D and S6B).

246

The methylation patterns of the target and off-target genes follow a different logic. We

247

observed that the TALE-A-α6 induces the appearance of the 5-hydroxymethylcytosine in the Pcdh-

248

α6 promoter. In this case, the usual inhibitory effect of CpG methylation is masked by

249

hydroxymethylation. Our finding provides a direct link between targeted gene activation and

250

hydroxymethylation, thereby DNA demethylation. DNA methylation status in embryonic stem

251

cells changes dynamically by the actions of de novo methyltransferases (Dnmt) and the Tet

252

enzymes26. The Pcdh-α promoters become methylated by Dnmt3b during mouse development17.

253

Our observation suggests that the methylated Pcdh-α promoters can be demethylated by the Tet

254

enzymes, and the demethylation process can be triggered by transcriptional activators.

255

The fact that transcriptional factors can alternate between local gene specific control and

256

enhancer-like activity during cell differentiation has also practical implications when gene

257

expression is controlled by TALE or CRISPR-based activators or repressors, which are frequently

258

employed for therapeutic purposes or to study the effect of regulators on chromosomal

259

conformation and epigenetic changes2-4, 6-8, 30.

260 261

Methods

262

Cell culturing, cell sorting and qPCR were performed as described in the SI Methods.

9 ACS Paragon Plus Environment

Page 10 of 24

Page 11 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

263

Construction of the cells containing the TALE-R and the expression cascade (rtTA and TALE-A-α11)

264

The repressor domain SID4X was cloned from the Enhanced Repressor Domain for TAL Effector

265

(SID4X)-containing plasmid (Addgene: 43882)21 to obtain pTW003 and pTW004 as plasmids

266

expressing TALE-repressor fusion proteins. Further steps were identical to that used for the

267

activators (SI Methods).

268

To construct the cascade with the TET system, a DNA fragment expressing rtTA3G was amplified

269

from pLenti CMV rtTA3G Blast (R980-M38-658) (Addgene: 31797) and cloned into the pCAGGS

270

plasmid. Puromycin-resistance gene (Puror) transcribed from the SV40 promoter was cloned in the

271

downstream of the rtTA3G gene. The resulting DNA fragment containing PCAG-rtTA3G-PSV40-

272

Puror was cloned into the multiple cloning site in the pDonor MCS Rosa26 plasmid (Perez-Pinera

273

et al., 2012) to obtain pTW005. The CAG promoter of the plasmid encoding TALE-A-a11

274

(pTW002) was replaced by the TRE3G promoter amplified from pLenti CMVTRE3G Puro DEST

275

(w819-1) (Addgene: 27570). Both plasmids were integrated into the Rosa loci in J1 ES cells by

276

electroporation.

277

Sequence alignment

278

Pairwise alignments of the TALE-A-α6 or TALE-A-α11 target sequences and Pcdh promoter

279

regions were carried out using ClustalW31 with the ‘msa’ R package32. Nucleotide sequences which

280

are similar to T6 and T11 target sequences, respectively, in the mouse genome were searched using

281

the Nucleotide BLAST (https://blast.ncbi.nlm.nih.gov/) using the Mouse genomic plus transcript

282

database.

283 284

Quantification of gene expression

285

Cells that expressed the housekeeping gene (Hprt) were included into the further analysis. Cp

286

values were calculated by the system software (LightCycler, Roche). Cp values of 45 was set (by

287

the machine) as the lowest expression level. The Cp values measured for single cells are shown in

288

Data S1-S5 (in the order of WT, TALE-R-α3, TALE-R-α4, TALE-A-α6 and TALE-A-α11).

289

Hence, 45-ct corresponds to the Log2 expression values. For binary expression, a (Log2) threshold

10 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

290

of 10 was taken to separate the on and off cells. TaqMan Gene Expression assay were purchased

291

from Applied Biosynthesis (Pcdhb18, Mm00474538_s1; Pcdhb19, Mm00474560_s1).

292

When cells were pooled upon sorting, no pre-amplification was performed in the qPCR. In this

293

case, RNA was quantified using TaqMan probes and CellsDirect One-step RT-qPCR kit

294

(Invitrogen) with LightCycler 480 (Roche).

295

Memory Experiment

296

General growth conditions are described in the SI Methods. Low and high expression states were

297

created as initial conditions, termed ON and OFF history conditions. Memory experiments were

298

performed in ES and NP cells.

299

In ES cells, the ON history condition was created by adding 0.1 µg/mL doxycycline into the culture,

300

while no inducer was added for the OFF history condition. The cultures were incubated for one

301

day, and growth medium was replaced by the fresh medium without doxycycline. The cells were

302

cultured for 7 days in the growth medium without doxycycline before total RNA was extracted

303

using the RNeasy Plus Mini Kit (Qiagen). Reverse transcription was carried out using oligo-dT

304

primer and the SuperScript III transcriptase (Invitrogen). Quantitative PCR was performed using

305

TaqMan probes and GoTaq Probe qPCR Master Mix (Promega).

306

In a memory experiment during in vitro differentiation to NP cells. The ON history condition was

307

generated by adding 0.1 µg/mL doxycycline into ES cell culture at Day -10, while no inducers were

308

added for the OFF history condition. The cultures were incubated for one day, and growth medium

309

was replaced by the fresh medium without doxycycline at Day -9. At Day -8, the embryoid body

310

formation was initiated as described in SI Methods. At Day -1, the embryoid bodies were split and

311

exposed to a doxycycline concentration range so that cells with different histories were grown in

312

identical conditions. The embryoid bodies were dissociated at Day 0 and sorted by FACS Aria III

313

(BD). The RNA was quantified from pooled cells.

314

Titration of TALE-A-α11 expression

315

ES cells were exposed to a range of doxycycline concentration for one day before total RNA was

316

extracted using the RNeasy Plus Mini Kit. Reverse transcription was carried out using oligo-dT 11 ACS Paragon Plus Environment

Page 12 of 24

Page 13 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

317

primer and the SuperScript III transcriptase. Quantitative PCR was performed using TaqMan

318

probes and GoTaq Probe qPCR Master Mix.

319 320

Models of the activity gradient flanking an activator

321

To predict the frequency of Pcdh-α+ cells f (TALE , NP)i induced by the TALE activator for each

322

isoform i, an activity gradient flanking the TALE target genes was multiplied with the inherent

323

strength of the respective promoter, which was approximated by the frequency of Pcdh expression

324

measured in WT neurons, f (WT , N1)i for each isoform.

325

(1)

f (TALE , NP )i  c p f (WT , N1)i e  ( xTALE  xi )

326

In the above simple gradient model, the gradient is an exponential function of the distance between

327

the TALE target gene and a particular Pcdh-α isoform (i), xTALE  xi ; the subscript TALE denotes

328

the order number of the Pcdh-α isoform targeted by the TALE. Due to the similarity of the lengths

329

of the isoforms, we equated the gene position with their order number: xi  i . The proportionality

330

constant, cp, and the gradient slope,  , were fitted to the measured frequencies.

331

For the asymmetric gradient, the following equation was used:

332

E (TALE , NP)i  cd  xTALE ,x  cinh c p E (WT , N1)i e

333

The downstream inhibitory effect is cinh

334

direct transcriptional control of the TALE on the target gene. Thus, these two parameters were also

335

fitted in (2) to the experimental data.

336

Quantification of the variance of downstream and upstream genes by activators or repressors

337

To quantify the effect of TALEs, the control ratios were calculated. For TALE activators, the ratio

338

is:

339

i 

  xTALE  xi

(2)

i

 1  c fi  xTALE 1, x

i

.  i , j is the Kronecker delta. cd reflects the

f (TALEA, NP)i f (WT , N1)i

(3)

12 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 24

340

The geometric mean of the control ratios for two genes upstream of the TALE-target gene was

341

used to quantify the relative activation of the upstream genes:

342

up 

343

The downstream effect was calculated similarly:

344

down 

345

For the TALE repressors, the control ratios are defined as  i  f (WT , N1)i / f (TALER, N1)i . The

346

upstream and downstream effect are calculated with equations (4) and (5).

3

 TALE 1 TALE  2  TALE

3

(4)

 TALE 1 TALE  2  TALE

(5)

347 348

Oxidative bisulfite sequencing

349

Oxidative bisulfite sequencing was carried out according to the method described previously33.

350

Briefly, genomic DNA was denatured in 50 mM NaOH at 37°C for 30 min and oxidation was done

351

using 15 mM KRuO4 (Alfa Aesar) on ice for 1 hour. Oxidised DNA was purified and then subject

352

to bisulfite sequencing.

353 354

ASSOCIATED CONTENT

355

Supporting Information

356

Supplementary Methods, Figures S1-S8, Data Files S1-S5.

357 358

ABBREVIATIONS

359

Pcdh: protocadherin.

360 361 362

AUTHOR CONTRIBUTIONS

363

A.B. designed the study, and A.B and T.W. analyzed the data and wrote the paper. A.B performed

364

the mathematical modelling. T.W. and S.W. performed the experiments. 13 ACS Paragon Plus Environment

Page 15 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

365 366

ACKNOWLEDGEMENTS

367

We thank Natalia Soshnikova for helpful discussions and Charlotte Simonin, Mélusine Bleu,

368

Sebastian Wenk, Katell Kunin and Xavier Farge for technical help. Cell sorting was carried out by

369

Janine Zankl, Anna Sproll and Stella Stefanova.

370 371

REFERENCES

372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406

[1] Shimoga, V., White, J. T., Li, Y., Sontag, E., and Bleris, L. (2013) Synthetic mammalian transgene negative autoregulation, Mol Syst Biol 9, 670. [2] Wijchers, P. J., Krijger, P. H., Geeven, G., Zhu, Y., Denker, A., Verstegen, M. J., Valdes-Quezada, C., Vermeulen, C., Janssen, M., Teunissen, H., Anink-Groenen, L. C., Verschure, P. J., and de Laat, W. (2016) Cause and Consequence of Tethering a SubTAD to Different Nuclear Compartments, Mol Cell 61, 461-473. [3] Vincent, K. A., Shyu, K. G., Luo, Y., Magner, M., Tio, R. A., Jiang, C., Goldberg, M. A., Akita, G. Y., Gregory, R. J., and Isner, J. M. (2000) Angiogenesis is induced in a rabbit model of hindlimb ischemia by naked DNA encoding an HIF-1alpha/VP16 hybrid transcription factor, Circulation 102, 2255-2261. [4] Deng, W., Lee, J., Wang, H., Miller, J., Reik, A., Gregory, P. D., Dean, A., and Blobel, G. A. (2012) Controlling long-range genomic interactions at a native locus by targeted tethering of a looping factor, Cell 149, 1233-1244. [5] Baudrimont, A., Voegeli, S., Viloria, E. C., Stritt, F., Lenon, M., Wada, T., Jaquet, V., and Becskei, A. (2017) Multiplexed gene control reveals rapid mRNA turnover, Sci Adv 3, e1700006. [6] Hsu, C., Jaquet, V., Gencoglu, M., and Becskei, A. (2016) Protein Dimerization Generates Bistability in Positive Feedback Loops, Cell Rep 16, 1204-1210. [7] Fink, K. D., Deng, P., Gutierrez, J., Anderson, J. S., Torrest, A., Komarla, A., Kalomoiris, S., Cary, W., Anderson, J. D., Gruenloh, W., Duffy, A., Tempkin, T., Annett, G., Wheelock, V., Segal, D. J., and Nolta, J. A. (2016) Allele-Specific Reduction of the Mutant Huntingtin Allele Using Transcription Activator-Like Effectors in Human Huntington's Disease Fibroblasts, Cell Transplant 25, 677-686. [8] Barbon, E., Pignani, S., Branchini, A., Bernardi, F., Pinotti, M., and Bovolenta, M. (2016) An engineered tale-transcription factor rescues transcription of factor VII impaired by promoter mutations and enhances its endogenous expression in hepatocytes, Sci Rep 6, 28304. [9] Moore, R., Chandrahas, A., and Bleris, L. (2014) Transcription activator-like effectors: a toolkit for synthetic biology, ACS Synth Biol 3, 708-716. [10] Boch, J., Scholze, H., Schornack, S., Landgraf, A., Hahn, S., Kay, S., Lahaye, T., Nickstadt, A., and Bonas, U. (2009) Breaking the code of DNA binding specificity of TAL-type III effectors, Science 326, 1509-1512. [11] Toegel, M., Azzam, G., Lee, E. Y., Knapp, D., Tan, Y., Fa, M., and Fulga, T. A. (2017) A multiplexable TALE-based binary expression system for in vivo cellular interaction studies, Nat Commun 8, 1663. [12] Gao, X., Tsang, J. C., Gaba, F., Wu, D., Lu, L., and Liu, P. (2014) Comparison of TALE designer transcription factors and the CRISPR/dCas9 in regulation of gene expression by targeting enhancers, Nucleic Acids Res 42, e155. 14 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453

[13] Benabdallah, N. S. W. I. I. R. S., Boyle S., Grimes R, Therizols T., Bickmore, W. (2017) PARP mediated chromatin unfolding is coupled to long-range enhancer activation, biorxiv. [14] Kulaeva, O. I., Nizovtseva, E. V., Polikanov, Y. S., Ulianov, S. V., and Studitsky, V. M. (2012) Distant activation of transcription: mechanisms of enhancer action, Mol Cell Biol 32, 4892-4897. [15] Crocker, J., and Stern, D. L. (2013) TALE-mediated modulation of transcriptional enhancers in vivo, Nat Methods 10, 762-767. [16] Hirayama, T., and Yagi, T. (2017) Regulation of clustered protocadherin genes in individual neurons, Semin Cell Dev Biol. [17] Toyoda, S., Kawaguchi, M., Kobayashi, T., Tarusawa, E., Toyama, T., Okano, M., Oda, M., Nakauchi, H., Yoshimura, Y., Sanbo, M., Hirabayashi, M., Hirayama, T., Hirabayashi, T., and Yagi, T. (2014) Developmental epigenetic modification regulates stochastic expression of clustered protocadherin genes, generating single neuron diversity, Neuron 82, 94-108. [18] Perez-Pinera, P., Ousterout, D. G., Brunger, J. M., Farin, A. M., Glass, K. A., Guilak, F., Crawford, G. E., Hartemink, A. J., and Gersbach, C. A. (2013) Synergistic and tunable human gene activation by combinations of synthetic transcription factors, Nat Methods 10, 239-242. [19] Wada, T., Wallerich, S., and Becskei, A. (2018) Stochastic Gene Choice during Cellular Differentiation, Cell Rep 24, 3503-3512. [20] Kelemen, J. Z., Ratna, P., Scherrer, S., and Becskei, A. (2010) Spatial epigenetic control of mono- and bistable gene expression, PLoS Biol 8, e1000332. [21] Cong, L., Zhou, R., Kuo, Y. C., Cunniff, M., and Zhang, F. (2012) Comprehensive interrogation of natural TALE DNA-binding modules and transcriptional repressor domains, Nat Commun 3, 968. [22] Dekker, J., and Mirny, L. (2016) The 3D Genome as Moderator of Chromosomal Communication, Cell 164, 1110-1121. [23] Guo, Y., Monahan, K., Wu, H., Gertz, J., Varley, K. E., Li, W., Myers, R. M., Maniatis, T., and Wu, Q. (2012) CTCF/cohesin-mediated DNA looping is required for protocadherin alpha promoter choice, Proc Natl Acad Sci U S A 109, 21081-21086. [24] Cherepanova, O. A., Gomez, D., Shankman, L. S., Swiatlowska, P., Williams, J., Sarmento, O. F., Alencar, G. F., Hess, D. L., Bevard, M. H., Greene, E. S., Murgai, M., Turner, S. D., Geng, Y. J., Bekiranov, S., Connelly, J. J., Tomilin, A., and Owens, G. K. (2016) Activation of the pluripotency factor OCT4 in smooth muscle cells is atheroprotective, Nat Med 22, 657-665. [25] Szyf, M. (2016) The elusive role of 5'-hydroxymethylcytosine, Epigenomics 8, 1539-1551. [26] Ambrosi, C., Manzo, M., and Baubec, T. (2017) Dynamics and Context-Dependent Roles of DNA Methylation, J Mol Biol 429, 1459-1475. [27] Halfon, M. S. (2006) (Re)modeling the transcriptional enhancer, Nat Genet 38, 1102-1103. [28] Banerji, J., Rusconi, S., and Schaffner, W. (1981) Expression of a beta-globin gene is enhanced by remote SV40 DNA sequences, Cell 27, 299-308. [29] Guo, Y., Xu, Q., Canzio, D., Shou, J., Li, J., Gorkin, D. U., Jung, I., Wu, H., Zhai, Y., Tang, Y., Lu, Y., Wu, Y., Jia, Z., Li, W., Zhang, M. Q., Ren, B., Krainer, A. R., Maniatis, T., and Wu, Q. (2015) CRISPR Inversion of CTCF Sites Alters Genome Topology and Enhancer/Promoter Function, Cell 162, 900910. [30] Tumbar, T., Sudlow, G., and Belmont, A. S. (1999) Large-scale chromatin unfolding and remodeling induced by VP16 acidic activation domain, J Cell Biol 145, 1341-1354. [31] Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Res 22, 4673-4680. [32] Bodenhofer, U., Bonatesta, E., Horejs-Kainrath, C., and Hochreiter, S. (2015) msa: an R package for multiple sequence alignment, Bioinformatics 31, 3997-3999.

15 ACS Paragon Plus Environment

Page 16 of 24

Page 17 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

454 455 456

ACS Synthetic Biology

[33] Booth, M. J., Branco, M. R., Ficz, G., Oxley, D., Krueger, F., Reik, W., and Balasubramanian, S. (2012) Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution, Science 336, 934-937.

457 458 459 460 461

Figure 1. Transcriptional activators switch from local control to enhancer-like activity

462

during differentiation. (A) Schemes of the protocadherin array and TALE activators. (B) Scheme

463

of the differentiation steps. The characteristic expression markers of the ES cells, NPs and neurons

464

are Oct4, Pax6 and Synaptophysin, respectively. (C-F) Frequency of cells expressing the particular

465

Pcdh isoform in Oct4+ ES (C, D) and Pax6+ NP (E, F) cells. The frequencies are shown for cells

466

expressing the TALE recognizing the Pcdh-α6 (yellow) or α11 (green), which are calculated from

467

a total number of the sorted Hprt+ cells (NHprt), from n independent biological replicates. Hprt, a

468

housekeeping gene was used as a marker of cell integrity. TALE-A-α6 (ES) (NHprt = 96, n = 2);

469

TALE-A-α6 (NP) (NHprt = 244, n = 3); TALE-A-α11 (ES) (NHprt = 189, n = 4), TALE-A-α11 (NP)

470

(NHprt = 202, n = 4). For comparison, the WT cells are displayed in black19. WT (ES) (NHprt = 122,

471

n = 3), WT (NP) (NHprt = 426, n = 7).

472 473

Figure 2. Characterization of the activity gradient around the TALE-A-activators. (A)

474

Expression of the Pcdh-α isoforms in ES cells where TALE-A-α11 was induced by the indicated

475

concentrations of doxycycline. The constitutively expressed rtTA protein (blue) induced the

476

expression of TALE-A-a11 (orange) in a doxycycline concentration-dependent manner. (B) The

477

prediction of Pcdh-α expression in TALE-A-α6 NP cells using the gradient models. The predicted

478

expression is proportional to the frequencies of the Pcdh+ cells in (Pax6 or Syn)+ neurons (top

479

panel). In the asymmetric gradient model, the production rates of the Pcdh-α6 and α7 isoforms

480

were fitted independently of the gradient (orange line). The following parameter values were fitted

481

for the asymmetric gradient model, γ = 0.16, cd = 0.17, cfi = 0.52, cp = 0.47. (C) Comparison of the

482

frequency of Pcdh-α isoform expression in the indicated (Pax6 or Syn)+ cells. 16 ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

483 484

Figure 3. Asymmetric effect of TALE-activators and repressors on the upstream and

485

downstream genes. (A) The distribution of the number of expressed Pcdh-α isoforms in single

486

(Pax6 or Syn)+ cells in NP and neuronal cultures. The mean number of the expressed isoforms is

487

calculated from n independent biological replicates, totaling NHprt Hprt+ single cells. Hprt, a

488

housekeeping gene was used as a marker of cell integrity. µ(WT) = 5.35 (n = 8, NHprt = 334),

489

µ(TALE-R-α3) = 4.1 (n = 3, NHprt = 121), µ(TALE-A-α11) = 2.86 (n = 6, NHprt = 202), µ(TALE-R-

490

α4) = 1.71 (n = 5, NHprt = 111) and µ(TALE-A-α6) = 1.38 (n = 6, NHprt = 244). (B) The effect of

491

TALE activators and repressors on the expression of the Pcdh-α cluster. The activation or

492

repression of the two genes upstream and downstream of the TALE target genes is normalized by

493

the activation / repression of the direct target gene. The action of repressors was assessed in

494

neurons, while that of the activators in NP cells. (C) Samples of single-cell expression in the TALE-

495

A-α6 background is shown in which either Pcdh-α1 or Pcdh-α12 are expressed: each horizontal

496

band represents a single cell. The yellow arrow indicates the TALE target gene. (D) The Spearman

497

correlation coefficients were calculated for single-cell expression frequencies of Pcdh isoforms in

498

(Pax6 or Syn)+ cells. Values higher than 0.2 are shown as indicated by the color scale. The WT

499

cells were assessed at the neuron stage (upper triangle), while the cells containing TALE-A-α6

500

were measured at the progenitor stage (lower triangle).

501 502

Figure 4. Epigenetic changes in the Pcdh-α cluster due to the TALE-A-α6 activator. The

503

methylation of the Pcdh-α promoters in NP (top panel) and ES (bottom panel) cells with WT or

504

TALE-A-α6 background. Increased and decreased methylation due to TALE-α6 is denoted by blue

505

and red shading, respectively. The magenta square denotes the Pcdh-α6 promoter, which was

506

assessed for hydroxymethylation. (A) Effect of TALE-A-α6 on the expression and methylation

507

frequencies in NP cells. The difference in the expression frequencies is calculated from data shown

508

in Figure 1 but including all Pax6+ or Syn+ cells (yellow). The color shades in the methylation

509

differences (blue and red) are the same as those in (A). (B) The effect of TALE-A-α6 on the CTCF

510

binding to the promoters and the HS5-1 enhancer in NP cells, measured by ChIP. The CTCF

511

binding is the mean value of two biological replicates each calculated from two technical replicates.

512

The difference is calculated from the measurements in WT and TALE-A-α6 cells; (the difference 17 ACS Paragon Plus Environment

Page 18 of 24

Page 19 of 24 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

513

is calculated in the order opposite to that for methylation since CTCF-binding and CpG methylation

514

have opposite effects on gene expression).

515 516

Figure 5. Spatial stochastic control of gene arrays. The genes in single cells (gray ellipsoids) in

517

a gene array are expressed stochastically (orange filled rectangles). The designer transcriptional

518

activators change their range of action during differentiation. The activator controls only its target

519

gene in ES cells (green). When ES cells differentiate intro neuronal progenitors (red), the activator

520

creates an activity gradient around the target gene with decreased CpG methylation in the

521

promoters. The gradient is asymmetric, with a stronger effect on the genes upstream of the target

522

gene. This long-range gradient induces the expression of the genes stochastically, in a correlated

523

way, so that only some of the genes subject to the activity gradient are expressed.

18 ACS Paragon Plus Environment

A

TALE-A-α11

ACS Synthetic Biology

100 kb

Pcdh-α1-α12

ES

Pcdh-α

E

Pcdh-α

F

NP

NP

Expression frequency

NP (Pax6)

Neuron (Synaptophysin, Syn) Pcdh-α ACS Paragon Plus Environment

Page 20 of 24

Pcdh-αC2

Expression frequency

ES (Oct4)

D

ES

Expression frequency

C

B

Pcdh-αC1

Expression frequency

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

TALE-A-α6

Pcdh-α

Page 21 of 24

ES

B rtTA

TALE-A-α11

Observation:

Model:

WT (Neuron)

Simple gradient Asymmetric gradient

Dox [μg/ml] 0 0.002 0.004 0.008 0.5

Expression frequency

Expression [/Hprt]

A

Pcdh-α

Pcdh-α

C Observation: WT(Neuron) TALE-R-α4 (Neuron) TALE-A-α6 (NP)

Pcdh-α

Prediction:

TALE-A-α6 (NP)

Simple gradient Asymmetric gradient

Expression frequency

Expression frequency

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

ACS Synthetic Biology

ACS Paragon Plus Environment

Pcdh-α

ACS Synthetic Biology Upstream genes

C

D

TALE-A-α6

Relative repression

1 2 3 4 5 6 7 8 9 10 11 12 C1 C2

Relative activation

Downstream genes

rS

Hprt Oct4 Pax6 Syn 1 2 3 4 5 6 7 8 9 10 11 12 C1 C2 α1

Expression (Log2)

1 2 3 4 5 6 7 8 9 10 11 α12 12 C1 ACS Paragon Plus Environment C2

WT (Neuron)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

TALE

B

A

Page 22 of 24

TALE-A-α6 (NP)

Page 23 of 24

B

Change in methylation due to TALE-A-α6

Difference in methylation frequency: WT – TALE-A-α6 Increased

Increased

Decreased Difference in ON cell frequency: TALE-A-α6 – WT

Decreased NP TALE-A-α6 WT

Δ Frequency

Pcdh-α

C

Difference in CTCF binding: TALE-A-α6 – WT Increased

ES TALE-A-α6 WT Hydroxymethylation

Decreased

ACS Paragon Plus Environment

Pcdh-α

HS5-1

A

Δ CTCF binding

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

ACS Synthetic Biology

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

Page 24 of 24

TF

TF

ES cells TF: Local control

Neuronal progenitor cells TF: Gradient, enhancer-like effect

ACS Paragon Plus Environment