Subscriber access provided by UNIVERSITY OF TOLEDO LIBRARIES
Ecotoxicology and Human Environmental Health
Reduced Zebrafish Transcriptome Atlas toward Understanding the Environmental Neurotoxicants Kun Zhang, and Yanbin Zhao Environ. Sci. Technol., Just Accepted Manuscript • DOI: 10.1021/acs.est.8b01350 • Publication Date (Web): 21 May 2018 Downloaded from http://pubs.acs.org on May 21, 2018
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 36
Environmental Science & Technology
1
Reduced Zebrafish Transcriptome Atlas toward Understanding the Environmental
2
Neurotoxicants
3
Kun Zhanga, and Yanbin Zhaoa,*
4 5
a
6
02115, USA
Brigham and Women's Hospital, Harvard Medical School, 60 Fenwood Road, Boston, MA
7 8
*Corresponding author:
9
Yanbin Zhao
10
Tel.: +1 857 307 0318; E-mail:
[email protected] ACS Paragon Plus Environment
Environmental Science & Technology
11
Abstract
12
Transcriptomic approaches monitoring gene responses at genome-scale are increasingly used in
13
toxicological research and help to clarify the molecular mechanisms of adverse effects caused by
14
environmental toxicants. However, their applications for chemical assessment are hampered due
15
to high expenses required and more importantly the lack of in-depth data mining and mechanistic
16
perspectives. Here, we described a Reduced Transcriptome Atlas (RTA) approach which
17
integrates transcriptomic datasets and a comprehensive panel of genes generated to represent
18
neurogenesis and the early neuronal development of zebrafish, to determine the potential
19
neurodevelopmental toxicities of environmental chemicals. Transcriptomic datasets of 74
20
chemicals and 736 related gene expression profiles were integrated resulting in 135 exposure
21
signatures. Chemical prioritization demonstrated four sets of hits to be neurotoxic: neuro-active
22
chemicals (representatively, Valproic acid, VPA and Carbamazepine, CAR), xenoestrogens
23
(Bisphenol A, BPA; Genistein, GEN; 17-α Ethinylestradiol, EE2), microcystins (Cyanopeptolin,
24
CP1020; Microcystin-LR, MCLR) and heavy metals (AgNO3, AgNPs). The enriched biological
25
pathways and processes were distinct among the four sets, while the overlapping functional
26
enrichments were observed within each set, e.g. over 25% differentially expressed genes and
27
four of top five KEGG pathways were shared between VPA and CAR. Furthermore, gene
28
expression index (GEI) analysis demonstrated that a gene panel with 300 genes was sufficient to
29
effectively characterize and cluster chemicals and therefore offer an efficient and cost-effective
30
tool for the prioritization of neurotoxicants. Thus, the RTA approach provides novel insights into
31
the understanding of the in-depth molecular mechanisms of environmental neurotoxicants and
32
can be used as indication for potential adverse outcomes.
ACS Paragon Plus Environment
Page 2 of 36
Page 3 of 36
33
Environmental Science & Technology
Introduction
34
Transcriptomics technologies can characterize and quantify the entirety of gene transcripts
35
simultaneously and consequently are useful to identify changes in molecular patterns related to
36
environmental stressors, such as toxic chemicals. Transcriptomics have increasingly been applied
37
in toxicological research. They help to clarify the molecular mechanisms of adverse effects and
38
predict the potential toxic effects caused by environmental chemicals.1-3 The integration of
39
transcriptomics and high-throughput screening contribute to the adverse outcome pathway
40
analysis for chemical risk assessments4,5 and assist with the classification and prioritization in
41
large toxicity screening program, for instance, Tox21 and the human toxome project.6-7
42
High throughput transcriptomic approaches have received more and more attention recently,
43
however, several issues still limit their extensive utilizations in general toxicology. One major
44
concern that significantly hampered the progress is the lack of data mining tools and advanced
45
knowledge discovery.8 Due to the data complexity, most of the transcriptomics-based toxicology
46
studies still focus on the signaling pathways and biomarker genes related to the already known
47
adverse effects of the toxicants, which helps to explain the molecular mode of actions of the
48
toxic outcomes observed at higher organizational levels (e.g. the histological and physiological
49
toxicities),9,10 while other meaningful molecular responses were either overlooked or ignored as
50
not directly relevant to the expected mode of actions. For example, bisphenol A (BPA), with
51
weak estrogenic activities, has been classified as an endocrine disruptor and a “substance of very
52
high concern”.11 Transcriptomics highlighted its dysregulations on endocrine systems such as on
53
the receptor binding and estrogen stimulus, steroidogenesis and cell proliferation.12,13 However,
54
the significant changes were also observed for other pathways, such as the DNA metabolic
55
process, oxidative stress, apoptosis, neurogenesis and nervous system development. These
ACS Paragon Plus Environment
Environmental Science & Technology
56
dysregulations indicate that BPA would have unanticipated toxicities on DNA repair process and
57
zebrafish neurodevelopment that are beyond the adverse outcomes on the endocrine system.
58
Meanwhile, due to the high biological complexity, many adverse outcomes in animals under
59
chemical exposure cannot be easily detected. The lack of advanced knowledge and technologies
60
to qualitatively and quantitatively characterize the traits of different phenotypes hampered the
61
progress to characterize and prioritize toxic chemicals. Especially for the nervous system, there
62
are limited strategies that were commonly applied to assess and prioritize the neuronal disruptive
63
chemicals during neurogenesis and early neuronal development, e.g. by immunohistochemistry,
64
in situ hybridization and the emerging behavioral testing (e.g. locomotor activities in teleost).14,15
65
Transcriptomics approaches can help to predict these unanticipated toxic effects, however, an
66
effective data interpretation is still hard to perform due to the limited knowledge of gene-
67
function relationships, regulatory mechanisms and the immature data mining tools.8,14 For
68
instance, functional enrichment analysis of transcriptome can clarify the dysregulated biological
69
pathways and processes, such as the altered mRNA metabolic process, cellar protein catabolic
70
process and chromosome organization caused by silver ions.9 Nevertheless, limited perspectives
71
can be obtained based on these general biochemical alterations and it is still difficult to uncover
72
the related toxic outcomes at higher organizational levels. Thus, the challenges on data analysis
73
and interpretation call for novel methodological strategies, especially for the complex nervous
74
system where the toxicity outcomes were hard to determine qualitatively and quantitatively.
75
Another obstacle confronting the utilization of transcriptomics is the relatively high cost. In
76
spite of falling prices, transcriptomic profiling remains expensive to perform, especially for a
77
large scale toxicological research, such as the high throughput screening for chemical clustering
78
and prioritization. In recent years, alternative strategies were developed with substantially
ACS Paragon Plus Environment
Page 4 of 36
Page 5 of 36
Environmental Science & Technology
79
reduced costs.16-19 Of them, a reduced transcriptome approach was increasingly used. It includes
80
a small number of genes as a gene panel which can effectively represent the whole-genome
81
expressions, such as the L1000 landmark genes18 and 917 human pathway reporter gene panel.19
82
In this way, the activity of entire signaling networks can be assessed based on these established
83
key regulators/targets rather than the whole transcriptome.19 Especially, the LINCS (NIH library
84
of integrated network-based cellular signatures) program as well as the related L1000 expression
85
profiling assay has generated a large-scale public dataset that indicates how human cells respond
86
to various genetic and environmental stressors, which already included over 27,000 perturbagens
87
and produced over 1.3 million L1000 profiles.20,21 As a powerful and effective tool, the LINCS
88
program greatly accelerated the drug discovery. For toxicology, a human S1500 sentinel gene set
89
were generated from 3339 gene expression series by Tox21 program (Toxicology Testing in the
90
21st Century) recently with full pathway coverages. It can be used to accurately predict pathway
91
perturbations and biological relationships for samples under study.22 Similarly, a reduced human
92
transcriptome (RHT) approach and a reduced zebrafish transcriptome (RZT) approach were
93
sequentially developed via the integration of 1200 human Entrez genes and 1637 zebrafish
94
Entrez genes, respectively, to comprehensively represent the biological pathways and
95
toxicologically relevant processes.23,24 They were proven to be promising for profiling the
96
molecular pathways that modulated by reference chemicals and the water samples ranging from
97
wastewater to drinking water.23,24 The enriched pathways in RHT and RZT analysis could be
98
used to evaluate and prioritize chemicals for future assessment. Specifically, the most sensitive
99
biological pathways could be linked with adverse outcomes at higher organizational levels and
100
could identify the early molecular responses for adverse outcome pathway (AOP) construction
101
and assess the hazard potencies of environmental toxicants in early developmental stages.23,24
ACS Paragon Plus Environment
Environmental Science & Technology
102
On this basis, the aim of present study was to develop a Reduced Transcriptome Atlas (RTA)
103
by integrating a large number of transcriptomic datasets with a specific gene panel (L2000) to
104
comprehensively represent the neurogenesis and early neuronal development in zebrafish. This
105
approach was aimed to uncover and prioritize the potential neurodevelopmental toxicants by in-
106
depth transcriptomic data mining and interpretation. Meanwhile, a specific gene set was aimed to
107
be generated to effectively characterize and prioritize chemicals as L2000 and therefore serves to
108
establish a cost-effective reduced zebrafish transcriptome approach that specifically targets the
109
nervous system.
110 111
Materials and Methods
112
RTA gene panel development. The RTA gene panel was aimed to comprehensively represent
113
the biological pathways and processes involved in neurogenesis and early neuronal development
114
in zebrafish (Danio rerio). Thus, we propose three criteria to get a maximum coverage in the
115
present study: Representing the entire neural networks, maximal coverage of biological pathways
116
and processes and integrating novel neural factors (Figure 1). Our current knowledge and large-
117
scale signaling pathway databases provide us the opportunity to retrieve most of the currently
118
known neural factors involved in neural networks and signaling pathways. With a comprehensive
119
survey of the up-to-date literatures, novel annotated neural genes can be further integrated.
120
Therefore, at first, we constructed a general framework for the neural signaling pathways and
121
related genes based on the current knowledge,25-27 which included most of the classic extrinsic
122
and intrinsic factors, such as neurotransmitters and growth factors. Then, the associated genes
123
within each signaling pathway were curated and integrated from two databases: Kyoto
124
Encyclopedia of Genes and Genomes (KEGG) database and Molecular Signatures Database
ACS Paragon Plus Environment
Page 6 of 36
Page 7 of 36
Environmental Science & Technology
125
(MSigDB). For the pathways generated for other species, like Homo sapiens and Mus musculus,
126
the corresponding zebrafish homologous genes were retrieved from Ensembl zebrafish genome
127
database (http://useast.ensembl.org/Danio_rerio/Info/Index).
128
The Gene Ontology (GO) dataset from the European Bioinformatics Institute (EMBL-EBI)
129
and MSigDB helps to uncover more neural factors. Nervous system development (GO Biological
130
Process, GO: 0007399) and the child GO terms, such as neurogenesis (GO: 0022008) and neural
131
tube development (GO:0021915), provide a large set of genes involved in neurogenesis and early
132
neuronal development. This dataset was further reviewed and clustered. The neural genes beyond
133
the classic pathways described above were further integrated manually. Meanwhile, the gene sets
134
from Mammalian Neurogenesis Gene Ontology (MANGO), a database of genes mapped to adult
135
neurogenesis via a comprehensive survey of the literatures,28 were also checked to include some
136
novel annotated neural genes. At last, all genes were integrated and clustered manually and the
137
related detailed annotations were extracted from Ensembl database and Genebank database using
138
R (version 3.4.3). This information was stored in FASTA format. In total, the gene panel in the
139
present study contained 2015 genes (termed L2000) (Table S1), which were divided into six
140
groups: extrinsic morphogens, growth factors, neurotransmitters and synaptic vesicle cycle,
141
intrinsic transcription factors, cytoskeletal proteins and epigenetic regulators. At last, the
142
representation of RTA gene panel on neural development was further validated by the enrichment
143
analysis of KEGG_Pathway and GO_Biological Process on the online bioinformatics resources
144
database: The Database for Annotation, Visualization and Integrated Discovery (DAVID).
145
Datasets collection. The basic criteria for data inclusion were as follows: (I) Exposure durations
146
representing the neurogenesis processes/early neuronal development. In zebrafish embryos, the
147
brain ventricles have formed by 48 hours post-fertilization (hpf). Neurogenesis peaks between 18
ACS Paragon Plus Environment
Environmental Science & Technology
148
hpf and 36 hpf.29 Therefore, the exposures covered the 0-48 hpf neurogenesis or exposures after
149
the first 48 hpf to represent the early neuronal development were selected for the present study.
150
(II) Studies have appropriate experimental setups. Treatment groups and vehicle controls should
151
have sufficient replicates for the statistical analysis. (III) Complete raw/normalized dataset with
152
sufficient annotation. The raw data or normalized dataset should be completed and can be
153
retrieved from public repositories or related literature. Annotation repository should be supplied
154
with sufficient information to cover all gene probes (Figure 1). The criteria for data inclusion
155
basically followed the approaches proposed by Ramasamy et al.30 and the MIAME-Guidelines.31
156
A database query was then conducted in the publicly accessible NCBI Gene Expression
157
Omnibus (GEO) and EMBL-EBI ArrayExpress online repositories. Datasets satisfying the above
158
restrictions were manually selected and integrated. Besides, the PubMed literature database was
159
also employed to search for relevant studies. Dataset was included if the criteria met. In total, the
160
searching resulted in 27 studies, which contain 74 environmental chemicals and 736 related gene
161
expression profiles (SI Table S1).
162
Data normalization and analysis. The raw data and platform information were downloaded
163
from GEO or Array Express. For data normalization, if available, the processed signal intensity
164
was employed. The raw data were log2 transformed and then the sample qualities were checked
165
using R package arrayQualityMetrics. The differentially expressed gene (DEG) was identified
166
with a moderated t-test using R-package “limma”. Cutoff criterium for DEGs was an adjusted P-
167
value of < 0.05. Because the analyzed studies are usually highly variable in regard to the
168
experimental setup, such as the exposure time points and the applied concentrations, the Log2
169
fold change (LogFC) values were not applied for DEGs identification to avoid a biased impact
170
on the analysis and to obtain sufficient gene expression information. From each platform file, the
ACS Paragon Plus Environment
Page 8 of 36
Page 9 of 36
Environmental Science & Technology
171
full annotation information, such as the probe IDs, Ensembl IDs, Genebank associate numbers,
172
gene names and symbols, chromosomal locations were extracted and stored in FASTA format.
173
The annotation repository was then mapped against the DEGs generated in each related dataset.
174
The final dataset with DEGs annotations was then stored for the RTA construction.
175
RTA Platform Construction. After obtaining the RTA gene panel (L2000), the full annotations
176
for each gene were retrieved from Ensembl database and Genebank database, which include the
177
Ensembl gene ID, Ensembl transcript ID, Entrez Gene ID, Genebank associate number, official
178
symbol, official full name, synonyms and chromosome location. This annotation repository was
179
then imported into R (version 3.4.3), which was used to compare with the annotations of DEGs
180
in each transcriptomic dataset to characterize and cluster the hit neural genes. The complete atlas
181
was assembled based on 135 gene sets with differentially expressed neural genes. Afterwards, it
182
was further reviewed and corrected manually to avoid the potential mismatched or overlapped
183
genes in each dataset.
184
Chemical Prioritization and Functional Enrichment Analysis. To discriminate the most likely
185
neurotoxic substances, several criteria were set up for the chemical prioritization analysis. It
186
included: (I) The ratio of differentially expressed neural genes (DEG-N) compared to the whole
187
set of DEGs, (II) Gene expression intensity (expression levels) of differentially expressed neural
188
genes responses to each chemical, (III) Gene expression patterns to cluster the different types of
189
chemicals. Chemicals with higher DEG-N proportions, higher gene expression levels or
190
displaying similar gene expression patterns indicated a higher neurotoxic potential. This strategy
191
was similar to those reported in previous large-scale transcriptome analysis, in which gene
192
expression intensity and hierarchical clustering were employed to characterize and group the
193
toxic chemicals.32,33
ACS Paragon Plus Environment
Environmental Science & Technology
194
For functional enrichment analysis, we divided the transcriptomes into two sets with respect
195
to the effect concentrations to derive biologically meaningful information as described below.
196
Only the exposures with reported visible effects, sublethal concentrations or pharmacological
197
concentrations were included. We performed the functional annotation and gene ontology (GO)
198
clustering by use of DAVID (The Database for Annotation, Visualization and Integrated
199
Discovery) Bioinformatics Resources 6.8. The enriched KEGG pathways and Gene Ontology
200
terms (Biological Process (BP), Cellular Component (CC) and Molecular Function (MF)) were
201
demonstrated by analyzing the differentially expressed neural genes generated in RTA gene set.
202
Hierarchical clustering map was constructed by use of the MultiExperimental Viewer v4.9
203
(Dana-Farber Cancer Institute, Boston, MA, USA) with the default distance metric (Pearson
204
Correlation) and average linkage clustering. Bootstrapping with 100 replicates was used to
205
generate support trees. Venn diagrams were generated using Venny 2.1.0. Chord diagram and
206
bubble plot were implemented using R (version 3.4.3) together with the package GOplot (version
207
1.0.2).
208
Gene Expression Index. Considering that the overlapping and gene expression intensities were
209
the major contributors for the development of reduced transcriptome atlas, as described in the
210
S1500 platform construction,22 a similar approach was performed for reduced zebrafish
211
transcriptome construction in the present study. The gene expression index (GEI) was developed
212
to score and rank each gene by multiplying the overlapping frequency by the averaged gene
213
expression intensity: GEI Score = (N/81)*(S/81). N represents the number of occurrence of each
214
gene as a DEG; S represents the summed absolute value (Log2FC); 81 represents the number of
215
exposure signatures.
216
ACS Paragon Plus Environment
Page 10 of 36
Page 11 of 36
Environmental Science & Technology
217
Results and Discussion
218
RTA Approach Development. Following the workflow diagram (Figure 1a), the RTA approach
219
was developed by integrating a large number of transcriptomic datasets and a specific gene panel
220
to comprehensively represent the neurogenesis and early neuronal development in zebrafish
221
(Danio rerio). The developed RTA gene set (termed L2000) consists of 2015 zebrafish Entrez ID
222
genes, which include 1236 extrinsic genes and 779 intrinsic genes. The extrinsic factors include
223
the vast majority of biological pathways and processes involved in neuronal development, such
224
as the extrinsic morphogens (e.g. Wnt, Notch, SHH, BMP and Ephrin signaling pathways),
225
growth factors (e.g. VEGF, IGF, FGF, Neurotrophin and Neuregulin), neurotransmitters (e.g.
226
dopamine, GABA, glutamate, acetylcholine, serotonin) and synaptic vesicle cycle process. The
227
cell-intrinsic factors include most known transcription factors, cytoskeletal proteins and other
228
regulators clarified in recent literature. Besides, some epigenetic regulators have recently been
229
shown to be crucially involved in modulating neurogenesis were also added, e.g. genes involved
230
in DNA methylation, demethylation and histone modifications (Figure 1b).
231
This RTA approach aggregated transcriptomic datasets from 27 studies, which contain 74
232
environmental chemicals and in total 736 related gene expression profiles. The gene expression
233
signatures for different exposure conditions, generated as the genes within the L2000 RTA gene
234
panel that expressed differentially in exposure fish samples compared to control group, were then
235
obtained. In total, 135 gene expression signatures were generated and each signature contained a
236
number of differentially expressed neural genes (DEG-N) range from 1 to 1316 with a specific
237
expression pattern. Chemicals included in the present study can be roughly divided into eight
238
categories: environmental hormones, polycyclic aromatic hydrocarbons, pharmaceuticals, food
239
additives, biocides (insecticides, herbicides, fungicides and antibiotics), industrial compounds,
ACS Paragon Plus Environment
Environmental Science & Technology
240
microcystins, heavy metals and related nanoparticles. The duration of exposure for each
241
chemical was depicted in Figure S1. In total, 91 exposures were integrated, including 46
242
exposures covered the fertilization to 48 hpf embryo developmental stage and 45 other exposures
243
range from fertilization to 7 days post-fertilization.
244
Chemical Prioritization. The DEG-N proportions were first checked. As shown in Figure 2a,
245
the ratios of DEG-N compared to the L2000 gene set were varied among chemicals, range from
246
65.3% to < 0.2%, while the ratios of DEG-N compared to the whole set of DEGs were relatively
247
stable, ranging from 12.6% to 1.2%. The high variations when compared to L2000 were mainly
248
due to the chemical types and the applied concentrations. The surprisingly stable ratios when
249
compared to whole DEGs set, mostly at about 5% to 8%, indicate that most of the chemicals had
250
background dysregulations on neural genes and were possibly due to their general adverse effects
251
on embryo development. The prioritized chemicals with higher DEG-N proportions (>8%) were
252
found to be estrogenic hormones (GEN, BPA), PAHs-like (TCDD, Dibenzothiophene),
253
microcystins (MCLR, CP1020), a neuro-drug (Fluoxetine) and a heavy metal (CoSO4).
254
Similarly, the gene expression intensity analysis based on Log2 FC value identified less than 6%
255
(8/135) chemicals with Log2 FC>3, i.e. xenoestrogens (EE2, GEN), microcystins (MCLR), a
256
neuro-drug (VPA) and heavy metals (AgNO3, AgNPs) (Figure 2b).
257
Hierarchical clustering map depicts 135 exposure signatures for the L2000 gene set (Figure
258
2c). Of these signatures, only a small part displayed clear similar gene expression patterns. They
259
can be roughly divided into three sets: (a) the highest prioritization occurred for pharmaceuticals,
260
i.e. VPA, CAR, retinoic acid (RA) and caffeine (CAFF). They are all neuro-drugs or neuroactive
261
compounds. (b) the second prioritized group consisted of extrinsic environmental hormones,
262
including the classic xenoestrogens, like EE2, GEN, BPA, 17β-estradiol (E2), and others such as
ACS Paragon Plus Environment
Page 12 of 36
Page 13 of 36
Environmental Science & Technology
263
pentachlorophenol, triclosan, propiconazole and diisobutyl_phthalate. (c) the third group
264
consisted of microcystins and heavy metals, including CP1020, AgNO3 and the related silver
265
nanoparticles (Ag50, Ag150 and PVP) (Figure 2d).
266
Therefore, the chemical prioritization analyses provides a list of hit chemicals that displayed
267
remarkable dysregulations on genes involved in neurogenesis and early neuronal development in
268
zebrafish embryos. The most frequently occurred substances were: neuro-drugs and neuroactive
269
compounds (representatively, VPA and CAR), xenoestrogens (representatively, BPA, GEN and
270
EE2), microcystins (CP1020 and MCLR) and heavy metal forms (AgNO3 and AgNPs). VPA is a
271
broad-spectrum antiepileptic drug that has been used in the clinic for more than 30 years. It is
272
effective for many different forms of partial and generalized epileptic seizure and is prescribed to
273
treat bipolar disorders, schizoaffective disorders, social phobias, as well as the neuropathic pain
274
and migraine headache.34 Similarly, CAR was first marketed in 1963 and is typically used for the
275
treatment of seizure disorders and neuropathic pain. It is also very important as off-label for a
276
second-line treatment for bipolar disorder.35 These compounds have been frequently detected in
277
the aquatic environment and have been non-surprisingly proven to be neurotoxic to animals like
278
zebrafish. They disrupted behavioral activities with similar patterns such as the acceleration of
279
spontaneous movement, touch response and light stimulation response in zebrafish eleuthero-
280
embryos and even showed the persisting behavioral alterations later in life.36,37 Xenoestrogens
281
like BPA, EE2 and GEN exhibit estrogen-mimicking, hormone-like properties that raise concern
282
about their endocrine disrupting activities.38,39 In recent years, the neurotoxicity of endocrine
283
disruptive chemicals, especially the xenoestrogens, has been extensively discussed.38,39 Animal
284
studies demonstrated that xenoestrogens, such as BPA and EE2, can produce adverse outcomes
285
in rearing behavior, locomotive activity, anxiety, learning and memory, as well as the neuronal
ACS Paragon Plus Environment
Environmental Science & Technology
286
abnormalities.39 Some evidence further clarified that BPA exerts effects on neurogenesis within
287
zebrafish hypothalamus and therefore caused the later hyperactive behaviors in zebrafish
288
larvae.40 In contrast, fewer reports discussed the neurotoxicity of GEN, while its neuroprotective
289
effects against the neurotoxic amyloid β-peptide and heavy metals have been demonstrated
290
widely.41 These results were consistent with the transcriptomic data interpretations in the present
291
study that xenoestrogens like BPA, EE2 and GEN would most likely to be neurotoxic substances.
292
Microcystins as the secondary metabolites are produced via cyanobacterial blooms. Their toxic
293
targets are not only the liver but also other organs, i.e., the brain.42 The serious neurotoxicity
294
effects caused by microcystins can lead to various symptoms, such as dizziness, headache, visual
295
disturbances and even blindness. In zebrafish, microcystins like MCLR and CP1020 can cross
296
the blood brain barrier via the uptake transporters43 and cause neurostructural and behavioral
297
changes mainly through oxidative stress and the inhibition of protein phosphatases.44 Similarly,
298
the neurotoxicity effects of AgNO3 and silver nanoparticles have also been investigated widely.
299
The common phenomenon includes the reduced neurite formation and outgrowth, impaired cell
300
membrane integrity, blood-brain barrier dysfunction, cell necrosis, neurodegeneration, oxidative
301
stress and apoptosis.45,46 The induced oxidative stress and cell apoptosis by increasing reactive
302
oxygen species and caspase activity were generally considered to be the primary target of AgNO3
303
and silver nanoparticles exposure.45,46 This molecular mode of actions were similar to that for
304
microcystins, which was also in agreement and consistent with the hierarchical clustering where
305
microcystins (CP1020) and AgNO3 and AgNPs built the third group (Figure 2d).
306
It should be noted that during the transcriptomic data analysis, we observed the chemical
307
concentration was a key factor to effectively characterize and cluster chemicals. The exposures
308
with low concentrations usually dysregulated less DEGs and were not be able to show clear
ACS Paragon Plus Environment
Page 14 of 36
Page 15 of 36
Environmental Science & Technology
309
molecular mode of actions and the expression patterns during the hierarchical clustering. On the
310
contrary, the exposures with relatively high concentrations (such as pharmacological or lethal
311
concentration) generally did well. This phenomenon was quite consistent with the observations
312
in a similar report previously on zebrafish transcriptome meta-analysis.32 Thus, to be able to
313
derive biologically meaningful information of different substances via transcriptome analysis,
314
only the exposures with reported inducing visible effects, sublethal concentrations or
315
pharmacological concentrations were included in the following functional enrichment analysis,
316
as described above.
317
Functional Enrichment Analysis. To systemically characterize the hit chemicals, the DEGs
318
were subjected to DAVID for KEGG pathway and Gene Ontology enrichment analysis. The top
319
five KEGG pathways and GO Biological Processes enriched for each of the 12 hit chemicals
320
(VPA, CAR, Fluoxetine (FLO), BPA, EE2, GEN, CP1020, MC-LR, AgNO3, Ag150, Cobalt and
321
TCDD) were depicted (Figure 3a, S2). VPA and CAR, the commonly administered antiepileptic
322
drugs, showed very similar dysregulations on the KEGG pathways, four of the top five pathways
323
were identical between them. They were Wnt signaling, GnRH signaling, MAPK signaling and
324
Melanogenesis. In contrast, the antidepressant drug of the selective serotonin reuptake inhibitor
325
(SSRI) class, FLO, displayed a distinct dysregulation. Only the weak dysregulated Wnt signaling
326
and GnRH signaling were identical with others. BPA and EE2 also showed similar patterns, with
327
three of the top five pathways identical, i.e. Wnt signaling, GnRH signaling, and Notch signaling.
328
Interestingly, the enriched pathways of BPA and EE2 also resembled the neuro-drugs (VPA and
329
CAR). On the contrary, the microcystins and heavy metal forms showed quite different patterns
330
compared to the neuro-drugs and xenoestrogens. The enriched and identical Wnt signaling and
331
TGF-beta signaling was observed for CP1020 and MCLR and the enriched and identical Wnt
ACS Paragon Plus Environment
Environmental Science & Technology
332
signaling, VEGF signaling and Adrenergic signaling was observed for AgNO3 and AgNP-150. In
333
comparison, the GO_ Biological Processes were varied among the hit chemicals (Figure S2).
334
It should be noted that Wnt signaling was the most enriched pathway and biological process
335
among the hit chemicals, which was not surprising because it is indispensable in virtually every
336
aspect of embryonic development.47 However, the in-depth analysis of L2000 displayed quite
337
distinct dysregulations on Wnt signaling components as well as other pathways among different
338
types of chemicals. The transcriptional changes of the representative extrinsic signaling (i.e.
339
morphogens, growth factors, neurotransmitters and synaptic vesicle cycle) response to four types
340
of chemicals (VPA, BPA, MCLR and Ag150) is depicted in Figure 3b. For VPA (Figure 3b),
341
almost all signals were remarkably dysregulated; the ratio ranges from 54.2% (13/24, BMP
342
signaling) to 86.4% (19/22, FGF signaling). For example, 71.1% (59/83) genes involved in Wnt
343
signaling were significantly changed, with 31 genes up-regulated (representatively, sfrp2, wif1
344
and nlk2) and 28 genes down-regulated (representatively, apc2, wnt4b, gsk3b and ctnnd2b). This
345
result indicated that VPA had comprehensive adverse effects on the signals of neuronal
346
development in zebrafish embryos. In contrast, BPA had a specific enrichment on the extrinsic
347
signals. Compared to the enriched Wnt signal (15.6%, 13/83), the enrichment on
348
neurotransmitters (Glutamate signaling, 48.9%, 22/45) and synapse vesicle cycle process (52.3%,
349
46/88) was much stronger. For MCLR, it displayed less effect on most of the extrinsic signals
350
such as growth factors and neurotransmitters, the weak enrichments were only found for Wnt
351
signaling (26.5%, 22/83) and synapse vesicle cycle (22.7%, 22/88). For AgNP-150, multiple
352
signals were targeted. The highest enrichments were found for Neurotrophin signaling (53.2%,
353
25/47), BMP signaling (45.8%, 11/24) and Dopamine signaling (43.3%, 13/30). It should be
354
noted that the enriched signaling was sometimes identical among chemicals (as shown in KEGG
ACS Paragon Plus Environment
Page 16 of 36
Page 17 of 36
Environmental Science & Technology
355
pathways), such as the Wnt signaling and the synapse vesicle cycle process; however, the
356
detailed regulation patterns of targeted genes within each signaling were quite distinct (Figure
357
3b).
358
VPA, as a representative chemical, was further analyzed to better understand its potential
359
adverse effects on neurogenesis and early neuronal development (Figure 3c). The bioactivity
360
profile of VPA was shown as the radial pie diagram where the length of the slice indicates the
361
percentage of DEG-N/RTA gene panel for each signaling and the angle of the slice indicates the
362
total number of changed genes for a specific category term in the annotation level (e.g. Wnt
363
signaling, Glutamate signaling and synapse vesicle cycle). In total, 28 extrinsic neural signaling
364
pathways, cell-intrinsic transcription factors and epigenetic regulators were integrated. As shown
365
in Figure 3c, VPA had a significant effect on neural signaling pathways. Almost all signals were
366
remarkably dysregulated. The top five enriched signaling pathways were FGF signaling (86.4%),
367
GABA signaling (73.3%), Neurotrophin signaling (72.3%), Wnt signaling (71.1%) and Ephrin
368
signaling (69.7%) (Figure 3c). In-depth data analysis on the Wnt signaling demonstrated that
369
VPA is a potent inhibitor. Besides the up-regulations on the key inhibitory factors (e.g. wif1,
370
cer1, nlk, sfrp1a and sfrp2), VPA suppressed most essential components involved in canonical
371
Wnt signaling, such as wnt1, wnt2b, lrp5, gsk3aa, ctnnb1, tcf7l1b, tcf7l2, axin1 and apc2 (Figure
372
S3). Especially for apc2, the fold-changes were found to be four times higher compared with the
373
control. In contrast, the high enriched FGF signaling pathway was apparently activated by VPA
374
(Figure S4). Accompanied with the up-regulation on multiple FGF factors like fgf3, fgf8a and
375
fgf14, the transcriptional levels of FGF receptors (fgfr1a, fgfr1b and fgfr1bl) and downstream
376
components such as grb2a, grb2b, sos1, rras2, mras and mapk1 were significantly increased
377
(Figure S4). Wnt signaling and FGF signaling serve many functions in embryonic developmental
ACS Paragon Plus Environment
Environmental Science & Technology
378
processes such as cellular proliferation, differentiation and morphogenesis,47 therefore, the
379
significant alterations on these signals indicated that VPA had multiple adverse effects on
380
embryonic development that was beyond the nervous system. We further analyzed the
381
regulations for GABA signaling (Figure S4). Similar to Wnt signaling, most of the essential
382
components participated in GABAergic synaptic process were inhibited, especially the GABAA
383
and GABAC receptors such as gabra1, gabrb1, gabrg2, gabrr1 and gabrr2b. Because GABAA
384
and GABAC receptors are ionotropic and mediate fast GABA responses by triggering chloride
385
channel openings,48 the significant inhibitions indicated that VPA may affect the fast GABA
386
signal transductions by the blockage of chloride channels (Figure S4).
387
In addition, the differentially expressed neural genes generated in RTA approach were also
388
used to analyze the overlap within each type of chemical (Figure 4, S5 and S6). Venn diagram
389
shows the overlapping genes among the four neuroactive chemicals (VPA, CAR, RA and CAFF)
390
(Figure 4a) and three xenoestrogens (BPA, EE2 and GEN) (Figure S5a). For neuroactive
391
chemicals, there were 74 overlapping genes showing significant alterations. The responses of
392
these genes appeared very similar in terms of directionality among these chemicals with only a
393
few exceptions. The top four shared biological pathways generated by DAVID were Wnt
394
signaling, MAPK signaling, Calcium signaling and VEGF signaling (Figure 4b). Shared GO
395
terms turned to be involved in early embryonic development, such as anterior/posterior pattern
396
specification and multicellular organism development (Figure 4c) and therefore indicated that
397
these chemicals have broadly adverse effects on zebrafish embryogenesis. For the three
398
xenoestrogens, there were 87 overlapping genes showing significant changes (Figure S5a). The
399
three overlapped pathways generated were Wnt signaling, cell adhesion molecules and Notch
400
signaling (Figure S5b). The shared GO terms were specific on neuronal development, such as the
ACS Paragon Plus Environment
Page 18 of 36
Page 19 of 36
Environmental Science & Technology
401
nervous system development, neuron projection, synapse and neuron projection development
402
(Figure S5c). Similarly, for microcystins, 110 overlapping genes were observed between MCLR
403
and CP1020 (Figure S6a). They were involved in Wnt signaling, TGF-beta signaling, FoxO
404
signaling and Focal adhesion and the overlapped GO terms were specific on neuronal
405
development, such as the axon guidance and neuron projection morphogenesis (Figure S6b). For
406
heavy metals, there were 99 overlapping genes among AgNO3, AgNP-150 and PVP (Figure S6c).
407
The top four overlapped biological pathways were Hedgehog signaling, TGF-beta signaling, Wnt
408
signaling and Cytokine-cytokine receptor interaction. The shared GO terms were enriched in
409
nervous system development, cytoplasm and sprouting angiogenesis (Figure S6d).
410
Construction of L300 RZT Approach. The utilization of “omics” for chemical assessment is
411
hampered by to the high expenses and the lack of “standardized” toxicogenomic methods,22,23
412
therefore, alternative strategies, such as the reduced transcriptome approach with substantially
413
reduced costs, were developed in recent years.18-24 Similar to the L1000 landmark gene panel for
414
drug discovery and the S1500 sentinel gene set for toxicological studies, here, based on the
415
L2000 gene set developed to comprehensively represent biological pathways involved in
416
neurogenesis and early neuronal development in zebrafish, we aimed to generate an efficient and
417
cost-effective neural gene panel with similar approaches for high throughput developmental
418
neurotoxicology analysis. As described above and previous study,32 exposure concentrations and
419
observation time points were the major factors that influence the gene expressions. To reduce the
420
study bias and be able to derive more biologically meaningful information from the large set of
421
different treatments, in each exposure, only the signatures with the highest concentration and/or
422
latest time point were included. They were believed to represent the treatments showing the
423
strongest and most significant neural toxicities.
ACS Paragon Plus Environment
Environmental Science & Technology
424
In total, 81 exposure signatures containing 72 environmental chemicals were included in the
425
RZT construction. This broadly representative chemical set includes eight large categories, such
426
as the environmental hormones, PAHs, pharmaceuticals, biocides and heavy metals. To capture
427
maximal variations in gene expressions and dynamics and the maximal pathway coverage, a
428
principal component analysis were employed as a first analysis step, while not surprisingly, it
429
failed to cluster the L2000 neural genes due to the large variations among exposure signatures.
430
Therefore we employed the GEI to score and rank each gene as described above. Based on this
431
GEI score, the L2000 genes were completely evaluated to generate a representative gene panel
432
with a minimum number of genes. We selected the top 100, 200, 300 and 400 genes (as L100,
433
L200, L300 and L400, respectively) and compared their representativeness on neural signaling
434
pathways and whether they can be used to cluster and prioritize chemicals in the same manner as
435
L2000. The average percentage of covered genes within each signaling pathway increased
436
steadily from L100 to L400, while the percentage of covered pathways increased rapidly at L300
437
and L400 compared to L100 and L200. We demonstrated that when the gene numbers increase to
438
300 (L300, Table S1), it can cover most neural signaling pathways (90.3%, 28/31) with a gene
439
coverage percentage of about 10%-30% among pathways (Figure S7, S8). This was similar with
440
L400 (93.5%, 29/31), which covered one more signaling pathway, neuropeptide Y. To generate a
441
representative gene panel with a minimum number of genes, L300 was therefore employed for
442
the further analysis. In L300, we found that many key regulators involved in each pathway were
443
included, such as the neurotransmitter receptors (bmpr2a, gria1a, gabrg2 and fgfr1a) and
444
transporters (slc6a1a, slc6a9 and slc32a1) and the neurogenic differentiation factors (neurod1,
445
neurod2 and neurod6a). At the same time, it was sufficient to group the chemicals and display
446
the different expression patterns among them. The sorted three clusters closely resembled those
ACS Paragon Plus Environment
Page 20 of 36
Page 21 of 36
Environmental Science & Technology
447
from L2000 clustering, i.e. the neuro-active chemicals, xenoestrogens, microcystins and heavy
448
metals (Figure 2d, 5a).
449
Chord diagram further characterized the expression patterns via L300 for four neuro-active
450
chemicals (VPA, CAR, RA and CAFF) (Figure 5b). Thirty-seven genes involved in four enriched
451
pathways (FGF signaling, GABA signaling, Neurotrophin signaling and Wnt signaling) were
452
compared. Genes participated in GABAergic synaptic process were overlapped significantly and
453
were remarkable inhibited, such as the neurotransmitter transporters slc6a1a and slc32a1 and the
454
G proteins like gng3. Genes involved in FGF signaling were also similarly among chemicals and
455
showed upregulation, such as fgf8a and the related receptor fgfr1a. In contrast, the expression
456
levels of genes involved in Wnt signaling were distinct though they were highly overlapped as
457
well. For example, the protein kinase, prkcba, was upregulated by VPA and CAR, while it was
458
downregulated by RA and CAFF (Figure 5b). Similarly, the gene expression characterizations
459
can also be efficiently clarified for the cluster of microcystins and heavy metals. Forty-eight
460
genes involved in Wnt signaling, BMP signaling, Neurotrophin signaling or Dopamine signaling
461
were compared. Most genes involved in Wnt, BMP and Neurotrophin signaling were overlapped
462
among these four chemicals, especially for AgNO3 and Ag150, and significantly upregulated
463
such as SMAD family member smad7 and the C-terminal binding protein ctbp2a. The different
464
expression patterns among chemicals were observed for genes involved in dopamine signaling,
465
such as the dopa decarboxylase ddc and the G proteins like gng3 (Figure 5c).
466
In conclusion, we developed a Reduced Transcriptome Atlas approach to characterize and
467
prioritize the potential neurotoxicants using zebrafish. We successfully demonstrated four sets of
468
hits to be neurotoxic, i.e. neuro-active chemicals, xenoestrogens, microcystins and heavy metals,
469
from the transcriptomic dataset analysis and in-depth interpretation. These results were
ACS Paragon Plus Environment
Environmental Science & Technology
470
consistent with the previous reports on their neurotoxicological effects observed at higher
471
organizational levels (e.g. histological and physiological toxicities). We further generated a gene
472
panel with 300 neural genes, which could effectively characterize and cluster the chemicals as
473
L2000 and therefore would help to establish a cost-effective RZT approach specific targeting on
474
the nervous system. The RTA approach generated in the present study would be applied for
475
prioritization of environmental neurotoxicants and discovery of the in-depth molecular mode of
476
actions. It provides novel insights into the understanding of gene-function relationships and the
477
regulatory mechanisms and would contribute to the early molecular response identification,
478
adverse outcome pathway (AOP) construction and the further ecological risk assessments. The
479
strategy employed in the present study can also be applied to prioritize other types of toxicants,
480
such as the endocrine disrupting chemicals (EDCs) and cardiovascular disrupting chemicals.
481 482
ASSOCIATED CONTENT
483
Supporting information
484
Details of chemical abbreviation; transcriptomic datasets information (SI Table S1); descriptions
485
of the duration of exposure (Figure S1); gene ontology enrichment analysis (Figure S2); striped
486
view of the enriched Wnt signaling pathway (Figure S3) and FGF signaling pathway and GABA
487
signaling pathway (Figure S4); overlapping functional enrichment for xenoestrogens (Figure S5),
488
microcystins and heavy metals (Figure S6); gene expression index ranking (Figure S7); pathway
489
coverage of reduced L300 (Figure S8). This material is available free of charge via the internet at
490
http://pubs.acs.org.
491
Table S1. L2000 and L300 gene set generated in the present study.
492
ACS Paragon Plus Environment
Page 22 of 36
Page 23 of 36
Environmental Science & Technology
493
AUTHOR INFORMATION
494
Corresponding author
495
Tel.: +1 857 307 0318; E-mail:
[email protected] 496 497
Notes
498
The authors declare no competing financial interest.
499 500
ACKNOWLEDGMENTS
501
We thank Nadja Brun (Woods Hole Oceanographic Institution, MA), Karl Fent (University of
502
Applied Sciences and Arts Northwestern Switzerland and Swiss Federal Institute of Technology
503
(ETH Zürich)) and Zaniar Ghazizadeh (Harvard Medical School, MA) for critically reading the
504
manuscript and making valuable suggestions.
ACS Paragon Plus Environment
Environmental Science & Technology
505
Figure Legends
506
Figure 1. Construction of the Reduced Transcriptome Atlas (RTA) Approach in Zebrafish.
507
(A) Design and the workflow of the RTA approach development. (B) L2000 gene set represents
508
the extrinsic and intrinsic factors involved in neurogenesis and early neuronal development of
509
zebrafish.
510 511
Figure 2. Chemical Prioritization through RTA Approach. (A) Ratio of RTA genes compared
512
to the L2000 gene set (left y-axis, red bar graph) and ratio of RTA genes compared to the whole
513
number of differentially expressed genes (right y-axis, blue dots) for each exposure signature.
514
Top 9 chemicals with the highest ratio of RTA genes/total genes are displayed on the right. (B)
515
Total gene expressions (Log2 fold change (Log2FC) of the different expressed gene) for each
516
signature. Top 8 chemicals with highest gene expression levels are displayed on the right. (C)
517
Hierarchical clustering of L2000 gene transcriptional responses for 135 exposure signatures.
518
White boxes in the figure indicate the similar expression patterns among chemicals. The colored
519
boxes on the right (C1, Yellow; C2, Green and C3, Blue) indicate the chemical groups displayed
520
on the subfigure (D). (D) Detailed information on the chemical clusters (Cluster 1, yellow:
521
neuro-active chemicals; Cluster 2, green: xenoestrogens; Cluster 3, blue: heavy metals and
522
microcystins) sorted by the hierarchical clustering.
523 524
Figure 3. Functional Enrichment Analysis of the Twelve Hit Chemicals. (A) KEGG pathway
525
enrichment. Top five KEGG pathways are displayed for each of the twelve chemicals (Valproic
526
acid (VPA), Carbamazepine (CAR), Fluoxetine (FLO), Bisphenol A (BPA), 17-α Ethinylestradiol
527
(EE2), Genistein (GEN), Cyanopeptolin (CP1020), Microcystin-LR (MCLR), AgNO3, Ag150,
ACS Paragon Plus Environment
Page 24 of 36
Page 25 of 36
Environmental Science & Technology
528
Cobalt
and
2,3,7,8-Tetrachlorodibenzo-p-dioxin
529
transcriptional
530
neurotransmitters and synaptic vesicle cycle) under four types of chemicals exposure (VPA,
531
BPA, MCLR and Ag150). (C) Polar area diagram shows the enriched neural signaling pathways
532
in zebrafish embryos under VPA exposure. The top 10 enriched pathways are described on the
533
right. The percentage value shows the proportion of different expressed genes compared to the
534
related gene panel for each signaling pathway recruited for the present study.
responses
of
key
extrinsic
(TCDD)). signaling
(B)
Heatmap
(morphogens,
depicts
growth
the
factors,
535 536
Figure 4. Overlapping Functional Enrichments of the Neuro-active Chemicals. (A) Venn
537
diagram shows the overlap of different expressed RTA genes relative to control among the four
538
neuro-active chemicals (VPA, CAR, RA and CAFF). (B) The top four shared KEGG pathways
539
are shown at below. (C) Overlapped gene ontology terms (BP: Biological Process; CC: Cellular
540
Component; MF: Molecular Function) among the four neuroactive chemicals shown in a bubble
541
plot using the Goplot package. Top five GO terms with adjusted p-value