Subscriber access provided by United Arab Emirates University | Libraries Deanship
Article
Authentication of Zanthoxylum Species Based on Integrated Analysis of Complete Chloroplast Genome Sequences and Metabolite Profiles Hyeon Ju Lee, Hyun Jo Koo, Jonghoon Lee, Dong Young Lee, Vo Ngoc Linh Giang, Minjung Kim, Hyeonah Shim, Jee Young Park, Ki-Oug Yoo, Sang Hyun Sung, and Tae-Jin Yang J. Agric. Food Chem., Just Accepted Manuscript • DOI: 10.1021/acs.jafc.7b04167 • Publication Date (Web): 23 Oct 2017 Downloaded from http://pubs.acs.org on October 24, 2017
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
Journal of Agricultural and Food Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 35
Journal of Agricultural and Food Chemistry
Authentication of Zanthoxylum Species Based on Integrated Analysis of Complete Chloroplast Genome Sequences and Metabolite Profiles
Hyeon Ju Lee1,4, Hyun Jo Koo1,4, Jonghoon Lee1,4, Sang-Choon Lee1, Dong Young Lee2, Vo Ngoc Linh Giang1, Minjung Kim1, Hyeonah Shim1, Jee Young Park1, Ki-Oug Yoo3, Sang Hyun Sung2*, Tae-Jin Yang1*
1
Department of Plant Science, Plant Genomics and Breeding Institute, and Research
Institute of Agriculture and Life Sciences, College of Agriculture and Life Sciences, Seoul National University, Seoul, 08826, Republic of Korea 2
College of Pharmacy and Research Institute of Pharmaceutical Science, Seoul National
University, Seoul, 08826, Republic of Korea 3
Department of Biological Sciences, Kangwon National University, Chuncheon,
Gangwon, 24341, Republic of Korea 4
These authors contributed equally to this work.
*Corresponding authors Email:
[email protected] Tel: +82-2-880-4547 Fax: + 82-2-873-2056
1
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 2 of 35
1
Abstract
2
We performed chloroplast genome sequencing and comparative analysis of two Rutaceae
3
species, Zanthoxylum schinifolium (Korean pepper tree) and Z. piperitum (Japanese pepper
4
tree), which are medicinal and culinary crops in Asia. We identified more than 837 single
5
nucleotide polymorphisms and 103 insertions/deletions (InDels) based on a comparison of
6
the two chloroplast genomes and developed seven DNA markers derived from five tandem
7
repeats and two InDel variations that discriminated between Korean Zanthoxylum species.
8
Metabolite profile analysis pointed to three metabolic groups, one with Korean Z. piperitum
9
samples, one with Korean Z. schinifolium samples and the last containing all the tested
10
Chinese Zanthoxylum species samples, which are considered to be Z. bungeanum based on
11
our results. Two markers were capable of distinguishing among these three groups. The
12
chloroplast genome sequences identified in this study represent a valuable genomics
13
resource for exploring diversity in Rutaceae, and the molecular markers will be useful for
14
authenticating dried Zanthoxylum berries in the marketplace.
15 16
Key words: Zanthoxylum, Z. schinifolium, Z. piperitum, chloroplast genome, marker
17 18
2
ACS Paragon Plus Environment
Page 3 of 35
19
Journal of Agricultural and Food Chemistry
Introduction
20
The chloroplast is an essential cytoplasmic organelle in plant cells, serving as the
21
location for photosynthesis to produce energy via CO2 assimilation.1 Chloroplasts retain an
22
autonomous organellar genome that encodes, among other proteins, the large subunit of the
23
key
24
carboxylase/oxygenase, rbcL).2 The circular chloroplast genome ranges in size from
25
approximately 120–217 kb and is maternally inherited in most angiosperms.3 The
26
chloroplast genome is usually divided into four parts including a large single copy (LSC)
27
region and a small single copy (SSC) region separated by a pair of inverted repeats (IRs).4-6
28
Compared to nuclear and mitochondrial genomes, chloroplast genomes are highly
29
conserved7, and there is little variation within a single species. Although chloroplast
30
genomes are highly conserved, small nucleotide variations offer enough information to
31
distinguish among different species and sometimes between different variants or cultivars
32
within a species.
enzyme
in
photosynthesis,
RuBisCO
(Ribulose-1,5-bisphosphate
33
Sequence variations in chloroplast genomes can be found in both protein-coding
34
genes (e.g., matK, rpoB, rpoC1 and rbcL) and intergenic regions (e.g., psbK-psbI, trnL-trnF
35
and atpF-atpH). These variations have been used to study plant genetic diversity and
36
evolution and to develop markers for authenticating plant species.8-13 Due to recent
37
advances in sequencing and assembly technologies, the complete chloroplast genome
38
sequences from more than 1800 species have been deposited in GenBank14. Phylogenetic
39
analyses based on chloroplast genome information have shed light on plant evolution. In
40
phylogenomic studies using chloroplast genes, the selection of the proper sequence datasets,
41
taxon sampling techniques and methods for phylogenetic analysis (Bayesian analysis,
42
maximum likelihood and so on) is important because different methods can produce
3
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 4 of 35
43
different results; therefore, the correct methodology is still under debate15. To date,
44
comprehensive surveys of genetic diversity using chloroplast genomes have been
45
performed in many plant species using close relatives16-19 or within subspecies in plants
46
such as Oryza sativa20 and Panax ginseng14.
47
The Zanthoxylum genus, which belongs to the Rutaceae family, comprises
48
approximately 250 species of aromatic trees and shrubs.21 In Africa, the Americas and Asia,
49
many Zanthoxylum species are traditionally used as food supplements or drugs due to the
50
valuable aromatic oil compounds obtained from their pericarps and leaves.22-26 Some Asian
51
species such as Z. piperitum (Japanese pepper), Z. schinifolium (Korean pepper) and Z.
52
bungeanum (Szechuan pepper) are also used as condiments and spices due to their strong
53
taste, especially in Eastern Asian countries including Korea, Japan and China22-25, 27-29,
54
while American and African Zanthoxylum species are not used for culinary purposes.25
55
Essential oils from Z. piperitum and Z. schinifolium contain beneficial compounds with
56
anti-microbial, anti-inflammatory and antioxidant activities.23,
57
schinifolium and Z. piperitum plants appear similar, they can be distinguished based on the
58
arrangement of their spikes on branches, which is alternate in Z. schinifolium and
59
symmetrically opposite in Z. piperitum (Figure 1A, B). However, it is not easy to
60
discriminate between fruits and seeds harvested from these plants due to their similar
61
morphology, especially when their dried and ground pericarps are distributed in the
62
marketplace. The chemical components of these species differ, including aromatic
63
compounds (especially their isopulegol contents), but their metabolic profiles sometimes
64
differ within a single species, such as in samples from different countries.22, 23, 30 It is even
65
more difficult to distinguish between Z. schinifolium and Z. piperitum based on their
66
metabolic profiles when the pericarp powders from the species are mixed. Differences in
25, 27, 28, 30
Although Z.
4
ACS Paragon Plus Environment
Page 5 of 35
Journal of Agricultural and Food Chemistry
67
DNA sequences could be used to differentiate/identify these two Zanthoxylum species;
68
however, the limited availability of genetic and genomic resources for both species
69
represents an obstacle for the establishment of a clear molecular authentication system.31
70
Therefore, a reliable tool is needed for authenticating the pericarps from these Zanthoxylum
71
species at the species level.
72
Several efforts have focused on developing markers to distinguish Z. schinifolium
73
from Z. piperitum based on sequence variations in their internal transcribed spacer (ITS)
74
regions in nuclear ribosomal DNA (nrDNA),31,
75
However, complete chloroplast genome sequences exhibit less variation than nrDNA within
76
species and can therefore provide important information for comprehensive analysis of
77
genetic diversity and establishment of a clear molecular authentication system.17
32
a well-known barcoding region.14,
33
78
In this study, we obtained the complete chloroplast genome sequences of Z.
79
piperitum and Z. schinifolium by de novo assembly of whole-genome sequencing (WGS)
80
data using next-generation sequencing (NGS) technology. We also carried out metabolic
81
profiling of Korean and Chinese Zanthoxylum species, and developed practical molecular
82
markers that distinguish Z. schinifolium, Z. piperitum and Chinese Zanthoxylum species.
83
5
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
84
Materials and Methods
85
Plant materials and DNA preparation
Page 6 of 35
86
Twenty-three individual samples of Zanthoxylum species collected from Korea and
87
China were used in this study, including 11 Chinese Zanthoxylum species (CZ), eight
88
Korean Z. piperitum (KZP) samples and four Korean Z. schinifolium (KZS) samples; their
89
geographical origins are described in Table 1 and Figure 1C. Total genomic DNA was
90
extracted from fresh leaves of KZP-08, KZS-03 and KZS-04 and from freeze-dried fruits
91
from the other samples using a modified cetyltrimethylammonium bromide (CTAB)
92
method.34 The quality and quantity of the extracted DNA samples were examined using a
93
NanoDrop ND-1000 (Thermo Scientific, Wilmington, MA).
94 95
Whole-genome shotgun sequencing
96
To generate the chloroplast genome sequences, genomic DNA was extracted from
97
the leaves of KZS-03 (Z. schinifolium) and KZP-08 (Z. piperitum) and used for whole
98
genome shotgun sequencing on the Illumina MiSeq platform (Illumina, San Diego, CA)
99
and Illumina NextSeq500 (Illumina, San Diego, CA), respectively (Table 1). A paired-end
100
genomic library was constructed following the manufacturer’s instructions. Library
101
construction and sequencing were carried out by Lab Genomics Co. (Seongnam, Korea).
102 103
Chloroplast genome assembly and gene annotation
104
The generated sequencing data with Phred scores of 20 or less were filtered and de
105
novo assembled using CLC genome assembler (v. beta 4.6, CLC Inc., Rarhus, Denmark)
106
according to the dnaLCW method described in Kim et al.14,
107
representing the chloroplast genome were combined into a draft sequence based on the
33
Principal contigs
6
ACS Paragon Plus Environment
Page 7 of 35
Journal of Agricultural and Food Chemistry
108
linkages of overlapping contig sequences. Annotation of protein-coding genes in the
109
chloroplast genome was carried out using the DOGMA program35 and manually confirmed
110
using BLAST searches. Circular gene maps of the complete chloroplast genomes were
111
drawn using OGDRAW (http://ogdraw.mpimp-golm.mpg.de/)36.
112 113
Comparative analysis of the chloroplast genomes of Zanthoxylum species
114
The assembled chloroplast genome sequence of Z. schinifolium was compared to
115
the complete chloroplast genome sequence of Z. piperitum obtained from sample KZP-08
116
(GenBank No.: KT153018).37 The two sequences were aligned and compared using
117
MAFFT
118
(http://genome.lbl.gov/vista/mvista/submit.shtml)39. Annotation information for mVISTA
119
was obtained using DOGMA35 and tRNAscan-SE40, followed by manual curation that also
120
included a comparison with published chloroplast genome sequences. In addition, tandem
121
repeats (TRs) were identified from the chloroplast genomes of the two Zanthoxylum species
122
using the Tandem Repeats Finder program (http://tandem.bu.edu/trf/trf.html)41 and
123
compared to identify the different regions between Z. schinifolium and Z. piperitum. The
124
rates of nonsynonymous substitutions per nonsynonymous sites (Ka) over synonymous
125
substitutions
126
(http://www.bork.embl.de/pal2nal/)42.
(http://mafft.cbrc.jp/alignment/server/)38
per
synonymous
site
(Ks)
were
and
calculated
using
mVISTA
PAL2NAL
127
To compare the ndhG sequences from Z. piperitum, Z. schinifolium and Z.
128
bungeanum, these sequences and their translated sequences were aligned and compared
129
using MAFFT (http://mafft.cbrc.jp/alignment/server/)38.
130 131
Molecular marker analysis
7
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 8 of 35
132
To validate the inter-species polymorphisms in the chloroplast genomes and to
133
develop DNA markers for discriminating these Zanthoxylum species, specific primers were
134
designed based on polymorphic regions derived from InDels and copy number variation of
135
the TRs between Z. piperitum and Z. schinifolium using the Primer 3 program
136
(http://bioinfo.ut.ee/primer3-0.4.0/).43, 44 Seven molecular markers were developed based on
137
the sequence variation between the Z. piperitum and Z. schinifolium chloroplast genomes.
138
PCR amplifications were performed in a total volume of 25 µl containing 20 ng of genomic
139
DNA template, 1× PCR buffer, 10 pM of each primer, 0.2 mM dNTPs and 1 U Taq DNA
140
polymerase (Vivagen, Korea). The amplified PCR fragments were analyzed via size
141
separation in 1.5% agarose gels or 9.0% polyacrylamide gels or by capillary electrophoresis
142
using a Fragment Analyzer (Advanced Analytical Technologies Inc., Ankeny, IA),
143
depending on the sizes of the PCR products.
144 145
Principle Component Analysis based on near infrared reflectance spectroscopy
146
analysis
147
The pericarp samples were cleaned, air-dried, placed into a stoppered glass vial and
148
dried for 12 h in an oven at 60°C to remove the moisture in the samples prior to near-
149
infrared reflectance (NIR) spectroscopy analysis. NIR spectra were obtained from the
150
samples using an NIR system (MPA; Bruker Optics, Ettlingen, Germany) over a
151
wavelength range of 10,000–4000 cm-1 using 32 scans at a resolution of 8 cm-1 per
152
spectrum. Each spectrum represents an average of 32 scanned spectra. Approximately 1 g
153
of sample was placed into a single glass sample vial. The spectra were acquired in the
154
reflectance mode using a glass sample vial as a reference standard. Each sample spectrum
155
was measured three times and the final spectra were averaged. NIR spectra are affected by 8
ACS Paragon Plus Environment
Page 9 of 35
Journal of Agricultural and Food Chemistry
156
both the chemical and physical properties of samples; the latter properties contribute to the
157
majority of unwanted variance among spectra. Therefore, spectral pre-processing must be
158
performed to reduce systematic noise, such as light scattering, path length differences,
159
baseline variation and so on. In this study, several spectral preprocessing methods were
160
used comparatively to obtain the optimum results, including first derivative, second
161
derivative, standard normal variate (SNV) and multiplicative scatter correction (MSC). To
162
avoid noise enhancement, which occurs as a consequence of derivative analysis, a
163
Savitzky-Golay smoothing filter was employed. NIR spectral data acquisition and spectral
164
preprocessing were performed with OPUS 7.0 software (Bruker Optics, Ettlingen,
165
Germany). SIMCA 13 software (Umetrics, Malmö, Sweden) was used for PCA. The data
166
sets were in Pareto scaling mode prior to PCA.
167 168
Phylogenetic analysis
169
The whole chloroplast genome sequences from 16 plant species were aligned using
170
ClustalW, and a maximum likelihood tree was generated with very strong branch swap
171
filter options using MEGA5 (version 5.2.2)45. To measure clade support, 1000 bootstrap
172
replicates were generated. A Bayesian tree was generated from the same sequence
173
alignment using BEAST (version 1.8.4)46 with the following options: substitution model,
174
HKY; base frequencies, Estimated; site heterogeneity model, None; tree prior, Coalescent -
175
Constant Size; tree model, Random starting model. The length of the chain for Two Markov
176
Chain Monte Carlo searches was 10,000,000 generations, with trees samples every 1000
177
generations. TreeAnnotator was run with the following options: burn-in (as trees), 100;
178
posterior probability limit, 0; target tree type, Maximum clade credibility tree; node heights,
9
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 10 of 35
179
Mean heights. A final tree with posterior probability values at the clade nodes was
180
generated with FigTree version 1.4.3 and edited with MEGA5 (version 5.2.2)45.
181 182
Results and Discussion
183
Complete chloroplast genome sequence of Z. schinifolium
184
We obtained approximately 3.26 Gb and 4.23 Gb of paired-end sequences from
185
whole-genome sequencing of KZP-08 (Z. piperitum) and KZS-03 (Z. schinifolium),
186
respectively (Table 2), using low-coverage WGS, an efficient method that has been used to
187
produce complete chloroplast genome sequences in many plant species.14,
188
Compared to the raw sequence data for KZP-08, the sequencing data for KZS-03 contains
189
many more raw sequence reads corresponding to the chloroplast genome (164.87x
190
chloroplast coverage from 3.26 Gbp in KZP-08 and 1069.04x chloroplast coverage from
191
4.23 Gbp in KZS-03) (Table 2). Following de novo assembly of the KZS-03 data, five
192
contigs were produced for the chloroplast genome, which were ordered based on the
193
complete chloroplast genome sequence of Z. piperitum (GenBank No.: KT153018). The
194
contigs were merged into a single circular draft sequence by combining overlapping
195
sequences. After putative assembly errors were curated by mapping raw reads onto the draft
196
sequence, we obtained 158,963 bp of the complete chloroplast genome sequence, with 38.4%
197
GC content. The chloroplast genome of Z. schinifolium has a typical quadripartite structure
198
with a pair of inverted repeat regions (IRa and IRb, each 27,085 bp) separated by a large
199
single copy (LSC) region (86,528 bp) and a small single copy (SSC) region (18,265 bp)
200
(Table 2). In addition, analysis of GC contents (calculated based on the GC composition in
201
100 bp sliding windows) and raw read mapping depth revealed that parts of the IR regions
202
next to SSC have relatively high GC contents with lower sequencing depth in the
19, 20, 33, 37
10
ACS Paragon Plus Environment
Page 11 of 35
Journal of Agricultural and Food Chemistry
203
chloroplast genomes of both Zanthoxylum species (Figure S1). In total, we identified 111
204
genes in the Z. piperitum and Z. schinifolium chloroplast genomes, including 78 protein-
205
coding genes, 29 tRNA genes and 4 rRNA genes, including 18 genes containing introns
206
(Table S1). When counting gene numbers, duplicated genes in IRa and IRb were considered
207
to be one gene instead of two. The complete chloroplast genome sequence of Z.
208
schinifolium was deposited in GenBank under accession number KT321318.
209 210
Comparative analysis of the chloroplast genomes of Z. piperitum versus Z.
211
schinifolium
212
The chloroplast genome sequences of the two Zanthoxylum species are 97.1%
213
identical, and their GC contents are also very similar (38.5% and 38.4% in Z. piperitum and
214
Z. schinifolium, respectively) (Table 2). Compared to the Z. piperitum chloroplast genome,
215
the Z. schinifolium chloroplast genome is 809 bp longer, with shorter IR regions (27,644 bp
216
in Z. piperitum and 27,085 bp in Z. schinifolium) but longer LSC and SSC regions (85,340
217
bp and 17,526 bp, respectively, in Z. piperitum and 86,528 bp and 18,265 bp, respectively,
218
in Z. schinifolium) (Table 2). Both chloroplast genomes contain 112 identical genes, which
219
are present in the same order in the genome (Figure 2). The IR regions in both genomes
220
contain completely duplicated genes, including eight protein-coding genes (rps19, rpl2,
221
rpl23, ycf2, ycf15, ndhB, rps7 and rps12), seven tRNA genes (trnI-CAU, trnL-CAA, trnV-
222
GAC, trnI-GAU, trnA-UGC, trnR-ACG and trnN-GUU) and four rRNA genes (rrn16, rrn23,
223
rrn4.5 and rrn5) (Table S1).
224
We performed comparative analysis using mVISTA to determine the level of
225
sequence divergence, finding that intergenic regions are more divergent than genic regions
226
(Figure S2). The nucleotide and amino acid sequences of protein-coding genes are highly
11
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 12 of 35
227
similar, with an average sequence similarity of 98.6 and 98.5%, respectively (Table S2).
228
When we aligned both chloroplast sequences, there were two notably large InDels: in the
229
rps16–trnQ (UUG) (473 bp) intergenic region in the LSC and in ycf1 (582 bp) in IRa (see
230
red arrowheads in Figure 2 and red dashed boxes in Figure S2). The ycf1 genes are located
231
in the borders between SSR and two IR regions, and an extended IR region in Z. piperitum
232
(approximately 560 bp) caused the latter large InDel (Figure 3).
233
We compared TRs within the chloroplast genome between Z. piperitum and Z.
234
schinifolium. Nineteen TR regions differ between the two Zanthoxylum species, and TRs
235
ranging from 15–38 bp in length were repeated from 0.5 to 3 times (Table 3). All TRs were
236
found in the 14 intergenic regions, including 13 located in LSC and one in an IR region
237
(blue arrowheads in Figure 2).
238
239
Divergence of coding gene sequences
240
Between the two Zanthoxylum species, 17 and 34 genes share identical nucleotide
241
and amino acid sequences, respectively (Table S2). Several genes with higher Ka, Ks or
242
Ka/Ks values are indicated in Figure S3. The average Ks values between the two
243
Zanthoxylum species are 0.0185, 0.0250 and 0.0059 in the LSC, SSC and IR regions,
244
respectively, with an average ratio of 0.0165 (Table S2, Figure S3). This result is in
245
agreement with the previous finding that IR regions are more conserved than other regions
246
because they frequently compensate for each other19. However, small variations were
247
observed even in highly conserved coding regions. Genes in SSC regions had higher rates
248
of changes in non-synonymous sites and higher average Ka/Ks ratios, indicating that genes
249
in the SSC region are relatively more variable between the two Zanthoxylum species than
250
those in other regions. Only one gene had a Ka/Ks ratio >1 (ndhG in the SSC region had a 12
ACS Paragon Plus Environment
Page 13 of 35
Journal of Agricultural and Food Chemistry
251
Ka/Ks ratio of 1.4324) (Table S2). NdhG is a subunit of the NADPH dehydrogenase
252
complex, which provides electrons for cyclic electron flow47 and helps protect the plant
253
against photo-oxidative stress.47, 48 NdhG might be structurally similar to the NuoJ subunit
254
of Escherichia coli complex I (NADH: ubiquinone oxidoreductase) and the Nqo10 subunit
255
of Thermus thermophiles complex I, and it appears that NdhG offers part of a
256
plastoquinone-binding site on its surface and is not involved in electron transport.49
257
Therefore, NdhG is likely to be more of a structural subunit than a functional subunit for
258
the NADPH dehydrogenase complex, and ZpNdhG and ZsNdhG, with four amino acid
259
differences, might both be functional. These result imply that accumulation of non-
260
synonymous mutations not affecting protein function sometimes can result in high value of
261
Ks/Ka even without positive selection.
262 263
Validation of inter-species polymorphism and development of authentication markers
264
We performed molecular classification of the two Zanthoxylum species by
265
designing TR markers based on InDel and copy number variations. We designed seven
266
primer sets derived from the two large InDels and five intergenic regions harboring TRs
267
and confirmed them by PCR analysis of Zanthoxylum species using two accessions each of
268
Chinese Zanthoxylum species (CZP-03, CZP-11), Korean Z. piperitum (KZP-01, KZP-08)
269
and Korean Z. schinifolium (KZS-03, KZS-04) (Table 3 and Table 4). Schematic diagrams
270
of the five TR markers are shown in Figure 4A–E. The lengths of genes used as markers are
271
varied due to tandem repeats and insertions. The sizes of PCR products from these markers
272
were varied according to their size expected from their sequences (Figure 4F–J). Although
273
3 markers (2 TR markers and 1 InDel marker) were the same between Z. piperitum and Z.
274
bungeanum, all seven markers revealed inter-specific polymorphism and clearly
13
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 14 of 35
275
discriminated between Z. piperitum and Z. schinifolium (see Figure 4 for TR markers,
276
Figure S4 for InDel markers and Figure S5 for all samples).
277 278
Authentication of Zanthoxylum species using the newly developed markers and
279
metabolite profile analysis
280
The sizes of PCR products from KZP and KZS amplified using the TR markers and InDel
281
markers (Figure 4, Figure S4 and Figure S5) were as expected (Table 4). Since the CZ
282
samples were purchased from markets in China, their tentative production area was known
283
but their scientific names were uncertain, although we thought they could have been
284
obtained from Z. piperitum or Z. schinifolium plants grown in China. When we analyzed
285
metabolite data using near infrared reflectance spectroscopy analysis, principle component
286
analysis (PCA) indicated that the CZ samples harbored distinct metabolites and therefore
287
might have been different species from the KZP and KZS samples (Figure 5). Several
288
Zanthoxylum species grow in China, and the chloroplast genome sequence of one of these
289
species, Z. bungeanum, has been reported (GenBank No. KX497031). Notably, the sizes of
290
PCR products from CZ matched the sizes expected from the Z. bungeanum chloroplast
291
genome (Figure 4, Figure S4, Figure S5 and Table 4). Several varieties of Z. bungeanum
292
also grow in Szechuan province; the fruits of these varieties, as well as of Z. armatum, are
293
commonly referred to as Szechuan peppers.29 Of the 11 CZ samples, three were also
294
obtained from Szechuan province, and we believe that the CZ samples are Szechuan
295
peppers. To date, the chloroplast genomes of all Chinese Zanthoxylum species except Z.
296
bungeanum have yet to be sequenced, so it is unclear if the chloroplast genomes of several
297
Chinese Zanthoxylum species are highly similar. However, we confirmed that the sequence
298
lengths of all CZ samples collected from China match that expected from Z. bungeanum.
14
ACS Paragon Plus Environment
Page 15 of 35
Journal of Agricultural and Food Chemistry
299
Therefore, our two TR markers, IMZanTR-1 and IMZanTR-3, can be used to distinguish Z.
300
piperitum, Z. schinifolium and Z. bungeanum and to identify Korean pepper, Japanese
301
pepper and some Szechuan peppers in the marketplace.
302 303 304
Zanthoxylum species in East Asia
305
Comparative analysis of metabolites revealed different profiles between Z.
306
piperitum and Z. schinifolium. While the Z. piperitum pericarp produces oleic acid as a
307
major fatty acid, the Z. schinifolium pericarp produces linolenic acid instead.22 Among
308
terpenoids, isopulegol is produced at the highest levels in Z. piperitum pericarp, followed
309
by myrcene, whereas myrcene is produced at the highest levels in Z. schinifolium pericarp,
310
followed by citronellal.22 Since isopulegol can be synthesized from citronellal by squalene
311
hopene cyclase,50 perhaps Z. piperitum contains high levels of oxidosqualene cyclase for
312
this conversion. However, Z. piperitum fruit from Japan contains high levels of limonene as
313
a major terpene product and produces very little isopulegol,51 and Z. schinifolium from
314
China produces linalool as a major product.52 The metabolite profiles of plants can vary
315
based on the environment, developmental stage, storage conditions after harvest, metabolite
316
extraction method and so on. To identify the differences in metabolite profiles among
317
Zanthoxylum species, it is best to perform experiments using the same method. Unlike
318
metabolite analysis, genetic information is highly reproducible and digitalized.
319
We constructed a maximum likelihood (ML) tree for several Rutaceae species
320
using whole chloroplast genome sequences with a clade of Anacardiaceae species as an
321
outgroup (Figure 6). In this phylogenetic tree, Z. piperitum and Z. bungeanum are closer to
322
each other than to Z. schinifolium. The ndhG sequence data support this result; there are
15
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 16 of 35
323
four amino acids difference between Z. piperitum and Z. schinifolium but no difference
324
between Z. piperitum and Z. bungeanum. The Bayesian (B) tree has the same structure as
325
the ML tree (Figure S6). Interestingly, Korean Z. piperitum is genetically closer to Chinese
326
Z. bungeanum than to Korean Z. schinifolium. Perhaps Z. piperitum and Z. bungeanum
327
diverged after their ancestor split from Z. schinifolium; this hypothesis is well supported by
328
both bootstrap and posterior probability values from the ML and B trees, respectively.
329
Z. piperitum is preferred for use as a spice over Z. schinifolium in Korea and Japan,
330
and Z. bungeanum is also widely used as a spice in China. There are additional
331
Zanthoxylum species in East Asia, some of which also produce culinary seeds. More
332
sequencing data from these species, such as Z. armatum and Z. simulans, will shed light on
333
the evolution of the Zanthoxylum species used as spices in this area.
334
The sizes of the PCR products obtained using the newly developed markers were
335
more similar between the KZP and CZ samples versus the KZS samples (Figure 4, Table 3).
336
However, these markers must be much more broadly applicable to other Zanthoxylum
337
species when they are used to discriminate species other than Z. piperitum, Z. schinifolium
338
and Z. bungeanum. The availability of additional sequencing data from other Zanthoxylum
339
species will also facilitate the development of a system for discriminating all of these
340
species in the marketplace.
341 342 343 344
16
ACS Paragon Plus Environment
Page 17 of 35
Journal of Agricultural and Food Chemistry
345 346
Abbreviations Used
347
LSC, large single copy; SSC, small single copy; IR, inverted repeat; TR, tandem repeat;
348
InDel, insertion or deletion; KZP, Korean Zanthoxylum piperitum; KZS, Korean
349
Zanthoxylum schinifolium; CZ, Chinese Zanthoxylum species
350 351
Acknowledgments
352
We thank all members of the Laboratory of Functional Crop Genomics and Biotechnology,
353
Seoul National University and Phyzen Genomics Institute for their technical assistance.
354 355
Funding Sources
356
This research was supported by the Next-Generation BioGreen21 Program for Agriculture
357
and Technology Development (Project No. PJ01103001) of the Rural Development
358
Administration and the Bio and Medical Technology Development Program of the NRF
359
funded by the Korean government, MSIP (NRF-2015M3A9A5030733), Republic of Korea.
360 361
17
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 18 of 35
Literature cited
1.
Sharkey, T. D., Photosynthesis in intact leaves of C3 plants: physics, physiology and rate
limitations. The Botanical Review 1985, 51, 53-105. 2.
Bedbrook, J. R.; Coen, D. M.; Beaton, A. R.; Bogorad, L.; Rich, A., Location of the single
gene for the large subunit of ribulosebisphosphate carboxylase on the maize chloroplast chromosome. J Biol Chem 1979, 254, 905-10. 3.
Hagemann, R., The foundation of extranuclear inheritance: plastid and mitochondrial
genetics. Molecular Genetics and Genomics 2010, 283, 199-209. 4.
Group, C. P. W., A DNA barcode for land plants. Proceedings of the National Academy of
Sciences of the United States of America 2009, 106, 12794-7. 5.
Palmer, J. D., Contrasting modes and tempos of genome evolution in land plant organelles.
Trends in Genetics 1990, 6, 115-120. 6.
J.D., P., Cell organelles. Plant Gene Research 1992, 99-122.
7.
Taberlet, P.; Gielly, L.; Pautou, G.; Bouvet, J., Universal primers for amplification of three
non-coding regions of chloroplast DNA. Plant molecular biology 1991, 17, 1105-1109. 8.
Bortiri, E.; Oh, S.-H.; Jiang, J.; Baggett, S.; Granger, A.; Weeks, C.; Buckingham, M.; Potter,
D.; Parfitt, D. E., Phylogeny and systematics of Prunus (Rosaceae) as determined by sequence analysis of ITS and the chloroplast trnL-trnF spacer DNA. Systematic Botany 2001, 26, 797-807. 9.
Samuel, R.; Stuessy, T. F.; Tremetsberger, K.; Baeza, C. M.; Siljak-Yakovlev, S.,
Phylogenetic relationships among species of Hypochaeris (Asteraceae, Cichorieae) based on ITS, plastid trnL intron, trnL-F spacer, and matK sequences. American Journal of Botany 2003, 90, 496507. 10.
Lee, B.-R.; Kim, S.-H.; Huh, M.-K., Phylogenic Study of Genus Asarum (Aristolochiaceae)
in Korea by trnL-trnT Region. Journal of Life Science 2010, 20, 1697-1703. 11.
Huh, M.-K.; Yoon, H.-J.; Choi, J.-S., Phylogenic Study of Genus Citrus and Two Relative
Genera in Korea by trnL-trnF Sequence. Journal of Life Science 2011, 21, 1452-1459.
18
ACS Paragon Plus Environment
Page 19 of 35
Journal of Agricultural and Food Chemistry
12.
Yang, J. Y.; Jang, S. Y.; Kim, H.-K.; Park, S. J., Development of a molecular marker to
discriminate Korean Rubus species medicinal plants based on the nuclear ribosomal DNA internal transcribed spacer and chloroplast trnL-F intergenic region sequences. Journal of the Korean Society for Applied Biological Chemistry 2012, 55, 281-289. 13.
Kim, J. H.; Jung, J.-Y.; Choi, H.-I.; Kim, N.-H.; Park, J. Y.; Lee, Y.; Yang, T.-J., Diversity
and evolution of major Panax species revealed by scanning the entire chloroplast intergenic spacer sequences. Genetic resources and crop evolution 2013, 60, 413-425. 14.
Kim, K.; Lee, S.-C.; Lee, J.; Lee, H. O.; Joh, H. J.; Kim, N.-H.; Park, H.-S.; Yang, T.-J.,
Comprehensive Survey of Genetic Diversity in Chloroplast Genomes and 45S nrDNAs within Panax ginseng Species. PLoS One 2015, 10, e0117159. 15.
Jansen, R. K.; Kaittanis, C.; Saski, C.; Lee, S. B.; Tomkins, J.; Alverson, A. J.; Daniell, H.,
Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids. BMC evolutionary biology 2006, 6, 32. 16.
Terakami, S.; Matsumura, Y.; Kurita, K.; Kanamori, H.; Katayose, Y.; Yamamoto, T.;
Katayama, H., Complete sequence of the chloroplast genome from pear (Pyrus pyrifolia): genome structure and comparative analysis. Tree Genetics & Genomes 2012, 8, 841-854. 17.
Ku, C.; Chung, W.-C.; Chen, L.-L.; Kuo, C.-H., The complete plastid genome sequence of
Madagascar periwinkle Catharanthus roseus (L.) G. Don: plastid genome evolution, molecular marker identification, and phylogenetic implications in asterids. PLoS One 2013, 8, e68518. 18.
Su, H. J.; Hogenhout, S. A.; Al-Sadi, A. M.; Kuo, C. H., Complete chloroplast genome
sequence of Omani lime (Citrus aurantiifolia) and comparative analysis within the rosids. PLoS One 2014, 9, e113049. 19.
Cho, K. S.; Yun, B. K.; Yoon, Y. H.; Hong, S. Y.; Mekapogu, M.; Kim, K. H.; Yang, T. J.,
Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum). PLoS One 2015, 10, e0125332. 20.
Tong, W.; He, Q.; Wang, X. Q.; Yoon, M. Y.; Ra, W. H.; Li, F.; Yu, J.; Oo, W. H.; Min, S. K.;
Choi, B. W., A chloroplast variation map generated using whole genome re‐sequencing of Korean landrace rice reveals phylogenetic relationships among Oryza sativa subspecies. Biological Journal of the Linnean Society 2015, 115, 940-952.
19
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
21.
Page 20 of 35
Arun, K.; Paridhavi, M., An ethno botanical phytochemical and pharmacological utilization
of widely distributed species Zanthoxylum: a comprehensive overview. Int. J. Pharm. Invent 2012, 2, 24-35. 22.
Ko, Y.-S.; Han, H.-J., Chemical Constituents of Korean Chopi (Zanthoxylum piperitum) and
Sancho (Zanthoxylum schinifolium) KOREAN J. FOOD SCI. TECHNOL. 1996, 28, 19-27. 23.
Kim, J.; Jeong, C.-H.; Bae, Y.-I.; Shim, K.-H., Chemical components of Zanthoxylum
schinifolium and Zanthoxylum piperitum leaves Korean J POSTARVEST SCI. TECHNOL 2000, 7, 189-194. 24.
Adesina, S., The Nigerian Zanthoxylum; chemical and biological values. African Journal of
Traditional, Complementary, and Alternative Medicines 2005, 2, 282-301. 25.
Yang, X., Aroma constituents and alkylamides of red and green huajiao (Zanthoxylum
bungeanum and Zanthoxylum schinifolium). Journal of agricultural and food chemistry 2008, 56, 1689-1696. 26.
Gupta, D. D.; Mandi, S. S., Species Specific AFLP Markers for authentication of
Zanthoxylum acanthopodium & Zanthoxylum oxyphyllum. J Med Plants 2013, 1, 1-9. 27.
Paik, S.-Y.; Koh, K.-H.; Beak, S.-M.; Paek, S.-H.; Kim, J.-A., The essential oils from
Zanthoxylum schinifolium pericarp induce apoptosis of HepG2 human hepatoma cells through increased production of reactive oxygen species. Biological and Pharmaceutical Bulletin 2005, 28, 802-807. 28.
Yamazaki, E.; Inagaki, M.; Kurita, O.; Inoue, T., Antioxidant activity of Japanese pepper
(Zanthoxylum piperitum DC.) fruit. Food chemistry 2007, 100, 171-177. 29.
Xiang, L.; Liu, Y.; Xie, C.; Li, X.; Yu, Y.; Ye, M.; Chen, S., The Chemical and Genetic
Characteristics of Szechuan Pepper (Zanthoxylum bungeanum and Z. armatum) Cultivars and Their Suitable Habitat. Front Plant Sci 2016, 7, 467. 30.
Cho, S.-H.; Kwon, E.-H.; Oh, S.-H.; Woo, M.-H., Suppressive Effects of the Extract of
Zanthoxylum schinifolium and Essential Oil from Zanthoxylum piperitum on Pacific Saury, Coloabis saira Kwamegi. Journal of the Korean Society of Food Science and Nutrition 2009, 38, 1753-1759. 31.
Sun, Y.-L.; Park, W.-G.; Kwon, O.-W.; Hong, S.-K., The internal transcribed spacer rDNA
specific markers for identification of Zanthoxylum piperitum. African Journal of Biotechnology 2013, 9, 6027-6039. 20
ACS Paragon Plus Environment
Page 21 of 35
Journal of Agricultural and Food Chemistry
32.
Kim, Y.-J.; Zhang, D.; Yang, D.-C., Biosynthesis and biotechnological production of
ginsenosides. Biotechnology advances 2015, 33, 717-735. 33.
Kim, W. J.; Ji, Y.; Lee, Y. M.; Kang, Y. M.; Choi, G.; Moon, B. C., Development of
Molecular Markers for the authentication of Zanthoxyli Pericarpium by the analysis of rDNA-ITS DNA barcode regions. The Korea Journal of Herbology 2015, 30, 41-47. 34.
Allen, G.; Flores-Vergara, M.; Krasynanski, S.; Kumar, S.; Thompson, W., A modified
protocol for rapid DNA isolation from plant tissues using cetyltrimethylammonium bromide. Nature protocols 2006, 1, 2320-2325. 35.
Wyman, S. K.; Jansen, R. K.; Boore, J. L., Automatic annotation of organellar genomes with
DOGMA. Bioinformatics 2004, 20, 3252-3255. 36.
Lohse, M.; Drechsel, O.; Bock, R., OrganellarGenomeDRAW (OGDRAW): a tool for the
easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes. Current genetics 2007, 52, 267-274. 37.
Kim, K.; Lee, S. C.; Lee, J.; Yu, Y.; Yang, K.; Choi, B. S.; Koh, H. J.; Waminal, N. E.; Choi,
H. I.; Kim, N. H.; Jang, W.; Park, H. S.; Lee, J.; Lee, H. O.; Joh, H. J.; Lee, H. J.; Park, J. Y.; Perumal, S.; Jayakodi, M.; Lee, Y. S.; Kim, B.; Copetti, D.; Kim, S.; Kim, S.; Lim, K. B.; Kim, Y. D.; Lee, J.; Cho, K. S.; Park, B. S.; Wing, R. A.; Yang, T. J., Complete chloroplast and ribosomal sequences for 30 accessions elucidate evolution of Oryza AA genome species. Sci Rep 2015, 5, 15655. 38.
Katoh, K.; Toh, H., Recent developments in the MAFFT multiple sequence alignment
program. Briefings in bioinformatics 2008, 9, 286-298. 39.
Frazer, K. A.; Pachter, L.; Poliakov, A.; Rubin, E. M.; Dubchak, I., VISTA: computational
tools for comparative genomics. Nucleic Acids Res 2004, 32, W273-9. 40.
Schattner, P.; Brooks, A. N.; Lowe, T. M., The tRNAscan-SE, snoscan and snoGPS web
servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res 2005, 33, W686-9. 41.
Benson, G., Tandem repeats finder: a program to analyze DNA sequences. Nucleic acids
research 1999, 27, 573. 42.
Suyama, M.; Torrents, D.; Bork, P., PAL2NAL: robust conversion of protein sequence
alignments into the corresponding codon alignments. Nucleic acids research 2006, 34, W609-W612.
21
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
43.
Page 22 of 35
Koressaar, T.; Remm, M., Enhancements and modifications of primer design program
Primer3. Bioinformatics 2007, 23, 1289-91. 44.
Untergasser, A.; Cutcutache, I.; Koressaar, T.; Ye, J.; Faircloth, B. C.; Remm, M.; Rozen, S.
G., Primer3--new capabilities and interfaces. Nucleic Acids Res 2012, 40, e115. 45.
Tamura, K.; Peterson, D.; Peterson, N.; Stecher, G.; Nei, M.; Kumar, S., MEGA5: molecular
evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 2011, 28, 2731-9. 46.
Drummond, A. J.; Suchard, M. A.; Xie, D.; Rambaut, A., Bayesian phylogenetics with
BEAUti and the BEAST 1.7. Mol Biol Evol 2012, 29, 1969-73. 47.
Casano, L. M.; Zapata, J. M.; Martin, M.; Sabater, B., Chlororespiration and poising of
cyclic electron transport. Plastoquinone as electron transporter between thylakoid NADH dehydrogenase and peroxidase. J Biol Chem 2000, 275, 942-8. 48.
Martin, M.; Casano, L. M.; Sabater, B., Identification of the product of ndhA gene as a
thylakoid protein synthesized in response to photooxidative treatment. Plant Cell Physiol 1996, 37, 293-8. 49.
Battchikova, N.; Eisenhut, M.; Aro, E. M., Cyanobacterial NDH-1 complexes: novel insights
and remaining puzzles. Biochim Biophys Acta 2011, 1807, 935-44. 50.
Hammer, S. C.; Marjanovic, A.; Dominicus, J. M.; Nestl, B. M.; Hauer, B., Squalene hopene
cyclases are protonases for stereoselective Bronsted acid catalysis. Nat Chem Biol 2015, 11, 121-6. 51.
Jiang, L.; Kubota, K., Differences in the volatile components and their odor characteristics of
green and ripe fruits and dried pericarp of Japanese pepper (Xanthoxylum piperitum DC.). J Agric Food Chem 2004, 52, 4197-203. 52.
Yang, X., Aroma constituents and alkylamides of red and green huajiao (Zanthoxylum
bungeanum and Zanthoxylum schinifolium). J Agric Food Chem 2008, 56, 1689-96. 53.
Lee, J.; Lee, H. J.; Kim, K.; Lee, S. C.; Sung, S. H.; Yang, T. J., The complete chloroplast
genome sequence of Zanthoxylum piperitum. Mitochondrial DNA A DNA Mapp Seq Anal 2016, 27, 3525-6.
22
ACS Paragon Plus Environment
Page 23 of 35
Journal of Agricultural and Food Chemistry
Figure captions Figure 1. Zanthoxylum species used in this study, and areas of origin of leaf samples or seed products collected from various markets. (A, B) Morphological characteristics of Z. piperitum and Z. schinifolium, respectively, used for genome and RNA sequencing. Opposite or alternately arranged spikes are indicated by orange circles. (C) Collection areas of seed products utilized for food or oriental medicine production. Collection areas for Korean Z. piperitum (KZP), Korean Z. schinifolium (KZS) and Chinese Zanthoxylum species (CZ) are indicated by circles on the map.
Figure 2. Circular gene maps of the chloroplast genomes of the two Zanthoxylum species. Genes shown inside the circle are transcribed clockwise, and those outside the circle are transcribed counterclockwise. The genes are colored according to their functions, as shown in the legend. Polymorphic sites between two Zanthoxylum species derived from the copy number variation of tandem repeat sequences are denoted with blue arrowheads at 14 locations, and the two InDel sites are marked by red arrows; those used for marker development are indicated by “*”.
Figure 3. Comparison of the borders of SSC and IR regions between the chloroplast genomes of the two Zanthoxylum species. Compared to the Z. piperitum sequences, IR regions in Z. schinifolium are shorter, and part of ycf1 in the SSC region (indicated with a triangle) was removed.
Figure 4. Schematic diagram of TRs and insertions in five TR markers (A–E), and confirmation of these markers for the discrimination of Zanthoxylum species (F–G). PCR products from two individuals each of CZ, KZP and KZS are of the predicted sizes using TR23
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 24 of 35
type markers. The primer sequences and expected sizes are shown in Table 4, and PCR results from whole samples (11 CZ, 8 KZP and 4 KZS) are shown in Figure S5.
Figure 5. Comparison of the metabolite profiles of Zanthoxylum seed samples. Principle Component Analysis of pericarp metabolites based on near infrared reflectance spectroscopy analysis data formed three groups: Chinese Zanthoxylum species (CZ), Korean Z. piperitum (KZP) and Korean Z. schinifolium (KZS).
Figure 6. Phylogenetic tree including sequences from Z. piperitum, Z. schinifolium and Z. bungeanum. The maximum likelihood tree was generated using whole chloroplast genome sequences with 1000 bootstrap replicates. Bootstrap values are shown in percentages, and posterior probability values from the Bayesian tree (adapted from Figure S6) are in parentheses. A clade of Anacardiaceae species was used as an outgroup.
24
ACS Paragon Plus Environment
Page 25 of 35
Journal of Agricultural and Food Chemistry
Table 1. List of Zanthoxylum species used in the study Species Chinese Zanthoxylum species (CZ)
Korean Z. piperitum (KZP)
Korean Z. schinifolium (KZS)
Sample CZ-01
Geographical origin Szechuan province
CZ-02
Szechuan province
CZ-03
Szechuan province
CZ-04
Shantung province
CZ-05
Shantung province
CZ-06
Shanxi province
CZ-07
Shanxi province
CZ-08
Shanxi province
CZ-09
Shanxi province
CZ-10
Shanxi province
CZ-11
Shanxi province
KZP-01
Geochang, Gyeongsangnam-do
KZP-02
Mungyeong, Gyeongsangbuk-do
KZP-03
Gurye, Jeollanam-do
KZP-04
Gurye, Jeollanam-do
KZP-05
Gurye, Jeollanam-do
KZP-06
Gimje, Jeollabuk-do
KZP-07
Imsil, Jeollabuk-do
KZP-08*
Geoje, Gyeongsangnam-do
KZS-01
Hongcheon, Gangwon-do
KZS-02
Yangpyeong, Gyeonggi-do
KZS-03*
Chuncheon, Gangwon-do
KZS-04 Yongin, Gyeonggi-do * Plant materials used for complete chloroplast genome sequencing
25
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 26 of 35
Table 2. Summary of whole genome sequencing and chloroplast genome assembly in Zanthoxylum species Species (Sample No.)
Raw data amount (Gbp)
Z. piperitum* 3.26 (KZP-08) Z. schinifolium 4.23 (KZS-03) Sequence identity (%) * cited from Lee et al. 53
GenBank No.
Cp coverage (x)
KT153018 KT321318
Length (bp)
GC content
LSC
SSC
IR
Total
164.87
85,340
17,526
27,644
158,154
38.5%
1069.04
86,528
18,265
27,085
158,963
38.4%
96.4
89.9
97.1
97.1
26
ACS Paragon Plus Environment
Page 27 of 35
Journal of Agricultural and Food Chemistry
Table 3. Intergenic regions containing tandem repeats with copy number variation between Zanthoxylum piperitum and Z. schinifolium No. 1 2 3 4 5 6
Position trnH(GUG) - psbA psbK - psbI trnS(GCU) trnG(UCC) trnR(UCU) - atpA atpH - atpI petN - psbM
Sequence (length) TAATTTTCTTAGTAGTATTC (20 bp) AGAGCCAACCACAATGT (17 bp) GTTACATTGTTACATTACACA (21 bp)
TTATATATTTATATT (15 bp) AAAGAAAATATTAAG (15 bp) AGTAATTTCATTATA (15 bp) TTTAATTCAGTAATTCAATT (20 bp) CCATTTAGAATTTTTCAGTAATTTAATT (28 bp) 7* psbM - trnD(GUC) AATACTAAAATACTAATA (18 bp) CTTTTTTTTATTTATCATT (17 bp) 8* psbZ - trnG(GCC) AAATAAATATTAATATAATAATT (23 bp) TTATTAATAGAAATATATATTATTTATA (28 bp) 9* trnS(GGA) - rps4 GGTGAAAGGGGAAATTTGTACGAGCCCGTTATTTTAGT (38 bp) 10 trnT(UGU) - trnL(UAA) TCTTAATCTATTCTA (15 bp) 11 ndhC - trnV(UAC) TAGTTTCGTTTGTTTGTTGT (20 bp) TTTTGATTCTATTCTATA (18 bp) 12* rpl33 - rps18 TTATTTCATATATTTAAATAGAAACAA (27 bp) 13 rpl16 - rps3 TTTAGAGATAATCTCAA (17 bp) 14 rrn4.5 - rrn5 ATTGTTCAACTCTTTGACAACATGAAAAAACC (32 bp) * Regions where PCR markers were developed for validation
Copy Number KZP KZS 2 1 1 2 1.5
2
1 2 0.5 2 1 1.5 1 1 1 1 1 1 1 1 1 1
2 1 2 1 2 2 2 2 3 2 2 2 2 3 2 2
27
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 28 of 35
Table 4. Newly developed markers for validation of the polymorphic sites among Zanthoxylum species Product size (bp) Marker name
IMZanTR-1 IMZanTR-2 IMZanTR-3 IMZanTR-4 IMZanTR-5 IMZanInDel-1 IMZanInDel-2
Primer sequence (5´- 3´)
Region
Forward
AATTGAGTTGGGAAATCAAACTGTA
Reverse
CTCGCTAGAATCCAAGACAATAGAA
Forward
GATCTTTTATCCACACACCGAATAC
Reverse
GAAAAGACAGAATGGAAAAGAATGA
Forward
GGGATCAAACTTCTGGAACTTGA
Reverse
TTATCCCGAGTTAGGCCAGATAC
Forward
CAATTCCCAGTTTCTGTGATACG
Reverse
CTCGTCAGACTTAAACCTAACTAAAAT
Forward
GTGCTTGTGTGTCACCCTTG
Reverse
GAGTCGCTTGGTTTTATCCAT
Forward
AGTGGTAAGGCAACGGGTTT
Reverse
GATACAAAGACAAAAAGTCCCACA
Forward
CAAAATCGAGGAAACGGAAGAGA
Reverse
TTGATGGAATTACGAATGGGGTC
Specific to
Type
290
CZ/KZP/KZS
TR & InDel
133
163
KZS
TR
389
405
520
CZ/KZP/KZS
TR & InDel
trnS(GGA) - rps4
210
210
248
KZS
TR
rpl33 - rps18
132
132
258
KZS
TR & InDel
rps16 - trnQ(UUG)
594
120
593
KZP
InDel
ycf1
943
943
361
KZS
InDel
CZ*
KZP
KZS
petN - psbM
196
214
psbM - trnD(GUC)
131
psbZ - trnG(GCC)
* Predicted CZ product size based on Zanthoxylum bungeanum (GenBank No. KX497031)
28
ACS Paragon Plus Environment
Page 29 of 35
Journal of Agricultural and Food Chemistry
Figure 1. Zanthoxylum species used in this study, and areas of origin of leaf samples or seed products collected from various markets. (A, B) Morphological characteristics of Z. piperitum and Z. schinifolium, respectively, used for genome and RNA sequencing. Opposite or alternately arranged spikes are indicated by orange circles. (C) Collection areas of seed products utilized for food or oriental medicine production. Collection areas for Korean Z. piperitum (KZP), Korean Z. schinifolium (KZS) and Chinese Zanthoxylum species (CZ) are indicated by circles on the map.
29
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 30 of 35
Figure 2. Circular gene maps of the chloroplast genomes of the two Zanthoxylum species. Genes shown inside the circle are transcribed clockwise, and those outside the circle are transcribed counterclockwise. The genes are colored according to their functions, as shown in the legend. Polymorphic sites between two Zanthoxylum species derived from the copy number variation of tandem repeat sequences are denoted with blue arrowheads at 14 locations, and the two InDel sites are marked by red arrows; those used for marker development are indicated by “*”.
30
ACS Paragon Plus Environment
Page 31 of 35
Journal of Agricultural and Food Chemistry
Figure 3. Comparison of the borders of SSC and IR regions between the chloroplast genomes of the two Zanthoxylum species. Compared to the Z. piperitum sequences, IR regions in Z. schinifolium are shorter, and part of ycf1 in the SSC region (indicated with a triangle) was removed.
31
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 32 of 35
Figure 4. Schematic diagram of TRs and insertions in five TR markers (A–E), and confirmation of these markers for the discrimination of Zanthoxylum species (F–G). PCR products from two individuals each of CZ, KZP and KZS are of the predicted sizes using TR-type markers. The primer sequences and expected sizes are shown in Table 4, and PCR results from whole samples (11 CZ, 8 KZP and 4 KZS) are shown in Figure S5.
32
ACS Paragon Plus Environment
Page 33 of 35
Journal of Agricultural and Food Chemistry
Figure 5. Comparison of the metabolite profiles of Zanthoxylum seed samples. Principle Component Analysis of pericarp metabolites based on near infrared reflectance spectroscopy analysis data formed three groups: Chinese Zanthoxylum species (CZ), Korean Z. piperitum (KZP) and Korean Z. schinifolium (KZS).
33
ACS Paragon Plus Environment
Journal of Agricultural and Food Chemistry
Page 34 of 35
Figure 6. Phylogenetic tree including sequences from Z. piperitum, Z. schinifolium and Z. bungeanum. The maximum likelihood tree was generated using whole chloroplast genome sequences with 1000 bootstrap replicates. Bootstrap values are shown in percentages, and posterior probability values from the Bayesian tree (adapted from Figure S6) are in parentheses. A clade of Anacardiaceae species was used as an outgroup.
34
ACS Paragon Plus Environment
Page 35 of 35
Journal of Agricultural and Food Chemistry
For Table of Contents Only
35
ACS Paragon Plus Environment